prototype
This commit is contained in:
commit
8534b15c20
8 changed files with 3048 additions and 0 deletions
167
README.md
Normal file
167
README.md
Normal file
|
|
@ -0,0 +1,167 @@
|
|||
# Nanobot SuperTonic Wisper Web
|
||||
|
||||
Standalone Python web project that:
|
||||
- uses a local `supertonic_gateway` orchestration layer,
|
||||
- uses a local `wisper` event bus,
|
||||
- spawns `nanobot agent` in a pseudo-TTY (TUI behavior),
|
||||
- streams TUI output to a browser chat page over WebSocket,
|
||||
- supports WebRTC voice input/output with host-side STT/TTS processing.
|
||||
|
||||
This project is separate from the `nanobot` repository and only talks to Nanobot as an external command.
|
||||
|
||||
## 1) Setup
|
||||
|
||||
```bash
|
||||
cd /home/kacper/nanobot-supertonic-wisper-web
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 2) Point to your Nanobot command
|
||||
|
||||
Default behavior:
|
||||
- if `~/nanobot/.venv/bin/python` exists, the app uses:
|
||||
- `NANOBOT_COMMAND="~/nanobot/.venv/bin/python -m nanobot agent --no-markdown"`
|
||||
- `NANOBOT_WORKDIR="~/nanobot"`
|
||||
- else if `~/nanobot/venv/bin/python` exists, the app uses:
|
||||
- `NANOBOT_COMMAND="~/nanobot/venv/bin/python -m nanobot agent --no-markdown"`
|
||||
- `NANOBOT_WORKDIR="~/nanobot"`
|
||||
- otherwise it falls back to `nanobot agent --no-markdown` from PATH.
|
||||
|
||||
Optional override (for any custom location):
|
||||
|
||||
```bash
|
||||
export NANOBOT_COMMAND="/home/kacper/nanobot/venv/bin/python -m nanobot agent --no-markdown"
|
||||
export NANOBOT_WORKDIR="/home/kacper/nanobot"
|
||||
```
|
||||
|
||||
Optional TUI output filtering (reduces spinner/thinking/tool-stream flood in web console):
|
||||
|
||||
```bash
|
||||
export NANOBOT_SUPPRESS_NOISY_UI='1'
|
||||
export NANOBOT_OUTPUT_DEDUP_WINDOW_S='1.5'
|
||||
```
|
||||
|
||||
## 3) Run web app
|
||||
|
||||
```bash
|
||||
uvicorn app:app --reload --host 0.0.0.0 --port 8080
|
||||
```
|
||||
|
||||
Open: `http://localhost:8080`
|
||||
|
||||
Or use the helper script (recommended for voice on iOS Safari):
|
||||
|
||||
```bash
|
||||
./start.sh
|
||||
```
|
||||
|
||||
`start.sh` enables HTTPS by default (`ENABLE_HTTPS=1`), auto-generates a local self-signed cert at `.certs/local-cert.pem` and key at `.certs/local-key.pem`, and serves `https://localhost:8000`.
|
||||
For iPhone access by LAN IP, open `https://<your-lan-ip>:8000` and trust the certificate on the device.
|
||||
Set `ENABLE_HTTPS=0` to run plain HTTP.
|
||||
|
||||
## How it works
|
||||
|
||||
- Click **Spawn Nanobot TUI** to start the agent process in a PTY.
|
||||
- Type messages in the input and press Enter, or click **Connect Voice Channel** and hold **Push-to-Talk** while speaking.
|
||||
- The browser receives streamed PTY output and displays it live.
|
||||
- When **Host Voice Output** is enabled, Nanobot output is synthesized on the host and streamed back over WebRTC audio.
|
||||
- For isolated RTC/TTS debugging, connect voice and click **Play Voice Test Script** to synthesize a sample line directly over the same WebRTC output path.
|
||||
|
||||
## Voice features
|
||||
|
||||
- Browser voice transport uses `RTCPeerConnection` + microphone capture (`getUserMedia`).
|
||||
- Voice input is explicit push-to-talk (hold button to capture, release to transcribe) instead of host-side silence segmentation.
|
||||
- Optional test mode can echo each released push-to-talk segment back to the user over WebRTC output.
|
||||
- Host receives raw audio and performs speech-to-text using:
|
||||
- `faster-whisper` directly by default (`HOST_STT_PROVIDER=faster-whisper`), or
|
||||
- `HOST_STT_COMMAND` (if `HOST_STT_PROVIDER=command`).
|
||||
- Host performs text-to-speech using:
|
||||
- `supertonic` Python library by default (`HOST_TTS_PROVIDER=supertonic`), or
|
||||
- `HOST_TTS_COMMAND` (if `HOST_TTS_PROVIDER=command`), or
|
||||
- `espeak` (if available in PATH).
|
||||
- Voice test mode sends a dedicated `voice-test-script` command over WebSocket and plays host TTS on the active WebRTC audio track (no Nanobot output required).
|
||||
- If STT/TTS is not configured, text chat still works and system messages explain what is missing.
|
||||
|
||||
### Optional host voice configuration
|
||||
|
||||
If you use `./start.sh`, you can put these in `.env.voice` and they will be loaded automatically.
|
||||
|
||||
Default direct STT (faster-whisper):
|
||||
|
||||
```bash
|
||||
export HOST_STT_PROVIDER='faster-whisper'
|
||||
export HOST_STT_MODEL='base.en'
|
||||
export HOST_STT_DEVICE='auto'
|
||||
export HOST_STT_COMPUTE_TYPE='int8'
|
||||
export HOST_STT_LANGUAGE='en'
|
||||
export HOST_STT_BEAM_SIZE='2'
|
||||
export HOST_STT_BEST_OF='2'
|
||||
export HOST_STT_VAD_FILTER='0'
|
||||
export HOST_STT_TEMPERATURE='0.0'
|
||||
export HOST_STT_LOG_PROB_THRESHOLD='-1.0'
|
||||
export HOST_STT_NO_SPEECH_THRESHOLD='0.6'
|
||||
export HOST_STT_COMPRESSION_RATIO_THRESHOLD='2.4'
|
||||
export HOST_STT_INITIAL_PROMPT='Transcribe brief spoken English precisely. Prefer common words over sound effects.'
|
||||
export HOST_STT_MIN_PTT_MS='220'
|
||||
export HOST_STT_MAX_PTT_MS='12000'
|
||||
export HOST_STT_PTT_PLAYBACK_TEST='0'
|
||||
export HOST_STT_SEGMENT_QUEUE_SIZE='2'
|
||||
export HOST_STT_BACKLOG_NOTICE_INTERVAL_S='6.0'
|
||||
export HOST_STT_SUPPRESS_DURING_TTS='1'
|
||||
export HOST_STT_SUPPRESS_MS_AFTER_TTS='300'
|
||||
```
|
||||
|
||||
Legacy compatibility: `HOST_STT_MIN_SEGMENT_MS` / `HOST_STT_MAX_SEGMENT_MS` are still read as fallbacks.
|
||||
|
||||
Note: first run may download the selected Whisper model weights.
|
||||
|
||||
Use command-based STT instead:
|
||||
|
||||
```bash
|
||||
export HOST_STT_PROVIDER='command'
|
||||
export HOST_STT_COMMAND='whisper_cli --input {input_wav}'
|
||||
```
|
||||
|
||||
Command contract:
|
||||
- `{input_wav}` is replaced with a temporary WAV file path.
|
||||
- Command must print transcript text to stdout.
|
||||
|
||||
Set TTS (optional; overrides `espeak` fallback):
|
||||
|
||||
```bash
|
||||
export HOST_TTS_PROVIDER='supertonic'
|
||||
export SUPERTONIC_MODEL='supertonic-2'
|
||||
export SUPERTONIC_VOICE_STYLE='M1'
|
||||
export SUPERTONIC_LANG='en'
|
||||
export SUPERTONIC_INTRA_OP_THREADS='1'
|
||||
export SUPERTONIC_INTER_OP_THREADS='1'
|
||||
export HOST_TTS_FLUSH_DELAY_S='0.45'
|
||||
export HOST_TTS_SENTENCE_FLUSH_DELAY_S='0.15'
|
||||
export HOST_TTS_MIN_CHARS='10'
|
||||
export HOST_TTS_MAX_WAIT_MS='1800'
|
||||
export HOST_TTS_MAX_CHUNK_CHARS='140'
|
||||
export HOST_RTC_OUTBOUND_LEAD_IN_MS='120'
|
||||
export HOST_RTC_OUTBOUND_IDLE_S='0.6'
|
||||
```
|
||||
|
||||
Use command-based TTS instead:
|
||||
|
||||
```bash
|
||||
export HOST_TTS_PROVIDER='command'
|
||||
export HOST_TTS_COMMAND='my_tts --text {text} --out {output_wav}'
|
||||
```
|
||||
|
||||
Command contract:
|
||||
- `{text}` is replaced with quoted text.
|
||||
- `{output_wav}` is replaced with a temporary WAV output path.
|
||||
- If `{output_wav}` is omitted, command stdout must be WAV bytes.
|
||||
|
||||
## Files
|
||||
|
||||
- `app.py`: FastAPI app and WebSocket endpoint.
|
||||
- `voice_rtc.py`: WebRTC signaling/session handling and host-side STT/TTS audio pipeline.
|
||||
- `supertonic_gateway.py`: process orchestration and PTY bridge.
|
||||
- `wisper.py`: event/message bus used by WebSocket streaming.
|
||||
- `static/index.html`: simple chat UI.
|
||||
103
app.py
Normal file
103
app.py
Normal file
|
|
@ -0,0 +1,103 @@
|
|||
import asyncio
|
||||
import contextlib
|
||||
import json
|
||||
from pathlib import Path
|
||||
from typing import Any, Awaitable, Callable
|
||||
|
||||
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
|
||||
from fastapi.responses import FileResponse, JSONResponse
|
||||
|
||||
from supertonic_gateway import SuperTonicGateway
|
||||
from voice_rtc import WebRTCVoiceSession
|
||||
|
||||
|
||||
BASE_DIR = Path(__file__).resolve().parent
|
||||
INDEX_PATH = BASE_DIR / "static" / "index.html"
|
||||
|
||||
app = FastAPI(title="Nanobot SuperTonic Wisper Web")
|
||||
gateway = SuperTonicGateway()
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health() -> JSONResponse:
|
||||
return JSONResponse({"status": "ok"})
|
||||
|
||||
|
||||
@app.get("/")
|
||||
async def index() -> FileResponse:
|
||||
return FileResponse(INDEX_PATH)
|
||||
|
||||
|
||||
@app.websocket("/ws/chat")
|
||||
async def websocket_chat(websocket: WebSocket) -> None:
|
||||
await websocket.accept()
|
||||
send_lock = asyncio.Lock()
|
||||
|
||||
async def safe_send_json(payload: dict[str, Any]) -> None:
|
||||
async with send_lock:
|
||||
await websocket.send_json(payload)
|
||||
|
||||
queue = await gateway.subscribe()
|
||||
voice_session = WebRTCVoiceSession(gateway=gateway, send_json=safe_send_json)
|
||||
sender = asyncio.create_task(_sender_loop(safe_send_json, queue, voice_session))
|
||||
try:
|
||||
while True:
|
||||
raw_message = await websocket.receive_text()
|
||||
try:
|
||||
message = json.loads(raw_message)
|
||||
except json.JSONDecodeError:
|
||||
await safe_send_json(
|
||||
{"role": "system", "text": "Invalid JSON message.", "timestamp": ""}
|
||||
)
|
||||
continue
|
||||
|
||||
msg_type = str(message.get("type", "")).strip()
|
||||
if msg_type == "spawn":
|
||||
await gateway.spawn_tui()
|
||||
elif msg_type == "stop":
|
||||
await gateway.stop_tui()
|
||||
elif msg_type == "rtc-offer":
|
||||
await voice_session.handle_offer(message)
|
||||
elif msg_type == "rtc-ice-candidate":
|
||||
await voice_session.handle_ice_candidate(message)
|
||||
elif msg_type == "voice-ptt":
|
||||
voice_session.set_push_to_talk_pressed(
|
||||
bool(message.get("pressed", False))
|
||||
)
|
||||
else:
|
||||
await safe_send_json(
|
||||
{
|
||||
"role": "system",
|
||||
"text": (
|
||||
"Unknown message type. Use spawn, stop, rtc-offer, "
|
||||
"rtc-ice-candidate, or voice-ptt."
|
||||
),
|
||||
"timestamp": "",
|
||||
}
|
||||
)
|
||||
except WebSocketDisconnect:
|
||||
pass
|
||||
finally:
|
||||
sender.cancel()
|
||||
with contextlib.suppress(asyncio.CancelledError):
|
||||
await sender
|
||||
await voice_session.close()
|
||||
await gateway.unsubscribe(queue)
|
||||
|
||||
|
||||
@app.on_event("shutdown")
|
||||
async def on_shutdown() -> None:
|
||||
await gateway.shutdown()
|
||||
|
||||
|
||||
async def _sender_loop(
|
||||
send_json: Callable[[dict[str, Any]], Awaitable[None]],
|
||||
queue: asyncio.Queue,
|
||||
voice_session: WebRTCVoiceSession,
|
||||
) -> None:
|
||||
while True:
|
||||
event = await queue.get()
|
||||
if event.role == "nanobot-tts":
|
||||
await voice_session.queue_output_text(event.text)
|
||||
continue
|
||||
await send_json(event.to_dict())
|
||||
5
requirements.txt
Normal file
5
requirements.txt
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
fastapi>=0.116.0,<1.0.0
|
||||
uvicorn[standard]>=0.35.0,<1.0.0
|
||||
aiortc>=1.8.0,<2.0.0
|
||||
supertonic>=1.1.2,<2.0.0
|
||||
faster-whisper>=1.1.0,<2.0.0
|
||||
126
start.sh
Executable file
126
start.sh
Executable file
|
|
@ -0,0 +1,126 @@
|
|||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
cd "$SCRIPT_DIR"
|
||||
|
||||
if [[ ! -d ".venv" ]]; then
|
||||
python3 -m venv .venv
|
||||
fi
|
||||
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt >/dev/null
|
||||
|
||||
# Optional local voice settings. Example file: .env.voice
|
||||
if [[ -f ".env.voice" ]]; then
|
||||
set -a
|
||||
# shellcheck disable=SC1091
|
||||
source ".env.voice"
|
||||
set +a
|
||||
fi
|
||||
|
||||
# Nanobot command defaults (prefer Nanobot's own virtualenv interpreter).
|
||||
: "${NANOBOT_WORKDIR:=${HOME}/nanobot}"
|
||||
if [[ -z "${NANOBOT_COMMAND:-}" ]]; then
|
||||
if [[ -x "${NANOBOT_WORKDIR}/.venv/bin/python" ]]; then
|
||||
NANOBOT_COMMAND="${NANOBOT_WORKDIR}/.venv/bin/python -m nanobot agent --no-markdown"
|
||||
elif [[ -x "${NANOBOT_WORKDIR}/venv/bin/python" ]]; then
|
||||
NANOBOT_COMMAND="${NANOBOT_WORKDIR}/venv/bin/python -m nanobot agent --no-markdown"
|
||||
fi
|
||||
fi
|
||||
export NANOBOT_WORKDIR NANOBOT_COMMAND
|
||||
: "${NANOBOT_SUPPRESS_NOISY_UI:=1}"
|
||||
: "${NANOBOT_OUTPUT_DEDUP_WINDOW_S:=1.5}"
|
||||
export NANOBOT_SUPPRESS_NOISY_UI NANOBOT_OUTPUT_DEDUP_WINDOW_S
|
||||
|
||||
# Host voice pipeline env vars (safe defaults).
|
||||
: "${HOST_STT_PROVIDER:=faster-whisper}"
|
||||
: "${HOST_STT_COMMAND:=}"
|
||||
: "${HOST_STT_MODEL:=base.en}"
|
||||
: "${HOST_STT_DEVICE:=auto}"
|
||||
: "${HOST_STT_COMPUTE_TYPE:=int8}"
|
||||
: "${HOST_STT_LANGUAGE:=en}"
|
||||
: "${HOST_STT_BEAM_SIZE:=2}"
|
||||
: "${HOST_STT_BEST_OF:=2}"
|
||||
: "${HOST_STT_VAD_FILTER:=0}"
|
||||
: "${HOST_STT_TEMPERATURE:=0.0}"
|
||||
: "${HOST_STT_LOG_PROB_THRESHOLD:=-1.0}"
|
||||
: "${HOST_STT_NO_SPEECH_THRESHOLD:=0.6}"
|
||||
: "${HOST_STT_COMPRESSION_RATIO_THRESHOLD:=2.4}"
|
||||
: "${HOST_STT_INITIAL_PROMPT:=Transcribe brief spoken English precisely. Prefer common words over sound effects.}"
|
||||
: "${HOST_TTS_PROVIDER:=supertonic}"
|
||||
: "${HOST_TTS_COMMAND:=}"
|
||||
: "${SUPERTONIC_MODEL:=supertonic-2}"
|
||||
: "${SUPERTONIC_VOICE_STYLE:=M1}"
|
||||
: "${SUPERTONIC_LANG:=en}"
|
||||
: "${SUPERTONIC_TOTAL_STEPS:=5}"
|
||||
: "${SUPERTONIC_SPEED:=1.05}"
|
||||
: "${SUPERTONIC_INTRA_OP_THREADS:=1}"
|
||||
: "${SUPERTONIC_INTER_OP_THREADS:=1}"
|
||||
: "${SUPERTONIC_AUTO_DOWNLOAD:=1}"
|
||||
: "${HOST_STT_MIN_PTT_MS:=220}"
|
||||
: "${HOST_STT_MAX_PTT_MS:=12000}"
|
||||
: "${HOST_STT_SEGMENT_QUEUE_SIZE:=2}"
|
||||
: "${HOST_STT_BACKLOG_NOTICE_INTERVAL_S:=6.0}"
|
||||
: "${HOST_STT_SUPPRESS_DURING_TTS:=1}"
|
||||
: "${HOST_STT_SUPPRESS_MS_AFTER_TTS:=300}"
|
||||
: "${HOST_RTC_OUTBOUND_LEAD_IN_MS:=120}"
|
||||
: "${HOST_RTC_OUTBOUND_IDLE_S:=0.6}"
|
||||
: "${HOST_TTS_FLUSH_DELAY_S:=0.45}"
|
||||
: "${HOST_TTS_SENTENCE_FLUSH_DELAY_S:=0.15}"
|
||||
: "${HOST_TTS_MIN_CHARS:=10}"
|
||||
: "${HOST_TTS_MAX_WAIT_MS:=1800}"
|
||||
: "${HOST_TTS_MAX_CHUNK_CHARS:=140}"
|
||||
|
||||
export HOST_STT_PROVIDER HOST_STT_COMMAND HOST_STT_MODEL HOST_STT_DEVICE
|
||||
export HOST_STT_COMPUTE_TYPE HOST_STT_LANGUAGE HOST_STT_BEAM_SIZE HOST_STT_BEST_OF HOST_STT_VAD_FILTER
|
||||
export HOST_STT_TEMPERATURE HOST_STT_LOG_PROB_THRESHOLD HOST_STT_NO_SPEECH_THRESHOLD
|
||||
export HOST_STT_COMPRESSION_RATIO_THRESHOLD
|
||||
export HOST_STT_INITIAL_PROMPT
|
||||
export HOST_TTS_PROVIDER HOST_TTS_COMMAND
|
||||
export SUPERTONIC_MODEL SUPERTONIC_VOICE_STYLE SUPERTONIC_LANG
|
||||
export SUPERTONIC_TOTAL_STEPS SUPERTONIC_SPEED
|
||||
export SUPERTONIC_INTRA_OP_THREADS SUPERTONIC_INTER_OP_THREADS SUPERTONIC_AUTO_DOWNLOAD
|
||||
export HOST_STT_MIN_PTT_MS HOST_STT_MAX_PTT_MS HOST_STT_SEGMENT_QUEUE_SIZE
|
||||
export HOST_STT_BACKLOG_NOTICE_INTERVAL_S
|
||||
export HOST_STT_SUPPRESS_DURING_TTS HOST_STT_SUPPRESS_MS_AFTER_TTS
|
||||
export HOST_RTC_OUTBOUND_LEAD_IN_MS HOST_RTC_OUTBOUND_IDLE_S
|
||||
export HOST_TTS_FLUSH_DELAY_S HOST_TTS_SENTENCE_FLUSH_DELAY_S
|
||||
export HOST_TTS_MIN_CHARS HOST_TTS_MAX_WAIT_MS HOST_TTS_MAX_CHUNK_CHARS
|
||||
|
||||
: "${UVICORN_HOST:=0.0.0.0}"
|
||||
: "${UVICORN_PORT:=8000}"
|
||||
: "${ENABLE_HTTPS:=1}"
|
||||
: "${SSL_DAYS:=365}"
|
||||
: "${SSL_CERT_FILE:=.certs/local-cert.pem}"
|
||||
: "${SSL_KEY_FILE:=.certs/local-key.pem}"
|
||||
|
||||
if [[ "$ENABLE_HTTPS" == "1" ]]; then
|
||||
mkdir -p "$(dirname "$SSL_CERT_FILE")"
|
||||
mkdir -p "$(dirname "$SSL_KEY_FILE")"
|
||||
|
||||
if [[ ! -f "$SSL_CERT_FILE" || ! -f "$SSL_KEY_FILE" ]]; then
|
||||
LOCAL_IP="$(hostname -I 2>/dev/null | awk '{print $1}')"
|
||||
SAN_ENTRIES="DNS:localhost,IP:127.0.0.1"
|
||||
if [[ -n "${LOCAL_IP:-}" ]]; then
|
||||
SAN_ENTRIES="${SAN_ENTRIES},IP:${LOCAL_IP}"
|
||||
fi
|
||||
|
||||
echo "Generating local TLS certificate at '$SSL_CERT_FILE' (SAN: ${SAN_ENTRIES})"
|
||||
openssl req -x509 -newkey rsa:2048 -sha256 -nodes -days "$SSL_DAYS" \
|
||||
-keyout "$SSL_KEY_FILE" \
|
||||
-out "$SSL_CERT_FILE" \
|
||||
-subj "/CN=localhost" \
|
||||
-addext "subjectAltName=${SAN_ENTRIES}" \
|
||||
-addext "keyUsage=digitalSignature,keyEncipherment" \
|
||||
-addext "extendedKeyUsage=serverAuth"
|
||||
fi
|
||||
|
||||
echo "Starting HTTPS server on https://localhost:${UVICORN_PORT}"
|
||||
exec uvicorn app:app --host "$UVICORN_HOST" --port "$UVICORN_PORT" \
|
||||
--ssl-certfile "$SSL_CERT_FILE" \
|
||||
--ssl-keyfile "$SSL_KEY_FILE"
|
||||
fi
|
||||
|
||||
echo "Starting HTTP server on http://localhost:${UVICORN_PORT}"
|
||||
exec uvicorn app:app --host "$UVICORN_HOST" --port "$UVICORN_PORT"
|
||||
566
static/index.html
Normal file
566
static/index.html
Normal file
|
|
@ -0,0 +1,566 @@
|
|||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<title>Nanobot Chat (SuperTonic + Wisper)</title>
|
||||
<style>
|
||||
:root {
|
||||
--bg: #f6f8fa;
|
||||
--panel: #ffffff;
|
||||
--text: #1f2937;
|
||||
--muted: #6b7280;
|
||||
--accent: #0d9488;
|
||||
--border: #d1d5db;
|
||||
}
|
||||
* {
|
||||
box-sizing: border-box;
|
||||
}
|
||||
body {
|
||||
margin: 0;
|
||||
font-family: "SF Mono", ui-monospace, Menlo, Consolas, monospace;
|
||||
background: linear-gradient(180deg, #eef6ff 0%, var(--bg) 100%);
|
||||
color: var(--text);
|
||||
}
|
||||
.wrap {
|
||||
max-width: 980px;
|
||||
margin: 24px auto;
|
||||
padding: 0 16px;
|
||||
}
|
||||
h1 {
|
||||
margin: 0 0 12px;
|
||||
font-size: 1.2rem;
|
||||
}
|
||||
.panel {
|
||||
background: var(--panel);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 10px;
|
||||
padding: 12px;
|
||||
}
|
||||
.controls {
|
||||
display: flex;
|
||||
gap: 8px;
|
||||
margin-bottom: 12px;
|
||||
}
|
||||
button {
|
||||
border: 1px solid var(--border);
|
||||
background: white;
|
||||
border-radius: 8px;
|
||||
padding: 8px 12px;
|
||||
cursor: pointer;
|
||||
}
|
||||
button:disabled {
|
||||
opacity: 0.6;
|
||||
cursor: not-allowed;
|
||||
}
|
||||
button.primary {
|
||||
background: var(--accent);
|
||||
color: white;
|
||||
border-color: var(--accent);
|
||||
}
|
||||
button.ptt-active {
|
||||
background: #be123c;
|
||||
color: white;
|
||||
border-color: #be123c;
|
||||
}
|
||||
.log {
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 8px;
|
||||
min-height: 420px;
|
||||
max-height: 420px;
|
||||
overflow: auto;
|
||||
padding: 10px;
|
||||
background: #0b1020;
|
||||
color: #d6e2ff;
|
||||
white-space: pre-wrap;
|
||||
}
|
||||
.line {
|
||||
margin-bottom: 8px;
|
||||
}
|
||||
.line.user {
|
||||
color: #9be5ff;
|
||||
}
|
||||
.line.system {
|
||||
color: #ffd28f;
|
||||
}
|
||||
.line.wisper {
|
||||
color: #c4f0be;
|
||||
}
|
||||
|
||||
.voice {
|
||||
display: flex;
|
||||
gap: 8px;
|
||||
align-items: center;
|
||||
margin-top: 8px;
|
||||
}
|
||||
.voice-status {
|
||||
color: var(--muted);
|
||||
font-size: 12px;
|
||||
}
|
||||
|
||||
.hint {
|
||||
margin-top: 10px;
|
||||
color: var(--muted);
|
||||
font-size: 12px;
|
||||
}
|
||||
@media (max-width: 700px) {
|
||||
.controls,
|
||||
.voice {
|
||||
flex-direction: column;
|
||||
align-items: stretch;
|
||||
}
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="wrap">
|
||||
<h1>Nanobot Web Chat (SuperTonic + Wisper)</h1>
|
||||
<div class="panel">
|
||||
<div class="controls">
|
||||
<button id="spawnBtn" class="primary">Spawn Nanobot TUI</button>
|
||||
<button id="stopBtn">Stop TUI</button>
|
||||
</div>
|
||||
<div id="log" class="log"></div>
|
||||
<div class="voice">
|
||||
<button id="recordBtn">Connect Voice Channel</button>
|
||||
<button id="pttBtn" disabled>Hold to Talk</button>
|
||||
<span id="voiceStatus" class="voice-status"></span>
|
||||
</div>
|
||||
<audio id="remoteAudio" autoplay playsinline hidden></audio>
|
||||
<div class="hint">
|
||||
Voice input and output run over a host WebRTC audio channel. Hold Push-to-Talk to send microphone audio for host STT.
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
const logEl = document.getElementById("log");
|
||||
const spawnBtn = document.getElementById("spawnBtn");
|
||||
const stopBtn = document.getElementById("stopBtn");
|
||||
const recordBtn = document.getElementById("recordBtn");
|
||||
const pttBtn = document.getElementById("pttBtn");
|
||||
const voiceStatus = document.getElementById("voiceStatus");
|
||||
const remoteAudio = document.getElementById("remoteAudio");
|
||||
|
||||
const wsProto = location.protocol === "https:" ? "wss" : "ws";
|
||||
const ws = new WebSocket(`${wsProto}://${location.host}/ws/chat`);
|
||||
|
||||
let peerConnection = null;
|
||||
let micStream = null;
|
||||
let remoteStream = null;
|
||||
let voiceConnected = false;
|
||||
let disconnectedTimer = null;
|
||||
let reconnectTimer = null;
|
||||
let reconnectAttempts = 0;
|
||||
let voiceDesired = false;
|
||||
let connectingVoice = false;
|
||||
let pttPressed = false;
|
||||
let rtcAnswerApplied = false;
|
||||
let pendingRemoteCandidates = [];
|
||||
const MAX_RECONNECT_ATTEMPTS = 2;
|
||||
|
||||
const appendLine = (role, text, timestamp) => {
|
||||
const line = document.createElement("div");
|
||||
line.className = `line ${role || "system"}`;
|
||||
const time = timestamp ? new Date(timestamp).toLocaleTimeString() : "";
|
||||
line.textContent = `[${time}] ${role}: ${text}`;
|
||||
logEl.appendChild(line);
|
||||
logEl.scrollTop = logEl.scrollHeight;
|
||||
};
|
||||
|
||||
const sendJson = (payload) => {
|
||||
if (ws.readyState !== WebSocket.OPEN) {
|
||||
appendLine("system", "Socket not ready.", new Date().toISOString());
|
||||
return;
|
||||
}
|
||||
ws.send(JSON.stringify(payload));
|
||||
};
|
||||
|
||||
const setVoiceState = (connected) => {
|
||||
voiceConnected = connected;
|
||||
recordBtn.textContent = connected ? "Disconnect Voice Channel" : "Connect Voice Channel";
|
||||
pttBtn.disabled = !connected;
|
||||
if (!connected) {
|
||||
pttBtn.textContent = "Hold to Talk";
|
||||
pttBtn.classList.remove("ptt-active");
|
||||
}
|
||||
};
|
||||
|
||||
const setMicCaptureEnabled = (enabled) => {
|
||||
if (!micStream) return;
|
||||
micStream.getAudioTracks().forEach((track) => {
|
||||
track.enabled = enabled;
|
||||
});
|
||||
};
|
||||
|
||||
const setPushToTalkState = (pressed, notifyServer = true) => {
|
||||
pttPressed = pressed;
|
||||
pttBtn.textContent = pressed ? "Release to Send" : "Hold to Talk";
|
||||
pttBtn.classList.toggle("ptt-active", pressed);
|
||||
setMicCaptureEnabled(pressed);
|
||||
if (notifyServer && ws.readyState === WebSocket.OPEN) {
|
||||
ws.send(JSON.stringify({ type: "voice-ptt", pressed }));
|
||||
}
|
||||
};
|
||||
|
||||
const beginPushToTalk = (event) => {
|
||||
if (event) event.preventDefault();
|
||||
if (!voiceConnected || !peerConnection || !micStream) {
|
||||
voiceStatus.textContent = "Connect voice channel first.";
|
||||
return;
|
||||
}
|
||||
if (pttPressed) return;
|
||||
setPushToTalkState(true);
|
||||
voiceStatus.textContent = "Listening while button is held...";
|
||||
};
|
||||
|
||||
const endPushToTalk = (event) => {
|
||||
if (event) event.preventDefault();
|
||||
if (!pttPressed) return;
|
||||
setPushToTalkState(false);
|
||||
if (voiceConnected) {
|
||||
voiceStatus.textContent = "Voice channel connected. Hold Push-to-Talk to speak.";
|
||||
}
|
||||
};
|
||||
|
||||
const clearReconnectTimer = () => {
|
||||
if (reconnectTimer) {
|
||||
clearTimeout(reconnectTimer);
|
||||
reconnectTimer = null;
|
||||
}
|
||||
};
|
||||
|
||||
const scheduleReconnect = (reason, delayMs = 1200) => {
|
||||
if (!voiceDesired) return;
|
||||
if (voiceConnected || connectingVoice) return;
|
||||
if (reconnectTimer) return;
|
||||
if (reconnectAttempts >= MAX_RECONNECT_ATTEMPTS) {
|
||||
voiceStatus.textContent = "Voice reconnect attempts exhausted.";
|
||||
return;
|
||||
}
|
||||
reconnectAttempts += 1;
|
||||
voiceStatus.textContent = `${reason} Retrying (${reconnectAttempts}/${MAX_RECONNECT_ATTEMPTS})...`;
|
||||
reconnectTimer = setTimeout(async () => {
|
||||
reconnectTimer = null;
|
||||
await connectVoiceChannel();
|
||||
}, delayMs);
|
||||
};
|
||||
|
||||
const stopVoiceChannel = async (statusText = "", clearDesired = false) => {
|
||||
if (clearDesired) {
|
||||
voiceDesired = false;
|
||||
reconnectAttempts = 0;
|
||||
clearReconnectTimer();
|
||||
}
|
||||
|
||||
if (disconnectedTimer) {
|
||||
clearTimeout(disconnectedTimer);
|
||||
disconnectedTimer = null;
|
||||
}
|
||||
|
||||
pendingRemoteCandidates = [];
|
||||
rtcAnswerApplied = false;
|
||||
setPushToTalkState(false);
|
||||
|
||||
if (peerConnection) {
|
||||
peerConnection.ontrack = null;
|
||||
peerConnection.onicecandidate = null;
|
||||
peerConnection.onconnectionstatechange = null;
|
||||
peerConnection.close();
|
||||
peerConnection = null;
|
||||
}
|
||||
|
||||
if (micStream) {
|
||||
micStream.getTracks().forEach((track) => track.stop());
|
||||
micStream = null;
|
||||
}
|
||||
|
||||
if (remoteStream) {
|
||||
remoteStream.getTracks().forEach((track) => track.stop());
|
||||
remoteStream = null;
|
||||
}
|
||||
|
||||
remoteAudio.srcObject = null;
|
||||
setVoiceState(false);
|
||||
if (statusText) {
|
||||
voiceStatus.textContent = statusText;
|
||||
}
|
||||
};
|
||||
|
||||
const applyRtcAnswer = async (message) => {
|
||||
if (!peerConnection) return;
|
||||
const rawSdp = (message.sdp || "").toString();
|
||||
if (!rawSdp.trim()) return;
|
||||
const sdp = `${rawSdp
|
||||
.replace(/\r\n/g, "\n")
|
||||
.replace(/\r/g, "\n")
|
||||
.split("\n")
|
||||
.map((line) => line.trimEnd())
|
||||
.join("\r\n")
|
||||
.trim()}\r\n`;
|
||||
try {
|
||||
await peerConnection.setRemoteDescription({
|
||||
type: message.rtcType || "answer",
|
||||
sdp,
|
||||
});
|
||||
rtcAnswerApplied = true;
|
||||
const queued = pendingRemoteCandidates;
|
||||
pendingRemoteCandidates = [];
|
||||
for (const candidate of queued) {
|
||||
try {
|
||||
await peerConnection.addIceCandidate(candidate);
|
||||
} catch (candidateErr) {
|
||||
appendLine("system", `Queued ICE apply error: ${candidateErr}`, new Date().toISOString());
|
||||
}
|
||||
}
|
||||
reconnectAttempts = 0;
|
||||
voiceStatus.textContent = "Voice channel negotiated.";
|
||||
} catch (err) {
|
||||
await stopVoiceChannel("Failed to apply WebRTC answer.");
|
||||
scheduleReconnect("Failed to apply answer.");
|
||||
const preview = sdp
|
||||
.split(/\r\n/)
|
||||
.slice(0, 6)
|
||||
.join(" | ");
|
||||
appendLine(
|
||||
"system",
|
||||
`RTC answer error: ${err}. SDP preview: ${preview}`,
|
||||
new Date().toISOString()
|
||||
);
|
||||
}
|
||||
};
|
||||
|
||||
const applyRtcIceCandidate = async (message) => {
|
||||
if (!peerConnection) return;
|
||||
if (message.candidate == null) {
|
||||
if (!rtcAnswerApplied || !peerConnection.remoteDescription) {
|
||||
pendingRemoteCandidates.push(null);
|
||||
return;
|
||||
}
|
||||
try {
|
||||
await peerConnection.addIceCandidate(null);
|
||||
} catch (err) {
|
||||
appendLine("system", `RTC ICE end error: ${err}`, new Date().toISOString());
|
||||
}
|
||||
return;
|
||||
}
|
||||
try {
|
||||
if (!rtcAnswerApplied || !peerConnection.remoteDescription) {
|
||||
pendingRemoteCandidates.push(message.candidate);
|
||||
return;
|
||||
}
|
||||
await peerConnection.addIceCandidate(message.candidate);
|
||||
} catch (err) {
|
||||
appendLine("system", `RTC ICE error: ${err}`, new Date().toISOString());
|
||||
}
|
||||
};
|
||||
|
||||
const connectVoiceChannel = async () => {
|
||||
if (voiceConnected || peerConnection || connectingVoice) return;
|
||||
if (!window.RTCPeerConnection) {
|
||||
voiceStatus.textContent = "WebRTC unavailable in this browser.";
|
||||
return;
|
||||
}
|
||||
if (!navigator.mediaDevices?.getUserMedia) {
|
||||
voiceStatus.textContent = "Microphone capture is unavailable.";
|
||||
return;
|
||||
}
|
||||
if (ws.readyState !== WebSocket.OPEN) {
|
||||
voiceStatus.textContent = "Socket not ready yet.";
|
||||
return;
|
||||
}
|
||||
|
||||
connectingVoice = true;
|
||||
try {
|
||||
clearReconnectTimer();
|
||||
rtcAnswerApplied = false;
|
||||
pendingRemoteCandidates = [];
|
||||
|
||||
try {
|
||||
micStream = await navigator.mediaDevices.getUserMedia({
|
||||
audio: {
|
||||
channelCount: 1,
|
||||
sampleRate: 48000,
|
||||
sampleSize: 16,
|
||||
latency: 0,
|
||||
echoCancellation: true,
|
||||
noiseSuppression: true,
|
||||
autoGainControl: false,
|
||||
},
|
||||
video: false,
|
||||
});
|
||||
} catch (_constraintErr) {
|
||||
micStream = await navigator.mediaDevices.getUserMedia({
|
||||
audio: true,
|
||||
video: false,
|
||||
});
|
||||
voiceStatus.textContent = "Using browser default microphone settings.";
|
||||
}
|
||||
setMicCaptureEnabled(false);
|
||||
|
||||
peerConnection = new RTCPeerConnection({
|
||||
iceServers: [{ urls: "stun:stun.l.google.com:19302" }],
|
||||
});
|
||||
remoteStream = new MediaStream();
|
||||
remoteAudio.srcObject = remoteStream;
|
||||
|
||||
peerConnection.ontrack = (event) => {
|
||||
if (event.track.kind !== "audio") return;
|
||||
remoteStream.addTrack(event.track);
|
||||
remoteAudio.play().catch(() => {
|
||||
voiceStatus.textContent = "Tap the page once to allow voice playback.";
|
||||
});
|
||||
};
|
||||
|
||||
peerConnection.onicecandidate = (event) => {
|
||||
if (!event.candidate) {
|
||||
sendJson({ type: "rtc-ice-candidate", candidate: null });
|
||||
return;
|
||||
}
|
||||
sendJson({
|
||||
type: "rtc-ice-candidate",
|
||||
candidate: event.candidate.toJSON(),
|
||||
});
|
||||
};
|
||||
|
||||
peerConnection.onconnectionstatechange = () => {
|
||||
const state = peerConnection?.connectionState || "new";
|
||||
if (state === "connected") {
|
||||
if (disconnectedTimer) {
|
||||
clearTimeout(disconnectedTimer);
|
||||
disconnectedTimer = null;
|
||||
}
|
||||
clearReconnectTimer();
|
||||
reconnectAttempts = 0;
|
||||
voiceStatus.textContent = "Voice channel connected. Hold Push-to-Talk to speak.";
|
||||
return;
|
||||
}
|
||||
if (state === "failed" || state === "closed") {
|
||||
stopVoiceChannel(`Voice channel ${state}.`);
|
||||
scheduleReconnect(`Voice channel ${state}.`);
|
||||
return;
|
||||
}
|
||||
if (state === "disconnected") {
|
||||
if (disconnectedTimer) clearTimeout(disconnectedTimer);
|
||||
voiceStatus.textContent = "Voice channel disconnected. Waiting to recover...";
|
||||
disconnectedTimer = setTimeout(() => {
|
||||
if (peerConnection?.connectionState === "disconnected") {
|
||||
stopVoiceChannel("Voice channel disconnected.");
|
||||
scheduleReconnect("Voice channel disconnected.");
|
||||
}
|
||||
}, 8000);
|
||||
return;
|
||||
}
|
||||
voiceStatus.textContent = `Voice channel ${state}...`;
|
||||
};
|
||||
|
||||
micStream.getAudioTracks().forEach((track) => {
|
||||
peerConnection.addTrack(track, micStream);
|
||||
});
|
||||
|
||||
setVoiceState(true);
|
||||
voiceStatus.textContent = "Connecting voice channel...";
|
||||
setPushToTalkState(false);
|
||||
|
||||
const offer = await peerConnection.createOffer();
|
||||
await peerConnection.setLocalDescription(offer);
|
||||
sendJson({
|
||||
type: "rtc-offer",
|
||||
sdp: offer.sdp,
|
||||
rtcType: offer.type,
|
||||
});
|
||||
} catch (err) {
|
||||
await stopVoiceChannel("Voice channel setup failed.");
|
||||
scheduleReconnect("Voice setup failed.");
|
||||
appendLine("system", `Voice setup error: ${err}`, new Date().toISOString());
|
||||
} finally {
|
||||
connectingVoice = false;
|
||||
}
|
||||
};
|
||||
|
||||
ws.onopen = () => {
|
||||
appendLine("system", "WebSocket connected.", new Date().toISOString());
|
||||
};
|
||||
ws.onclose = async () => {
|
||||
appendLine("system", "WebSocket disconnected.", new Date().toISOString());
|
||||
await stopVoiceChannel("Voice channel disconnected.", true);
|
||||
};
|
||||
ws.onerror = () => appendLine("system", "WebSocket error.", new Date().toISOString());
|
||||
ws.onmessage = async (event) => {
|
||||
try {
|
||||
const msg = JSON.parse(event.data);
|
||||
|
||||
if (msg.type === "rtc-answer") {
|
||||
await applyRtcAnswer(msg);
|
||||
return;
|
||||
}
|
||||
if (msg.type === "rtc-ice-candidate") {
|
||||
await applyRtcIceCandidate(msg);
|
||||
return;
|
||||
}
|
||||
if (msg.type === "rtc-state") {
|
||||
const state = (msg.state || "").toString();
|
||||
if (state) {
|
||||
if (state === "connected") {
|
||||
voiceStatus.textContent = "Voice channel connected. Hold Push-to-Talk to speak.";
|
||||
} else {
|
||||
voiceStatus.textContent = `Voice channel ${state}.`;
|
||||
}
|
||||
}
|
||||
return;
|
||||
}
|
||||
if (msg.type === "rtc-error") {
|
||||
const text = (msg.message || "Unknown WebRTC error.").toString();
|
||||
voiceStatus.textContent = `Voice error: ${text}`;
|
||||
appendLine("system", `Voice error: ${text}`, new Date().toISOString());
|
||||
await stopVoiceChannel("Voice channel error.");
|
||||
scheduleReconnect("Voice channel error.");
|
||||
return;
|
||||
}
|
||||
|
||||
appendLine(msg.role || "system", msg.text || "", msg.timestamp || "");
|
||||
} catch (_err) {
|
||||
appendLine("system", event.data, new Date().toISOString());
|
||||
}
|
||||
};
|
||||
|
||||
spawnBtn.onclick = () => sendJson({ type: "spawn" });
|
||||
stopBtn.onclick = () => sendJson({ type: "stop" });
|
||||
pttBtn.onpointerdown = (event) => {
|
||||
if (event.button !== 0) return;
|
||||
if (pttBtn.setPointerCapture) {
|
||||
pttBtn.setPointerCapture(event.pointerId);
|
||||
}
|
||||
beginPushToTalk(event);
|
||||
};
|
||||
pttBtn.onpointerup = (event) => endPushToTalk(event);
|
||||
pttBtn.onpointercancel = (event) => endPushToTalk(event);
|
||||
pttBtn.onlostpointercapture = (event) => endPushToTalk(event);
|
||||
pttBtn.addEventListener("keydown", (event) => {
|
||||
const isSpace = event.code === "Space" || event.key === " ";
|
||||
if (!isSpace || event.repeat) return;
|
||||
beginPushToTalk(event);
|
||||
});
|
||||
pttBtn.addEventListener("keyup", (event) => {
|
||||
const isSpace = event.code === "Space" || event.key === " ";
|
||||
if (!isSpace) return;
|
||||
endPushToTalk(event);
|
||||
});
|
||||
recordBtn.onclick = async () => {
|
||||
if (voiceConnected || peerConnection || connectingVoice) {
|
||||
await stopVoiceChannel("Voice channel disconnected.", true);
|
||||
return;
|
||||
}
|
||||
voiceDesired = true;
|
||||
reconnectAttempts = 0;
|
||||
await connectVoiceChannel();
|
||||
};
|
||||
document.body.addEventListener("click", () => {
|
||||
if (remoteAudio.srcObject && remoteAudio.paused) {
|
||||
remoteAudio.play().catch(() => {});
|
||||
}
|
||||
});
|
||||
setVoiceState(false);
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
388
supertonic_gateway.py
Normal file
388
supertonic_gateway.py
Normal file
|
|
@ -0,0 +1,388 @@
|
|||
import asyncio
|
||||
import contextlib
|
||||
import os
|
||||
import pty
|
||||
import re
|
||||
import shlex
|
||||
import signal
|
||||
import subprocess
|
||||
import time
|
||||
from collections import deque
|
||||
from pathlib import Path
|
||||
|
||||
from wisper import WisperBus, WisperEvent
|
||||
|
||||
|
||||
ANSI_ESCAPE_RE = re.compile(r"\x1B\[[0-?]*[ -/]*[@-~]")
|
||||
CONTROL_CHAR_RE = re.compile(r"[\x00-\x08\x0b-\x1f\x7f]")
|
||||
BRAILLE_SPINNER_RE = re.compile(r"[\u2800-\u28ff]")
|
||||
SPINNER_ONLY_RE = re.compile(r"^[\s|/\\\-]+$")
|
||||
BOX_DRAWING_ONLY_RE = re.compile(r"^[\s\u2500-\u257f]+$")
|
||||
THINKING_LINE_RE = re.compile(r"^(?:agent|nanobot)\s+is\s+thinking\b", re.IGNORECASE)
|
||||
TOOL_STREAM_LINE_RE = re.compile(
|
||||
r"^(?:tool(?:\s+call|\s+output)?|calling\s+tool|running\s+tool|executing\s+tool)\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
EMOJI_RE = re.compile(
|
||||
"[" # Common emoji and pictograph blocks.
|
||||
"\U0001F1E6-\U0001F1FF"
|
||||
"\U0001F300-\U0001F5FF"
|
||||
"\U0001F600-\U0001F64F"
|
||||
"\U0001F680-\U0001F6FF"
|
||||
"\U0001F700-\U0001F77F"
|
||||
"\U0001F780-\U0001F7FF"
|
||||
"\U0001F800-\U0001F8FF"
|
||||
"\U0001F900-\U0001F9FF"
|
||||
"\U0001FA00-\U0001FAFF"
|
||||
"\u2600-\u26FF"
|
||||
"\u2700-\u27BF"
|
||||
"\uFE0F"
|
||||
"\u200D"
|
||||
"]"
|
||||
)
|
||||
|
||||
|
||||
def _clean_output(text: str) -> str:
|
||||
cleaned = ANSI_ESCAPE_RE.sub("", text)
|
||||
cleaned = BRAILLE_SPINNER_RE.sub(" ", cleaned)
|
||||
cleaned = CONTROL_CHAR_RE.sub("", cleaned)
|
||||
return cleaned.replace("\r", "\n")
|
||||
|
||||
|
||||
def _resolve_nanobot_command_and_workdir() -> tuple[str, Path]:
|
||||
command_override = os.getenv("NANOBOT_COMMAND")
|
||||
workdir_override = os.getenv("NANOBOT_WORKDIR")
|
||||
|
||||
if workdir_override:
|
||||
default_workdir = Path(workdir_override).expanduser()
|
||||
else:
|
||||
default_workdir = Path.home()
|
||||
|
||||
if command_override:
|
||||
return command_override, default_workdir
|
||||
|
||||
nanobot_dir = Path.home() / "nanobot"
|
||||
nanobot_python_candidates = [
|
||||
nanobot_dir / ".venv" / "bin" / "python",
|
||||
nanobot_dir / "venv" / "bin" / "python",
|
||||
]
|
||||
for nanobot_venv_python in nanobot_python_candidates:
|
||||
if nanobot_venv_python.exists():
|
||||
if not workdir_override:
|
||||
default_workdir = nanobot_dir
|
||||
return f"{nanobot_venv_python} -m nanobot agent --no-markdown", default_workdir
|
||||
|
||||
return "nanobot agent --no-markdown", default_workdir
|
||||
|
||||
|
||||
def _infer_venv_root(command_parts: list[str], workdir: Path) -> Path | None:
|
||||
if not command_parts:
|
||||
return None
|
||||
|
||||
binary = Path(command_parts[0]).expanduser()
|
||||
if binary.is_absolute() and binary.name.startswith("python") and binary.parent.name == "bin":
|
||||
return binary.parent.parent
|
||||
|
||||
for candidate in (workdir / ".venv", workdir / "venv"):
|
||||
if (candidate / "bin" / "python").exists():
|
||||
return candidate
|
||||
return None
|
||||
|
||||
|
||||
def _build_process_env(command_parts: list[str], workdir: Path) -> tuple[dict[str, str], Path | None]:
|
||||
env = os.environ.copy()
|
||||
env.pop("PYTHONHOME", None)
|
||||
|
||||
venv_root = _infer_venv_root(command_parts, workdir)
|
||||
if not venv_root:
|
||||
return env, None
|
||||
|
||||
venv_bin = str(venv_root / "bin")
|
||||
path_entries = [entry for entry in env.get("PATH", "").split(os.pathsep) if entry]
|
||||
path_entries = [entry for entry in path_entries if entry != venv_bin]
|
||||
path_entries.insert(0, venv_bin)
|
||||
env["PATH"] = os.pathsep.join(path_entries)
|
||||
env["VIRTUAL_ENV"] = str(venv_root)
|
||||
return env, venv_root
|
||||
|
||||
|
||||
class NanobotTUIProcess:
|
||||
def __init__(self, bus: WisperBus, command: str, workdir: Path) -> None:
|
||||
self._bus = bus
|
||||
self._command = command
|
||||
self._workdir = workdir
|
||||
self._process: subprocess.Popen[bytes] | None = None
|
||||
self._master_fd: int | None = None
|
||||
self._read_task: asyncio.Task[None] | None = None
|
||||
self._pending_output = ""
|
||||
self._suppress_noisy_ui = os.getenv("NANOBOT_SUPPRESS_NOISY_UI", "1").strip() not in {
|
||||
"0",
|
||||
"false",
|
||||
"False",
|
||||
"no",
|
||||
"off",
|
||||
}
|
||||
self._dedup_window_s = max(0.2, float(os.getenv("NANOBOT_OUTPUT_DEDUP_WINDOW_S", "1.5")))
|
||||
self._recent_lines: deque[tuple[str, float]] = deque()
|
||||
self._last_tts_line = ""
|
||||
|
||||
@property
|
||||
def running(self) -> bool:
|
||||
return self._process is not None and self._process.poll() is None
|
||||
|
||||
async def start(self) -> None:
|
||||
if self.running:
|
||||
await self._bus.publish(WisperEvent(role="system", text="Nanobot TUI is already running."))
|
||||
return
|
||||
|
||||
command_parts = [
|
||||
os.path.expandvars(os.path.expanduser(part)) for part in shlex.split(self._command)
|
||||
]
|
||||
if not command_parts:
|
||||
await self._bus.publish(WisperEvent(role="system", text="NANOBOT_COMMAND is empty."))
|
||||
return
|
||||
|
||||
if not self._workdir.exists():
|
||||
await self._bus.publish(
|
||||
WisperEvent(
|
||||
role="system",
|
||||
text=f"NANOBOT_WORKDIR does not exist: {self._workdir}",
|
||||
)
|
||||
)
|
||||
return
|
||||
|
||||
master_fd, slave_fd = pty.openpty()
|
||||
child_env, child_venv_root = _build_process_env(command_parts=command_parts, workdir=self._workdir)
|
||||
try:
|
||||
self._process = subprocess.Popen(
|
||||
command_parts,
|
||||
stdin=slave_fd,
|
||||
stdout=slave_fd,
|
||||
stderr=slave_fd,
|
||||
cwd=str(self._workdir),
|
||||
start_new_session=True,
|
||||
env=child_env,
|
||||
)
|
||||
except FileNotFoundError as exc:
|
||||
os.close(master_fd)
|
||||
os.close(slave_fd)
|
||||
await self._bus.publish(
|
||||
WisperEvent(
|
||||
role="system",
|
||||
text=(
|
||||
"Could not start Nanobot process "
|
||||
f"(command='{command_parts[0]}', workdir='{self._workdir}'): {exc}. "
|
||||
"Check NANOBOT_COMMAND and NANOBOT_WORKDIR."
|
||||
),
|
||||
)
|
||||
)
|
||||
return
|
||||
except Exception as exc:
|
||||
os.close(master_fd)
|
||||
os.close(slave_fd)
|
||||
await self._bus.publish(
|
||||
WisperEvent(role="system", text=f"Failed to spawn TUI process: {exc}")
|
||||
)
|
||||
return
|
||||
|
||||
os.close(slave_fd)
|
||||
os.set_blocking(master_fd, False)
|
||||
self._master_fd = master_fd
|
||||
self._read_task = asyncio.create_task(self._read_output(), name="nanobot-tui-reader")
|
||||
await self._bus.publish(
|
||||
WisperEvent(
|
||||
role="system",
|
||||
text=f"Spawned Nanobot TUI with command: {' '.join(command_parts)}",
|
||||
)
|
||||
)
|
||||
if child_venv_root:
|
||||
await self._bus.publish(
|
||||
WisperEvent(
|
||||
role="system",
|
||||
text=f"Nanobot runtime venv: {child_venv_root}",
|
||||
)
|
||||
)
|
||||
|
||||
async def send(self, text: str) -> None:
|
||||
if not self.running or self._master_fd is None:
|
||||
await self._bus.publish(
|
||||
WisperEvent(role="system", text="Nanobot TUI is not running. Click spawn first.")
|
||||
)
|
||||
return
|
||||
message = text.rstrip("\n") + "\n"
|
||||
try:
|
||||
os.write(self._master_fd, message.encode())
|
||||
except OSError as exc:
|
||||
await self._bus.publish(WisperEvent(role="system", text=f"Failed to write to TUI: {exc}"))
|
||||
|
||||
async def stop(self) -> None:
|
||||
if self._read_task:
|
||||
self._read_task.cancel()
|
||||
with contextlib.suppress(asyncio.CancelledError):
|
||||
await self._read_task
|
||||
self._read_task = None
|
||||
|
||||
if self.running and self._process:
|
||||
try:
|
||||
os.killpg(self._process.pid, signal.SIGTERM)
|
||||
except ProcessLookupError:
|
||||
pass
|
||||
except Exception:
|
||||
self._process.terminate()
|
||||
try:
|
||||
self._process.wait(timeout=3)
|
||||
except Exception:
|
||||
self._process.kill()
|
||||
self._process.wait(timeout=1)
|
||||
|
||||
if self._master_fd is not None:
|
||||
try:
|
||||
os.close(self._master_fd)
|
||||
except OSError:
|
||||
pass
|
||||
self._master_fd = None
|
||||
self._process = None
|
||||
self._pending_output = ""
|
||||
self._recent_lines.clear()
|
||||
self._last_tts_line = ""
|
||||
await self._bus.publish(WisperEvent(role="system", text="Stopped Nanobot TUI."))
|
||||
|
||||
async def _read_output(self) -> None:
|
||||
if self._master_fd is None:
|
||||
return
|
||||
while self.running:
|
||||
try:
|
||||
chunk = os.read(self._master_fd, 4096)
|
||||
except BlockingIOError:
|
||||
await asyncio.sleep(0.05)
|
||||
continue
|
||||
except OSError:
|
||||
break
|
||||
|
||||
if not chunk:
|
||||
await asyncio.sleep(0.05)
|
||||
continue
|
||||
|
||||
text = _clean_output(chunk.decode(errors="ignore"))
|
||||
if not text.strip():
|
||||
continue
|
||||
|
||||
displayable, tts_publishable = self._consume_output_chunk(text)
|
||||
if displayable:
|
||||
await self._bus.publish(WisperEvent(role="nanobot", text=displayable))
|
||||
if tts_publishable:
|
||||
await self._bus.publish(WisperEvent(role="nanobot-tts", text=tts_publishable))
|
||||
|
||||
trailing_display, trailing_tts = self._consume_output_chunk("\n")
|
||||
if trailing_display:
|
||||
await self._bus.publish(WisperEvent(role="nanobot", text=trailing_display))
|
||||
if trailing_tts:
|
||||
await self._bus.publish(WisperEvent(role="nanobot-tts", text=trailing_tts))
|
||||
|
||||
if self._process is not None:
|
||||
exit_code = self._process.poll()
|
||||
await self._bus.publish(
|
||||
WisperEvent(role="system", text=f"Nanobot TUI exited (code={exit_code}).")
|
||||
)
|
||||
|
||||
def _consume_output_chunk(self, text: str) -> tuple[str, str]:
|
||||
self._pending_output += text
|
||||
|
||||
lines = self._pending_output.split("\n")
|
||||
self._pending_output = lines.pop()
|
||||
|
||||
if len(self._pending_output) > 1024:
|
||||
lines.append(self._pending_output)
|
||||
self._pending_output = ""
|
||||
|
||||
kept_lines: list[str] = []
|
||||
tts_lines: list[str] = []
|
||||
for line in lines:
|
||||
normalized = self._normalize_line(line)
|
||||
if not normalized:
|
||||
continue
|
||||
if self._suppress_noisy_ui and self._is_noisy_ui_line(normalized):
|
||||
continue
|
||||
if normalized != self._last_tts_line:
|
||||
tts_lines.append(normalized)
|
||||
self._last_tts_line = normalized
|
||||
if self._is_recent_duplicate(normalized):
|
||||
continue
|
||||
kept_lines.append(normalized)
|
||||
|
||||
return "\n".join(kept_lines).strip(), "\n".join(tts_lines).strip()
|
||||
|
||||
def _normalize_line(self, line: str) -> str:
|
||||
without_emoji = EMOJI_RE.sub(" ", line)
|
||||
return re.sub(r"\s+", " ", without_emoji).strip()
|
||||
|
||||
def _is_noisy_ui_line(self, line: str) -> bool:
|
||||
if SPINNER_ONLY_RE.fullmatch(line):
|
||||
return True
|
||||
if BOX_DRAWING_ONLY_RE.fullmatch(line):
|
||||
return True
|
||||
|
||||
candidate = re.sub(r"^[^\w]+", "", line)
|
||||
if THINKING_LINE_RE.match(candidate):
|
||||
return True
|
||||
if TOOL_STREAM_LINE_RE.match(candidate):
|
||||
return True
|
||||
return False
|
||||
|
||||
def _is_recent_duplicate(self, line: str) -> bool:
|
||||
now = time.monotonic()
|
||||
normalized = line.lower()
|
||||
|
||||
while self._recent_lines and (now - self._recent_lines[0][1]) > self._dedup_window_s:
|
||||
self._recent_lines.popleft()
|
||||
|
||||
for previous, _timestamp in self._recent_lines:
|
||||
if previous == normalized:
|
||||
return True
|
||||
|
||||
self._recent_lines.append((normalized, now))
|
||||
return False
|
||||
|
||||
|
||||
class SuperTonicGateway:
|
||||
def __init__(self) -> None:
|
||||
self.bus = WisperBus()
|
||||
self._lock = asyncio.Lock()
|
||||
self._tui: NanobotTUIProcess | None = None
|
||||
|
||||
async def subscribe(self) -> asyncio.Queue[WisperEvent]:
|
||||
return await self.bus.subscribe()
|
||||
|
||||
async def unsubscribe(self, queue: asyncio.Queue[WisperEvent]) -> None:
|
||||
await self.bus.unsubscribe(queue)
|
||||
|
||||
async def spawn_tui(self) -> None:
|
||||
async with self._lock:
|
||||
if self._tui and self._tui.running:
|
||||
await self.bus.publish(WisperEvent(role="system", text="Nanobot TUI is already running."))
|
||||
return
|
||||
|
||||
command, workdir = _resolve_nanobot_command_and_workdir()
|
||||
self._tui = NanobotTUIProcess(bus=self.bus, command=command, workdir=workdir)
|
||||
await self._tui.start()
|
||||
|
||||
async def send_user_message(self, text: str) -> None:
|
||||
message = text.strip()
|
||||
if not message:
|
||||
return
|
||||
await self.bus.publish(WisperEvent(role="user", text=message))
|
||||
async with self._lock:
|
||||
if not self._tui:
|
||||
await self.bus.publish(
|
||||
WisperEvent(role="system", text="Nanobot TUI is not running. Click spawn first.")
|
||||
)
|
||||
return
|
||||
await self._tui.send(message)
|
||||
|
||||
async def stop_tui(self) -> None:
|
||||
async with self._lock:
|
||||
if self._tui:
|
||||
await self._tui.stop()
|
||||
|
||||
async def shutdown(self) -> None:
|
||||
await self.stop_tui()
|
||||
1656
voice_rtc.py
Normal file
1656
voice_rtc.py
Normal file
File diff suppressed because it is too large
Load diff
37
wisper.py
Normal file
37
wisper.py
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
import asyncio
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime, timezone
|
||||
|
||||
|
||||
@dataclass(slots=True)
|
||||
class WisperEvent:
|
||||
role: str
|
||||
text: str
|
||||
timestamp: str = field(
|
||||
default_factory=lambda: datetime.now(timezone.utc).isoformat(timespec="seconds")
|
||||
)
|
||||
|
||||
def to_dict(self) -> dict[str, str]:
|
||||
return {"role": self.role, "text": self.text, "timestamp": self.timestamp}
|
||||
|
||||
|
||||
class WisperBus:
|
||||
def __init__(self) -> None:
|
||||
self._subscribers: set[asyncio.Queue[WisperEvent]] = set()
|
||||
self._lock = asyncio.Lock()
|
||||
|
||||
async def subscribe(self) -> asyncio.Queue[WisperEvent]:
|
||||
queue: asyncio.Queue[WisperEvent] = asyncio.Queue()
|
||||
async with self._lock:
|
||||
self._subscribers.add(queue)
|
||||
return queue
|
||||
|
||||
async def unsubscribe(self, queue: asyncio.Queue[WisperEvent]) -> None:
|
||||
async with self._lock:
|
||||
self._subscribers.discard(queue)
|
||||
|
||||
async def publish(self, event: WisperEvent) -> None:
|
||||
async with self._lock:
|
||||
subscribers = list(self._subscribers)
|
||||
for queue in subscribers:
|
||||
queue.put_nowait(event)
|
||||
Loading…
Add table
Add a link
Reference in a new issue