PhoneWork/README.md

# PhoneWork

Feishu bot that lets users control Claude Code CLI from their phone.

## Architecture

PhoneWork uses a **Router + Host Client** architecture that supports both single-machine and multi-host deployments:

```
┌─────────────────┐         ┌──────────┐  WebSocket   ┌────────────────────────────────────┐
│  Feishu App     │         │  Feishu  │◄────────────►│          Router (public VPS)       │
│  (User's Phone) │◄───────►│  Cloud   │              │  - Feishu event handler            │
└─────────────────┘         └──────────┘              │  - Router LLM (routing only)       │
                                                      │  - Node registry + active node map │
                                                      └───────────┬────────────────────────┘
                                                                  │ WebSocket (host clients connect in)
                                                      ┌───────────┴────────────────────────┐
                                                      │                                    │
                                           ┌──────────▼──────────┐           ┌────────────▼────────┐
                                           │   Host Client A     │           │   Host Client B     │
                                           │   (home-pc)         │           │   (work-server)     │
                                           │  - Mailboy LLM      │           │  - Mailboy LLM      │
                                           │  - CC sessions      │           │  - CC sessions      │
                                           │  - Shell / files    │           │  - Shell / files    │
                                           └─────────────────────┘           └─────────────────────┘
```

**Key design decisions:**
- Host clients connect TO the router (outbound WebSocket) — NAT-transparent
- A user can be registered on multiple nodes simultaneously
- The **router LLM** decides *which node* to route each message to
- The **node mailboy LLM** handles the full orchestration loop
- Each node maintains its own conversation history per user

**Deployment modes:**
- **Standalone (`python standalone.py`):** Runs router + host client at localhost. Same architecture, simpler setup for single-machine use.
- **Multi-host:** Router on a public VPS, host clients behind NAT on different machines.

**Components:**

| Module | Purpose |
|--------|---------|
| `standalone.py` | Single-process entry point: runs router + host client together |
| `main.py` | FastAPI entry point for router-only mode |
| `shared/protocol.py` | Wire protocol for router-host communication |
| `router/main.py` | FastAPI app factory, mounts `/ws/node` endpoint |
| `router/nodes.py` | Node registry, connection management, user-to-node mapping |
| `router/ws.py` | WebSocket endpoint for host clients, heartbeat, message routing |
| `router/rpc.py` | Request correlation with asyncio.Future, timeout handling |
| `router/routing_agent.py` | Single-shot routing LLM to decide which node handles each message |
| `host_client/main.py` | WebSocket client connecting to router, message handling, reconnection |
| `host_client/config.py` | Host client configuration loader |
| `bot/handler.py` | Receives Feishu events via long-connection WebSocket |
| `bot/feishu.py` | Sends text/file replies back to Feishu |
| `bot/commands.py` | Slash command handler (`/new`, `/status`, `/shell`, `/remind`, `/tasks`, `/nodes`, `/node`) |
| `orchestrator/agent.py` | LangChain agent with per-user history + direct/smart mode + direct Q&A |
| `orchestrator/tools.py` | Tools: session management, shell, file ops, web search, scheduler, task status |
| `agent/manager.py` | Session registry with persistence, idle timeout, and auto-background tasks |
| `agent/pty_process.py` | Runs `claude -p` headlessly, manages session continuity via `--resume` |
| `agent/task_runner.py` | Background task runner with Feishu notifications |
| `agent/scheduler.py` | Reminder scheduler with persistence |
| `agent/audit.py` | Audit log of all interactions |

**Flow:** User message → Feishu WebSocket → Handler → (passthrough or LLM) → Session Manager → `claude -p` → Response back to Feishu

---

## Feishu App Setup

### 1. Create App

Go to [Feishu Open Platform](https://open.feishu.cn/app) → **Create App** → **Custom App**.

Record the **App ID** and **App Secret** from the Credentials page.

### 2. Enable Bot Capability

**App Features** → **Bot** → Enable.

### 3. Subscribe to Events (Long-connection mode)

**Event Subscriptions** → **Request URL** tab:

- Switch to **"Use long connection to receive events"** (长连接接收事件)
- No public URL required

Add event subscription:

| Event | Event Type |
|-------|-----------|
| Receive messages | `im.message.receive_v1` |

### 4. Required Permissions (API Scopes)

Go to **Permissions & Scopes** and add the following:

| Permission | API Scope | Used For |
|------------|-----------|----------|
| Read private messages sent to the bot | `im:message` (read) | Receiving user messages via WebSocket |
| Send messages | `im:message:send_as_bot` | Sending text replies |
| Upload files | `im:resource` | Uploading files before sending |
| Send messages in private chats | `im:message` (write) | Sending file messages |

Minimal scope list to request:

```
im:message
im:message:send_as_bot
im:resource
```

> **Note:** `im:resource` covers both file upload (`im.v1.file.create`) and sending
> file-type messages. Without it, `send_file()` will fail with a permission error.

### 5. Publish App

After adding all permissions:

1. **Version Management** → Create a new version → Submit for review (or self-publish if in the same org)
2. Install the app to your workspace

---

## Configuration

Copy and fill in credentials:

```bash
cp keyring.example.yaml keyring.yaml
```

`keyring.yaml` fields:

```yaml
FEISHU_APP_ID: cli_xxxxxxxxxxxxxxxx
FEISHU_APP_SECRET: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

OPENAI_BASE_URL: https://open.bigmodel.cn/api/paas/v4/
OPENAI_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL: glm-4.7

# Server configuration
# Only used in router mode (python main.py) or standalone mode (python standalone.py)
# Default: 8000
PORT: 8000

# Root directory for all project sessions (absolute path)
# Only used in standalone mode (python standalone.py)
# In router mode (python main.py), this field is ignored
WORKING_DIR: C:/Users/yourname/projects

# Allowlist of Feishu open_ids that may use the bot.
# Leave empty to allow all users.
ALLOWED_OPEN_IDS:
  - ou_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

# Optional: 秘塔AI Search API key for web search functionality
# Get your key at: https://metaso.cn/search-api/api-keys
METASO_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

# Optional: Multi-host mode configuration
# Set ROUTER_MODE to true to enable router mode (deploy on public VPS)
ROUTER_MODE: false
ROUTER_SECRET: your-shared-secret-for-router-host-auth
```

### Host Client Configuration (for multi-host mode)

Copy and fill in credentials:

```bash
cp host_config.example.yaml host_config.yaml
```

Create `host_config.yaml` on each host client machine:

```yaml
NODE_ID: home-pc
DISPLAY_NAME: Home PC
ROUTER_URL: ws://192.168.1.100:8000/ws/node
ROUTER_SECRET: <shared_secret>

OPENAI_BASE_URL: https://open.bigmodel.cn/api/paas/v4/
OPENAI_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL: glm-4.7

WORKING_DIR: C:/Users/me/projects
METASO_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

# Which Feishu open_ids this node serves
SERVES_USERS:
  - ou_abc123def456
```

#### Determining `ROUTER_URL`

`ROUTER_URL` is the WebSocket address of the machine running the router (`python main.py`).

| Scenario | Value |
|---|---|
| Router and host client on the same LAN | `ws://192.168.x.x:8000/ws/node` |
| Router on a public VPS (no TLS) | `ws://your-server-ip:8000/ws/node` |
| Router behind a reverse proxy with TLS | `wss://yourdomain.com/ws/node` |

To find the router machine's LAN IP on Windows: `ipconfig` → look for IPv4 Address under your active adapter.

The router binds to `0.0.0.0:8000`, so any IP or hostname that reaches that machine on port 8000 will work.

#### Determining `ROUTER_SECRET`

`ROUTER_SECRET` is a shared secret that authenticates host client connections. Generate it once on any machine:

```bash
python -c "import secrets; print(secrets.token_hex(32))"
```

Copy the output. Set the **same value** in:
- `keyring.yaml` on the router machine (under `ROUTER_SECRET`)
- `host_config.yaml` on every host client machine (under `ROUTER_SECRET`)

If the secrets don't match, the router will reject the connection with code 4001.

In standalone mode (`python standalone.py`), the secret is auto-generated at startup and never needs to be configured.


---

## Installation & Run

**Requirements:** Python 3.11+, [Claude Code CLI](https://claude.ai/code) installed and authenticated.

```bash
python -m venv .venv
source .venv/Scripts/activate   # Windows
# source .venv/bin/activate     # Linux/macOS

pip install -r requirements.txt
```

### Standalone mode (single machine)

Runs router + host client in one process. This is the normal setup for personal use.

```bash
cp keyring.example.yaml keyring.yaml
# Fill in keyring.yaml, then:
python standalone.py
```

### Multi-host mode

**Router** (public VPS or any reachable machine — runs the Feishu bot):

```bash
# keyring.yaml must have ROUTER_MODE: true and ROUTER_SECRET set
python main.py
```

**Host client** (your dev machine behind NAT — runs Claude Code):

```bash
# Fill in host_config.yaml with ROUTER_URL and ROUTER_SECRET, then:
python -m host_client.main
```

Generate a shared secret for `ROUTER_SECRET`:

```bash
python -c "import secrets; print(secrets.token_hex(32))"
```

Set the **same value** in `keyring.yaml` on the router and `host_config.yaml` on each host client.

### Health check

```
GET /health
```

---

## Bot Commands

| Command | Description |
|---------|-------------|
| `/new <dir> [msg]` | Create a new Claude Code session in `<dir>` |
| `/new <dir> [msg] --timeout N` | Create with custom CC timeout (seconds) |
| `/new <dir> [msg] --idle N` | Create with custom idle timeout (seconds) |
| `/status` | Show your sessions and current mode |
| `/switch <n>` | Switch active session to number `<n>` from `/status` |
| `/close [n]` | Close active session (or session `<n>`) |
| `/direct` | Direct mode: messages go straight to Claude Code (no LLM overhead) |
| `/smart` | Smart mode: messages go through LLM for intelligent routing (default) |
| `/shell <cmd>` | Run a shell command directly (bypasses LLM) |
| `/remind <time> <msg>` | Set a reminder (e.g., `/remind 10m check build`) |
| `/tasks` | List background tasks with status |
| `/nodes` | List connected host nodes (multi-host mode) |
| `/node <name>` | Switch active node (multi-host mode) |
| `/help` | Show command reference |

### Message Routing Modes

**Smart mode (default):** Messages are analyzed by the LLM, which decides whether to create a new session, send to an existing one, or ask for clarification. Useful when you want the bot to understand natural language requests.

**Direct mode:** Messages go straight to the active Claude Code session, bypassing the LLM. Faster and more predictable, but requires an active session. Use `/direct` to enable.

### Claude Code Commands

Claude Code slash commands (like `/help`, `/clear`, `/compact`, `/cost`) are passed through to Claude Code when you have an active session. Bot commands (`/new`, `/status`, `/switch`, etc.) are handled by the bot first.

---

## Features

### Prototype Consolidation (Milestone 1)

#### Core Reliability

- **Message splitting** - Long responses automatically split into multiple messages instead of getting cut off
- **Concurrent handling** - Multiple users can message the bot simultaneously without conflicts
- **Session persistence** - Active sessions survive server restarts (saved to disk)
- **Direct mode** - Messages go straight to Claude Code, skipping the LLM for faster responses

#### Better Interaction

- **Slash commands** - Direct control via `/new`, `/status`, `/switch`, `/close`, `/direct`, `/smart`
- **Multi-session switching** - Multiple projects open simultaneously, switch between them
- **Interactive cards** - Session status displayed in Feishu message cards

#### Operational Quality

- **Health checks** - `/health` endpoint shows WebSocket status and can test Claude Code connectivity
- **Auto-reconnection** - WebSocket automatically reconnects if the connection drops
- **Configurable timeouts** - Each session can have custom idle and execution timeout settings
- **Audit logging** - All conversations logged to files for debugging and accountability

#### Security

- **User allowlist** - Configure which Feishu users are allowed to use the bot
- **Session isolation** - Each user can only see and access their own sessions
- **Path sandboxing** - Sessions can only run inside the allowed working directory, blocking path traversal attacks

### Versatile Assistant (Milestone 2)

#### Direct Q&A
- Ask general knowledge questions without creating a Claude Code session
- The LLM answers directly using its own knowledge (e.g., "what is a Python generator?")
- Automatic detection of question-like messages

#### Shell Access
- Execute shell commands remotely via `/shell` or through the LLM
- Safety guards block destructive commands (`rm -rf /`, `sudo rm`, `mkfs`, etc.)
- Configurable timeout (max 120 seconds)

#### File Operations
- **Read files** - View file content with line numbers
- **Write files** - Create or append to files
- **List directories** - Browse project structure
- **Search content** - Grep-like search across text files
- **Send files** - Deliver files directly to Feishu chat

#### Background Tasks
- Long-running tasks (timeout > 60s) automatically run in background
- Immediate acknowledgment with task ID
- Feishu notification on completion
- Track task status with `/tasks` command

#### Web Search
- Search the web via 秘塔AI Search (requires `METASO_API_KEY`)
- Fetch and extract content from URLs
- Ask questions with RAG-powered answers
- Supports multiple scopes: webpage, paper, document, video, podcast

#### Scheduling & Reminders
- Set one-time reminders: `/remind 10m check the build`
- Schedule recurring reminders
- Notifications delivered to Feishu
- Persistent across server restarts

### Multi-Host Architecture (Milestone 3)

#### Deployment Options

**Single-Machine Mode:**
```bash
python standalone.py
```
Runs both router and host client in one process. Identical UX to pre-M3 setup.

**Router Mode (Public VPS):**
```bash
# Set ROUTER_MODE: true in keyring.yaml
python main.py
```
Runs only the router: Feishu handler + routing LLM + node registry.

**Host Client Mode (Behind NAT):**
```bash
# Create host_config.yaml with ROUTER_URL and ROUTER_SECRET
python -m host_client.main
```
Connects to router via WebSocket, runs full mailboy stack locally.

#### Node Management
- `/nodes` — View all connected host nodes with status
- `/node <name>` — Switch active node for your user
- Automatic routing: LLM decides which node handles each message
- Health monitoring: Router tracks node heartbeats
- Reconnection: Host clients auto-reconnect on disconnect

#### Security
- Shared secret authentication between router and host clients
- User isolation: Each node only serves configured users
- Path sandboxing: Sessions restricted to WORKING_DIR