PhoneWork/README.md
Yuyao Huang (Sam) 8ecc701d5e feat: 添加任务调度器、后台任务运行器及多种工具支持
实现后台任务调度器(scheduler.py)和任务运行器(task_runner.py),支持长时间运行任务的异步执行和状态跟踪
新增多种工具支持:Shell命令执行、文件操作(读写/搜索/发送)、网页搜索/问答、定时提醒等
扩展README和ROADMAP文档,描述新功能和未来多主机架构规划
在配置文件中添加METASO_API_KEY支持秘塔AI搜索功能
优化代理逻辑,自动识别通用问题直接回答而不创建会话
2026-03-28 13:45:20 +08:00

243 lines
9.1 KiB
Markdown

# PhoneWork
Feishu bot that lets users control Claude Code CLI from their phone.
## Architecture
```
┌─────────────┐ WebSocket ┌──────────────┐ LangChain ┌─────────────┐
│ Feishu │ ◄──────────────► │ FastAPI │ ◄──────────────► │ LLM API │
│ (client) │ │ (server) │ │ (ZhipuAI) │
└─────────────┘ └──────────────┘ └─────────────┘
┌─────────────┐
│ Claude Code │
│ (headless) │
└─────────────┘
```
**Components:**
| Module | Purpose |
|--------|---------|
| `main.py` | FastAPI entry point, starts WebSocket client + session manager + scheduler |
| `bot/handler.py` | Receives Feishu events via long-connection WebSocket |
| `bot/feishu.py` | Sends text/file/card replies back to Feishu |
| `bot/commands.py` | Slash command handler (`/new`, `/status`, `/shell`, `/remind`, `/tasks`, etc.) |
| `orchestrator/agent.py` | LangChain agent with per-user history + direct/smart mode + direct Q&A |
| `orchestrator/tools.py` | Tools: session management, shell, file ops, web search, scheduler, task status |
| `agent/manager.py` | Session registry with persistence, idle timeout, and auto-background tasks |
| `agent/pty_process.py` | Runs `claude -p` headlessly, manages session continuity via `--resume` |
| `agent/task_runner.py` | Background task runner with Feishu notifications |
| `agent/scheduler.py` | Reminder scheduler with persistence |
| `agent/audit.py` | Audit log of all interactions |
**Flow:** User message → Feishu WebSocket → Handler → (passthrough or LLM) → Session Manager → `claude -p` → Response back to Feishu
---
## Feishu App Setup
### 1. Create App
Go to [Feishu Open Platform](https://open.feishu.cn/app) → **Create App****Custom App**.
Record the **App ID** and **App Secret** from the Credentials page.
### 2. Enable Bot Capability
**App Features****Bot** → Enable.
### 3. Subscribe to Events (Long-connection mode)
**Event Subscriptions****Request URL** tab:
- Switch to **"Use long connection to receive events"** (长连接接收事件)
- No public URL required
Add event subscription:
| Event | Event Type |
|-------|-----------|
| Receive messages | `im.message.receive_v1` |
### 4. Required Permissions (API Scopes)
Go to **Permissions & Scopes** and add the following:
| Permission | API Scope | Used For |
|------------|-----------|----------|
| Read private messages sent to the bot | `im:message` (read) | Receiving user messages via WebSocket |
| Send messages | `im:message:send_as_bot` | Sending text replies |
| Upload files | `im:resource` | Uploading files before sending |
| Send messages in private chats | `im:message` (write) | Sending file messages |
Minimal scope list to request:
```
im:message
im:message:send_as_bot
im:resource
```
> **Note:** `im:resource` covers both file upload (`im.v1.file.create`) and sending
> file-type messages. Without it, `send_file()` will fail with a permission error.
### 5. Publish App
After adding all permissions:
1. **Version Management** → Create a new version → Submit for review (or self-publish if in the same org)
2. Install the app to your workspace
---
## Configuration
Copy and fill in credentials:
```bash
cp keyring.example.yaml keyring.yaml
```
`keyring.yaml` fields:
```yaml
FEISHU_APP_ID: cli_xxxxxxxxxxxxxxxx
FEISHU_APP_SECRET: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
OPENAI_BASE_URL: https://open.bigmodel.cn/api/paas/v4/
OPENAI_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL: glm-4.7
# Root directory for all project sessions (absolute path)
WORKING_DIR: C:/Users/yourname/projects
# Allowlist of Feishu open_ids that may use the bot.
# Leave empty to allow all users.
ALLOWED_OPEN_IDS:
- ou_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Optional: 秘塔AI Search API key for web search functionality
# Get your key at: https://metaso.cn/search-api/api-keys
METASO_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
```
---
## Installation & Run
**Requirements:** Python 3.11+, [Claude Code CLI](https://claude.ai/code) installed and authenticated.
```bash
python -m venv .venv
source .venv/Scripts/activate # Windows
# source .venv/bin/activate # Linux/macOS
pip install -r requirements.txt
python main.py
```
Server listens on `http://0.0.0.0:8000`.
Health check: `GET /health`
Claude smoke test: `GET /health/claude`
Active sessions: `GET /sessions`
---
## Bot Commands
| Command | Description |
|---------|-------------|
| `/new <dir> [msg]` | Create a new Claude Code session in `<dir>` |
| `/new <dir> [msg] --timeout N` | Create with custom CC timeout (seconds) |
| `/new <dir> [msg] --idle N` | Create with custom idle timeout (seconds) |
| `/status` | Show your sessions and current mode |
| `/switch <n>` | Switch active session to number `<n>` from `/status` |
| `/close [n]` | Close active session (or session `<n>`) |
| `/direct` | Direct mode: messages go straight to Claude Code (no LLM overhead) |
| `/smart` | Smart mode: messages go through LLM for intelligent routing (default) |
| `/shell <cmd>` | Run a shell command directly (bypasses LLM) |
| `/remind <time> <msg>` | Set a reminder (e.g., `/remind 10m check build`) |
| `/tasks` | List background tasks with status |
| `/help` | Show command reference |
### Message Routing Modes
**Smart mode (default):** Messages are analyzed by the LLM, which decides whether to create a new session, send to an existing one, or ask for clarification. Useful when you want the bot to understand natural language requests.
**Direct mode:** Messages go straight to the active Claude Code session, bypassing the LLM. Faster and more predictable, but requires an active session. Use `/direct` to enable.
### Claude Code Commands
Claude Code slash commands (like `/help`, `/clear`, `/compact`, `/cost`) are passed through to Claude Code when you have an active session. Bot commands (`/new`, `/status`, `/switch`, etc.) are handled by the bot first.
---
## Features
### Core Reliability
- **Message splitting** - Long responses automatically split into multiple messages instead of getting cut off
- **Concurrent handling** - Multiple users can message the bot simultaneously without conflicts
- **Session persistence** - Active sessions survive server restarts (saved to disk)
- **Direct mode** - Messages go straight to Claude Code, skipping the LLM for faster responses
### Better Interaction
- **Slash commands** - Direct control via `/new`, `/status`, `/switch`, `/close`, `/direct`, `/smart`
- **Multi-session switching** - Multiple projects open simultaneously, switch between them
- **Interactive cards** - Session status displayed in Feishu message cards
### Operational Quality
- **Health checks** - `/health` endpoint shows WebSocket status and can test Claude Code connectivity
- **Auto-reconnection** - WebSocket automatically reconnects if the connection drops
- **Configurable timeouts** - Each session can have custom idle and execution timeout settings
- **Audit logging** - All conversations logged to files for debugging and accountability
### Security
- **User allowlist** - Configure which Feishu users are allowed to use the bot
- **Session isolation** - Each user can only see and access their own sessions
- **Path sandboxing** - Sessions can only run inside the allowed working directory, blocking path traversal attacks
### Versatile Assistant (Milestone 2)
#### Direct Q&A
- Ask general knowledge questions without creating a Claude Code session
- The LLM answers directly using its own knowledge (e.g., "what is a Python generator?")
- Automatic detection of question-like messages
#### Shell Access
- Execute shell commands remotely via `/shell` or through the LLM
- Safety guards block destructive commands (`rm -rf /`, `sudo rm`, `mkfs`, etc.)
- Configurable timeout (max 120 seconds)
#### File Operations
- **Read files** - View file content with line numbers
- **Write files** - Create or append to files
- **List directories** - Browse project structure
- **Search content** - Grep-like search across text files
- **Send files** - Deliver files directly to Feishu chat
#### Background Tasks
- Long-running tasks (timeout > 60s) automatically run in background
- Immediate acknowledgment with task ID
- Feishu notification on completion
- Track task status with `/tasks` command
#### Web Search
- Search the web via 秘塔AI Search (requires `METASO_API_KEY`)
- Fetch and extract content from URLs
- Ask questions with RAG-powered answers
- Supports multiple scopes: webpage, paper, document, video, podcast
#### Scheduling & Reminders
- Set one-time reminders: `/remind 10m check the build`
- Schedule recurring reminders
- Notifications delivered to Feishu
- Persistent across server restarts