feat: 重构数据存储路径并优化任务通知机制
将审计日志、会话数据和定时任务文件移动到统一的data目录下 为后台任务添加完成回调功能,优化CC任务完成后的通知流程 更新README和ROADMAP文档,标记已完成的功能项
This commit is contained in:
parent
80e4953cf9
commit
52a9d085f7
8
.gitignore
vendored
8
.gitignore
vendored
@ -70,3 +70,11 @@ dmypy.json
|
||||
|
||||
# Ruff
|
||||
.ruff_cache/
|
||||
|
||||
# Runtime data (sessions, audit logs, scheduled jobs)
|
||||
data/
|
||||
|
||||
# Legacy paths (pre-consolidation)
|
||||
sessions.json
|
||||
scheduled_jobs.json
|
||||
audit/
|
||||
|
||||
119
README.md
119
README.md
@ -4,27 +4,55 @@ Feishu bot that lets users control Claude Code CLI from their phone.
|
||||
|
||||
## Architecture
|
||||
|
||||
PhoneWork uses a **Router + Host Client** architecture that supports both single-machine and multi-host deployments:
|
||||
|
||||
```
|
||||
┌─────────────┐ WebSocket ┌──────────────┐ LangChain ┌─────────────┐
|
||||
│ Feishu │ ◄──────────────► │ FastAPI │ ◄──────────────► │ LLM API │
|
||||
│ (client) │ │ (server) │ │ (ZhipuAI) │
|
||||
└─────────────┘ └──────────────┘ └─────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ Claude Code │
|
||||
│ (headless) │
|
||||
└─────────────┘
|
||||
┌─────────────────┐ ┌──────────┐ WebSocket ┌────────────────────────────────────┐
|
||||
│ Feishu App │ │ Feishu │◄────────────►│ Router (public VPS) │
|
||||
│ (User's Phone) │◄───────►│ Cloud │ │ - Feishu event handler │
|
||||
└─────────────────┘ └──────────┘ │ - Router LLM (routing only) │
|
||||
│ - Node registry + active node map │
|
||||
└───────────┬────────────────────────┘
|
||||
│ WebSocket (host clients connect in)
|
||||
┌───────────┴────────────────────────┐
|
||||
│ │
|
||||
┌──────────▼──────────┐ ┌────────────▼────────┐
|
||||
│ Host Client A │ │ Host Client B │
|
||||
│ (home-pc) │ │ (work-server) │
|
||||
│ - Mailboy LLM │ │ - Mailboy LLM │
|
||||
│ - CC sessions │ │ - CC sessions │
|
||||
│ - Shell / files │ │ - Shell / files │
|
||||
└─────────────────────┘ └─────────────────────┘
|
||||
```
|
||||
|
||||
**Key design decisions:**
|
||||
- Host clients connect TO the router (outbound WebSocket) — NAT-transparent
|
||||
- A user can be registered on multiple nodes simultaneously
|
||||
- The **router LLM** decides *which node* to route each message to
|
||||
- The **node mailboy LLM** handles the full orchestration loop
|
||||
- Each node maintains its own conversation history per user
|
||||
|
||||
**Deployment modes:**
|
||||
- **Standalone (`python standalone.py`):** Runs router + host client at localhost. Same architecture, simpler setup for single-machine use.
|
||||
- **Multi-host:** Router on a public VPS, host clients behind NAT on different machines.
|
||||
|
||||
**Components:**
|
||||
|
||||
| Module | Purpose |
|
||||
|--------|---------|
|
||||
| `main.py` | FastAPI entry point, starts WebSocket client + session manager + scheduler |
|
||||
| `standalone.py` | Single-process entry point: runs router + host client together |
|
||||
| `main.py` | FastAPI entry point for router-only mode |
|
||||
| `shared/protocol.py` | Wire protocol for router-host communication |
|
||||
| `router/main.py` | FastAPI app factory, mounts `/ws/node` endpoint |
|
||||
| `router/nodes.py` | Node registry, connection management, user-to-node mapping |
|
||||
| `router/ws.py` | WebSocket endpoint for host clients, heartbeat, message routing |
|
||||
| `router/rpc.py` | Request correlation with asyncio.Future, timeout handling |
|
||||
| `router/routing_agent.py` | Single-shot routing LLM to decide which node handles each message |
|
||||
| `host_client/main.py` | WebSocket client connecting to router, message handling, reconnection |
|
||||
| `host_client/config.py` | Host client configuration loader |
|
||||
| `bot/handler.py` | Receives Feishu events via long-connection WebSocket |
|
||||
| `bot/feishu.py` | Sends text/file/card replies back to Feishu |
|
||||
| `bot/commands.py` | Slash command handler (`/new`, `/status`, `/shell`, `/remind`, `/tasks`, etc.) |
|
||||
| `bot/feishu.py` | Sends text/file replies back to Feishu |
|
||||
| `bot/commands.py` | Slash command handler (`/new`, `/status`, `/shell`, `/remind`, `/tasks`, `/nodes`, `/node`) |
|
||||
| `orchestrator/agent.py` | LangChain agent with per-user history + direct/smart mode + direct Q&A |
|
||||
| `orchestrator/tools.py` | Tools: session management, shell, file ops, web search, scheduler, task status |
|
||||
| `agent/manager.py` | Session registry with persistence, idle timeout, and auto-background tasks |
|
||||
@ -122,6 +150,33 @@ ALLOWED_OPEN_IDS:
|
||||
# Optional: 秘塔AI Search API key for web search functionality
|
||||
# Get your key at: https://metaso.cn/search-api/api-keys
|
||||
METASO_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
|
||||
# Optional: Multi-host mode configuration
|
||||
# Set ROUTER_MODE to true to enable router mode (deploy on public VPS)
|
||||
ROUTER_MODE: false
|
||||
ROUTER_SECRET: your-shared-secret-for-router-host-auth
|
||||
```
|
||||
|
||||
### Host Client Configuration (for multi-host mode)
|
||||
|
||||
Create `host_config.yaml` for each host client:
|
||||
|
||||
```yaml
|
||||
NODE_ID: home-pc
|
||||
DISPLAY_NAME: Home PC
|
||||
ROUTER_URL: wss://router.example.com/ws/node
|
||||
ROUTER_SECRET: <shared_secret>
|
||||
|
||||
OPENAI_BASE_URL: https://open.bigmodel.cn/api/paas/v4/
|
||||
OPENAI_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
OPENAI_MODEL: glm-4.7
|
||||
|
||||
WORKING_DIR: C:/Users/me/projects
|
||||
METASO_API_KEY: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
|
||||
# Which Feishu open_ids this node serves
|
||||
SERVES_USERS:
|
||||
- ou_abc123def456
|
||||
```
|
||||
|
||||
---
|
||||
@ -162,6 +217,8 @@ Active sessions: `GET /sessions`
|
||||
| `/shell <cmd>` | Run a shell command directly (bypasses LLM) |
|
||||
| `/remind <time> <msg>` | Set a reminder (e.g., `/remind 10m check build`) |
|
||||
| `/tasks` | List background tasks with status |
|
||||
| `/nodes` | List connected host nodes (multi-host mode) |
|
||||
| `/node <name>` | Switch active node (multi-host mode) |
|
||||
| `/help` | Show command reference |
|
||||
|
||||
### Message Routing Modes
|
||||
@ -240,3 +297,39 @@ Claude Code slash commands (like `/help`, `/clear`, `/compact`, `/cost`) are pas
|
||||
- Schedule recurring reminders
|
||||
- Notifications delivered to Feishu
|
||||
- Persistent across server restarts
|
||||
|
||||
### Multi-Host Architecture (Milestone 3)
|
||||
|
||||
#### Deployment Options
|
||||
|
||||
**Single-Machine Mode:**
|
||||
```bash
|
||||
python standalone.py
|
||||
```
|
||||
Runs both router and host client in one process. Identical UX to pre-M3 setup.
|
||||
|
||||
**Router Mode (Public VPS):**
|
||||
```bash
|
||||
# Set ROUTER_MODE: true in keyring.yaml
|
||||
python main.py
|
||||
```
|
||||
Runs only the router: Feishu handler + routing LLM + node registry.
|
||||
|
||||
**Host Client Mode (Behind NAT):**
|
||||
```bash
|
||||
# Create host_config.yaml with ROUTER_URL and ROUTER_SECRET
|
||||
python -m host_client.main
|
||||
```
|
||||
Connects to router via WebSocket, runs full mailboy stack locally.
|
||||
|
||||
#### Node Management
|
||||
- `/nodes` — View all connected host nodes with status
|
||||
- `/node <name>` — Switch active node for your user
|
||||
- Automatic routing: LLM decides which node handles each message
|
||||
- Health monitoring: Router tracks node heartbeats
|
||||
- Reconnection: Host clients auto-reconnect on disconnect
|
||||
|
||||
#### Security
|
||||
- Shared secret authentication between router and host clients
|
||||
- User isolation: Each node only serves configured users
|
||||
- Path sandboxing: Sessions restricted to WORKING_DIR
|
||||
|
||||
46
ROADMAP.md
46
ROADMAP.md
@ -1,9 +1,9 @@
|
||||
# PhoneWork — Roadmap
|
||||
|
||||
## Milestone 2: Mailboy as a Versatile Assistant
|
||||
## ✅ Milestone 2: Mailboy as a Versatile Assistant (COMPLETED)
|
||||
|
||||
**Goal:** Elevate the mailboy (GLM-4.7 orchestrator) from a mere Claude Code relay into a
|
||||
fully capable phone assistant. Users should be able to control their machine, manage files,
|
||||
fully capable phone assistant. Users can control their machine, manage files,
|
||||
search the web, get direct answers, and track long-running tasks — all without necessarily
|
||||
opening a Claude Code session.
|
||||
|
||||
@ -177,21 +177,21 @@ args: action ("remind" | "repeat"), delay_seconds (int), interval_seconds (int),
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] M2.1: Ask "what is a Python generator?" — mailboy replies directly, no tool call
|
||||
- [ ] M2.2: Send "check git status in todo_app" — `ShellTool` runs, output returned
|
||||
- [ ] M2.2: Send "rm -rf /" — blocked by safety guard
|
||||
- [ ] M2.3: Send "show me the last 50 lines of audit/abc123.jsonl" — file content returned
|
||||
- [ ] M2.3: Send "send me the sessions.json file" — file arrives in Feishu chat
|
||||
- [ ] M2.4: Start a long CC task (e.g. `--timeout 120`) — bot replies immediately, notifies on finish
|
||||
- [ ] M2.4: `/tasks` — lists running task with elapsed time
|
||||
- [ ] M2.5: "Python 3.13 有哪些新特性?" — `web ask` returns RAG answer from metaso
|
||||
- [ ] M2.5: "帮我读取这个URL: https://example.com" — page content extracted as markdown
|
||||
- [ ] M2.6: `/remind 10m deploy check` — 10 min later, message arrives in Feishu
|
||||
- [x] M2.1: Ask "what is a Python generator?" — mailboy replies directly, no tool call
|
||||
- [x] M2.2: Send "check git status in todo_app" — `ShellTool` runs, output returned
|
||||
- [x] M2.2: Send "rm -rf /" — blocked by safety guard
|
||||
- [x] M2.3: Send "show me the last 50 lines of audit/abc123.jsonl" — file content returned
|
||||
- [x] M2.3: Send "send me the sessions.json file" — file arrives in Feishu chat
|
||||
- [x] M2.4: Start a long CC task (e.g. `--timeout 120`) — bot replies immediately, notifies on finish
|
||||
- [x] M2.4: `/tasks` — lists running task with elapsed time
|
||||
- [x] M2.5: "Python 3.13 有哪些新特性?" — `web ask` returns RAG answer from metaso
|
||||
- [x] M2.5: "帮我读取这个URL: https://example.com" — page content extracted as markdown
|
||||
- [x] M2.6: `/remind 10m deploy check` — 10 min later, message arrives in Feishu
|
||||
|
||||
---
|
||||
---
|
||||
|
||||
## Milestone 3: Multi-Host Architecture (Router / Host Client Split)
|
||||
## ✅ Milestone 3: Multi-Host Architecture (Router / Host Client Split) (COMPLETED)
|
||||
|
||||
**Goal:** Split PhoneWork into two deployable components — a public-facing **Router** and
|
||||
one or more **Host Clients** behind NAT. A user can be served by multiple nodes simultaneously.
|
||||
@ -519,16 +519,16 @@ PhoneWork/
|
||||
|
||||
## M3 Verification Checklist
|
||||
|
||||
- [ ] `python standalone.py` — works identically to current `python main.py`
|
||||
- [ ] Router starts, host client connects, registration logged
|
||||
- [ ] Feishu message → routing LLM selects node → forwarded → reply returned
|
||||
- [ ] `/nodes` shows all connected nodes with active marker
|
||||
- [ ] `/node work-server` — switches active node, confirmed in next message
|
||||
- [ ] Two nodes serving same user — message routed to active node
|
||||
- [ ] Kill host client → router marks offline, user sees "Node home-pc is offline"
|
||||
- [ ] Host client reconnects → re-registered, messages flow again
|
||||
- [ ] Long CC task on node finishes → router forwards completion notification to Feishu
|
||||
- [ ] Wrong `ROUTER_SECRET` → connection rejected with 401
|
||||
- [x] `python standalone.py` — works identically to current `python main.py`
|
||||
- [x] Router starts, host client connects, registration logged
|
||||
- [x] Feishu message → routing LLM selects node → forwarded → reply returned
|
||||
- [x] `/nodes` shows all connected nodes with active marker
|
||||
- [x] `/node work-server` — switches active node, confirmed in next message
|
||||
- [x] Two nodes serving same user — message routed to active node
|
||||
- [x] Kill host client → router marks offline, user sees "Node home-pc is offline"
|
||||
- [x] Host client reconnects → re-registered, messages flow again
|
||||
- [x] Long CC task on node finishes → router forwards completion notification to Feishu
|
||||
- [x] Wrong `ROUTER_SECRET` → connection rejected with 401
|
||||
|
||||
---
|
||||
|
||||
|
||||
@ -10,7 +10,7 @@ from typing import Optional
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
AUDIT_DIR = Path(__file__).parent.parent / "audit"
|
||||
AUDIT_DIR = Path(__file__).parent.parent / "data" / "audit"
|
||||
|
||||
|
||||
def _ensure_audit_dir() -> None:
|
||||
|
||||
@ -17,7 +17,7 @@ logger = logging.getLogger(__name__)
|
||||
|
||||
DEFAULT_IDLE_TIMEOUT = 30 * 60
|
||||
DEFAULT_CC_TIMEOUT = 300.0
|
||||
PERSISTENCE_FILE = Path(__file__).parent.parent / "sessions.json"
|
||||
PERSISTENCE_FILE = Path(__file__).parent.parent / "data" / "sessions.json"
|
||||
|
||||
|
||||
@dataclass
|
||||
@ -105,7 +105,7 @@ class SessionManager:
|
||||
|
||||
if cc_timeout > 60:
|
||||
from agent.task_runner import task_runner
|
||||
from orchestrator.tools import get_current_chat
|
||||
from orchestrator.tools import get_current_chat, set_current_chat, set_current_user
|
||||
|
||||
chat_id = get_current_chat()
|
||||
|
||||
@ -126,10 +126,29 @@ class SessionManager:
|
||||
)
|
||||
return output
|
||||
|
||||
async def on_task_complete(task) -> None:
|
||||
if not chat_id or not user_id or not task.result:
|
||||
return
|
||||
set_current_user(user_id)
|
||||
set_current_chat(chat_id)
|
||||
from orchestrator.agent import agent
|
||||
follow_up = (
|
||||
f"CC task completed. Output:\n{task.result}\n\n"
|
||||
f"Original request was: {message}\n\n"
|
||||
"If the user asked you to send a file, use send_file now. "
|
||||
"Otherwise just acknowledge completion."
|
||||
)
|
||||
reply = await agent.run(user_id, follow_up)
|
||||
if reply:
|
||||
from bot.feishu import send_text
|
||||
await send_text(chat_id, "chat_id", reply)
|
||||
|
||||
task_id = await task_runner.submit(
|
||||
run_task,
|
||||
run_task(),
|
||||
description=f"CC session {conv_id}: {message[:50]}",
|
||||
notify_chat_id=chat_id,
|
||||
user_id=user_id,
|
||||
on_complete=on_task_complete,
|
||||
)
|
||||
return f"⏳ Task #{task_id} started (timeout: {int(cc_timeout)}s). I'll notify you when it's done."
|
||||
|
||||
@ -183,6 +202,7 @@ class SessionManager:
|
||||
def _save(self) -> None:
|
||||
try:
|
||||
data = {cid: s.to_dict() for cid, s in self._sessions.items()}
|
||||
PERSISTENCE_FILE.parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(PERSISTENCE_FILE, "w", encoding="utf-8") as f:
|
||||
json.dump(data, f, indent=2)
|
||||
logger.debug("Saved %d sessions to %s", len(data), PERSISTENCE_FILE)
|
||||
|
||||
@ -14,7 +14,7 @@ from typing import Any, Callable, Dict, Optional
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
PERSISTENCE_FILE = Path(__file__).parent.parent / "scheduled_jobs.json"
|
||||
PERSISTENCE_FILE = Path(__file__).parent.parent / "data" / "scheduled_jobs.json"
|
||||
|
||||
|
||||
class JobStatus(str, Enum):
|
||||
@ -98,6 +98,7 @@ class Scheduler:
|
||||
"""Save jobs to persistence file."""
|
||||
try:
|
||||
data = {jid: job.to_dict() for jid, job in self._jobs.items()}
|
||||
PERSISTENCE_FILE.parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(PERSISTENCE_FILE, "w", encoding="utf-8") as f:
|
||||
json.dump(data, f, indent=2, ensure_ascii=False)
|
||||
except Exception:
|
||||
|
||||
@ -57,6 +57,7 @@ class TaskRunner:
|
||||
description: str,
|
||||
notify_chat_id: Optional[str] = None,
|
||||
user_id: Optional[str] = None,
|
||||
on_complete: Optional[Callable[[BackgroundTask], Awaitable[None]]] = None,
|
||||
) -> str:
|
||||
"""Submit a coroutine as a background task."""
|
||||
task_id = str(uuid.uuid4())[:8]
|
||||
@ -72,11 +73,11 @@ class TaskRunner:
|
||||
async with self._lock:
|
||||
self._tasks[task_id] = task
|
||||
|
||||
asyncio.create_task(self._run_task(task_id, coro))
|
||||
asyncio.create_task(self._run_task(task_id, coro, on_complete))
|
||||
logger.info("Submitted background task %s: %s", task_id, description)
|
||||
return task_id
|
||||
|
||||
async def _run_task(self, task_id: str, coro: Awaitable[Any]) -> None:
|
||||
async def _run_task(self, task_id: str, coro: Awaitable[Any], on_complete: Optional[Callable[[BackgroundTask], Awaitable[None]]] = None) -> None:
|
||||
"""Execute a task and send notification on completion."""
|
||||
async with self._lock:
|
||||
task = self._tasks.get(task_id)
|
||||
@ -107,6 +108,12 @@ class TaskRunner:
|
||||
else:
|
||||
await self._send_notification(task)
|
||||
|
||||
if on_complete and task.status == TaskStatus.COMPLETED:
|
||||
try:
|
||||
await on_complete(task)
|
||||
except Exception:
|
||||
logger.exception("on_complete callback failed for task %s", task_id)
|
||||
|
||||
async def _send_notification(self, task: BackgroundTask) -> None:
|
||||
"""Send Feishu notification about task completion."""
|
||||
from bot.feishu import send_text
|
||||
|
||||
@ -184,17 +184,15 @@ async def send_file(receive_id: str, receive_id_type: str, file_path: str, file_
|
||||
loop = asyncio.get_running_loop()
|
||||
|
||||
# Step 1: Upload file → get file_key
|
||||
with open(path, "rb") as f:
|
||||
file_data = f.read()
|
||||
|
||||
def _upload():
|
||||
with open(path, "rb") as f:
|
||||
req = (
|
||||
CreateFileRequest.builder()
|
||||
.request_body(
|
||||
CreateFileRequestBody.builder()
|
||||
.file_type(file_type)
|
||||
.file_name(file_name)
|
||||
.file(file_data)
|
||||
.file(f)
|
||||
.build()
|
||||
)
|
||||
.build()
|
||||
|
||||
@ -190,14 +190,14 @@ class NodeClient:
|
||||
"""Periodic heartbeat loop."""
|
||||
while self._running:
|
||||
await asyncio.sleep(30)
|
||||
if self.ws and self.ws.open:
|
||||
if self.ws:
|
||||
await self.send_heartbeat()
|
||||
|
||||
async def status_loop(self) -> None:
|
||||
"""Periodic status update loop."""
|
||||
while self._running:
|
||||
await asyncio.sleep(60)
|
||||
if self.ws and self.ws.open:
|
||||
if self.ws:
|
||||
await self.send_status()
|
||||
|
||||
async def run(self) -> None:
|
||||
|
||||
@ -51,6 +51,10 @@ Your responsibilities:
|
||||
6. WEB / SEARCH: Use the `web` tool when the user needs current information. \
|
||||
Call it ONCE (or at most twice with a refined query). Then synthesize and reply — \
|
||||
do NOT keep searching in a loop. If the first search returns results, use them.
|
||||
7. BACKGROUND TASKS: When `create_conversation` or `send_to_conversation` returns a \
|
||||
"Task #... started" message, the task is running in the background. \
|
||||
Immediately reply to the user that the task has started and they will be notified. \
|
||||
Do NOT call `task_status` in a loop waiting for it — the system sends a notification when done.
|
||||
|
||||
Guidelines:
|
||||
- Relay Claude Code's output verbatim.
|
||||
@ -191,6 +195,7 @@ class OrchestrationAgent:
|
||||
reply = ""
|
||||
try:
|
||||
web_calls = 0
|
||||
task_status_calls = 0
|
||||
for iteration in range(MAX_ITERATIONS):
|
||||
logger.debug(" LLM call #%d", iteration)
|
||||
ai_msg: AIMessage = await self._llm_with_tools.ainvoke(messages)
|
||||
@ -221,6 +226,16 @@ class OrchestrationAgent:
|
||||
)
|
||||
continue
|
||||
|
||||
if tool_name == "task_status":
|
||||
task_status_calls += 1
|
||||
if task_status_calls > 1:
|
||||
result = "Task is still running in the background. Stop polling and tell the user they will be notified when it completes."
|
||||
logger.warning(" task_status poll limit exceeded, blocking")
|
||||
messages.append(
|
||||
ToolMessage(content=str(result), tool_call_id=tool_id)
|
||||
)
|
||||
continue
|
||||
|
||||
tool_obj = _TOOL_MAP.get(tool_name)
|
||||
if tool_obj is None:
|
||||
result = f"Unknown tool: {tool_name}"
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user