Yuyao Huang (Sam) 29c0f2e403 docs: 更新项目架构文档并添加路线图文件

- 重构README.md，使用图表展示系统架构和组件交互
- 新增ROADMAP.md详细记录未来开发计划，分为四个阶段
- 优化项目设置说明，使其更加清晰易读

2026-03-28 08:16:55 +08:00

4.5 KiB

Raw Blame History

PhoneWork Roadmap

Issues observed in real usage, grouped by impact. No priority order within each phase.

Phase 1 — Core Reliability

These are friction points that directly break or degrade the basic send-message → get-reply loop.

1.1 Long output splitting

Problem: Feishu truncates messages at 4000 chars. Long code output is silently cut. Fix: Automatically split into multiple sequential messages with [1/3], [2/3] headers.

1.2 Concurrent message handling

Problem: If the user sends two messages quickly, both fire agent.run() simultaneously for the same user, causing race conditions in _active_conv and interleaved --resume calls to the same CC session. Fix: Per-user async lock (or queue) so messages process one at a time per user.

1.3 Session persistence across restarts

Problem: manager._sessions is in-memory. A server restart loses all active sessions. Users have to recreate them. Fix: Persist {conv_id, cwd, cc_session_id} to a JSON file on disk; reload on startup.

1.4 Mail boy passthrough mode

Problem: The mail boy (GLM) sometimes paraphrases or summarizes instead of relaying verbatim, losing code blocks and exact output. Fix: Bypass the mail boy entirely for follow-up messages — detect that there's an active session and call manager.send() directly without an LLM round-trip.

Phase 2 — Better Interaction Model

Reducing the number of messages needed to get things done.

2.1 Slash commands

Problem: Users must phrase everything as natural language for the mail boy to interpret. Fix: Recognize a small set of commands directly in handler.py before hitting the LLM:

/new <dir> — create session
/list — list sessions
/close — close active session
/switch <n> — switch active session by number
/retry — resend last message to CC

2.2 Multi-session switching

Problem: Only one "active session" per user. To switch projects, the user must remember conv_ids. Fix: /list shows numbered sessions; /switch 2 activates session #2. The system prompt shows all open sessions, not just the active one.

2.3 Feishu message cards

Problem: Plain text is hard to scan — code blocks, file paths, and status info all look the same. Fix: Use Feishu Interactive Cards (msg_type: interactive) to render:

Session status as a structured card (project name, cwd, session ID)
Action buttons: Continue, Close session, Run again

Phase 3 — Operational Quality

Making it reliable enough to leave running 24/7.

3.1 Health check improvements

Problem: /health only reports session count. No way to know if the Feishu WS connection is alive, or if CC is callable. Fix: Add to /health:

WebSocket connection status
Last message received timestamp
A claude -p "ping" smoke test result

3.2 Automatic reconnection

Problem: The Feishu WebSocket thread is a daemon — if it dies silently (network blip), no messages are received and there's no recovery. Fix: Wrap ws_client.start() in a retry loop with exponential backoff and log reconnection events.

3.3 Per-session timeout configuration

Problem: All sessions share a 30-min idle timeout and 300s CC timeout. Long-running tasks (e.g. running tests) may need more; quick chats need less. Fix: Allow per-session timeout overrides; expose via /new <dir> --timeout 600.

3.4 Audit log

Problem: No record of what was sent to Claude Code or what it did. Impossible to debug after the fact. Fix: Append each (timestamp, conv_id, prompt, response) to a JSONL file per session under the project directory.

Phase 4 — Multi-user & Security

For sharing the bot with teammates.

4.1 User allowlist

Problem: Anyone who can message the bot can run arbitrary code via Claude Code. Fix: ALLOWED_OPEN_IDS list in keyring.yaml; reject messages from unknown users.

4.2 Per-user session isolation

Problem: All users share the same manager singleton — user A could theoretically send to user B's session by guessing a conv_id. Fix: Namespace sessions by user_id; send_to_conversation validates that the requesting user owns the session.

4.3 Working directory sandboxing

Problem: The safety check in _resolve_dir blocks paths outside WORKING_DIR, but Claude Code itself runs with --dangerously-skip-permissions and can write anywhere. Fix: Consider running CC in a restricted user account or container; or drop --dangerously-skip-permissions and implement a permission-approval flow via Feishu buttons.

4.5 KiB Raw Blame History