4.5 KiB
PhoneWork Roadmap
Issues observed in real usage, grouped by impact. No priority order within each phase.
Phase 1 — Core Reliability
These are friction points that directly break or degrade the basic send-message → get-reply loop.
1.1 Long output splitting
Problem: Feishu truncates messages at 4000 chars. Long code output is silently cut.
Fix: Automatically split into multiple sequential messages with [1/3], [2/3] headers.
1.2 Concurrent message handling
Problem: If the user sends two messages quickly, both fire agent.run() simultaneously for the same user, causing race conditions in _active_conv and interleaved --resume calls to the same CC session.
Fix: Per-user async lock (or queue) so messages process one at a time per user.
1.3 Session persistence across restarts
Problem: manager._sessions is in-memory. A server restart loses all active sessions. Users have to recreate them.
Fix: Persist {conv_id, cwd, cc_session_id} to a JSON file on disk; reload on startup.
1.4 Mail boy passthrough mode
Problem: The mail boy (GLM) sometimes paraphrases or summarizes instead of relaying verbatim, losing code blocks and exact output.
Fix: Bypass the mail boy entirely for follow-up messages — detect that there's an active session and call manager.send() directly without an LLM round-trip.
Phase 2 — Better Interaction Model
Reducing the number of messages needed to get things done.
2.1 Slash commands
Problem: Users must phrase everything as natural language for the mail boy to interpret.
Fix: Recognize a small set of commands directly in handler.py before hitting the LLM:
/new <dir>— create session/list— list sessions/close— close active session/switch <n>— switch active session by number/retry— resend last message to CC
2.2 Multi-session switching
Problem: Only one "active session" per user. To switch projects, the user must remember conv_ids.
Fix: /list shows numbered sessions; /switch 2 activates session #2. The system prompt shows all open sessions, not just the active one.
2.3 Feishu message cards
Problem: Plain text is hard to scan — code blocks, file paths, and status info all look the same.
Fix: Use Feishu Interactive Cards (msg_type: interactive) to render:
- Session status as a structured card (project name, cwd, session ID)
- Action buttons: Continue, Close session, Run again
Phase 3 — Operational Quality
Making it reliable enough to leave running 24/7.
3.1 Health check improvements
Problem: /health only reports session count. No way to know if the Feishu WS connection is alive, or if CC is callable.
Fix: Add to /health:
- WebSocket connection status
- Last message received timestamp
- A
claude -p "ping"smoke test result
3.2 Automatic reconnection
Problem: The Feishu WebSocket thread is a daemon — if it dies silently (network blip), no messages are received and there's no recovery.
Fix: Wrap ws_client.start() in a retry loop with exponential backoff and log reconnection events.
3.3 Per-session timeout configuration
Problem: All sessions share a 30-min idle timeout and 300s CC timeout. Long-running tasks (e.g. running tests) may need more; quick chats need less.
Fix: Allow per-session timeout overrides; expose via /new <dir> --timeout 600.
3.4 Audit log
Problem: No record of what was sent to Claude Code or what it did. Impossible to debug after the fact.
Fix: Append each (timestamp, conv_id, prompt, response) to a JSONL file per session under the project directory.
Phase 4 — Multi-user & Security
For sharing the bot with teammates.
4.1 User allowlist
Problem: Anyone who can message the bot can run arbitrary code via Claude Code.
Fix: ALLOWED_OPEN_IDS list in keyring.yaml; reject messages from unknown users.
4.2 Per-user session isolation
Problem: All users share the same manager singleton — user A could theoretically send to user B's session by guessing a conv_id.
Fix: Namespace sessions by user_id; send_to_conversation validates that the requesting user owns the session.
4.3 Working directory sandboxing
Problem: The safety check in _resolve_dir blocks paths outside WORKING_DIR, but Claude Code itself runs with --dangerously-skip-permissions and can write anywhere.
Fix: Consider running CC in a restricted user account or container; or drop --dangerously-skip-permissions and implement a permission-approval flow via Feishu buttons.