docs: 添加秘塔AI搜索MCP服务文档和更新README
添加秘塔AI搜索MCP服务的详细文档metaso.md,包含API说明和配置指南 更新README文件中的命令说明和功能描述 新增ROADMAP.md文件记录未来开发计划
This commit is contained in:
parent
de6205d2fd
commit
6a0d409dd6
51
README.md
51
README.md
@ -24,8 +24,8 @@ Feishu bot that lets users control Claude Code CLI from their phone.
|
||||
| `main.py` | FastAPI entry point, starts WebSocket client + session manager |
|
||||
| `bot/handler.py` | Receives Feishu events via long-connection WebSocket |
|
||||
| `bot/feishu.py` | Sends text/file/card replies back to Feishu |
|
||||
| `bot/commands.py` | Slash command handler (`/new`, `/list`, `/close`, `/switch`, `/help`) |
|
||||
| `orchestrator/agent.py` | LangChain agent with per-user history + passthrough mode |
|
||||
| `bot/commands.py` | Slash command handler (`/new`, `/status`, `/switch`, `/direct`, `/smart`, etc.) |
|
||||
| `orchestrator/agent.py` | LangChain agent with per-user history + direct/smart mode toggle |
|
||||
| `orchestrator/tools.py` | Tools: `create_conversation`, `send_to_conversation`, `list_conversations`, `close_conversation` |
|
||||
| `agent/manager.py` | Session registry with persistence and idle timeout reaper |
|
||||
| `agent/pty_process.py` | Runs `claude -p` headlessly, manages session continuity via `--resume` |
|
||||
@ -148,18 +148,49 @@ Active sessions: `GET /sessions`
|
||||
| `/new <dir> [msg]` | Create a new Claude Code session in `<dir>` |
|
||||
| `/new <dir> [msg] --timeout N` | Create with custom CC timeout (seconds) |
|
||||
| `/new <dir> [msg] --idle N` | Create with custom idle timeout (seconds) |
|
||||
| `/list` | List your active sessions |
|
||||
| `/switch <n>` | Switch active session to number `<n>` from `/list` |
|
||||
| `/status` | Show your sessions and current mode |
|
||||
| `/switch <n>` | Switch active session to number `<n>` from `/status` |
|
||||
| `/close [n]` | Close active session (or session `<n>`) |
|
||||
| `/direct` | Direct mode: messages go straight to Claude Code (no LLM overhead) |
|
||||
| `/smart` | Smart mode: messages go through LLM for intelligent routing (default) |
|
||||
| `/help` | Show command reference |
|
||||
|
||||
Any message without a `/` prefix is forwarded directly to the active Claude Code session.
|
||||
### Message Routing Modes
|
||||
|
||||
**Smart mode (default):** Messages are analyzed by the LLM, which decides whether to create a new session, send to an existing one, or ask for clarification. Useful when you want the bot to understand natural language requests.
|
||||
|
||||
**Direct mode:** Messages go straight to the active Claude Code session, bypassing the LLM. Faster and more predictable, but requires an active session. Use `/direct` to enable.
|
||||
|
||||
### Claude Code Commands
|
||||
|
||||
Claude Code slash commands (like `/help`, `/clear`, `/compact`, `/cost`) are passed through to Claude Code when you have an active session. Bot commands (`/new`, `/status`, `/switch`, etc.) are handled by the bot first.
|
||||
|
||||
---
|
||||
|
||||
## Security
|
||||
## Features
|
||||
|
||||
- **Allowlist** (`ALLOWED_OPEN_IDS`): only listed `open_id`s can use the bot. Empty = open to all.
|
||||
- **Path sandbox**: all session directories must be under `WORKING_DIR`; path traversal is blocked.
|
||||
- **Session ownership**: sessions are tied to the creating user; other users cannot send to them.
|
||||
- **Audit log**: all prompts and responses are written to `agent/audit.log`.
|
||||
### Core Reliability
|
||||
|
||||
- **Message splitting** - Long responses automatically split into multiple messages instead of getting cut off
|
||||
- **Concurrent handling** - Multiple users can message the bot simultaneously without conflicts
|
||||
- **Session persistence** - Active sessions survive server restarts (saved to disk)
|
||||
- **Direct mode** - Messages go straight to Claude Code, skipping the LLM for faster responses
|
||||
|
||||
### Better Interaction
|
||||
|
||||
- **Slash commands** - Direct control via `/new`, `/status`, `/switch`, `/close`, `/direct`, `/smart`
|
||||
- **Multi-session switching** - Multiple projects open simultaneously, switch between them
|
||||
- **Interactive cards** - Session status displayed in Feishu message cards
|
||||
|
||||
### Operational Quality
|
||||
|
||||
- **Health checks** - `/health` endpoint shows WebSocket status and can test Claude Code connectivity
|
||||
- **Auto-reconnection** - WebSocket automatically reconnects if the connection drops
|
||||
- **Configurable timeouts** - Each session can have custom idle and execution timeout settings
|
||||
- **Audit logging** - All conversations logged to files for debugging and accountability
|
||||
|
||||
### Security
|
||||
|
||||
- **User allowlist** - Configure which Feishu users are allowed to use the bot
|
||||
- **Session isolation** - Each user can only see and access their own sessions
|
||||
- **Path sandboxing** - Sessions can only run inside the allowed working directory, blocking path traversal attacks
|
||||
|
||||
189
ROADMAP.md
Normal file
189
ROADMAP.md
Normal file
@ -0,0 +1,189 @@
|
||||
# PhoneWork — Roadmap
|
||||
|
||||
## Milestone 2: Mailboy as a Versatile Assistant
|
||||
|
||||
**Goal:** Elevate the mailboy (GLM-4.7 orchestrator) from a mere Claude Code relay into a
|
||||
fully capable phone assistant. Users should be able to control their machine, manage files,
|
||||
search the web, get direct answers, and track long-running tasks — all without necessarily
|
||||
opening a Claude Code session.
|
||||
|
||||
### M2.1 — Direct Q&A (no CC session required)
|
||||
|
||||
The mailboy already has an LLM; it just needs permission to answer.
|
||||
When no active session exists and the user asks a general question, the mailboy should
|
||||
reply with its own knowledge instead of asking "which project?".
|
||||
|
||||
**Changes:**
|
||||
- Update the system prompt: give the mailboy explicit permission to answer questions directly
|
||||
- Add a heuristic in `_run_locked`: if there's no active session and the message looks like a
|
||||
question (ends with `?`, contains `what/how/why/explain`), skip tool loop and reply directly
|
||||
- Zero new code; pure prompt + logic tweak in `orchestrator/agent.py`
|
||||
|
||||
---
|
||||
|
||||
### M2.2 — Shell Tool
|
||||
|
||||
Run arbitrary shell commands on the host machine and return stdout/stderr.
|
||||
Covers: check git status, tail logs, kill a process, `ps aux | grep`, `pip list`, etc.
|
||||
|
||||
**New tool:** `orchestrator/tools.py` → `ShellTool`
|
||||
```
|
||||
name: "run_shell"
|
||||
args: command (str), cwd (str, optional, defaults to WORKING_DIR), timeout (int, default 30)
|
||||
returns: {stdout, stderr, exit_code}
|
||||
```
|
||||
|
||||
**Safety guards:**
|
||||
- Blocklist of destructive patterns (`rm -rf /`, `format`, `mkfs`, `shutdown`, `reboot`,
|
||||
`dd if=`, `:(){:|:&};:`) — refuse with a clear error
|
||||
- `cwd` must be under `WORKING_DIR` (reuse `_resolve_dir`) or be an explicit absolute path
|
||||
approved by the user (raise a confirmation request)
|
||||
- Timeout hard cap: 120 s; for longer tasks see M2.4
|
||||
|
||||
**New slash command:** `/shell <command>` (bypasses LLM; runs directly)
|
||||
|
||||
---
|
||||
|
||||
### M2.3 — File Operations Tool
|
||||
|
||||
Read files, list directories, search content, and send files back to the user in Feishu.
|
||||
Covers: "show me the error log", "what files are in my project?", "search for TODO comments"
|
||||
|
||||
**New tool:** `orchestrator/tools.py` → `FileOpsTool` (single tool, `action` dispatch)
|
||||
```
|
||||
name: "file_ops"
|
||||
args: action ("read" | "list" | "search" | "send"), path (str), query (str, for search),
|
||||
max_bytes (int, default 8000)
|
||||
```
|
||||
|
||||
- `read`: read file, truncate to `max_bytes`, return content
|
||||
- `list`: recursive directory tree (depth-limited to 3), file sizes
|
||||
- `search`: grep-like ripgrep/Python search for `query` in `path`
|
||||
- `send`: upload and send file via `bot/feishu.py::send_file()` (already implemented)
|
||||
— tool receives `chat_id` via context var (add alongside `current_user_id`)
|
||||
|
||||
**Safety:** all paths must be under `WORKING_DIR`
|
||||
|
||||
---
|
||||
|
||||
### M2.4 — Long-Running Task Manager
|
||||
|
||||
This is the key UX upgrade. `claude -p` and shell commands that take minutes need
|
||||
fire-and-forget with completion notification.
|
||||
|
||||
**Design:**
|
||||
- Add `BackgroundTask` dataclass: `{task_id, description, started_at, status, conv_id_or_none}`
|
||||
- `TaskRunner` singleton in `agent/task_runner.py`:
|
||||
- `submit(coro, description, notify_chat_id) -> task_id`
|
||||
- wraps coroutine in `asyncio.create_task`; on completion sends a Feishu notification
|
||||
via `send_text(notify_chat_id, ...)`
|
||||
- stores tasks in-memory dict `{task_id: BackgroundTask}`
|
||||
- When `manager.send()` is called for a CC session:
|
||||
- if `cc_timeout > 60`: automatically run in background, return immediately with
|
||||
`"⏳ Task #<id> started. I'll notify you when it's done."`
|
||||
- otherwise run inline as today
|
||||
|
||||
**New tool:** `run_background` — explicitly submits any shell command or CC prompt as a
|
||||
background task and returns `task_id` immediately.
|
||||
|
||||
**New slash command:** `/tasks` — list running/completed background tasks with status.
|
||||
|
||||
**New tool:** `task_status` — check status of a specific `task_id`, optionally get output so far.
|
||||
|
||||
**Notification format:**
|
||||
```
|
||||
✅ Task #abc123 done (42s)
|
||||
/new todo_app: fix the login bug
|
||||
|
||||
[CC output truncated to 800 chars]...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### M2.5 — Web Tool
|
||||
|
||||
Let the mailboy fetch URLs and search the web for quick answers.
|
||||
Covers: "最新的 LangChain 有什么变化?", "fetch this GitHub issue", "帮我搜索这篇论文"
|
||||
|
||||
**Backend:** 秘塔AI Search MCP (`https://metaso.cn/api/mcp`) — mainland China accessible,
|
||||
official API, Bearer token auth. Requires `METASO_API_KEY` in `keyring.yaml`.
|
||||
Get key at: https://metaso.cn/search-api/api-keys
|
||||
|
||||
**New tool:** `WebTool` (three actions dispatched via one tool)
|
||||
```
|
||||
name: "web"
|
||||
args: action ("search" | "fetch" | "ask"), query (str), url (str), scope (str), max_chars (int)
|
||||
```
|
||||
|
||||
- `search`: calls `metaso_web_search` — returns top results (title + snippet + URL)
|
||||
- `scope` options: `webpage` (default), `paper`, `document`, `video`, `podcast`
|
||||
- `fetch`: calls `metaso_web_reader` with `format=markdown` — extracts clean content from URL
|
||||
- `ask`: calls `metaso_chat` — RAG answer combining search + generation (for quick factual Q&A)
|
||||
|
||||
**Implementation:** HTTP POST to `https://metaso.cn/api/mcp` with JSON-RPC body,
|
||||
`Authorization: Bearer <METASO_API_KEY>` header. Use `httpx.AsyncClient` (already installed).
|
||||
|
||||
**New config key:** `METASO_API_KEY` in `keyring.yaml` and `config.py` (optional — WebTool
|
||||
disabled gracefully if not set)
|
||||
|
||||
---
|
||||
|
||||
### M2.6 — Scheduling & Reminders
|
||||
|
||||
Set a one-shot reminder or run a recurring check.
|
||||
Covers: "remind me in 30 minutes", "check if the tests pass every 5 minutes"
|
||||
|
||||
**Design:** `agent/scheduler.py` — thin wrapper around `asyncio` with:
|
||||
- `schedule_once(delay_seconds, coro, description)` — fire once
|
||||
- `schedule_recurring(interval_seconds, coro_factory, description, max_runs)` — repeat N times
|
||||
- All scheduled jobs send a Feishu notification on completion (same as M2.4)
|
||||
- Jobs stored in-memory; cleared on server restart (acceptable for now)
|
||||
|
||||
**New tool:** `scheduler`
|
||||
```
|
||||
args: action ("remind" | "repeat"), delay_seconds (int), interval_seconds (int),
|
||||
message (str), conv_id (str, optional — if set, forward to that CC session)
|
||||
```
|
||||
|
||||
**New slash command:** `/remind <N>m|h|s <message>` — set a reminder without LLM
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
1. **M2.1** — Direct Q&A (prompt + 10-line logic change; highest ROI, zero risk)
|
||||
2. **M2.4** — Background task runner (unblocks long CC jobs; foundational for M2.5/M2.6)
|
||||
3. **M2.2** — Shell tool (most-used phone use case)
|
||||
4. **M2.3** — File ops tool (`send_file` already done; rest is straightforward)
|
||||
5. **M2.5** — Web tool (秘塔AI MCP; needs `METASO_API_KEY`)
|
||||
6. **M2.6** — Scheduling (builds on M2.4 notification infra)
|
||||
|
||||
---
|
||||
|
||||
## Files to Create / Modify
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `orchestrator/agent.py` | M2.1 prompt update + question heuristic |
|
||||
| `orchestrator/tools.py` | Add `ShellTool`, `FileOpsTool`, `WebTool`, `TaskStatusTool`, `SchedulerTool` |
|
||||
| `agent/task_runner.py` | New — `TaskRunner` singleton, `BackgroundTask` dataclass |
|
||||
| `agent/scheduler.py` | New — `schedule_once`, `schedule_recurring` |
|
||||
| `bot/commands.py` | Add `/shell`, `/tasks`, `/remind` commands |
|
||||
| `bot/feishu.py` | Add `chat_id` context var for file send from tool |
|
||||
| `bot/handler.py` | Pass `chat_id` into context var alongside `user_id` |
|
||||
| `requirements.txt` | Add `httpx` (if not already present as transitive dep) |
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] M2.1: Ask "what is a Python generator?" — mailboy replies directly, no tool call
|
||||
- [ ] M2.2: Send "check git status in todo_app" — `ShellTool` runs, output returned
|
||||
- [ ] M2.2: Send "rm -rf /" — blocked by safety guard
|
||||
- [ ] M2.3: Send "show me the last 50 lines of audit/abc123.jsonl" — file content returned
|
||||
- [ ] M2.3: Send "send me the sessions.json file" — file arrives in Feishu chat
|
||||
- [ ] M2.4: Start a long CC task (e.g. `--timeout 120`) — bot replies immediately, notifies on finish
|
||||
- [ ] M2.4: `/tasks` — lists running task with elapsed time
|
||||
- [ ] M2.5: "Python 3.13 有哪些新特性?" — `web ask` returns RAG answer from metaso
|
||||
- [ ] M2.5: "帮我读取这个URL: https://example.com" — page content extracted as markdown
|
||||
- [ ] M2.6: `/remind 10m deploy check` — 10 min later, message arrives in Feishu
|
||||
134
docs/metaso.md
Normal file
134
docs/metaso.md
Normal file
@ -0,0 +1,134 @@
|
||||
秘塔AI搜索的MCP服务
|
||||
简介
|
||||
秘塔AI搜索的MCP服务是一个基于Model Context Protocol (MCP) 的智能搜索和问答服务,为AI助手提供强大的网络搜索、内容读取和智能问答能力。通过集成本服务,AI助手可以实时获取网络信息,读取网页内容,并基于RAG技术提供准确的智能问答。
|
||||
|
||||
服务地址
|
||||
ModelScope地址: https://www.modelscope.cn/mcp/servers/metasota/metaso-search
|
||||
|
||||
API端点: https://metaso.cn/api/mcp
|
||||
|
||||
功能特性
|
||||
🔍 多维度搜索
|
||||
支持网页、文档、论文、图片、视频、播客等多种内容类型搜索
|
||||
灵活的搜索范围配置
|
||||
可自定义返回结果数量
|
||||
📖 网页内容读取
|
||||
支持任意URL的网页内容提取
|
||||
提供JSON和Markdown两种输出格式
|
||||
智能内容解析和结构化处理
|
||||
💬 智能问答服务
|
||||
基于RAG(检索增强生成)技术
|
||||
多模型支持,默认使用快速模型
|
||||
结合搜索结果提供准确回答
|
||||
工具列表
|
||||
1. metaso_web_search - 网络搜索工具
|
||||
功能描述: 根据关键词搜索网页、文档、论文、图片、视频、播客等内容
|
||||
|
||||
参数说明:
|
||||
|
||||
q (必填, string): 搜索查询关键词
|
||||
scope (可选, string): 搜索范围,可选值:webpage, document, paper, image, video, podcast
|
||||
includeSummary (可选, boolean): 通过网页摘要信息提升搜索结果的召回率
|
||||
includeRawContent (可选, boolean): 抓取所有来源网页原文
|
||||
size (可选, integer): 返回结果数量,默认为10
|
||||
使用示例:
|
||||
|
||||
{
|
||||
"q": "人工智能最新发展",
|
||||
"scope": "paper",
|
||||
"includeSummary": true,
|
||||
"size": 5
|
||||
}
|
||||
2. metaso_web_reader - 网页内容读取工具
|
||||
功能描述: 读取指定URL的网页内容
|
||||
|
||||
参数说明:
|
||||
|
||||
url (必填, string): 要读取的URL地址
|
||||
format (必填, string): 输出格式,可选值:json, markdown
|
||||
使用示例:
|
||||
|
||||
{
|
||||
"url": "https://example.com/article",
|
||||
"format": "markdown"
|
||||
}
|
||||
3. metaso_chat - 智能问答工具
|
||||
功能描述: 基于RAG的智能问答服务
|
||||
|
||||
参数说明:
|
||||
|
||||
message (必填, string): 用户问题
|
||||
model (可选, string): 使用的模型,默认为"fast"
|
||||
使用示例:
|
||||
|
||||
{
|
||||
"message": "请解释一下量子计算的基本原理",
|
||||
"model": "fast"
|
||||
}
|
||||
配置方法
|
||||
1. 基础配置
|
||||
在您的MCP客户端配置文件中添加以下配置:
|
||||
|
||||
请将YOUR_API_KEY替换为你自己的ApiKey
|
||||
|
||||
{
|
||||
"mcpServers": {
|
||||
"metaso": {
|
||||
"url": "https://metaso.cn/api/mcp",
|
||||
"headers": {
|
||||
"Authorization": "Bearer YOUR_API_KEY"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
VSCode配置
|
||||
|
||||
{
|
||||
"servers": {
|
||||
"metaso": {
|
||||
"url": "https://metaso.cn/api/mcp",
|
||||
"type": "http",
|
||||
"headers": {
|
||||
"Authorization": "Bearer YOUR_API_KEY"
|
||||
}
|
||||
}
|
||||
},
|
||||
"inputs": []
|
||||
}
|
||||
2. API密钥获取
|
||||
访问秘塔AI搜索官网
|
||||
注册并登录账户
|
||||
访问API控制台(https://metaso.cn/search-api/api-keys)中获取API密钥
|
||||
将密钥替换配置中的 YOUR_API_KEY
|
||||
使用场景
|
||||
📚 学术研究
|
||||
搜索最新论文和研究资料
|
||||
获取特定领域的学术文档
|
||||
快速获取研究背景信息
|
||||
📰 信息获取
|
||||
实时新闻和资讯搜索
|
||||
网页内容快速提取
|
||||
多媒体内容发现
|
||||
🤖 AI增强
|
||||
为AI助手提供实时信息检索能力
|
||||
增强对话系统的知识库
|
||||
支持基于最新信息的智能问答
|
||||
💼 商业应用
|
||||
市场调研和竞品分析
|
||||
行业趋势监控
|
||||
客户服务知识库构建
|
||||
技术优势
|
||||
高性能: 基于秘塔AI搜索的强大搜索引擎
|
||||
多格式支持: 支持多种内容类型和输出格式
|
||||
RAG技术: 结合检索和生成,提供准确回答
|
||||
易于集成: 标准MCP协议,兼容性强
|
||||
灵活配置: 丰富的参数选项,满足不同需求
|
||||
注意事项
|
||||
API配额: 请注意API调用配额限制,合理使用服务。2. 内容合规: 搜索和获取的内容请遵守相关法律法规。
|
||||
缓存策略: 建议实施适当的缓存策略以提高效率。
|
||||
错误处理: 请在应用中实现合适的错误处理机制。
|
||||
支持与反馈
|
||||
如果您在使用过程中遇到问题或有改进建议,欢迎通过以下方式联系我们:
|
||||
|
||||
官方技术支持邮箱: support-1@metasota.ai
|
||||
官网客服: 19980541467(微信同号)
|
||||
Loading…
x
Reference in New Issue
Block a user