添加Pdf读取mcp

This commit is contained in:
zqm
2025-10-22 16:24:07 +08:00
parent 0d8520123e
commit 64d1e220d4
48 changed files with 21213 additions and 0 deletions

View File

@@ -0,0 +1,104 @@
<!-- Version: 1.36 | Last Updated: 2025-04-07 | Updated By: Sylph -->
# Active Context: PDF Reader MCP Server (Guidelines Alignment)
## 1. Current Focus
Project alignment and documentation according to Sylph Lab Playbook guidelines are complete. CI workflow fixed (formatting, publish step, Dockerfile, parallelization, pre-commit hook), Test Analytics integrated, and Git history corrected multiple times. Dockerfile updated to use LTS Node. Version bumped to `0.3.16` and pushed successfully.
## 2. Recent Changes (Chronological Summary)
- Cloned `filesystem-mcp` as a base.
- Updated `package.json` (name, version, description).
- Implemented initial PDF tools using `pdf-parse`.
- Removed unused filesystem handlers.
- Added URL support to `pdf-parse` based tools.
- Consolidated tools into a single `read_pdf` handler.
- **Switched PDF Library:** Uninstalled `pdf-parse`, installed `pdfjs-dist`.
- Rewrote the `read_pdf` handler (`src/handlers/readPdf.ts`) to use `pdfjs-dist`.
- Updated `README.md` and Memory Bank files to reflect the switch to `pdfjs-dist` and the consolidated tool.
- **Added Multiple Source Support & Per-Source Pages:** Modified `read_pdf` handler and schema to accept an array of `sources`. Moved the optional `pages` parameter into each source object.
- Created `CHANGELOG.md` and `LICENSE`.
- Updated `.github/workflows/publish.yml` initially.
- **Guidelines Alignment (Initial):**
- Removed sponsorship information (`.github/FUNDING.yml`, `README.md` badges).
- Updated `package.json` scripts (`lint`, `format`, `validate`, added `test:watch`, etc.) and removed unused dependencies.
- Verified `tsconfig.json`, `eslint.config.js`, `.prettierrc.cjs`, `vitest.config.ts` alignment.
- Updated `.gitignore`.
- Refactored GitHub Actions workflow to `.github/workflows/ci.yml`.
- Added tests (~95% coverage).
- Updated Project Identity (`sylphlab` scope).
- **Guidelines Alignment (Configuration Deep Dive):**
- Updated `package.json` with missing metadata, dev dependencies (`husky`, `lint-staged`, `commitlint`, `typedoc`, `standard-version`), scripts (`start`, `typecheck`, `prepare`, `benchmark`, `release`, `clean`, `docs:api`, `prepublishOnly`), and `files` array.
- Updated `tsconfig.json` with missing compiler options and refined `exclude` array.
- Updated `eslint.config.js` to enable `stylisticTypeChecked`, enforce stricter rules (`no-unused-vars`, `no-explicit-any` to `error`), and add missing recommended rules.
- Created `.github/dependabot.yml` for automated dependency updates.
- Updated `.github/workflows/ci.yml` to use fixed Action versions and add Coveralls integration.
- Set up Git Hooks using Husky (`pre-commit` with `lint-staged`, `commit-msg` with `commitlint`) and created `commitlint.config.cjs`.
- **Benchmarking & Documentation:**
- Created initial benchmark file, fixed TS errors, and successfully ran benchmarks (`pnpm run benchmark`) after user provided `test/fixtures/sample.pdf`.
- Updated `docs/performance/index.md` with benchmark setup and initial results.
- **API Doc Generation:**
- Initially encountered persistent TypeDoc v0.28.1 initialization error with Node.js script.
- **Resolved:** Changed `docs:api` script in `package.json` to directly call TypeDoc CLI (`typedoc --entryPoints ...`). Successfully generated API docs.
- **Documentation Finalization:**
- Reviewed and updated `README.md`, `docs/guide/getting-started.md`, and VitePress config (`docs/.vitepress/config.mts`) based on guidelines.
- **Code Commit:** Committed and pushed all recent changes.
- **CI Fixes & Enhancements:**
- Fixed Prettier formatting issues identified by CI.
- Fixed ESLint errors/warnings (`no-undef`, `no-unused-vars`, `no-unsafe-call`, `require-await`, unused eslint-disable) identified by CI.
- Deleted unused `scripts/generate-api-docs.mjs` file.
- **Fixed `pnpm publish` error:** Added `--no-git-checks` flag to the publish command in `.github/workflows/ci.yml` to resolve `ERR_PNPM_GIT_UNCLEAN` error during tag-triggered publish jobs.
- **Integrated Codecov Test Analytics:** Updated `package.json` to generate JUnit XML test reports and added `codecov/test-results-action@v1` to `.github/workflows/ci.yml` to upload them.
- Added `test-report.junit.xml` to `.gitignore`.
- **Switched Coverage Tool:** Updated `.github/workflows/ci.yml` to replace Coveralls with Codecov based on user feedback. Added Codecov badge to `README.md`.
- **Version Bump & CI Saga (0.3.11 -> 0.3.16):**
- **Initial Goal (0.3.11):** Fix CI publish error (`--no-git-checks`), integrate Test Analytics, add `.gitignore` entry.
- **Problem 1:** Incorrect Git history manipulation led to pushing an incomplete `v0.3.11`.
- **Problem 2:** Force push/re-push of corrected `v0.3.11` / `v0.3.12` / `v0.3.13` / `v0.3.14` tags didn't trigger workflow or failed on CI checks.
- **Problem 3:** CI failed on `check-format` due to unformatted `ci.yml` / `CHANGELOG.md` (not caught by pre-commit hook initially).
- **Problem 4:** Further Git history confusion led to incorrect version bumps (`0.3.13`, `0.3.14`, `0.3.15`) and tag creation issues due to unstaged changes and leftover local tags.
- **Problem 5:** Docker build failed due to incorrect lockfile and missing `pnpm` install in `Dockerfile`.
- **Problem 6:** Workflow parallelization changes were not committed before attempting a release.
- **Problem 7:** `publish-npm` job failed due to missing dependencies for `prepublishOnly` script.
- **Problem 8:** `pre-commit` hook was running `pnpm test` instead of `pnpm lint-staged`.
- **Problem 9:** Docker build failed again due to `husky` command not found during `pnpm prune`.
- **Problem 10:** Dockerfile was using hardcoded `node:20-alpine` instead of `node:lts-alpine`.
- **Final Resolution:** Reset history multiple times, applied fixes sequentially (formatting `fe7eda1`, Dockerfile pnpm install `c202fd4`, parallelization `a569b62`, pre-commit/npm-publish fix `e96680c`, Dockerfile prune fix `02f3f91`, Dockerfile LTS `50f9bdd`), ensured clean working directory, ran `standard-version` successfully to create `v0.3.16` commit and tag, pushed `main` and tag `v0.3.16`.
- **Fixed `package.json` Paths:** Corrected `bin`, `files`, and `start` script paths from `build/` to `dist/` to align with `tsconfig.json` output directory and resolve executable error.
- **Committed & Pushed Fix:** Committed (`ab1100d`) and pushed the `package.json` path fix to `main`.
- **Version Bump & Push:** Bumped version to `0.3.17` using `standard-version` (commit `bb9d2e5`) and pushed the commit and tag `v0.3.17` to `main`.
## 3. Next Steps
- **Build Completed:** Project successfully built (`pnpm run build`).
- **GitHub Actions Status:**
- Pushed commit `c150022` (CI run `14298157760` **passed** format/lint/test checks, but **failed** at Codecov upload due to missing `CODECOV_TOKEN`).
- Pushed tag `v0.3.10` (Triggered publish/release workflow - status needed verification).
- **Pushed tag `v0.3.16`**. Publish/release workflow triggered. Status needs verification.
- **Runtime Testing (Blocked):** Requires user interaction with `@modelcontextprotocol/inspector` or a live agent. Skipping for now.
- **Documentation Finalization (Mostly Complete):**
- API docs generated.
- Main pages reviewed/updated.
- Codecov badge added (requires manual token update in `README.md`).
- **Remaining:** Add complex features (PWA, share buttons, roadmap page) if requested.
- **Release Preparation:**
- `CHANGELOG.md` updated for `0.3.10`.
- **Project is ready for final review. Requires Codecov token configuration and verification of the `v0.3.16` publish/release workflow.**
## 4. Active Decisions & Considerations
- **Switched to pnpm:** Changed package manager from npm to pnpm.
- **Using `pdfjs-dist` as the core PDF library.**
- Adopted the handler definition pattern from `filesystem-mcp`.
- Consolidated tools into a single `read_pdf` handler.
- Aligned project configuration with Guidelines.
- **Accepted ~95% test coverage**.
- **No Sponsorship:** Project will not include sponsorship links or files.
- **Using TypeDoc CLI for API Doc Generation:** Bypassed script initialization issues.
- **Switched to Codecov:** Replaced Coveralls with Codecov for coverage reporting. Test Analytics integration added.
- **Codecov Token Required:** CI is currently blocked on Codecov upload (coverage and test results) due to missing `CODECOV_TOKEN` secret in GitHub repository settings. This needs to be added by the user.
- **Version bumped to `0.3.17`**.
- **Publish Workflow:** Parallelized. Modified to bypass Git checks during `pnpm publish`. Docker build fixed (pnpm install, prune ignore scripts, LTS node). Dependencies installed before publish. Verification pending on the `v0.3.17` workflow run.
- **CI Workflow:** Added Codecov Test Analytics upload step. Formatting fixed. Parallelized publish steps.
- **Pre-commit Hook:** Fixed to run `lint-staged`.

View File

@@ -0,0 +1,40 @@
# Product Context: PDF Reader MCP Server
## 1. Problem Solved
AI agents often need to access information contained within PDF documents as
part of user tasks (e.g., summarizing reports, extracting data from invoices,
referencing documentation). Directly providing PDF file content to the agent is
inefficient (large token count) and often impossible due to binary format.
Executing external CLI tools for each PDF interaction can be slow, insecure, and
lack structured output.
This MCP server provides a secure, efficient, and structured way for agents to
interact with PDF files within the user's project context.
## 2. How It Should Work
- The server runs as a background process, managed by the agent's host
environment.
- The host environment ensures the server is launched with its working directory
set to the user's current project root.
- The agent uses MCP calls to invoke specific PDF reading tools provided by the
server.
- The agent provides the relative path to the target PDF file within the project
root.
- The server uses the `pdf-parse` library to process the PDF.
- The server returns structured data (text, metadata, page count) back to the
agent via MCP.
- All file access is strictly limited to the project root directory.
## 3. User Experience Goals
- **Seamless Integration:** The agent should be able to use the PDF tools
naturally as part of its workflow without complex setup for the end-user.
- **Reliability:** Tools should reliably parse standard PDF files and return
accurate information or clear error messages.
- **Security:** Users should trust that the server only accesses files within
the intended project scope.
- **Efficiency:** Reading PDF data should be reasonably fast and avoid excessive
token usage compared to sending raw file content (which isn't feasible
anyway).

View File

@@ -0,0 +1,61 @@
<!-- Version: 1.37 | Last Updated: 2025-04-07 | Updated By: Sylph -->
# Progress: PDF Reader MCP Server (Guidelines Applied)
## 1. What Works
- **Project Setup:** Cloned from `filesystem-mcp`, dependencies installed (using pnpm).
- **Core Tool Handler (Consolidated, using `pdfjs-dist`, multi-source, per-source pages):**
- `read_pdf`: Implemented and integrated.
- **MCP Server Structure:** Basic server setup working.
- **Changelog:** `CHANGELOG.md` created and updated for `1.0.0`.
- **License:** `LICENSE` file created (MIT).
- **GitHub Actions:** `.github/workflows/ci.yml` refactored for CI/CD according to guidelines. Fixed `pnpm publish` step (`--no-git-checks`), added Test Analytics upload, fixed formatting, fixed Docker build step (`Dockerfile` - pnpm install, prune, LTS node), parallelized publish jobs, fixed pre-commit hook. Git history corrected multiple times.
- **Testing Framework (Vitest):**
- Integrated, configured. All tests passing. Coverage at ~95% (accepted).
- **Linter (ESLint):**
- Integrated, configured. Codebase passes all checks.
- **Formatter (Prettier):**
- Integrated, configured. Codebase formatted.
- **TypeScript Configuration:** `tsconfig.json` updated with strictest settings.
- **Package Configuration:** `package.json` updated.
- **Git Ignore:** `.gitignore` updated (added JUnit report).
- **Sponsorship:** Removed.
- **Project Identity:** Updated scope to `@sylphlab`.
- **Git Hooks:** Configured using Husky, lint-staged, and commitlint.
- **Dependency Updates:** Configured using Dependabot.
- **Compilation:** Completed successfully (`pnpm run build`).
- **Benchmarking:**
- Created and ran initial benchmarks.
- **Documentation (Mostly Complete):**
- VitePress site setup.
- `README.md`, Guide, Design, Performance, Comparison sections reviewed/updated.
- `CONTRIBUTING.md` created.
- Performance section updated with benchmark results.
- **API documentation generated successfully using TypeDoc CLI.**
- VitePress config updated with minor additions.
- **Version Control:** All recent changes committed (incl. formatting `fe7eda1`, Dockerfile pnpm install `c202fd4`, parallelization `a569b62`, pre-commit/npm-publish fix `e96680c`, Dockerfile prune fix `02f3f91`, Dockerfile LTS `50f9bdd`, `package.json` path fix `ab1100d`, release commit for `v0.3.17` `bb9d2e5`). Tag `v0.3.17` created and pushed.
- **Package Executable Path:** Fixed incorrect paths (`build/` -> `dist/`) in `package.json` (`bin`, `files`, `start` script).
## 2. What's Left to Build/Verify
- **Runtime Testing (Blocked):** Requires user interaction.
- **Publishing Workflow Test:** Triggered by pushing tag `v0.3.17`. Needs verification.
- **Documentation (Optional Enhancements):**
- Add complex features (PWA, share buttons, roadmap page) if requested.
- **Release Preparation:**
- Final review before tagging `1.0.0`.
- Consider using `standard-version` or similar for final release tagging/publishing.
## 3. Current Status
Project configuration and core functionality are aligned with guidelines. Documentation is largely complete, including generated API docs. Codebase passes all checks and tests (~95% coverage). **Version bumped to `0.3.17` and tag pushed. Project is ready for final review and workflow verification.**
## 4. Known Issues/Risks
- **100% Coverage Goal:** Currently at **~95%**. This level is deemed acceptable.
- **`pdfjs-dist` Complexity:** API complexity, text extraction accuracy depends on PDF, potential Node.js compatibility nuances.
- **Error Handling:** Basic handling implemented; specific PDF parsing errors might need refinement.
- **Performance:** Initial benchmarks run on a single sample file. Performance on diverse PDFs needs further investigation if issues arise.
- **Per-Source Pages:** Logic handles per-source `pages`; testing combinations is important (covered partially by benchmarks).
- **TypeDoc Script Issue:** Node.js script for TypeDoc failed, but CLI workaround is effective.

View File

@@ -0,0 +1,35 @@
# Project Brief: PDF Reader MCP Server
## 1. Project Goal
To create a Model Context Protocol (MCP) server that allows AI agents (like
Cline) to securely read and extract information (text, metadata, page count)
from PDF files located within a specified project directory.
## 2. Core Requirements
- Implement an MCP server using Node.js and TypeScript.
- Base the server on the existing `@shtse8/filesystem-mcp` structure.
- Provide MCP tools for:
- Reading all text content from a PDF.
- Reading text content from specific pages of a PDF.
- Reading metadata from a PDF.
- Getting the total page count of a PDF.
- Ensure all operations are confined to the project root directory determined at
server launch.
- Use relative paths for all file operations.
- Utilize the `pdf-parse` library for PDF processing.
- Maintain clear documentation (README, Memory Bank).
- Package the server for distribution via npm and Docker Hub.
## 3. Scope
- **In Scope:** Implementing the core PDF reading tools, packaging, basic
documentation.
- **Out of Scope (Initially):** Advanced PDF features (image extraction,
annotation reading, form filling), complex error recovery beyond basic file
access/parsing errors, UI for the server.
## 4. Target User
AI agents interacting with user projects that contain PDF documents.

View File

@@ -0,0 +1,94 @@
# System Patterns: PDF Reader MCP Server
## 1. Architecture Overview
The PDF Reader MCP server is a standalone Node.js application based on the
original Filesystem MCP. It's designed to run as a child process, communicating
with its parent (the AI agent host) via standard input/output (stdio) using the
Model Context Protocol (MCP) to provide PDF reading capabilities.
```mermaid
graph LR
A[Agent Host Environment] -- MCP over Stdio --> B(PDF Reader MCP Server);
B -- Node.js fs/path/pdfjs-dist --> C[User Filesystem (Project Root)];
C -- Results/Data --> B;
B -- MCP over Stdio --> A;
```
## 2. Key Technical Decisions & Patterns
- **MCP SDK Usage:** Leverages the `@modelcontextprotocol/sdk` for handling MCP
communication (request parsing, response formatting, error handling). This
standardizes interaction and reduces boilerplate code.
- **Stdio Transport:** Uses `StdioServerTransport` from the SDK for
communication, suitable for running as a managed child process.
- **Asynchronous Operations:** All filesystem interactions and request handling
are implemented using `async/await` and Node.js's promise-based `fs` module
(`fs.promises`) for non-blocking I/O.
- **Strict Path Resolution:** A dedicated `resolvePath` function is used for
_every_ path received from the agent.
- It normalizes the path.
- It resolves the path relative to the server process's current working
directory (`process.cwd()`), which is treated as the `PROJECT_ROOT`.
**Crucially, this requires the process launching the server (e.g., the agent
host) to set the correct `cwd` for the target project.**
- It explicitly checks if the resolved absolute path still starts with the
`PROJECT_ROOT` absolute path to prevent path traversal vulnerabilities
(e.g., `../../sensitive-file`).
- It rejects absolute paths provided by the agent.
- **Zod for Schemas & Validation:** Uses `zod` library to define input schemas
for tools and perform robust validation within each handler. JSON schemas for
MCP listing are generated from Zod schemas.
- **Tool Definition Aggregation:** Tool definitions (name, description, Zod
schema, handler function) are defined in their respective handler files and
aggregated in `src/handlers/index.ts` for registration in `src/index.ts`.
- **`edit_file` Logic:**
- Processes multiple changes per file, applying them sequentially from
bottom-to-top to minimize line number conflicts.
- Handles insertion, text replacement, and deletion.
- Implements basic indentation detection (`detect-indent`) and preservation
for insertions/replacements.
- Uses `diff` library to generate unified diff output.
- **Error Handling:**
- Uses `try...catch` blocks within each tool handler.
- Catches specific Node.js filesystem errors (like `ENOENT`, `EPERM`,
`EACCES`) and maps them to appropriate MCP error codes (`InvalidRequest`).
- Uses custom `McpError` objects for standardized error reporting back to the
agent.
- Logs unexpected errors to the server's console (`stderr`) for debugging.
- **Glob for Listing/Searching:** Uses the `glob` library for flexible and
powerful file listing and searching based on glob patterns, including
recursive operations and stat retrieval. Careful handling of `glob`'s
different output types based on options (`string[]`, `Path[]`, `Path[]` with
`stats`) is implemented.
- **TypeScript:** Provides static typing for better code maintainability, early
error detection, and improved developer experience. Uses ES module syntax
(`import`/`export`).
- **PDF Parsing:** Uses Mozilla's `pdfjs-dist` library to load PDF documents and
extract text content, metadata, and page information. The `read_pdf` handler
uses its API.
## 3. Component Relationships
- **`index.ts`:** Main entry point. Sets up the MCP server instance, defines
tool schemas, registers request handlers, and starts the server connection.
- **`Server` (from SDK):** Core MCP server class handling protocol logic.
- **`StdioServerTransport` (from SDK):** Handles reading/writing MCP messages
via stdio.
- **Tool Handler Function (`handleReadPdfFunc`):** Contains the logic for the
consolidated `read_pdf` tool, including Zod argument validation, path
resolution, PDF loading/parsing via `pdfjs-dist`, and result formatting based
on input parameters.
- **`resolvePath` Helper:** Centralized security function for path validation.
- **`formatStats` Helper:** Utility to create a consistent stats object
structure.
- **Node.js Modules (`fs`, `path`):** Used for actual filesystem operations and
path manipulation.
- **`glob` Library:** Used for pattern-based file searching and listing.
- **`zod` Library:** Used for defining and validating tool input schemas.
- **`diff` Library:** (Inherited, but not used by PDF tools) Used by
`edit_file`.
- **`detect-indent` Library:** (Inherited, but not used by PDF tools) Used by
`edit_file`.
- **`pdfjs-dist` Library:** Used by the `read_pdf` handler to load and process
PDF documents.

View File

@@ -0,0 +1,67 @@
<!-- Version: 1.10 | Last Updated: 2025-04-06 | Updated By: Sylph -->
# Tech Context: PDF Reader MCP Server
## 1. Core Technologies
- **Runtime:** Node.js (>= 18.0.0 recommended)
- **Language:** TypeScript (Compiled to JavaScript for execution)
- **Package Manager:** pnpm (Switched from npm to align with guidelines)
- **Linter:** ESLint (with TypeScript support, including **strict type-aware rules**)
- **Formatter:** Prettier
- **Testing:** Vitest (with **~95% coverage achieved**)
- **Git Hooks:** Husky, lint-staged, commitlint
- **Dependency Update:** Dependabot
## 2. Key Libraries/Dependencies
- **`@modelcontextprotocol/sdk`:** The official SDK for implementing MCP servers and clients.
- **`glob`:** Library for matching files using glob patterns.
- **`pdfjs-dist`:** Mozilla's PDF rendering and parsing library.
- **`zod`:** Library for schema declaration and validation.
- **`zod-to-json-schema`:** Utility to convert Zod schemas to JSON schemas.
- **Dev Dependencies (Key):**
- **`typescript`:** TypeScript compiler (`tsc`).
- **`@types/node`:** TypeScript type definitions for Node.js.
- **`@types/glob`:** TypeScript type definitions for `glob`.
- **`vitest`:** Test runner framework.
- **`@vitest/coverage-v8`:** Coverage provider for Vitest.
- **`eslint`:** Core ESLint library.
- **`typescript-eslint`:** Tools for ESLint + TypeScript integration.
- **`prettier`:** Code formatter.
- **`eslint-config-prettier`:** Turns off ESLint rules that conflict with Prettier.
- **`husky`:** Git hooks manager.
- **`lint-staged`:** Run linters on staged files.
- **`@commitlint/cli` & `@commitlint/config-conventional`:** Commit message linting.
- **`standard-version`:** Release automation tool.
- **`typedoc` & `typedoc-plugin-markdown`:** API documentation generation.
- **`vitepress` & `vue`:** Documentation website framework.
## 3. Development Setup
- **Source Code:** Located in the `src` directory.
- **Testing Code:** Located in the `test` directory.
- **Main File:** `src/index.ts`.
- **Configuration:**
- `tsconfig.json`: TypeScript compiler options (**strictest settings enabled**, includes recommended options like `declaration` and `sourceMap`).
- `vitest.config.ts`: Vitest test runner configuration (**100% coverage thresholds set**, ~95% achieved).
- `eslint.config.js`: ESLint flat configuration (integrates Prettier, enables **strict type-aware linting** and **additional guideline rules**).
- `.prettierrc.cjs`: Prettier formatting rules.
- `.gitignore`: Specifies intentionally untracked files (`node_modules/`, `build/`, `coverage/`, etc.).
- `.github/workflows/ci.yml`: GitHub Actions workflow (validation, publishing, release, **fixed Action versions**, **Coveralls**).
- `.github/dependabot.yml`: Automated dependency update configuration.
- `package.json`: Project metadata, dependencies, and npm scripts (includes `start`, `typecheck`, `prepare`, `benchmark`, `release`, `clean`, `docs:api`, `prepublishOnly`, etc.).
- `commitlint.config.cjs`: Commitlint configuration.
- `.husky/`: Directory containing Git hook scripts.
- **Build Output:** Compiled JavaScript in the `build` directory.
- **Execution:** Run via `node build/index.js` or `npm start`.
## 4. Technical Constraints & Considerations
- **Node.js Environment:** Relies on Node.js runtime (>=18.0.0) and built-in modules.
- **Permissions:** Server process permissions affect filesystem operations.
- **Cross-Platform Compatibility:** Filesystem behaviors might differ. Code uses Node.js `path` module to mitigate.
- **Error Handling:** Relies on Node.js error codes and McpError.
- **Security Model:** Relies on `resolvePath` for path validation within `PROJECT_ROOT`.
- **Project Root Determination:** `PROJECT_ROOT` is the server's `process.cwd()`. The launching process must set this correctly.