添加Pdf读取mcp
This commit is contained in:
104
pdf-reader-mcp/memory-bank/activeContext.md
Normal file
104
pdf-reader-mcp/memory-bank/activeContext.md
Normal file
@@ -0,0 +1,104 @@
|
||||
<!-- Version: 1.36 | Last Updated: 2025-04-07 | Updated By: Sylph -->
|
||||
|
||||
# Active Context: PDF Reader MCP Server (Guidelines Alignment)
|
||||
|
||||
## 1. Current Focus
|
||||
|
||||
Project alignment and documentation according to Sylph Lab Playbook guidelines are complete. CI workflow fixed (formatting, publish step, Dockerfile, parallelization, pre-commit hook), Test Analytics integrated, and Git history corrected multiple times. Dockerfile updated to use LTS Node. Version bumped to `0.3.16` and pushed successfully.
|
||||
|
||||
## 2. Recent Changes (Chronological Summary)
|
||||
|
||||
- Cloned `filesystem-mcp` as a base.
|
||||
- Updated `package.json` (name, version, description).
|
||||
- Implemented initial PDF tools using `pdf-parse`.
|
||||
- Removed unused filesystem handlers.
|
||||
- Added URL support to `pdf-parse` based tools.
|
||||
- Consolidated tools into a single `read_pdf` handler.
|
||||
- **Switched PDF Library:** Uninstalled `pdf-parse`, installed `pdfjs-dist`.
|
||||
- Rewrote the `read_pdf` handler (`src/handlers/readPdf.ts`) to use `pdfjs-dist`.
|
||||
- Updated `README.md` and Memory Bank files to reflect the switch to `pdfjs-dist` and the consolidated tool.
|
||||
- **Added Multiple Source Support & Per-Source Pages:** Modified `read_pdf` handler and schema to accept an array of `sources`. Moved the optional `pages` parameter into each source object.
|
||||
- Created `CHANGELOG.md` and `LICENSE`.
|
||||
- Updated `.github/workflows/publish.yml` initially.
|
||||
- **Guidelines Alignment (Initial):**
|
||||
- Removed sponsorship information (`.github/FUNDING.yml`, `README.md` badges).
|
||||
- Updated `package.json` scripts (`lint`, `format`, `validate`, added `test:watch`, etc.) and removed unused dependencies.
|
||||
- Verified `tsconfig.json`, `eslint.config.js`, `.prettierrc.cjs`, `vitest.config.ts` alignment.
|
||||
- Updated `.gitignore`.
|
||||
- Refactored GitHub Actions workflow to `.github/workflows/ci.yml`.
|
||||
- Added tests (~95% coverage).
|
||||
- Updated Project Identity (`sylphlab` scope).
|
||||
- **Guidelines Alignment (Configuration Deep Dive):**
|
||||
- Updated `package.json` with missing metadata, dev dependencies (`husky`, `lint-staged`, `commitlint`, `typedoc`, `standard-version`), scripts (`start`, `typecheck`, `prepare`, `benchmark`, `release`, `clean`, `docs:api`, `prepublishOnly`), and `files` array.
|
||||
- Updated `tsconfig.json` with missing compiler options and refined `exclude` array.
|
||||
- Updated `eslint.config.js` to enable `stylisticTypeChecked`, enforce stricter rules (`no-unused-vars`, `no-explicit-any` to `error`), and add missing recommended rules.
|
||||
- Created `.github/dependabot.yml` for automated dependency updates.
|
||||
- Updated `.github/workflows/ci.yml` to use fixed Action versions and add Coveralls integration.
|
||||
- Set up Git Hooks using Husky (`pre-commit` with `lint-staged`, `commit-msg` with `commitlint`) and created `commitlint.config.cjs`.
|
||||
- **Benchmarking & Documentation:**
|
||||
- Created initial benchmark file, fixed TS errors, and successfully ran benchmarks (`pnpm run benchmark`) after user provided `test/fixtures/sample.pdf`.
|
||||
- Updated `docs/performance/index.md` with benchmark setup and initial results.
|
||||
- **API Doc Generation:**
|
||||
- Initially encountered persistent TypeDoc v0.28.1 initialization error with Node.js script.
|
||||
- **Resolved:** Changed `docs:api` script in `package.json` to directly call TypeDoc CLI (`typedoc --entryPoints ...`). Successfully generated API docs.
|
||||
- **Documentation Finalization:**
|
||||
- Reviewed and updated `README.md`, `docs/guide/getting-started.md`, and VitePress config (`docs/.vitepress/config.mts`) based on guidelines.
|
||||
- **Code Commit:** Committed and pushed all recent changes.
|
||||
- **CI Fixes & Enhancements:**
|
||||
- Fixed Prettier formatting issues identified by CI.
|
||||
- Fixed ESLint errors/warnings (`no-undef`, `no-unused-vars`, `no-unsafe-call`, `require-await`, unused eslint-disable) identified by CI.
|
||||
- Deleted unused `scripts/generate-api-docs.mjs` file.
|
||||
- **Fixed `pnpm publish` error:** Added `--no-git-checks` flag to the publish command in `.github/workflows/ci.yml` to resolve `ERR_PNPM_GIT_UNCLEAN` error during tag-triggered publish jobs.
|
||||
- **Integrated Codecov Test Analytics:** Updated `package.json` to generate JUnit XML test reports and added `codecov/test-results-action@v1` to `.github/workflows/ci.yml` to upload them.
|
||||
- Added `test-report.junit.xml` to `.gitignore`.
|
||||
- **Switched Coverage Tool:** Updated `.github/workflows/ci.yml` to replace Coveralls with Codecov based on user feedback. Added Codecov badge to `README.md`.
|
||||
- **Version Bump & CI Saga (0.3.11 -> 0.3.16):**
|
||||
- **Initial Goal (0.3.11):** Fix CI publish error (`--no-git-checks`), integrate Test Analytics, add `.gitignore` entry.
|
||||
- **Problem 1:** Incorrect Git history manipulation led to pushing an incomplete `v0.3.11`.
|
||||
- **Problem 2:** Force push/re-push of corrected `v0.3.11` / `v0.3.12` / `v0.3.13` / `v0.3.14` tags didn't trigger workflow or failed on CI checks.
|
||||
- **Problem 3:** CI failed on `check-format` due to unformatted `ci.yml` / `CHANGELOG.md` (not caught by pre-commit hook initially).
|
||||
- **Problem 4:** Further Git history confusion led to incorrect version bumps (`0.3.13`, `0.3.14`, `0.3.15`) and tag creation issues due to unstaged changes and leftover local tags.
|
||||
- **Problem 5:** Docker build failed due to incorrect lockfile and missing `pnpm` install in `Dockerfile`.
|
||||
- **Problem 6:** Workflow parallelization changes were not committed before attempting a release.
|
||||
- **Problem 7:** `publish-npm` job failed due to missing dependencies for `prepublishOnly` script.
|
||||
- **Problem 8:** `pre-commit` hook was running `pnpm test` instead of `pnpm lint-staged`.
|
||||
- **Problem 9:** Docker build failed again due to `husky` command not found during `pnpm prune`.
|
||||
- **Problem 10:** Dockerfile was using hardcoded `node:20-alpine` instead of `node:lts-alpine`.
|
||||
- **Final Resolution:** Reset history multiple times, applied fixes sequentially (formatting `fe7eda1`, Dockerfile pnpm install `c202fd4`, parallelization `a569b62`, pre-commit/npm-publish fix `e96680c`, Dockerfile prune fix `02f3f91`, Dockerfile LTS `50f9bdd`), ensured clean working directory, ran `standard-version` successfully to create `v0.3.16` commit and tag, pushed `main` and tag `v0.3.16`.
|
||||
- **Fixed `package.json` Paths:** Corrected `bin`, `files`, and `start` script paths from `build/` to `dist/` to align with `tsconfig.json` output directory and resolve executable error.
|
||||
- **Committed & Pushed Fix:** Committed (`ab1100d`) and pushed the `package.json` path fix to `main`.
|
||||
- **Version Bump & Push:** Bumped version to `0.3.17` using `standard-version` (commit `bb9d2e5`) and pushed the commit and tag `v0.3.17` to `main`.
|
||||
|
||||
## 3. Next Steps
|
||||
|
||||
- **Build Completed:** Project successfully built (`pnpm run build`).
|
||||
- **GitHub Actions Status:**
|
||||
- Pushed commit `c150022` (CI run `14298157760` **passed** format/lint/test checks, but **failed** at Codecov upload due to missing `CODECOV_TOKEN`).
|
||||
- Pushed tag `v0.3.10` (Triggered publish/release workflow - status needed verification).
|
||||
- **Pushed tag `v0.3.16`**. Publish/release workflow triggered. Status needs verification.
|
||||
- **Runtime Testing (Blocked):** Requires user interaction with `@modelcontextprotocol/inspector` or a live agent. Skipping for now.
|
||||
- **Documentation Finalization (Mostly Complete):**
|
||||
- API docs generated.
|
||||
- Main pages reviewed/updated.
|
||||
- Codecov badge added (requires manual token update in `README.md`).
|
||||
- **Remaining:** Add complex features (PWA, share buttons, roadmap page) if requested.
|
||||
- **Release Preparation:**
|
||||
- `CHANGELOG.md` updated for `0.3.10`.
|
||||
- **Project is ready for final review. Requires Codecov token configuration and verification of the `v0.3.16` publish/release workflow.**
|
||||
|
||||
## 4. Active Decisions & Considerations
|
||||
|
||||
- **Switched to pnpm:** Changed package manager from npm to pnpm.
|
||||
- **Using `pdfjs-dist` as the core PDF library.**
|
||||
- Adopted the handler definition pattern from `filesystem-mcp`.
|
||||
- Consolidated tools into a single `read_pdf` handler.
|
||||
- Aligned project configuration with Guidelines.
|
||||
- **Accepted ~95% test coverage**.
|
||||
- **No Sponsorship:** Project will not include sponsorship links or files.
|
||||
- **Using TypeDoc CLI for API Doc Generation:** Bypassed script initialization issues.
|
||||
- **Switched to Codecov:** Replaced Coveralls with Codecov for coverage reporting. Test Analytics integration added.
|
||||
- **Codecov Token Required:** CI is currently blocked on Codecov upload (coverage and test results) due to missing `CODECOV_TOKEN` secret in GitHub repository settings. This needs to be added by the user.
|
||||
- **Version bumped to `0.3.17`**.
|
||||
- **Publish Workflow:** Parallelized. Modified to bypass Git checks during `pnpm publish`. Docker build fixed (pnpm install, prune ignore scripts, LTS node). Dependencies installed before publish. Verification pending on the `v0.3.17` workflow run.
|
||||
- **CI Workflow:** Added Codecov Test Analytics upload step. Formatting fixed. Parallelized publish steps.
|
||||
- **Pre-commit Hook:** Fixed to run `lint-staged`.
|
||||
40
pdf-reader-mcp/memory-bank/productContext.md
Normal file
40
pdf-reader-mcp/memory-bank/productContext.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# Product Context: PDF Reader MCP Server
|
||||
|
||||
## 1. Problem Solved
|
||||
|
||||
AI agents often need to access information contained within PDF documents as
|
||||
part of user tasks (e.g., summarizing reports, extracting data from invoices,
|
||||
referencing documentation). Directly providing PDF file content to the agent is
|
||||
inefficient (large token count) and often impossible due to binary format.
|
||||
Executing external CLI tools for each PDF interaction can be slow, insecure, and
|
||||
lack structured output.
|
||||
|
||||
This MCP server provides a secure, efficient, and structured way for agents to
|
||||
interact with PDF files within the user's project context.
|
||||
|
||||
## 2. How It Should Work
|
||||
|
||||
- The server runs as a background process, managed by the agent's host
|
||||
environment.
|
||||
- The host environment ensures the server is launched with its working directory
|
||||
set to the user's current project root.
|
||||
- The agent uses MCP calls to invoke specific PDF reading tools provided by the
|
||||
server.
|
||||
- The agent provides the relative path to the target PDF file within the project
|
||||
root.
|
||||
- The server uses the `pdf-parse` library to process the PDF.
|
||||
- The server returns structured data (text, metadata, page count) back to the
|
||||
agent via MCP.
|
||||
- All file access is strictly limited to the project root directory.
|
||||
|
||||
## 3. User Experience Goals
|
||||
|
||||
- **Seamless Integration:** The agent should be able to use the PDF tools
|
||||
naturally as part of its workflow without complex setup for the end-user.
|
||||
- **Reliability:** Tools should reliably parse standard PDF files and return
|
||||
accurate information or clear error messages.
|
||||
- **Security:** Users should trust that the server only accesses files within
|
||||
the intended project scope.
|
||||
- **Efficiency:** Reading PDF data should be reasonably fast and avoid excessive
|
||||
token usage compared to sending raw file content (which isn't feasible
|
||||
anyway).
|
||||
61
pdf-reader-mcp/memory-bank/progress.md
Normal file
61
pdf-reader-mcp/memory-bank/progress.md
Normal file
@@ -0,0 +1,61 @@
|
||||
<!-- Version: 1.37 | Last Updated: 2025-04-07 | Updated By: Sylph -->
|
||||
|
||||
# Progress: PDF Reader MCP Server (Guidelines Applied)
|
||||
|
||||
## 1. What Works
|
||||
|
||||
- **Project Setup:** Cloned from `filesystem-mcp`, dependencies installed (using pnpm).
|
||||
- **Core Tool Handler (Consolidated, using `pdfjs-dist`, multi-source, per-source pages):**
|
||||
- `read_pdf`: Implemented and integrated.
|
||||
- **MCP Server Structure:** Basic server setup working.
|
||||
- **Changelog:** `CHANGELOG.md` created and updated for `1.0.0`.
|
||||
- **License:** `LICENSE` file created (MIT).
|
||||
- **GitHub Actions:** `.github/workflows/ci.yml` refactored for CI/CD according to guidelines. Fixed `pnpm publish` step (`--no-git-checks`), added Test Analytics upload, fixed formatting, fixed Docker build step (`Dockerfile` - pnpm install, prune, LTS node), parallelized publish jobs, fixed pre-commit hook. Git history corrected multiple times.
|
||||
- **Testing Framework (Vitest):**
|
||||
- Integrated, configured. All tests passing. Coverage at ~95% (accepted).
|
||||
- **Linter (ESLint):**
|
||||
- Integrated, configured. Codebase passes all checks.
|
||||
- **Formatter (Prettier):**
|
||||
- Integrated, configured. Codebase formatted.
|
||||
- **TypeScript Configuration:** `tsconfig.json` updated with strictest settings.
|
||||
- **Package Configuration:** `package.json` updated.
|
||||
- **Git Ignore:** `.gitignore` updated (added JUnit report).
|
||||
- **Sponsorship:** Removed.
|
||||
- **Project Identity:** Updated scope to `@sylphlab`.
|
||||
- **Git Hooks:** Configured using Husky, lint-staged, and commitlint.
|
||||
- **Dependency Updates:** Configured using Dependabot.
|
||||
- **Compilation:** Completed successfully (`pnpm run build`).
|
||||
- **Benchmarking:**
|
||||
- Created and ran initial benchmarks.
|
||||
- **Documentation (Mostly Complete):**
|
||||
- VitePress site setup.
|
||||
- `README.md`, Guide, Design, Performance, Comparison sections reviewed/updated.
|
||||
- `CONTRIBUTING.md` created.
|
||||
- Performance section updated with benchmark results.
|
||||
- **API documentation generated successfully using TypeDoc CLI.**
|
||||
- VitePress config updated with minor additions.
|
||||
- **Version Control:** All recent changes committed (incl. formatting `fe7eda1`, Dockerfile pnpm install `c202fd4`, parallelization `a569b62`, pre-commit/npm-publish fix `e96680c`, Dockerfile prune fix `02f3f91`, Dockerfile LTS `50f9bdd`, `package.json` path fix `ab1100d`, release commit for `v0.3.17` `bb9d2e5`). Tag `v0.3.17` created and pushed.
|
||||
- **Package Executable Path:** Fixed incorrect paths (`build/` -> `dist/`) in `package.json` (`bin`, `files`, `start` script).
|
||||
|
||||
## 2. What's Left to Build/Verify
|
||||
|
||||
- **Runtime Testing (Blocked):** Requires user interaction.
|
||||
- **Publishing Workflow Test:** Triggered by pushing tag `v0.3.17`. Needs verification.
|
||||
- **Documentation (Optional Enhancements):**
|
||||
- Add complex features (PWA, share buttons, roadmap page) if requested.
|
||||
- **Release Preparation:**
|
||||
- Final review before tagging `1.0.0`.
|
||||
- Consider using `standard-version` or similar for final release tagging/publishing.
|
||||
|
||||
## 3. Current Status
|
||||
|
||||
Project configuration and core functionality are aligned with guidelines. Documentation is largely complete, including generated API docs. Codebase passes all checks and tests (~95% coverage). **Version bumped to `0.3.17` and tag pushed. Project is ready for final review and workflow verification.**
|
||||
|
||||
## 4. Known Issues/Risks
|
||||
|
||||
- **100% Coverage Goal:** Currently at **~95%**. This level is deemed acceptable.
|
||||
- **`pdfjs-dist` Complexity:** API complexity, text extraction accuracy depends on PDF, potential Node.js compatibility nuances.
|
||||
- **Error Handling:** Basic handling implemented; specific PDF parsing errors might need refinement.
|
||||
- **Performance:** Initial benchmarks run on a single sample file. Performance on diverse PDFs needs further investigation if issues arise.
|
||||
- **Per-Source Pages:** Logic handles per-source `pages`; testing combinations is important (covered partially by benchmarks).
|
||||
- **TypeDoc Script Issue:** Node.js script for TypeDoc failed, but CLI workaround is effective.
|
||||
35
pdf-reader-mcp/memory-bank/projectbrief.md
Normal file
35
pdf-reader-mcp/memory-bank/projectbrief.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Project Brief: PDF Reader MCP Server
|
||||
|
||||
## 1. Project Goal
|
||||
|
||||
To create a Model Context Protocol (MCP) server that allows AI agents (like
|
||||
Cline) to securely read and extract information (text, metadata, page count)
|
||||
from PDF files located within a specified project directory.
|
||||
|
||||
## 2. Core Requirements
|
||||
|
||||
- Implement an MCP server using Node.js and TypeScript.
|
||||
- Base the server on the existing `@shtse8/filesystem-mcp` structure.
|
||||
- Provide MCP tools for:
|
||||
- Reading all text content from a PDF.
|
||||
- Reading text content from specific pages of a PDF.
|
||||
- Reading metadata from a PDF.
|
||||
- Getting the total page count of a PDF.
|
||||
- Ensure all operations are confined to the project root directory determined at
|
||||
server launch.
|
||||
- Use relative paths for all file operations.
|
||||
- Utilize the `pdf-parse` library for PDF processing.
|
||||
- Maintain clear documentation (README, Memory Bank).
|
||||
- Package the server for distribution via npm and Docker Hub.
|
||||
|
||||
## 3. Scope
|
||||
|
||||
- **In Scope:** Implementing the core PDF reading tools, packaging, basic
|
||||
documentation.
|
||||
- **Out of Scope (Initially):** Advanced PDF features (image extraction,
|
||||
annotation reading, form filling), complex error recovery beyond basic file
|
||||
access/parsing errors, UI for the server.
|
||||
|
||||
## 4. Target User
|
||||
|
||||
AI agents interacting with user projects that contain PDF documents.
|
||||
94
pdf-reader-mcp/memory-bank/systemPatterns.md
Normal file
94
pdf-reader-mcp/memory-bank/systemPatterns.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# System Patterns: PDF Reader MCP Server
|
||||
|
||||
## 1. Architecture Overview
|
||||
|
||||
The PDF Reader MCP server is a standalone Node.js application based on the
|
||||
original Filesystem MCP. It's designed to run as a child process, communicating
|
||||
with its parent (the AI agent host) via standard input/output (stdio) using the
|
||||
Model Context Protocol (MCP) to provide PDF reading capabilities.
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
A[Agent Host Environment] -- MCP over Stdio --> B(PDF Reader MCP Server);
|
||||
B -- Node.js fs/path/pdfjs-dist --> C[User Filesystem (Project Root)];
|
||||
C -- Results/Data --> B;
|
||||
B -- MCP over Stdio --> A;
|
||||
```
|
||||
|
||||
## 2. Key Technical Decisions & Patterns
|
||||
|
||||
- **MCP SDK Usage:** Leverages the `@modelcontextprotocol/sdk` for handling MCP
|
||||
communication (request parsing, response formatting, error handling). This
|
||||
standardizes interaction and reduces boilerplate code.
|
||||
- **Stdio Transport:** Uses `StdioServerTransport` from the SDK for
|
||||
communication, suitable for running as a managed child process.
|
||||
- **Asynchronous Operations:** All filesystem interactions and request handling
|
||||
are implemented using `async/await` and Node.js's promise-based `fs` module
|
||||
(`fs.promises`) for non-blocking I/O.
|
||||
- **Strict Path Resolution:** A dedicated `resolvePath` function is used for
|
||||
_every_ path received from the agent.
|
||||
- It normalizes the path.
|
||||
- It resolves the path relative to the server process's current working
|
||||
directory (`process.cwd()`), which is treated as the `PROJECT_ROOT`.
|
||||
**Crucially, this requires the process launching the server (e.g., the agent
|
||||
host) to set the correct `cwd` for the target project.**
|
||||
- It explicitly checks if the resolved absolute path still starts with the
|
||||
`PROJECT_ROOT` absolute path to prevent path traversal vulnerabilities
|
||||
(e.g., `../../sensitive-file`).
|
||||
- It rejects absolute paths provided by the agent.
|
||||
- **Zod for Schemas & Validation:** Uses `zod` library to define input schemas
|
||||
for tools and perform robust validation within each handler. JSON schemas for
|
||||
MCP listing are generated from Zod schemas.
|
||||
- **Tool Definition Aggregation:** Tool definitions (name, description, Zod
|
||||
schema, handler function) are defined in their respective handler files and
|
||||
aggregated in `src/handlers/index.ts` for registration in `src/index.ts`.
|
||||
- **`edit_file` Logic:**
|
||||
- Processes multiple changes per file, applying them sequentially from
|
||||
bottom-to-top to minimize line number conflicts.
|
||||
- Handles insertion, text replacement, and deletion.
|
||||
- Implements basic indentation detection (`detect-indent`) and preservation
|
||||
for insertions/replacements.
|
||||
- Uses `diff` library to generate unified diff output.
|
||||
- **Error Handling:**
|
||||
- Uses `try...catch` blocks within each tool handler.
|
||||
- Catches specific Node.js filesystem errors (like `ENOENT`, `EPERM`,
|
||||
`EACCES`) and maps them to appropriate MCP error codes (`InvalidRequest`).
|
||||
- Uses custom `McpError` objects for standardized error reporting back to the
|
||||
agent.
|
||||
- Logs unexpected errors to the server's console (`stderr`) for debugging.
|
||||
- **Glob for Listing/Searching:** Uses the `glob` library for flexible and
|
||||
powerful file listing and searching based on glob patterns, including
|
||||
recursive operations and stat retrieval. Careful handling of `glob`'s
|
||||
different output types based on options (`string[]`, `Path[]`, `Path[]` with
|
||||
`stats`) is implemented.
|
||||
- **TypeScript:** Provides static typing for better code maintainability, early
|
||||
error detection, and improved developer experience. Uses ES module syntax
|
||||
(`import`/`export`).
|
||||
- **PDF Parsing:** Uses Mozilla's `pdfjs-dist` library to load PDF documents and
|
||||
extract text content, metadata, and page information. The `read_pdf` handler
|
||||
uses its API.
|
||||
|
||||
## 3. Component Relationships
|
||||
|
||||
- **`index.ts`:** Main entry point. Sets up the MCP server instance, defines
|
||||
tool schemas, registers request handlers, and starts the server connection.
|
||||
- **`Server` (from SDK):** Core MCP server class handling protocol logic.
|
||||
- **`StdioServerTransport` (from SDK):** Handles reading/writing MCP messages
|
||||
via stdio.
|
||||
- **Tool Handler Function (`handleReadPdfFunc`):** Contains the logic for the
|
||||
consolidated `read_pdf` tool, including Zod argument validation, path
|
||||
resolution, PDF loading/parsing via `pdfjs-dist`, and result formatting based
|
||||
on input parameters.
|
||||
- **`resolvePath` Helper:** Centralized security function for path validation.
|
||||
- **`formatStats` Helper:** Utility to create a consistent stats object
|
||||
structure.
|
||||
- **Node.js Modules (`fs`, `path`):** Used for actual filesystem operations and
|
||||
path manipulation.
|
||||
- **`glob` Library:** Used for pattern-based file searching and listing.
|
||||
- **`zod` Library:** Used for defining and validating tool input schemas.
|
||||
- **`diff` Library:** (Inherited, but not used by PDF tools) Used by
|
||||
`edit_file`.
|
||||
- **`detect-indent` Library:** (Inherited, but not used by PDF tools) Used by
|
||||
`edit_file`.
|
||||
- **`pdfjs-dist` Library:** Used by the `read_pdf` handler to load and process
|
||||
PDF documents.
|
||||
67
pdf-reader-mcp/memory-bank/techContext.md
Normal file
67
pdf-reader-mcp/memory-bank/techContext.md
Normal file
@@ -0,0 +1,67 @@
|
||||
<!-- Version: 1.10 | Last Updated: 2025-04-06 | Updated By: Sylph -->
|
||||
|
||||
# Tech Context: PDF Reader MCP Server
|
||||
|
||||
## 1. Core Technologies
|
||||
|
||||
- **Runtime:** Node.js (>= 18.0.0 recommended)
|
||||
- **Language:** TypeScript (Compiled to JavaScript for execution)
|
||||
- **Package Manager:** pnpm (Switched from npm to align with guidelines)
|
||||
- **Linter:** ESLint (with TypeScript support, including **strict type-aware rules**)
|
||||
- **Formatter:** Prettier
|
||||
- **Testing:** Vitest (with **~95% coverage achieved**)
|
||||
- **Git Hooks:** Husky, lint-staged, commitlint
|
||||
- **Dependency Update:** Dependabot
|
||||
|
||||
## 2. Key Libraries/Dependencies
|
||||
|
||||
- **`@modelcontextprotocol/sdk`:** The official SDK for implementing MCP servers and clients.
|
||||
- **`glob`:** Library for matching files using glob patterns.
|
||||
- **`pdfjs-dist`:** Mozilla's PDF rendering and parsing library.
|
||||
- **`zod`:** Library for schema declaration and validation.
|
||||
- **`zod-to-json-schema`:** Utility to convert Zod schemas to JSON schemas.
|
||||
|
||||
- **Dev Dependencies (Key):**
|
||||
- **`typescript`:** TypeScript compiler (`tsc`).
|
||||
- **`@types/node`:** TypeScript type definitions for Node.js.
|
||||
- **`@types/glob`:** TypeScript type definitions for `glob`.
|
||||
- **`vitest`:** Test runner framework.
|
||||
- **`@vitest/coverage-v8`:** Coverage provider for Vitest.
|
||||
- **`eslint`:** Core ESLint library.
|
||||
- **`typescript-eslint`:** Tools for ESLint + TypeScript integration.
|
||||
- **`prettier`:** Code formatter.
|
||||
- **`eslint-config-prettier`:** Turns off ESLint rules that conflict with Prettier.
|
||||
- **`husky`:** Git hooks manager.
|
||||
- **`lint-staged`:** Run linters on staged files.
|
||||
- **`@commitlint/cli` & `@commitlint/config-conventional`:** Commit message linting.
|
||||
- **`standard-version`:** Release automation tool.
|
||||
- **`typedoc` & `typedoc-plugin-markdown`:** API documentation generation.
|
||||
- **`vitepress` & `vue`:** Documentation website framework.
|
||||
|
||||
## 3. Development Setup
|
||||
|
||||
- **Source Code:** Located in the `src` directory.
|
||||
- **Testing Code:** Located in the `test` directory.
|
||||
- **Main File:** `src/index.ts`.
|
||||
- **Configuration:**
|
||||
- `tsconfig.json`: TypeScript compiler options (**strictest settings enabled**, includes recommended options like `declaration` and `sourceMap`).
|
||||
- `vitest.config.ts`: Vitest test runner configuration (**100% coverage thresholds set**, ~95% achieved).
|
||||
- `eslint.config.js`: ESLint flat configuration (integrates Prettier, enables **strict type-aware linting** and **additional guideline rules**).
|
||||
- `.prettierrc.cjs`: Prettier formatting rules.
|
||||
- `.gitignore`: Specifies intentionally untracked files (`node_modules/`, `build/`, `coverage/`, etc.).
|
||||
- `.github/workflows/ci.yml`: GitHub Actions workflow (validation, publishing, release, **fixed Action versions**, **Coveralls**).
|
||||
- `.github/dependabot.yml`: Automated dependency update configuration.
|
||||
- `package.json`: Project metadata, dependencies, and npm scripts (includes `start`, `typecheck`, `prepare`, `benchmark`, `release`, `clean`, `docs:api`, `prepublishOnly`, etc.).
|
||||
- `commitlint.config.cjs`: Commitlint configuration.
|
||||
- `.husky/`: Directory containing Git hook scripts.
|
||||
- **Build Output:** Compiled JavaScript in the `build` directory.
|
||||
- **Execution:** Run via `node build/index.js` or `npm start`.
|
||||
|
||||
## 4. Technical Constraints & Considerations
|
||||
|
||||
- **Node.js Environment:** Relies on Node.js runtime (>=18.0.0) and built-in modules.
|
||||
- **Permissions:** Server process permissions affect filesystem operations.
|
||||
- **Cross-Platform Compatibility:** Filesystem behaviors might differ. Code uses Node.js `path` module to mitigate.
|
||||
- **Error Handling:** Relies on Node.js error codes and McpError.
|
||||
- **Security Model:** Relies on `resolvePath` for path validation within `PROJECT_ROOT`.
|
||||
- **Project Root Determination:** `PROJECT_ROOT` is the server's `process.cwd()`. The launching process must set this correctly.
|
||||
Reference in New Issue
Block a user