Odel
gutenberg mcp server

gutenberg mcp server

@cyanheadsDeveloper Tools2TypeScriptApache-2.0Updated 6 days ago

MCP server for Project Gutenberg — 75,000+ public-domain ebooks with full plain-text retrieval.

Server endpointStreamable HTTPNo authProbed

This is the third-party server itself — Odel doesn't run it. Hitting this URL directly talks straight to the upstream server with no auth or proxying. Connect through Odel to front it with managed auth.

@cyanheads/gutenberg-mcp-server

Search, browse, and read 75,000+ public-domain books from Project Gutenberg with full plain-text retrieval and offset/limit chunking via MCP. STDIO or Streamable HTTP.

4 Tools

Version License Docker MCP SDK npm TypeScript Bun

Install in Claude Desktop Install in Cursor Install in VS Code

Framework

Public Hosted Server: https://gutenberg.caseyjhand.com/mcp


Tools

Four tools for searching and reading Project Gutenberg's public-domain library:

ToolDescription
gutenberg_search_booksSearch the Gutenberg catalog by title, author, topic, language, or author lifespan — returns popularity-ordered results with IDs ready for follow-up calls
gutenberg_get_bookFetch complete metadata for a book by ID — full formats map, translators, editors, subjects, bookshelves, copyright status, and the has_plain_text flag
gutenberg_get_textRetrieve the plain-text content of a book, stripped of license boilerplate, with offset/limit chunking for context-budget management
gutenberg_browse_popularBrowse the most-downloaded books, optionally filtered by language or topic — useful as a discovery entry point

gutenberg_search_books

Search the Project Gutenberg catalog of 78,000+ public-domain books.

  • Full-text search against titles and author names (space-separated words, case-insensitive)
  • Topic filter matches subject headings and bookshelf categories
  • Language filter by ISO 639-1 two-character codes (e.g., ["en"], ["fr", "de"])
  • Author lifespan range filter via author_year_start / author_year_end
  • Sort by popularity (download count), or by Gutenberg ID ascending/descending
  • Batch lookup by known ID list via ids parameter
  • Paginated — up to 32 books per page; use totalCount to determine total pages
  • Each result includes has_plain_text to indicate whether gutenberg_get_text will work

gutenberg_get_book

Fetch complete metadata for a single Project Gutenberg book.

  • Returns the full formats map (MIME type → download URL) including plain text, HTML, EPUB, and cover image
  • Includes translators and editors alongside authors, each with birth/death years
  • has_plain_text flag confirms whether a UTF-8 or ASCII plain-text format is available
  • media_type distinguishes readable text books from audio recordings
  • Use this before gutenberg_get_text to confirm text availability and inspect the formats map

gutenberg_get_text

Retrieve the plain-text content of a Project Gutenberg book, stripped of license boilerplate.

  • Strips the standard Gutenberg license header and footer — response contains only the literary work
  • Offset/limit chunking for long works: novels routinely run 500 KB–2 MB; read in manageable chunks without loading the whole file
  • Response includes totalChars, offset, length, and remainingChars for precise pagination
  • Paragraph-boundary trimming: actual returned length may be slightly less than limit — use length (not limit) to compute the next offset
  • Prefers UTF-8 plain text; falls back to ASCII plain text; converts HTML as a last resort
  • Refuses audio books (media_type "Sound") with a clear recovery hint
  • provenance field carries the Gutenberg ID, title, and license URL for attribution

gutenberg_browse_popular

Browse the most-downloaded Project Gutenberg books.

  • Returns up to 32 titles ordered by download count (most popular first)
  • Optionally filter by language (ISO 639-1 codes) and/or topic keyword
  • Useful as a discovery entry point: "what are the most popular classics in French?"
  • totalInCatalog provides full context — "top 20 of 60,000"

Features

Built on @cyanheads/mcp-ts-core:

  • Declarative tool definitions — single file per tool, framework handles registration and validation
  • Unified error handling — handlers throw, framework catches, classifies, and formats with recovery hints
  • Pluggable auth: none, jwt, oauth
  • Swappable storage backends: in-memory, filesystem, Supabase, Cloudflare KV/R2/D1
  • Structured logging with optional OpenTelemetry tracing
  • STDIO and Streamable HTTP transports

Project Gutenberg integration:

  • Catalog search and metadata via Gutendex — an unofficial but stable JSON API over the Gutenberg dataset
  • Full plain-text retrieval directly from Project Gutenberg file servers with transparent UTF-8/ASCII/HTML fallback chain
  • In-session text caching: book text is fetched once per session and served from cache for subsequent chunk reads
  • No API key required — Project Gutenberg data is freely available; no registration needed

Agent-friendly output:

  • has_plain_text flag on every search/browse result so agents can pre-filter before attempting text retrieval
  • Precise chunking contract: offset, length, totalChars, remainingChars, hasMore on every gutenberg_get_text response for reliable sequential reads
  • provenance field on every text response for attribution
  • Discriminated sourceFormat field (text/plain; charset=utf-8, text/plain; charset=us-ascii, text/html) so agents know the fidelity of the text

Getting started

Public Hosted Instance

A public instance is available at https://gutenberg.caseyjhand.com/mcp — no installation required. Point any MCP client at it via Streamable HTTP:

{
  "mcpServers": {
    "gutenberg-mcp-server": {
      "type": "streamable-http",
      "url": "https://gutenberg.caseyjhand.com/mcp"
    }
  }
}

Self-Hosted / Local

No API key required. Add the following to your MCP client configuration file:

{
  "mcpServers": {
    "gutenberg-mcp-server": {
      "type": "stdio",
      "command": "bunx",
      "args": ["@cyanheads/gutenberg-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Or with npx (no Bun required):

{
  "mcpServers": {
    "gutenberg-mcp-server": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@cyanheads/gutenberg-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Or with Docker:

{
  "mcpServers": {
    "gutenberg-mcp-server": {
      "type": "stdio",
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "-e", "MCP_TRANSPORT_TYPE=stdio",
        "ghcr.io/cyanheads/gutenberg-mcp-server:latest"
      ]
    }
  }
}

For Streamable HTTP, set the transport and start the server:

MCP_TRANSPORT_TYPE=http MCP_HTTP_PORT=3010 bun run start:http
# Server listens at http://localhost:3010/mcp

Prerequisites

  • Bun v1.3.11 or higher (or Node.js v24+).
  • No API key required — Project Gutenberg data is freely available.

Installation

  1. Clone the repository:
git clone https://github.com/cyanheads/gutenberg-mcp-server.git
  1. Navigate into the directory:
cd gutenberg-mcp-server
  1. Install dependencies:
bun install
  1. Configure environment:
cp .env.example .env
# edit .env if you need to override any defaults

Configuration

VariableDescriptionDefault
GUTENDEX_BASE_URLBase URL for the Gutendex catalog API. Override for self-hosted instances.https://gutendex.com/books/
GUTENBERG_TEXT_BASE_URLBase URL for Project Gutenberg file servers. Override for mirrors.https://www.gutenberg.org
MCP_TRANSPORT_TYPETransport: stdio or http.stdio
MCP_HTTP_PORTPort for HTTP server.3010
MCP_AUTH_MODEAuth mode: none, jwt, or oauth.none
MCP_LOG_LEVELLog level (RFC 5424).info
LOGS_DIRDirectory for log files (Node.js only).<project-root>/logs
STORAGE_PROVIDER_TYPEStorage backend.in-memory
OTEL_ENABLEDEnable OpenTelemetry instrumentation.false

See .env.example for the full list of optional overrides.


Running the server

Local development

  • Build and run:

    # One-time build
    bun run rebuild
    
    # Run the built server
    bun run start:stdio
    # or
    bun run start:http
    
  • Run checks and tests:

    bun run devcheck   # Lint, format, typecheck, security
    bun run test       # Vitest test suite
    bun run lint:mcp   # Validate MCP definitions against spec
    

Docker

docker build -t gutenberg-mcp-server .
docker run --rm -p 3010:3010 gutenberg-mcp-server

The Dockerfile defaults to HTTP transport, stateless session mode, and logs to /var/log/gutenberg-mcp-server. OpenTelemetry peer dependencies are installed by default — build with --build-arg OTEL_ENABLED=false to omit them.


Project structure

PathPurpose
src/index.tscreateApp() entry point — registers tools and inits services.
src/config/server-config.tsServer-specific environment variable parsing (Gutendex and file-server URL overrides).
src/mcp-server/tools/definitions/Tool definitions (*.tool.ts).
src/services/gutendex/Gutendex catalog API client — search and book metadata.
src/services/gutenberg-text/Full plain-text retrieval, boilerplate stripping, in-session caching, and chunking.
tests/Unit and integration tests mirroring src/.

Development guide

See CLAUDE.md / AGENTS.md for development guidelines and architectural rules. The short version:

  • Handlers throw, framework catches — no try/catch in tool logic
  • Use ctx.log for request-scoped logging, ctx.state for tenant-scoped storage
  • Register new tools via the entry arrays in src/index.ts
  • Wrap external API calls: validate raw → normalize to domain type → return output schema; never fabricate missing fields

Contributing

Issues and pull requests are welcome. Run checks and tests before submitting:

bun run devcheck
bun run test

License

Apache-2.0 — see LICENSE for details.

Data from Project Gutenberg is in the public domain. Catalog metadata sourced from Gutendex (MIT license).