Odel
gbif biodiversity mcp server

gbif biodiversity mcp server

@cyanheadsData & Analytics1TypeScriptApache-2.0Updated 6 days ago

Search GBIF species taxonomy, occurrence records, datasets, and publishers.

Server endpointStreamable HTTPNo authProbed

This is the third-party server itself — Odel doesn't run it. Hitting this URL directly talks straight to the upstream server with no auth or proxying. Connect through Odel to front it with managed auth.

@cyanheads/gbif-biodiversity-mcp-server

Search GBIF species taxonomy, occurrence records, datasets, and publishers via MCP. STDIO or Streamable HTTP.

12 Tools • 2 Resources

Version License Docker MCP SDK npm TypeScript Bun


Tools

12 tools for working with GBIF species taxonomy, occurrence records, datasets, and publishers:

ToolDescription
gbif_match_speciesMatch a species name against the GBIF backbone taxonomy — returns taxonKey, confidence score, and full classification
gbif_get_speciesFetch a single backbone taxon by key — full classification, authorship, synonymy, vernacular name, descendant count
gbif_search_speciesSearch or browse the GBIF backbone taxonomy by name fragment, rank, kingdom, family, or genus
gbif_get_species_classificationReturn the complete parent chain for a taxon — root-first ordered array from kingdom to immediate parent
gbif_get_species_childrenList direct children of a backbone taxon — genera within a family, species within a genus
gbif_search_occurrencesSearch 2.4B+ GBIF occurrence records with Darwin Core filters — country, bounding box, WKT geometry, year, month, basis of record
gbif_count_occurrencesCount occurrences matching a filter without fetching records — fast single-number response
gbif_get_occurrenceFetch a single occurrence record by key — full Darwin Core record with GADM geography, media, and quality flags
gbif_occurrence_facetsAggregate occurrence counts by a dimension — country, year, basis of record, dataset, kingdom, and more
gbif_search_datasetsSearch GBIF datasets by keyword, type, country, or publishing organization
gbif_get_datasetFetch full dataset metadata by UUID — title, description, citation, contacts, license, DOI, coverage
gbif_search_publishersSearch GBIF-registered publishing organizations by name fragment or country

gbif_match_species

Match a scientific or common name against the GBIF backbone taxonomy.

  • Fuzzy matching handles minor typos and vernacular names; set strict: true for exact-only matching
  • Returns taxonKey — the backbone key required by gbif_search_occurrences, gbif_count_occurrences, and gbif_occurrence_facets
  • Confidence score 0–100; below 80 warrants review
  • Full classification hierarchy with keys at each rank: kingdom, phylum, class, order, family, genus, species
  • matchType NONE indicates no usable match — try removing strict mode or broadening the name
  • Resolves synonyms: always returns the accepted backbone key regardless of which name form was queried

gbif_get_species

Fetch a complete taxon record by GBIF backbone key.

  • Full classification, authorship string, and vernacular (English) name when available
  • taxonomicStatus: ACCEPTED, SYNONYM, DOUBTFUL — when SYNONYM, acceptedKey and accepted identify the current name
  • numDescendants and numOccurrences for scope at a glance
  • extinct field present only when explicitly flagged — not false on unlabeled taxa
  • publishedIn carries the original description citation when available

gbif_search_species

Search or browse the GBIF backbone taxonomy.

  • Accepts name fragments matching scientific and vernacular names
  • Filter by rank, kingdom, family, or genus to scope browsing
  • isExtinct filter for extinct vs. extant taxa
  • Scope to a specific checklist dataset with datasetKey (omit for the GBIF backbone)
  • Paginated — limit up to 1000, use offset to walk through large groups

gbif_get_species_classification

Return the full parent chain for a taxon as an ordered array.

  • Root-first (kingdom → phylum → class → order → family → genus → species → up to parent of queried taxon)
  • Each entry: rank, canonical name, scientific name, taxon key
  • Useful for building taxonomic trees or placing an unfamiliar taxon in context without manual backbone navigation

gbif_get_species_children

List direct children of a backbone taxon.

  • Genera within a family, species within a genus, subspecies within a species
  • Each child: key, name, rank, taxonomic status, common name, occurrence count, descendant count
  • Paginated — limit up to 1000, iterate with offset for large groups like Coleoptera

gbif_search_occurrences

Search 2.4B+ GBIF occurrence records with full Darwin Core filtering.

  • Use taxonKey from gbif_match_species for reliable results — resolves synonyms automatically; scientificName filter does not
  • Geographic filters: country (ISO 3166-1 alpha-2), bounding box (decimalLatitude/decimalLongitude ranges as "min,max"), or WKT polygon (geometry)
  • Temporal filters: year as single year or range, month (1–12) for seasonal queries
  • basisOfRecord enum: HUMAN_OBSERVATION, PRESERVED_SPECIMEN, MACHINE_OBSERVATION, and more
  • hasCoordinate to require or exclude georeferenced records
  • Pagination capped at offset+limit ≈ 100,000 — use gbif_occurrence_facets for aggregate analysis beyond this

gbif_count_occurrences

Count occurrences matching a filter without fetching any records.

  • Backed by the lightweight /occurrence/count endpoint — fast single-number response
  • Supported filters: taxonKey, country, isGeoreferenced, datasetKey, year
  • Use to assess result set size before deciding whether to paginate a full search

gbif_get_occurrence

Fetch a single occurrence record by GBIF occurrence key.

  • Complete Darwin Core record — all coordinate fields, GADM administrative geography (continent, country, state/province, locality), dates
  • Collections metadata: institution code, collection code, catalog number
  • Collector and identifier names, individual count, sex, life stage
  • Associated media (images, audio, video) with URLs and license
  • GBIF data quality issue flags for provenance assessment

gbif_occurrence_facets

Aggregate occurrence counts across a dimension.

  • Facets: COUNTRY, YEAR, BASIS_OF_RECORD, DATASET_KEY, KINGDOM_KEY, PHYLUM_KEY, CLASS_KEY, ORDER_KEY, FAMILY_KEY, GENUS_KEY, SPECIES_KEY, PUBLISHING_COUNTRY, MONTH
  • Scope with taxonKey, country, year, geometry, or basisOfRecord filters
  • Returns top-N values (up to 100) ranked by count — no record payloads
  • Core tool for distribution analysis ("which countries have the most records?") and trend queries ("how has observation volume changed since 2010?")

gbif_search_datasets

Search GBIF datasets by keyword, type, country, or publishing organization.

  • Filters: free-text query, dataset type (OCCURRENCE, CHECKLIST, METADATA, SAMPLING_EVENT), publishing country, hosting organization UUID
  • Returns title, type, description, license, DOI, and record count
  • Use hostingOrg from gbif_search_publishers to scope to datasets from one organization
  • Paginated — limit up to 1000

gbif_get_dataset

Fetch full dataset metadata by UUID.

  • Full description, citation text (for academic reference), license, DOI
  • Contacts with role, name, organization, and email
  • numConstituents for aggregate datasets (e.g. iNaturalist, eBird)
  • Use after gbif_search_datasets or when an occurrence record's datasetKey needs provenance detail

gbif_search_publishers

Search organizations registered with GBIF.

  • Filter by name fragment or country
  • Returns organization key, title, and country — sufficient to chain into gbif_search_datasets with hostingOrg
  • Paginated — limit up to 1000

Resources

TypeNameDescription
Resourcegbif://species/{taxonKey}Taxon record from the GBIF backbone — classification, authorship, synonymy status, vernacular name
Resourcegbif://dataset/{datasetKey}Dataset metadata — title, description, citation, license, contacts, coverage

Features

Built on @cyanheads/mcp-ts-core:

  • Declarative tool definitions — single file per tool, framework handles registration and validation
  • Unified error handling across all tools
  • Pluggable auth (none, jwt, oauth)
  • Swappable storage backends: in-memory, filesystem, Supabase, Cloudflare KV/R2/D1
  • Structured logging with optional OpenTelemetry tracing
  • Runs locally (stdio/HTTP) or on Cloudflare Workers from the same codebase

GBIF-specific:

  • Full GBIF REST API v1 coverage: species taxonomy, occurrences, datasets, and publishers
  • gbif_match_species as the entry point — resolves synonyms to backbone taxon keys used throughout
  • Occurrence pagination cap detection with paginationNote — redirects to facet aggregation before hitting the ~100,000 row limit
  • WKT polygon geometry support for geographic occurrence queries
  • Darwin Core field mapping with explicit provenance on sparse upstream fields

Agent-friendly output:

  • gbif_match_species is the mandatory first step — all downstream tools document which key they expect
  • Graceful sparse-field handling — optional fields absent from the API response are omitted rather than null-filled
  • Discriminated error contracts with typed reasons, structured recovery hints, and when documentation per tool

Getting started

Self-Hosted / Local

Add the following to your MCP client configuration file.

{
  "mcpServers": {
    "gbif-biodiversity-mcp-server": {
      "type": "stdio",
      "command": "bunx",
      "args": ["@cyanheads/gbif-biodiversity-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Or with npx (no Bun required):

{
  "mcpServers": {
    "gbif-biodiversity-mcp-server": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@cyanheads/gbif-biodiversity-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Or with Docker:

{
  "mcpServers": {
    "gbif-biodiversity-mcp-server": {
      "type": "stdio",
      "command": "docker",
      "args": ["run", "-i", "--rm", "-e", "MCP_TRANSPORT_TYPE=stdio", "ghcr.io/cyanheads/gbif-biodiversity-mcp-server:latest"]
    }
  }
}

For Streamable HTTP, set the transport and start the server:

MCP_TRANSPORT_TYPE=http MCP_HTTP_PORT=3010 bun run start:http
# Server listens at http://localhost:3010/mcp

Prerequisites

Installation

  1. Clone the repository:
git clone https://github.com/cyanheads/gbif-biodiversity-mcp-server.git
  1. Navigate into the directory:
cd gbif-biodiversity-mcp-server
  1. Install dependencies:
bun install

Configuration

All configuration is validated at startup via Zod schemas in src/config/server-config.ts. Key environment variables:

VariableDescriptionDefault
MCP_TRANSPORT_TYPETransport: stdio or httpstdio
MCP_HTTP_PORTHTTP server port3010
MCP_HTTP_ENDPOINT_PATHHTTP endpoint path where the MCP server is mounted/mcp
MCP_PUBLIC_URLPublic origin override for TLS-terminating reverse-proxy deploymentsnone
MCP_AUTH_MODEAuthentication: none, jwt, or oauthnone
MCP_LOG_LEVELLog level (debug, info, warning, error, etc.)info
MCP_GC_PRESSURE_INTERVAL_MSOpt-in Bun-only forced-GC pressure loop (ms). Try 60000 if RSS grows under sustained HTTP load.0 (disabled)
LOGS_DIRDirectory for log files (Node.js only)<project-root>/logs
STORAGE_PROVIDER_TYPEStorage backend: in-memory, filesystem, supabase, cloudflare-kv/r2/d1in-memory
GBIF_BASE_URLGBIF API base URL overridehttps://api.gbif.org/v1
GBIF_REQUEST_TIMEOUT_MSHTTP request timeout in milliseconds10000
OTEL_ENABLEDEnable OpenTelemetryfalse

Running the server

Local development

  • Build and run the production version:

    # One-time build
    bun run rebuild
    
    # Run the built server
    bun run start:http
    # or
    bun run start:stdio
    
  • Run checks and tests:

    bun run devcheck  # Lints, formats, type-checks, and more
    bun run test      # Runs the test suite
    

Project structure

DirectoryPurpose
src/mcp-server/toolsTool definitions (*.tool.ts). Twelve tools across species taxonomy, occurrences, datasets, and publishers.
src/mcp-server/resourcesResource definitions. Species and dataset stable-URI resources.
src/services/gbifGBIF REST API service layer — client, request handling, type definitions.
src/configServer-specific environment variable parsing and validation with Zod.
tests/Unit and integration tests, mirroring the src/ structure.

Development guide

See CLAUDE.md for development guidelines and architectural rules. The short version:

  • Handlers throw, framework catches — no try/catch in tool logic
  • Use ctx.log for logging, ctx.state for storage
  • Register new tools and resources in the createApp() arrays

Contributing

Issues and pull requests are welcome. Run checks and tests before submitting:

bun run devcheck
bun run test

License

This project is licensed under the Apache 2.0 License. See the LICENSE file for details.