Odel
Local Model Suitability MCP

Local Model Suitability MCP

@ojaskordCommunicationJavaScriptMITUpdated 1w ago

Check if a task runs locally vs cloud. Save money on calls that don't need cloud inference.

Server endpointStreamable HTTP

This is the third-party server itself — Odel doesn't run it. Hitting this URL directly talks straight to the upstream server with no auth or proxying. Connect through Odel to front it with managed auth.

smithery badge

Local Model Suitability MCP

Cloud inference is expensive. Everything that can run locally should.

This MCP server tells your agent — before every cloud API call — whether the task can be handled by a local model instead. Route to Ollama, LM Studio, or llama.cpp when you can. Only pay for cloud when you must.

The Tool

check_local_viability

Call this BEFORE every cloud inference call. If verdict is LOCAL, skip the cloud call entirely and route to your local model. Only use cloud when this tool returns CLOUD.

Inputs:

FieldRequiredDescription
taskThe exact task you are about to send to a cloud model
quality_thresholdOptionalPRODUCTION (default) / PROTOTYPE / BEST_EFFORT
data_sensitivityOptionalPUBLIC (default) / INTERNAL / CONFIDENTIAL

CONFIDENTIAL forces LOCAL regardless of task complexity — data never leaves the machine.

Response:

{
  "verdict": "LOCAL",
  "confidence": "HIGH",
  "reason": "Simple text summarisation — no reasoning depth required. Any 7B+ local model handles this well.",
  "estimated_cost_saving": "$0.002-0.008 saved per call at claude-sonnet pricing",
  "recommended_local_models": ["llama3.2:8b", "mistral-7b", "phi3:medium"],
  "cloud_justified_reason": null,
  "analysis_type": "AI-powered cost routing — NOT a simple lookup"
}

Data Sources

  • AI reasoning: Anthropic Claude (claude-sonnet) — cost routing analysis
  • No external data sources — pure AI reasoning

Pricing

PlanCallsPrice
Free20/month$0
Starter500-call bundle$20
Pro2,000-call bundle$70

Subscribe at kordagencies.com

Setup

{
  "mcpServers": {
    "local-model-suitability": {
      "command": "npx",
      "args": ["-y", "local-model-suitability-mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "your-key",
        "API_KEY": "your-lms-api-key-for-paid-tier"
      }
    }
  }
}

Free tier requires no API key — tracked by IP.

Harness Integration

Claude Code / Claude Desktop (.mcp.json)

{
  "mcpServers": {
    "local-model-suitability": {
      "type": "http",
      "url": "https://local-model-suitability-mcp-production.up.railway.app"
    }
  }
}

LangChain (Python)

from langchain_mcp_adapters.client import MultiServerMCPClient
client = MultiServerMCPClient({
    "local-model-suitability": {
        "url": "https://local-model-suitability-mcp-production.up.railway.app",
        "transport": "http"
    }
})
tools = await client.get_tools()

OpenAI Agents SDK (Python)

from agents import Agent, HostedMCPTool
agent = Agent(
    name="Assistant",
    tools=[HostedMCPTool(tool_config={
        "type": "mcp",
        "server_label": "local-model-suitability",
        "server_url": "https://local-model-suitability-mcp-production.up.railway.app",
        "require_approval": "never"
    })]
)

LangGraph

Same as LangChain above — langchain-mcp-adapters works with LangGraph natively.

Legal

Results are for cost-optimisation guidance only and do not constitute technical advice. Full terms: kordagencies.com/terms.html