Real advice for every tool your agent considers.

AI agents burn tokens retrying flaky, slow, or non-compliant tools. ToolRate delivers objective reliability ratings and smart recommendations from thousands of real agent executions in production.

Know before you call.

Get Started Free ▶ Watch 1min Demo View on GitHub

832

Tools Rated

100K

Data Points

<8ms

Avg Response

LLM Sources

The Problem

Agents burn cycles on failing tools

Stripe times out. LemonSqueezy rejects auth. PayPal finally works. Three attempts, wasted tokens, degraded UX — and no record of why any of it happened.

The Solution

One assessment before every call

ToolRate scores every tool in real time from the collective experience of thousands of production agents. Pick the best option first, fall back intelligently. Every decision is logged with a confidence score attached.

Global Compliance Layer

🌍

Jurisdiction, Made Visible

Exclusive to ToolRate

Every tool in ToolRate comes with clear, real-world jurisdiction data — hosting location, GDPR risk level, and confidence score. So your agents decide based on facts, not assumptions.

Reliability first — every tool is scored neutrally, with no geographic penalty.
GDPR risk made explicit — clear signals for data-residency compliance when it matters.
Region as a choice, never a default — EU, US, and global tools ranked by performance.
Your rules, your ranking — pass preferences once via the SDK and ToolRate weighs every recommendation against your policy (e.g. “prefer EU for European data” or “optimize for lowest latency worldwide”).

From San Francisco to Berlin to Singapore, every agent builder gets the same transparent view — with full control to match how you actually build.

Install ToolRate

Beginner-friendly in two commands. Works on every platform — no PEP 668 drama, no virtualenv archaeology.

Recommended

Modern & fastest

# Install uv (one-time)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Add ToolRate to your project
uv add toolrate

Alternative

Without uv

python3 -m venv .venv
source .venv/bin/activate
pip install toolrate

Note: If you see a PEP 668 “externally-managed-environment” error with plain pip, that’s because of Homebrew Python. Use one of the methods above instead. For TypeScript / Node 18+: npm install toolrate.

Three lines to get started

from toolrate import ToolRate, guard

client = ToolRate("nf_live_...")

# Check reliability before calling
score = client.assess("https://api.stripe.com/v1/charges")
# => { reliability_score: 94.2, failure_risk: "low", ... }

# Or use guard() for auto-fallback
result = guard(client, "https://api.stripe.com/v1/charges",
               lambda: stripe.Charge.create(...),
               fallbacks=[
                   ("https://api.lemonsqueezy.com/v1/checkouts",
                    lambda: lemon.create_checkout(...)),
               ])
import { ToolRate } from "toolrate";

const client = new ToolRate("nf_live_...");

// Check reliability before calling
const score = await client.assess("https://api.stripe.com/v1/charges");

// Or use guard() for auto-fallback
const result = await client.guard(
  "https://api.stripe.com/v1/charges",
  () => stripe.charges.create({...}),
  { fallbacks: [
    ["https://api.lemonsqueezy.com/v1/checkouts",
     () => lemon.createCheckout({...})],
  ]}
);
# Assess a tool
curl -X POST https://api.toolrate.ai/v1/assess \
  -H "X-Api-Key: nf_live_..." \
  -H "Content-Type: application/json" \
  -d '{"tool_identifier": "https://api.stripe.com/v1/charges"}'

# Report a result
curl -X POST https://api.toolrate.ai/v1/report \
  -H "X-Api-Key: nf_live_..." \
  -H "Content-Type: application/json" \
  -d '{"tool_identifier": "https://api.stripe.com/v1/charges",
    "success": true, "latency_ms": 420}'

Built for production agents

Reliability intelligence for the developers, enterprises, and agents running production AI workloads.

Reliability Scoring

Real-world success rates, common failure modes, and recommended mitigations — so agents know exactly how much to trust the tool, and auditors know precisely how the score was calculated.

One-Line Guard

result = toolrate.guard(tool="stripe/charges", context=plan)

Zero branching logic. Zero retry boilerplate. Production-ready in one line.

Hidden Gems

The tools nobody pitches but production agents quietly rely on — surfaced from real fallback patterns across thousands of sessions and ranked by recovery rate.

Fallback Chains

When OpenAI, Stripe, or SendGrid drops, what do production agents actually switch to? Live journey data, ranked by downstream completion rate.

Reliability Webhooks

Get paged the moment a tool's reliability crosses a threshold you define. HMAC-signed, per-tool, exponential-backoff delivery — wired into PagerDuty or Slack in seconds.

MCP Server

Drop ToolRate into Claude Code, Cursor, or Zed in one line — npx -y @toolrate/mcp-server or uvx toolrate-mcp. Nine native tools — assess, route_llm, report, fallback chains — live on npm and PyPI.

NEW · LLM Router

LLM Router — one call, the right model.

The ToolRate LLM Router picks the optimal model for each task — combining real-time reliability, exact per-token pricing, and latency awareness across all major providers, plus Ollama for local and free.

reliability_first 80 / 20

balanced 55 / 45

cost_first 25 / 75

speed_first 35 / 45 / 20

# Tell ToolRate your constraints, get the right model back.
result = client.assess(
  tool_identifier="https://api.anthropic.com/v1/messages",
  task_complexity="low",
  expected_tokens=500,
  max_price_per_call=0.01,
  budget_strategy="cost_first",
)

# → recommended_model: "claude-haiku-4-5"
# → price_per_call:    $0.00152  (exact per-token math)
# → within_budget:     true
# → reasoning:         "Anthropic Messages scored 91.7/100
#                       for reliability (low risk). Recommended
#                       model: claude-haiku-4-5. Cost: $0.0015/call.
#                       Typical latency ~500ms. Strategy: cost-first;
#                       task complexity: low. Fits within your budget."

What you get per call

Exact per-million-token cost at your expected volume
Specific model inside each provider (Haiku for low, Opus for reasoning)
Human-readable reasoning string — drop it in your logs
Over-budget tools flagged, never silently filtered
Drop-in LLMRouter class with automatic fallback cascade

anthropic

openai

groq

together

mistral

deepseek

ollama · local · free

Pricing that scales with your agents

Start free. Scale with pay-as-you-go. Flat-rate when you need it. See all plans →

Free

$0 / forever

For testing and side projects

100 assessments / day
Public data pool
Python & TypeScript SDKs
Standard support

Create Free Key

Pay-as-you-go

$0.008 / assessment

Best for autonomous agents and bots

First 100 / day free
$0.008 per assessment after
No monthly commitment
Webhook alerts included

Start Pay-as-you-go

Pro

$29 / month

Flat rate for heavy usage

10,000 assessments / month
Priority support
Higher rate limits
Webhook alerts

Upgrade to Pro

Building an AI platform? Talk to sales about Enterprise →