Status: Systems Operational

An open marketplace for AI inference endpoints.

ForgeRouter gives developers one OpenAI-compatible API to route requests across ForgeSoftware-hosted models, community providers, and future verified European providers — with transparent pricing, visible infrastructure, and explicit control over where inference runs.

api OpenAI-Compatible API
group Community providers
verified Verified tiers
euro Euro billing
visibility Provider-level transparency
Client App Gateway Community Host ForgeSoftware Verified Provider Datacenter Provider

Choose the provider tier that matches your workload.

Community

  • • Lower-cost community hosts
  • • Explicit opt-in only
  • • Hardware/location declared
  • • Reliability measured
  • • Best effort

Observed

  • • Uptime/latency tracked
  • • Error rate monitored
  • • Better routing confidence
  • • Transparent limitations
RECOMMENDED

Platform-operated

  • • Operated by ForgeSoftware
  • • Clear infrastructure
  • • Example: Qwen3.6-27B on 2x RTX A5000 in France

Verified

  • • Identity checked
  • • Hosting/policy documented
  • • Stronger trust signals
  • • Future enterprise policies

Small providers can serve models. They just can't easily sell them.

Many people can run vLLM, llama.cpp or SGLang. Fewer can build billing, token accounting, dashboards, monitoring, and trust signals.

terminal

For Developers

  • close Managing dozens of API keys and different SDKs.
  • close Hard to compare real providers and actual hardware used.
  • close Opaque routing—not knowing exactly where user data is processed.
dns

For Providers

  • close Struggling to gain visibility and get listed on aggregators.
  • close Complex billing, invoicing, and cross-border payment collection.
  • close Idle GPU capacity that is difficult to monetise effectively.

Know the deployment behind the model.

ForgeRouter does not hide every backend behind a single model name. Users can inspect the actual deployment: provider, country, runtime, hardware, quantization, context, pricing, uptime and logging policy.

Model: forge-1/Qwen3.6-27B
Provider: ForgeSoftware
Tier: Platform-operated beta
Location: France
Runtime: vLLM
Hardware: 2x RTX A5000
Quantization: AWQ INT4 / Marlin
Context: up to 131k
Policy: No prompt logging
Status: Best effort

Route by price, trust and performance.

Exact deployment

model="forge-1/Qwen3.6-27B"

Balanced

model="qwen-27b",
strategy="balanced"

Economy community

model="qwen-27b",
strategy="cost",
allow_community=true

ForgeNode: from local endpoint to marketplace deployment.

ForgeNode is the lightweight Python agent providers install next to their inference runtime.

  • ✓ Detects GPU & VRAM
  • ✓ Detects Runtime (vLLM, SGLang, etc.)
  • ✓ Detects Model & Quantization
  • ✓ Reports Health & Metrics
$ pipx install forgenode
$ forgenode enroll --token <YOUR_TOKEN>
$ forgenode detect
$ forgenode publish

Transparent marketplace pricing.

Provider inference price + ForgeRouter platform fee = Effective customer cost.

euroEuro billing
account_balance_walletPrepaid wallet
visibilityNo hidden markup

Community providers unlock the long tail of open inference.

Better prices

Access unused capacity at lower rates.

More model variety

Find niche fine-tunes and specialized models.

Transparent risk

Opt-in only, with clear tracking of reliability.

Upgrade path

Good community nodes can become Verified.

Private beta

Join the ForgeRouter beta.

Request early access to the OpenAI-compatible gateway, provider marketplace previews, and routing policy experiments.