The AI Model Hub: 348+ Models, One Interface

The Model Proliferation Problem

In 2024, there were a handful of meaningful AI models. In 2026, there are hundreds. OpenAI, Anthropic, Google, Meta, Mistral, Cohere, Groq, NVIDIA, AWS Bedrock, Azure, and dozens of open-source providers on OpenRouter — each with multiple models, each with different strengths, pricing, speed, and context window characteristics.

Managing this landscape is a job. Knowing which model to use for which task, keeping API keys organised, updating workflows when new models release, and staying within budget across providers — these are real operational costs.

The Sovereignty Protocol's AI Model Hub abstracts all of it.

Unified Provider Support

The platform maintains active integrations with:

OpenAI — GPT-4o, GPT-4o mini, o3, o4-mini, and the full model line
Anthropic — Claude Sonnet, Claude Opus, and Haiku tiers
Google — Gemini 2.0 Flash, Gemini 2.5 Pro, and Ultra
Meta (via NVIDIA/Groq) — Llama 3.x family at multiple parameter scales
Mistral — Mistral Large, Medium, and Small; Codestral for code tasks
Groq — Ultra-fast inference for latency-sensitive workflows
NVIDIA NIM — Hosted open-weight models at enterprise scale
OpenRouter — 300+ models through a single API endpoint
AWS Bedrock — Titan, Llama, Mistral, Cohere via AWS infrastructure
Azure OpenAI — Enterprise-grade GPT deployment for compliance environments

Each provider is connected once. After that, any model from any provider is accessible through the same cascade step syntax: provider: openai, model: gpt-4o-mini.

The Model Routing System

The platform does not expose raw provider selection in most workflows. Instead, it uses a model routing layer — a central configuration that maps abstract role names to concrete model identifiers.

chat:
  primary: "claude-3-5-sonnet-20241022"       # Best quality
  secondary: "mistralai/mistral-medium-3.5"   # Logic / reasoning
  tertiary: "meta-llama/llama-3.1-70b"        # Fast / general
  speed: "groq/llama-3.1-8b-instant"          # Latency critical
  vision: "gpt-4o"                            # Image understanding
  code: "codestral-latest"                    # Code tasks

In cascade steps, you reference the role, not the model:

- name: assessment
  type: agent
  model: secondary      # maps to mistral-medium-3.5 today

When Mistral releases a better secondary-tier model, you update one line in model_routing.yaml. Every cascade that references secondary automatically uses the new model. No individual workflow changes required.

Per-Task Model Selection

Some tasks have clear model requirements. The routing system accommodates this:

Vision tasks always use a vision-capable model. Cascade steps with image inputs are automatically routed to the vision role, regardless of what role the rest of the cascade uses.

Code tasks route to the code role. Codestral outperforms general-purpose models on code generation and review by a significant margin.

Speed-sensitive tasks — real-time event reactions, live chat responses — route to the speed role. Sub-second latency from Groq-hosted models makes latency-sensitive workflows practical.

Reasoning-heavy tasks — complex analysis, multi-step logic, structured output generation — benefit from the secondary or primary role, accepting slightly higher latency for better output quality.

BYOK and Model Costs

Every provider connection supports Bring Your Own Key — you configure your API keys in your account settings and the platform routes to your keys, not ours.

This means:

You pay your negotiated provider rates, not a platform markup
Your usage is tracked in your provider dashboard, not hidden
Enterprise volume discounts you have with providers are respected
You choose which providers to connect and which to leave out

Model cost transparency is built into the dashboard. Every cascade run logs which model was used for which step, making cost attribution granular. You can see that your Librarian agent spent $0.003 on primary model calls last week — not a blended platform fee.

Comparing Models

The Model Hub includes a built-in comparison tool. Run the same prompt against multiple models side by side, see the responses together, and evaluate which model performs best for your specific use case.

This is not a benchmark. It is your actual prompts, against your actual data, tested against the models you are considering. The results land in your session so you can review them at your pace.

Use the comparison tool when:

You are selecting a model for a new cascade
You suspect a recently released model might outperform your current choice
You want to calibrate whether a cheaper model is good enough for a specific task

348+ Models, Governed

Every model accessible through the platform is subject to the same governance layer as the rest of the system. It does not matter whether a prompt goes to GPT-4o or a Llama-3 model on OpenRouter — the active law set applies. The Nexus Report is generated. The audit trail is complete.

Switching models does not mean switching governance. The constitutional rules travel with the workflow, not with the model.

This is the difference between the Model Hub and a raw multi-provider API client. You have access to the full landscape of AI models. You are not exposed to the full landscape of AI risk.