The Model Proliferation Problem
In 2024, there were a handful of meaningful AI models. In 2026, there are hundreds. OpenAI, Anthropic, Google, Meta, Mistral, Cohere, Groq, NVIDIA, AWS Bedrock, Azure, and dozens of open-source providers on OpenRouter โ each with multiple models, each with different strengths, pricing, speed, and context window characteristics.
Managing this landscape is a job. Knowing which model to use for which task, keeping API keys organised, updating workflows when new models release, and staying within budget across providers โ these are real operational costs.
The Sovereignty Protocol's AI Model Hub abstracts all of it.
Unified Provider Support
The platform maintains active integrations with:
- OpenAI โ GPT-4o, GPT-4o mini, o3, o4-mini, and the full model line
- Anthropic โ Claude Sonnet, Claude Opus, and Haiku tiers
- Google โ Gemini 2.0 Flash, Gemini 2.5 Pro, and Ultra
- Meta (via NVIDIA/Groq) โ Llama 3.x family at multiple parameter scales
- Mistral โ Mistral Large, Medium, and Small; Codestral for code tasks
- Groq โ Ultra-fast inference for latency-sensitive workflows
- NVIDIA NIM โ Hosted open-weight models at enterprise scale
- OpenRouter โ 300+ models through a single API endpoint
- AWS Bedrock โ Titan, Llama, Mistral, Cohere via AWS infrastructure
- Azure OpenAI โ Enterprise-grade GPT deployment for compliance environments
Each provider is connected once. After that, any model from any provider is accessible through the same cascade step syntax: provider: openai, model: gpt-4o-mini.
The Model Routing System
The platform does not expose raw provider selection in most workflows. Instead, it uses a model routing layer โ a central configuration that maps abstract role names to concrete model identifiers.
chat:
primary: "claude-3-5-sonnet-20241022" # Best quality
secondary: "mistralai/mistral-medium-3.5" # Logic / reasoning
tertiary: "meta-llama/llama-3.1-70b" # Fast / general
speed: "groq/llama-3.1-8b-instant" # Latency critical
vision: "gpt-4o" # Image understanding
code: "codestral-latest" # Code tasks
In cascade steps, you reference the role, not the model:
- name: assessment
type: agent
model: secondary # maps to mistral-medium-3.5 today
When Mistral releases a better secondary-tier model, you update one line in model_routing.yaml. Every cascade that references secondary automatically uses the new model. No individual workflow changes required.
Per-Task Model Selection
Some tasks have clear model requirements. The routing system accommodates this:
Vision tasks always use a vision-capable model. Cascade steps with image inputs are automatically routed to the vision role, regardless of what role the rest of the cascade uses.
Code tasks route to the code role. Codestral outperforms general-purpose models on code generation and review by a significant margin.
Speed-sensitive tasks โ real-time event reactions, live chat responses โ route to the speed role. Sub-second latency from Groq-hosted models makes latency-sensitive workflows practical.
Reasoning-heavy tasks โ complex analysis, multi-step logic, structured output generation โ benefit from the secondary or primary role, accepting slightly higher latency for better output quality.
BYOK and Model Costs
Every provider connection supports Bring Your Own Key โ you configure your API keys in your account settings and the platform routes to your keys, not ours.
This means:
- You pay your negotiated provider rates, not a platform markup
- Your usage is tracked in your provider dashboard, not hidden
- Enterprise volume discounts you have with providers are respected
- You choose which providers to connect and which to leave out
Model cost transparency is built into the dashboard. Every cascade run logs which model was used for which step, making cost attribution granular. You can see that your Librarian agent spent $0.003 on primary model calls last week โ not a blended platform fee.
Comparing Models
The Model Hub includes a built-in comparison tool. Run the same prompt against multiple models side by side, see the responses together, and evaluate which model performs best for your specific use case.
This is not a benchmark. It is your actual prompts, against your actual data, tested against the models you are considering. The results land in your session so you can review them at your pace.
Use the comparison tool when:
- You are selecting a model for a new cascade
- You suspect a recently released model might outperform your current choice
- You want to calibrate whether a cheaper model is good enough for a specific task
348+ Models, Governed
Every model accessible through the platform is subject to the same governance layer as the rest of the system. It does not matter whether a prompt goes to GPT-4o or a Llama-3 model on OpenRouter โ the active law set applies. The Nexus Report is generated. The audit trail is complete.
Switching models does not mean switching governance. The constitutional rules travel with the workflow, not with the model.
This is the difference between the Model Hub and a raw multi-provider API client. You have access to the full landscape of AI models. You are not exposed to the full landscape of AI risk.