Integrating Multi-LLM Support
How to work with OpenAI, Anthropic, Google Gemini, and Ollama in one platform.
# Integrating Multi-LLM Support
CognitiveX supports multiple LLM providers out of the box. Here's how to use them effectively.
Supported Providers
- **OpenAI**: GPT-4, GPT-3.5
- **Anthropic**: Claude 3 (Opus, Sonnet, Haiku)
- **Google**: Gemini Pro, Gemini Ultra
- **Ollama**: Local models (Llama, Mistral, etc.)
Configuration
Set up multiple providers:
const client = new CognitiveXClient({
providers: {
openai: { apiKey: process.env.OPENAI_KEY },
anthropic: { apiKey: process.env.ANTHROPIC_KEY },
google: { apiKey: process.env.GOOGLE_KEY },
ollama: { baseUrl: "http://localhost:11434" },
},
});
Choosing Models
Specify per request:
await client.chat({
model: "claude-3-opus",
messages: [...],
});
Or use automatic routing:
await client.chat({
model: "auto", // Let CognitiveX choose
task: "complex_reasoning",
maxCost: 0.01,
});
Model Selection Criteria
By Task Type
- **Creative Writing**: GPT-4, Claude Opus
- **Code Generation**: GPT-4, Claude Sonnet
- **Analysis**: Gemini Pro, Claude Opus
- **Quick Queries**: GPT-3.5, Claude Haiku
By Cost
- **Budget**: Ollama (free), GPT-3.5
- **Standard**: Claude Haiku, Gemini Pro
- **Premium**: GPT-4, Claude Opus
By Latency
- **Fastest**: Ollama, GPT-3.5
- **Fast**: Claude Haiku, Gemini Pro
- **Standard**: GPT-4, Claude Opus
Fallback Strategies
Handle provider outages:
await client.chat({
model: "gpt-4",
fallback: ["claude-3-opus", "gemini-pro"],
retries: 3,
});
Best Practices
1. **Test Multiple Models**: Each has strengths 2. **Monitor Costs**: Track spending by provider 3. **Use Auto-Routing**: Let the system optimize 4. **Keep Ollama Warm**: Local fallback is valuable
Performance Comparison
| Provider | Speed | Cost | Quality | |----------|-------|------|---------| | GPT-4 | ⭐⭐⭐ | 💰💰💰 | ⭐⭐⭐⭐⭐ | | Claude Opus | ⭐⭐⭐⭐ | 💰💰💰 | ⭐⭐⭐⭐⭐ | | Gemini Pro | ⭐⭐⭐⭐ | 💰💰 | ⭐⭐⭐⭐ | | Ollama | ⭐⭐⭐⭐⭐ | Free | ⭐⭐⭐ |
See our documentation for detailed model comparisons.