← Back to All Products
🤖 Self-Hosted AI Platform
iNdex .ai
Your data never leaves your servers.
Your intelligence. Your infrastructure. Your rules.
Llama 3.1
CodeLlama
Mistral
Phi-3
🔒 100% LOCAL — ZERO CLOUD
llama3.1:8b
0+Models Supported
0msAvg First Token
0KContext Window
0%Data Privacy
Supported Models
Neural Network Ecosystem
Swap between frontier open-source models in real time. No API keys. No cloud dependencies.
Technical Design
Clean Architecture
Four-layer design. Blazor frontend, CQRS application layer, pure domain, and Ollama-powered infrastructure.
01
Presentation
Blazor Server + MudBlazor UI components and REST API endpoints
02
Application
CQRS pattern. Commands, queries, handlers and business logic via MediatR
03
Domain
Entities, interfaces, pure domain logic. Zero external dependencies
04
Infrastructure
Ollama AI engine, PostgreSQL, file system. Repository implementations
Data Sovereignty
Zero External Calls
Air-Gapped Operation
Runs fully offline. No internet required after installation.
Your Infrastructure
Deploy on-premise, private cloud, or bare metal GPU server.
GDPR & HIPAA Ready
Full compliance. Data never crosses organizational boundaries.
No Vendor Lock-in
Open standards. Swap models freely. Own your AI stack forever.
Benchmarks
Raw Performance
Measured on consumer-grade NVIDIA GPU. Results vary by hardware configuration.
Streaming API
Real-Time Token Streaming
# POST /api/chat/stream
curl -X POST http://localhost:5000/api/chat/stream \ -H "Content-Type: application/json" \ -d '{"model":"llama3.1:8b","prompt":"Explain neural networks"}'
↓ streaming response ...
curl -X POST http://localhost:5000/api/chat/stream \ -H "Content-Type: application/json" \ -d '{"model":"llama3.1:8b","prompt":"Explain neural networks"}'
↓ streaming response ...
Server-Sent Events
See each token as it's generated. No waiting for full completion. Native SSE protocol.
SSE
REST + WebSocket
Both sync and streaming endpoints. Integrate with any client in any language.
REST
Semantic Kernel
Microsoft's AI orchestration. Build agents, chains, and plugins natively in .NET.
SK
Get Running in Minutes
One Command Deploy
Three paths to production. Docker, native .NET, or GPU-accelerated bare metal.
Docker Compose
Spin up the full stack in one command. Postgres, Ollama, and the API fully orchestrated.
$ docker-compose up -d
Native .NET
ASP.NET Core Minimal APIs. Fast, modern, cross-platform across Linux, Windows, and macOS.
$ dotnet run --project iNdex.Api
GPU Accelerated
NVIDIA CUDA via Ollama + LLamaSharp. 16GB RAM minimum. GPU optional but recommended.
$ ollama pull llama3.1:8b
Start Building
Own Your AI.
Own Your Future.
Self-hosted. Privacy-first. Enterprise-ready. iNdex.ai puts the power of large language models entirely within your control.