MODEL INFERENCE

Deploy AI Models

One-click deployment for the latest open-source AI models. Run DeepSeek, Llama 4, and more with serverless inference or dedicated GPU infrastructure.

Full Catalog

All Available Models

Browse our complete model catalog with filtering and one-click deployment.

Model	Parameters	Category	Context	Pricing
DeepSeek V3	671B (37B active)	MoE	64K tokens	In: ₹20/1M tokens Out: ₹40/1M tokens
DeepSeek R1s	671B (37B active)	Reasoning	64K tokens	In: ₹25/1M tokens Out: ₹50/1M tokens
Llama 4 Maverick	17B x 128 Experts	MoE	128K tokens	In: ₹30/1M tokens Out: ₹60/1M tokens
Llama 4 Scout	17B x 16 Experts	MoE	128K tokens	In: ₹15/1M tokens Out: ₹30/1M tokens
GPT OSS 120B	120 Billion	General Purpose	8,192 tokens	In: ₹80/1M tokens Out: ₹160/1M tokens
Hermes 3 Llama 3.1 405B	405 Billion	General Purpose	128K tokens	In: ₹60/1M tokens Out: ₹120/1M tokens
Sarvam-2B	2 Billion	Multilingual	4,096 tokens	In: ₹5/1M tokens Out: ₹10/1M tokens
Dolphin 2.9.2 Mistral 8x22B	8 x 22B MoE	MoE	64K tokens	In: ₹40/1M tokens Out: ₹80/1M tokens
DeepSeek V3 0324	671B (37B active)	General Purpose	64K tokens	In: ₹20/1M tokens Out: ₹40/1M tokens

Showing 1–9 of 9 models

Deployment Options

Choose how you want to run your AI models.

Serverless API

Pay-per-token pricing with instant scaling

No GPU management
auto-scaling to zero
pay only for usage
sub-second latency.

How It Works

Platform Features

Choose how you want to run your AI models.

One-Click Deploy

Deploy any model in seconds with pre-optimized configurations.

OpenAI-Compatible API

Drop-in replacement for OpenAI API with minimal code changes.

Start Deploying AI Models

Get started with our free tier. No credit card required.

Get Free API Key Back to GPU Overview

MODEL INFERENCE

Deploy AI Models

One-click deployment for the latest open-source AI models. Run DeepSeek, Llama 4, and more with serverless inference or dedicated GPU infrastructure.

Full Catalog

All Available Models

Browse our complete model catalog with filtering and one-click deployment.

Model	Parameters	Category	Context	Pricing
DeepSeek V3	671B (37B active)	MoE	64K tokens	In: ₹20/1M tokens Out: ₹40/1M tokens
DeepSeek R1s	671B (37B active)	Reasoning	64K tokens	In: ₹25/1M tokens Out: ₹50/1M tokens
Llama 4 Maverick	17B x 128 Experts	MoE	128K tokens	In: ₹30/1M tokens Out: ₹60/1M tokens
Llama 4 Scout	17B x 16 Experts	MoE	128K tokens	In: ₹15/1M tokens Out: ₹30/1M tokens
GPT OSS 120B	120 Billion	General Purpose	8,192 tokens	In: ₹80/1M tokens Out: ₹160/1M tokens
Hermes 3 Llama 3.1 405B	405 Billion	General Purpose	128K tokens	In: ₹60/1M tokens Out: ₹120/1M tokens
Sarvam-2B	2 Billion	Multilingual	4,096 tokens	In: ₹5/1M tokens Out: ₹10/1M tokens
Dolphin 2.9.2 Mistral 8x22B	8 x 22B MoE	MoE	64K tokens	In: ₹40/1M tokens Out: ₹80/1M tokens
DeepSeek V3 0324	671B (37B active)	General Purpose	64K tokens	In: ₹20/1M tokens Out: ₹40/1M tokens

Showing 1–9 of 9 models

Deployment Options

Choose how you want to run your AI models.

Serverless API

Pay-per-token pricing with instant scaling

No GPU management
auto-scaling to zero
pay only for usage
sub-second latency.

How It Works

Platform Features

Choose how you want to run your AI models.

One-Click Deploy

Deploy any model in seconds with pre-optimized configurations.

OpenAI-Compatible API

Drop-in replacement for OpenAI API with minimal code changes.

Start Deploying AI Models

Get started with our free tier. No credit card required.

Get Free API Key Back to GPU Overview

DeepSeek R1

Advanced reasoning model with chain-of-thought capabilities. Excels at mathematical reasoning, logical puzzles, and complex problem solving.

Llama 4 Maverick

Meta's latest flagship MoE model with 128 specialized experts and 128K context length. Superior instruction following with efficient inference.

GPT OSS 120B

Large-scale open-source GPT model with 120 billion parameters. Enterprise-grade performance for chatbots, content generation, and code assistance under Apache 2.0 license.

DeepSeek V3 0324

March 2024 release of DeepSeek V3 with 671B parameters in MoE architecture. Enhanced reasoning, coding, and multilingual capabilities with 64K context length.

Llama 4 Scout

Meta's efficient MoE model optimized for speed and cost. 16 experts deliver competitive quality at half the cost of Maverick, with the same 128K context length for versatile applications.

DeepSeek V3

State-of-the-art 671B Mixture of Experts model delivering GPT-4 class performance at a fraction of the cost. Excellent for general purpose AI tasks with 64K context length.

1 2 Next

Deploy AI Models

All Available Models

Deployment Options

Serverless API

Platform Features

One-Click Deploy

OpenAI-Compatible API

Start Deploying AI Models

DeepSeek R1

Llama 4 Maverick

GPT OSS 120B

DeepSeek V3 0324

Llama 4 Scout

DeepSeek V3

Deploy AI Models

All Available Models

Deployment Options

Serverless API

Platform Features

One-Click Deploy

OpenAI-Compatible API

Start Deploying AI Models

DeepSeek R1

Llama 4 Maverick

GPT OSS 120B

DeepSeek V3 0324

Llama 4 Scout

DeepSeek V3

Dedicated Instance

Auto-Scaling

Fine-Tuning Ready

Private Deployment

Usage Analytics