All systems operational

One API.
Text, image, and video.

QuickCasa LLM provides a unified, OpenAI-compatible API for text generation, image creation, and video synthesis — all powered by our own infrastructure.

Get Started View Models
curl https://llm.quickcasa.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qc-text-1",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'
Models

Six models, one endpoint.

From a free sandbox model to cinema-quality video — pick the model that fits your use case.

Free

qc-text-hello-world

Free sandbox model for prototyping and testing. Build your integration, validate your structure, and experiment — no cost, no commitment.

Price Free
Streaming SSE
Best For Testing
Text

qc-text-1

Fast, efficient text generation optimised for conversational AI, summarisation, and everyday tasks. Low latency, high throughput.

Latency ~200ms TTFT
Streaming SSE
Best For Speed
Text

qc-text-2

Our most capable text model. Excels at complex reasoning, code generation, structured outputs, and nuanced language tasks.

Latency ~400ms TTFT
Streaming SSE
Best For Quality
Image

qc-image-turbo

Blazing-fast image generation. Ideal for real-time applications, previews, and high-volume batch processing.

Speed ~3s
Max Size 1024x1024
Format PNG
Image

qc-image-quality

Studio-grade image generation. Delivers stunning detail, accurate text rendering, and photorealistic output.

Speed ~15s
Max Size 1328x1328
Format PNG
Video

qc-video

Text-to-video generation powered by our 14B parameter video model. Cinematic motion, coherent scenes, smooth playback.

Speed ~90s
Resolution 832x480
Duration Up to 6s
Why QuickCasa LLM

Built different.

Drop-in compatible with the OpenAI SDK. No rewrites, no vendor lock-in.

🔌

OpenAI-Compatible

Works with any OpenAI SDK or client library. Same endpoints, same request format, same streaming protocol.

Low Latency

Hosted on dedicated GPU infrastructure in North America. Optimised for throughput and availability.

🔒

Secure by Default

API key authentication, rate limiting, and TLS encryption on every request. Your data never leaves our infrastructure.

🎥

Multimodal

Text, images, and video from a single API. No juggling multiple providers or credential sets.

📡

Streaming

Full Server-Sent Events support for chat completions. Token-by-token streaming, just like you'd expect.

🚀

Transparent Pricing

Clear, predictable pricing. Pay for what you use — no surprises, no hidden fees.

Media Delivery

Every asset, globally delivered.

Generated images and videos are served from our global CDN — the same edge network that powers some of the largest platforms on the internet.

🌐

200+ Edge Locations

Content is cached and served from over 200 edge points of presence worldwide, ensuring sub-100ms delivery to end users regardless of geography.

🔗

Permanent URLs

Set hostMedia: true and your generated assets get a permanent, shareable URL hosted on our CDN indefinitely. No expiry, no extra storage to manage.

📊

Built-In Analytics

Every media asset is tracked automatically. View counts, access patterns, and usage metrics — all available without any additional instrumentation.

🛡️

Enterprise-Grade Infrastructure

Backed by multi-region redundancy, automatic failover, and 99.99% uptime SLA. Your assets are replicated across multiple availability zones with built-in DDoS protection.

🔄

Temp or Permanent — Your Call

Temporary assets are perfect for previews and ephemeral workflows. Permanent hosting is for production content you want available forever. Both are served from the same fast edge network.

🎯

Clean URLs

No ugly signed URLs or expiring tokens. Every asset gets a clean, human-readable URL like llm.quickcasa.ai/media/{id} that you can embed anywhere.

Ready to build?

Already a QuickCasa customer? Add LLM access to your plan in minutes.

Get Access View Documentation