QuickCasa LLM provides a unified, OpenAI-compatible API for text generation, image creation, and video synthesis — all powered by our own infrastructure.
curl https://llm.quickcasa.ai/v1/chat/completions \ -H "Authorization: Bearer sk-..." \ -H "Content-Type: application/json" \ -d '{ "model": "qc-text-1", "messages": [ {"role": "user", "content": "Hello!"} ] }'
From a free sandbox model to cinema-quality video — pick the model that fits your use case.
Free sandbox model for prototyping and testing. Build your integration, validate your structure, and experiment — no cost, no commitment.
Fast, efficient text generation optimised for conversational AI, summarisation, and everyday tasks. Low latency, high throughput.
Our most capable text model. Excels at complex reasoning, code generation, structured outputs, and nuanced language tasks.
Blazing-fast image generation. Ideal for real-time applications, previews, and high-volume batch processing.
Studio-grade image generation. Delivers stunning detail, accurate text rendering, and photorealistic output.
Text-to-video generation powered by our 14B parameter video model. Cinematic motion, coherent scenes, smooth playback.
Drop-in compatible with the OpenAI SDK. No rewrites, no vendor lock-in.
Works with any OpenAI SDK or client library. Same endpoints, same request format, same streaming protocol.
Hosted on dedicated GPU infrastructure in North America. Optimised for throughput and availability.
API key authentication, rate limiting, and TLS encryption on every request. Your data never leaves our infrastructure.
Text, images, and video from a single API. No juggling multiple providers or credential sets.
Full Server-Sent Events support for chat completions. Token-by-token streaming, just like you'd expect.
Clear, predictable pricing. Pay for what you use — no surprises, no hidden fees.
Generated images and videos are served from our global CDN — the same edge network that powers some of the largest platforms on the internet.
Content is cached and served from over 200 edge points of presence worldwide, ensuring sub-100ms delivery to end users regardless of geography.
Set hostMedia: true and your generated assets get a permanent, shareable URL hosted on our CDN indefinitely. No expiry, no extra storage to manage.
Every media asset is tracked automatically. View counts, access patterns, and usage metrics — all available without any additional instrumentation.
Backed by multi-region redundancy, automatic failover, and 99.99% uptime SLA. Your assets are replicated across multiple availability zones with built-in DDoS protection.
Temporary assets are perfect for previews and ephemeral workflows. Permanent hosting is for production content you want available forever. Both are served from the same fast edge network.
No ugly signed URLs or expiring tokens. Every asset gets a clean, human-readable URL like llm.quickcasa.ai/media/{id} that you can embed anywhere.
Already a QuickCasa customer? Add LLM access to your plan in minutes.