“ Zilliz Cloud has been an important part of Exa’s journey to build and scale entity search, giving us the retrieval performance and operational simplicity we need to scale quickly and confidently. ”
Zilliz Cloud named a leader in the Forrester Wave™
Vector Database Providers, Q3 2024
The Vector
Lakebase for AI
Beyond vector databases — real-time serving, iterative discovery, and batch analytics on a single source of truth, each at the right cost, at hundred-billion data scale.
Built by the creators of Milvus.
Compute

Introducing Zilliz CLI and Agent Skills for Zilliz Cloud
Starting today, you can manage your entire Zilliz Cloud deployment without leaving your terminal or AI coding agent. Provision clusters, run searches, configure backups, and manage access control — through CLI commands or natural language. No console. No context switching.
Built for Reliability
Built on a deep understanding of large-scale vector database failure modes. Production-tested across 10,000+ enterprises over 8 years.


Built for Scale
Engineered to handle 100B+ entities and 10K+ QPS with consistent latency and predictable performance.


Built for Lower Cost
All data and indexes on S3, with hot cache and on-demand compute to cut costs by 90%.


Full-Spectrum Search
From vector and text to JSON and geospatial—combined with hybrid retrieval, filtering, and reranking for expressive multi-modal queries.


Lake-Native Storage
Unified storage for serving and analytics, built on Vortex—an open, next-gen format. Up to 10× faster, cheaper random reads than Lance, with per-column format flexibility.


“ With Zilliz Cloud, we have achieved a true consciousness of data, bringing the data together in the way that an individual doing their job needs to see it. ”
“ Zilliz Cloud has helped us create a strong foundation behind the scenes as we continue to grow and serve hundreds of thousands of clinicians. ”
“ Zilliz gave us real-time retrieval for our multilingual RAG system at scale with tight latency targets. It freed up engineering cycles and let us focus on improving reasoning on the model side, not managing infrastructure. ”
Real-time Serving Highlights
Tiered Architecture
Optimize for diverse workloads with flexible tiers—delivering ultra-high performance, balanced efficiency, and cost-effective scaling across massive datasets.
- Performance-Optimized Solution
- Capacity-Optimized Solution
- Tiered-Storage Solution


Massive Multi-Tenancy for AI Apps
Unlimited namespaces with hybrid vector, full-text, and JSON search—plus hot-cold data serving.


Global Cluster
Multi-region deployment with replication and failover—ensuring low-latency, high-availability access worldwide, supporting rapid global expansion of AI applications.


Performance
Setup: 768-dimensional vectors, top-k = 10, cluster-size = 1 CU
On-demand Compute Highlights
On-demand Search
Pay per query, not per provisioned compute—enabling dramatically lower cost than serverless at scale.


Seamless Backfill & Schema Iteration
Backfill and evolve schemas and data models online—without impacting serving, built for continuous AI iteration.


Bring Indexes to Your Lake
An optional access mode to operate directly on your S3 data (Iceberg, Lance, Vortex, Parquet). Keep data in your bucket while indexes are built and served on Zilliz—no copies, no ETL.


Performance and Cost
Setup: 1 billion 768-dimensional vectors, top-k = 100k, cluster-size =64 CU
The CLI for Vector Lakebase
Your Vector Lakebase. Your Terminal. Full Control.
The official CLI for management, search, and analytics.


