Frequently Asked Questions
Q: What happens when the cache reaches its memory limit?
VCAL Server applies LRU eviction automatically to keep memory usage within configured limits.
You can control cache capacity using:
VCAL_CAP_MAX_BYTES=268435456 # 256 MB memory cap
VCAL_CAP_MAX_VECTORS=1000000 # optional vector count cap
Eviction removes the least recently used entries first.
Q: Can I pre-load known Q&A pairs?
Yes. You can pre-seed data in two ways:
- Insert entries via
POST /v1/upsert - Load a snapshot before serving traffic
This is useful for:
- Domain-specific FAQs
- Product documentation
- Knowledge bases for RAG systems
Q: Can snapshots be shared between containers or nodes?
Yes, provided that:
- The same data directory is shared (e.g., volume mount)
- The servers use identical index parameters, including:
VCAL_DIMSVCAL_MVCAL_EF_SEARCH
Snapshot compatibility is guaranteed within the same major VCAL Server version.
Q: Is authentication mandatory?
No. Authentication is optional.
By default:
- App endpoints (
/v1/*) are open - Admin endpoints (
/admin/*) require keys if configured
To enable authentication, configure key files:
VCAL_KEYS_APP_FILE=/path/to/app.keys
VCAL_KEYS_ADMIN_FILE=/path/to/admin.keys
For production deployments, authentication is strongly recommended.
Q: How is data persisted?
VCAL Server persists data via autosave snapshots written atomically to disk.
Relevant settings:
VCAL_AUTOSAVE_SECS=3600
VCAL_AUTOSAVE_ATOMIC=1
Each snapshot write includes:
vcal.index(HNSW index)answers.json(payload store)
Writes use a temporary file + rename strategy to prevent corruption.
Q: How do I back up or migrate VCAL data?
Stop the server and copy the data directory, typically:
/var/lib/vcal/
Ensure the destination server uses the same:
VCAL_DIMS- compatible VCAL Server version
Q: How do I verify snapshot consistency?
Snapshots are written atomically and validated at load time.
If a snapshot is incomplete or corrupted:
- VCAL logs a clear error
- The server refuses to load invalid data
No manual consistency scripts are required.
Q: What happens if my license expires?
If a license expires or becomes invalid:
- The server refuses to start on restart
- Running instances will:
- log license expiration warnings
- reject protected operations
Data on disk remains intact until a valid license is installed.
Q: Does VCAL ever send data externally?
No.
VCAL Server is fully on-prem / VPC-only:
- No outbound telemetry
- No external API calls
- No license callbacks
- No hidden network traffic
All embeddings, answers, and metrics remain inside your infrastructure.
Q: How does VCAL handle concurrent requests safely?
VCAL Server uses:
- Lock-free reads for query paths
- Guarded writes for index mutations
- Atomic snapshot writes
This ensures safe operation under high concurrency without index corruption.
Q: Can I run VCAL behind an API gateway or load balancer?
Yes.
VCAL Server is stateless except for:
- in-memory cache
- local snapshot storage
For multi-replica setups:
- Use shared volumes for snapshots, or
- Accept per-node cache warm-up
Q: How large can the index grow?
Memory usage scales linearly with vector count.
Approximate guideline:
- ~8 GB RAM per 1 million vectors (768 dimensions)
Actual usage depends on:
VCAL_M- payload size
- eviction settings
Q: How do I monitor performance?
VCAL Server exposes Prometheus metrics at:
/metrics
Common metrics include:
vcal_hits_totalvcal_misses_totalvcal_tokens_saved_totalvcal_index_size_bytes- latency histograms (
vcal_latency_seconds_bucket)
A ready-to-import Grafana dashboard is provided with the release.
Q: What are typical cache hit ratios?
After warm-up, most production workloads achieve:
- 70–90% cache hit ratio
Results depend on:
- similarity threshold
- TTL configuration
- query repetition patterns
Q: What happens if the server crashes during a write?
If VCAL_AUTOSAVE_ATOMIC=1 is enabled:
- Writes occur to a temporary file
- Files are renamed only after successful completion
The last valid snapshot is preserved.
Q: Can I use different embedding models?
Yes.
VCAL is model-agnostic. You may use any embedding model as long as:
VCAL_DIMS=<embedding dimension>
matches the model output size.
Q: Where can I find the OpenAPI specification?
- Swagger UI:
/api-docs - OpenAPI file:
openapi.yml(downloadable)
Availability may vary by license tier.
Q: What’s the difference between VCAL Core and VCAL Server?
-
VCAL Core
Open-source Rust library (Apache-2.0) providing the HNSW index and eviction logic. -
VCAL Server
Production service adding:- HTTP API
- persistence
- metrics
- authentication
- licensing
Q: How do I contact support?
- Trial / Growth — send an email to support@vcal-project.com
- Enterprise — dedicated support channel and SLA