Frequently Asked Questions

Q: What happens when the cache reaches its memory limit?

VCAL Server applies LRU eviction automatically to keep memory usage within configured limits.

You can control cache capacity using:

VCAL_CAP_MAX_BYTES=268435456   # 256 MB memory cap
VCAL_CAP_MAX_VECTORS=1000000  # optional vector count cap

Eviction removes the least recently used entries first.

Q: Can I pre-load known Q&A pairs?

Yes. You can pre-seed data in two ways:

Insert entries via POST /v1/upsert
Load a snapshot before serving traffic

This is useful for:

Domain-specific FAQs
Product documentation
Knowledge bases for RAG systems

Q: Can snapshots be shared between containers or nodes?

Yes, provided that:

The same data directory is shared (e.g., volume mount)
The servers use identical index parameters, including:
- VCAL_DIMS
- VCAL_M
- VCAL_EF_SEARCH

Snapshot compatibility is guaranteed within the same major VCAL Server version.

Q: Is authentication mandatory?

No. Authentication is optional.

By default:

App endpoints (/v1/*) are open
Admin endpoints (/admin/*) require keys if configured

To enable authentication, configure key files:

VCAL_KEYS_APP_FILE=/path/to/app.keys
VCAL_KEYS_ADMIN_FILE=/path/to/admin.keys

For production deployments, authentication is strongly recommended.

Q: How is data persisted?

VCAL Server persists data via autosave snapshots written atomically to disk.

Relevant settings:

VCAL_AUTOSAVE_SECS=3600
VCAL_AUTOSAVE_ATOMIC=1

Each snapshot write includes:

vcal.index (HNSW index)
answers.json (payload store)

Writes use a temporary file + rename strategy to prevent corruption.

Q: How do I back up or migrate VCAL data?

Stop the server and copy the data directory, typically:

/var/lib/vcal/

Ensure the destination server uses the same:

VCAL_DIMS
compatible VCAL Server version

Q: How do I verify snapshot consistency?

Snapshots are written atomically and validated at load time.

If a snapshot is incomplete or corrupted:

VCAL logs a clear error
The server refuses to load invalid data

No manual consistency scripts are required.

Q: What happens if my license expires?

If a license expires or becomes invalid:

The server refuses to start on restart
Running instances will:
- log license expiration warnings
- reject protected operations

Data on disk remains intact until a valid license is installed.

Q: Does VCAL ever send data externally?

No.

VCAL Server is fully on-prem / VPC-only:

No outbound telemetry
No external API calls
No license callbacks
No hidden network traffic

All embeddings, answers, and metrics remain inside your infrastructure.

Q: How does VCAL handle concurrent requests safely?

VCAL Server uses:

Lock-free reads for query paths
Guarded writes for index mutations
Atomic snapshot writes

This ensures safe operation under high concurrency without index corruption.

Q: Can I run VCAL behind an API gateway or load balancer?

Yes.

VCAL Server is stateless except for:

in-memory cache
local snapshot storage

For multi-replica setups:

Use shared volumes for snapshots, or
Accept per-node cache warm-up

Q: How large can the index grow?

Memory usage scales linearly with vector count.

Approximate guideline:

~8 GB RAM per 1 million vectors (768 dimensions)

Actual usage depends on:

VCAL_M
payload size
eviction settings

Q: How do I monitor performance?

VCAL Server exposes Prometheus metrics at:

/metrics

Common metrics include:

vcal_hits_total
vcal_misses_total
vcal_tokens_saved_total
vcal_index_size_bytes
latency histograms (vcal_latency_seconds_bucket)

A ready-to-import Grafana dashboard is provided with the release.

Q: What are typical cache hit ratios?

After warm-up, most production workloads achieve:

70–90% cache hit ratio

Results depend on:

similarity threshold
TTL configuration
query repetition patterns

Q: What happens if the server crashes during a write?

If VCAL_AUTOSAVE_ATOMIC=1 is enabled:

Writes occur to a temporary file
Files are renamed only after successful completion

The last valid snapshot is preserved.

Q: Can I use different embedding models?

Yes.

VCAL is model-agnostic. You may use any embedding model as long as:

VCAL_DIMS=<embedding dimension>

matches the model output size.

Q: Where can I find the OpenAPI specification?

Swagger UI: /api-docs
OpenAPI file: openapi.yml (downloadable)

Availability may vary by license tier.

Q: What’s the difference between VCAL Core and VCAL Server?

VCAL Core
Open-source Rust library (Apache-2.0) providing the HNSW index and eviction logic.
VCAL Server
Production service adding:
- HTTP API
- persistence
- metrics
- authentication
- licensing

Q: How do I contact support?

Trial / Growth — send an email to support@vcal-project.com
Enterprise — dedicated support channel and SLA

Q: What happens when the cache reaches its memory limit?​

Q: Can I pre-load known Q&A pairs?​

Q: Can snapshots be shared between containers or nodes?​

Q: Is authentication mandatory?​

Q: How is data persisted?​

Q: How do I back up or migrate VCAL data?​

Q: How do I verify snapshot consistency?​

Q: What happens if my license expires?​

Q: Does VCAL ever send data externally?​

Q: How does VCAL handle concurrent requests safely?​

Q: Can I run VCAL behind an API gateway or load balancer?​

Q: How large can the index grow?​

Q: How do I monitor performance?​

Q: What are typical cache hit ratios?​

Q: What happens if the server crashes during a write?​

Q: Can I use different embedding models?​

Q: Where can I find the OpenAPI specification?​

Q: What’s the difference between VCAL Core and VCAL Server?​

Q: How do I contact support?​