Skip to main content

Frequently Asked Questions


Q: What happens when the cache reaches its memory limit?

VCAL Server applies LRU eviction automatically to keep memory usage within configured limits.

You can control cache capacity using:

VCAL_CAP_MAX_BYTES=268435456   # 256 MB memory cap
VCAL_CAP_MAX_VECTORS=1000000 # optional vector count cap

Eviction removes the least recently used entries first.


Q: Can I pre-load known Q&A pairs?

Yes. You can pre-seed data in two ways:

  • Insert entries via POST /v1/upsert
  • Load a snapshot before serving traffic

This is useful for:

  • Domain-specific FAQs
  • Product documentation
  • Knowledge bases for RAG systems

Q: Can snapshots be shared between containers or nodes?

Yes, provided that:

  • The same data directory is shared (e.g., volume mount)
  • The servers use identical index parameters, including:
    • VCAL_DIMS
    • VCAL_M
    • VCAL_EF_SEARCH

Snapshot compatibility is guaranteed within the same major VCAL Server version.


Q: Is authentication mandatory?

No. Authentication is optional.

By default:

  • App endpoints (/v1/*) are open
  • Admin endpoints (/admin/*) require keys if configured

To enable authentication, configure key files:

VCAL_KEYS_APP_FILE=/path/to/app.keys
VCAL_KEYS_ADMIN_FILE=/path/to/admin.keys

For production deployments, authentication is strongly recommended.


Q: How is data persisted?

VCAL Server persists data via autosave snapshots written atomically to disk.

Relevant settings:

VCAL_AUTOSAVE_SECS=3600
VCAL_AUTOSAVE_ATOMIC=1

Each snapshot write includes:

  • vcal.index (HNSW index)
  • answers.json (payload store)

Writes use a temporary file + rename strategy to prevent corruption.


Q: How do I back up or migrate VCAL data?

Stop the server and copy the data directory, typically:

/var/lib/vcal/

Ensure the destination server uses the same:

  • VCAL_DIMS
  • compatible VCAL Server version

Q: How do I verify snapshot consistency?

Snapshots are written atomically and validated at load time.

If a snapshot is incomplete or corrupted:

  • VCAL logs a clear error
  • The server refuses to load invalid data

No manual consistency scripts are required.


Q: What happens if my license expires?

If a license expires or becomes invalid:

  • The server refuses to start on restart
  • Running instances will:
    • log license expiration warnings
    • reject protected operations

Data on disk remains intact until a valid license is installed.


Q: Does VCAL ever send data externally?

No.

VCAL Server is fully on-prem / VPC-only:

  • No outbound telemetry
  • No external API calls
  • No license callbacks
  • No hidden network traffic

All embeddings, answers, and metrics remain inside your infrastructure.


Q: How does VCAL handle concurrent requests safely?

VCAL Server uses:

  • Lock-free reads for query paths
  • Guarded writes for index mutations
  • Atomic snapshot writes

This ensures safe operation under high concurrency without index corruption.


Q: Can I run VCAL behind an API gateway or load balancer?

Yes.

VCAL Server is stateless except for:

  • in-memory cache
  • local snapshot storage

For multi-replica setups:

  • Use shared volumes for snapshots, or
  • Accept per-node cache warm-up

Q: How large can the index grow?

Memory usage scales linearly with vector count.

Approximate guideline:

  • ~8 GB RAM per 1 million vectors (768 dimensions)

Actual usage depends on:

  • VCAL_M
  • payload size
  • eviction settings

Q: How do I monitor performance?

VCAL Server exposes Prometheus metrics at:

/metrics

Common metrics include:

  • vcal_hits_total
  • vcal_misses_total
  • vcal_tokens_saved_total
  • vcal_index_size_bytes
  • latency histograms (vcal_latency_seconds_bucket)

A ready-to-import Grafana dashboard is provided with the release.


Q: What are typical cache hit ratios?

After warm-up, most production workloads achieve:

  • 70–90% cache hit ratio

Results depend on:

  • similarity threshold
  • TTL configuration
  • query repetition patterns

Q: What happens if the server crashes during a write?

If VCAL_AUTOSAVE_ATOMIC=1 is enabled:

  • Writes occur to a temporary file
  • Files are renamed only after successful completion

The last valid snapshot is preserved.


Q: Can I use different embedding models?

Yes.

VCAL is model-agnostic. You may use any embedding model as long as:

VCAL_DIMS=<embedding dimension>

matches the model output size.


Q: Where can I find the OpenAPI specification?

  • Swagger UI: /api-docs
  • OpenAPI file: openapi.yml (downloadable)

Availability may vary by license tier.


Q: What’s the difference between VCAL Core and VCAL Server?

  • VCAL Core
    Open-source Rust library (Apache-2.0) providing the HNSW index and eviction logic.

  • VCAL Server
    Production service adding:

    • HTTP API
    • persistence
    • metrics
    • authentication
    • licensing

Q: How do I contact support?