Skip to main content

Monitoring & Metrics

VCAL exposes native Prometheus metrics at /metrics.

Key Metrics

MetricTypeDescription
vcal_cache_hits_totalCounterCache Hit Ratio (%)
vcal_tokens_saved_totalCounterTokens Saved (range)
vcal_tokens_saved_totalCounterCost Savings (range)
vcal_answers_cachedGaugeAnswers Cached
vcal_cache_hits_totalGaugeRequests (range)
vcal_search_errors_total, vcal_batch_search_errors_total, vcal_insert_errors_total, vcal_upsert_errors_total, vcal_delete_errors_totalCounterErrors (range)
vcal_search_latency_seconds_bucketGaugeServer Search Latency (p50 / p95)
vcal_evictions_totalGaugeTTL Evictions (10m)
vcal_evictions_totalGaugeLRU Evictions (10m)

Example Grafana Dashboard

  1. Add Prometheus datasource:

    URL: http://vcal-server:8080/metrics
  2. Import the provided dashboard JSON from deploy/grafana/vcal-dashboard.json (request the file from VCAL Server Team).

  3. Visualize:

    • Cache hit ratio
    • Tokens saved (range)
    • Costs saved (range)
    • Answers cached
    • Requests
    • Errors
    • Server Search Latency (p55 / p95)
    • TTL Evictions
    • LRU Evictions