Runtime Metrics

Runtime metrics are track anything about your models over time, from latency to error rates to prediction success rates.

The Qwak UI displays the model metrics in the Health in the in the Overview tab.

MetricDescription
Median responseThe median time elapsed, in milliseconds, of model inference requests in the last 5 minutes.
Error percentageThe percentage of 5XX error type of total request in the last 5 minutes.
Predication success rateThe average success inference requests per minute.
ThroughputThe total amount of requests, in 1-minute long windows.
LatencyThe time elapsed, in milliseconds, of model inference requests. If one or more of these operations fail, this is the time to fail.
Error rateThe percentage of 5XX errors of total request in 1 aggregation of 1-minute metrics to Datadog.

Exporting metrics to your Grafana

Qwak allows to connect to the hosted Prometheus and send metrics to Grafana.

If you wish to export the metrics to your Grafana, please contact [email protected].