Inference Analytics

Automatic log and inference collections on the Qwak Lake

The model Analytics tab provides an interface to the Qwak Lake, which is an automated log collection system for models.

In addition to performance data, you can also find all the predictions that were made with models deployed via Qwak, including the input and output data of each function of your models.

The data is stored as parquet files in your Object storage, and you can also load it into your favorite BI tool and analyze the Model data with your tools.


Enabling Qwak Lake analytics

Qwak Analytics collection is enabled by default when using the api decorator.

@qwak.api()
def predict(self, df):
    return pd.DataFrame(self.catboost.predict(df[self.columns]), columns=['churn'])

Note it can be turned off by passing analytics=False to the decorator

@qwak.api(analytics=False)
def predict(self, df):
    return pd.DataFrame(self.catboost.predict(df[self.columns]), columns=['churn'])

📘

Analytics columns are defined based on the naming conventions of input variables within the predict() method. When utilizing the default df parameters, these columns commence with input_. Conversely, if you've specified custom parameters, the columns will initiate with the name of your parameter.

For instance, if your predict signature reads as follows: def predict(self, request) -> String, then your analytics input columns will begin with request_.



Querying analytics in the UI

In the Analytics view, you can write SQL queries to analyze the model requests and predictions:


🚧

Leveraging Partitions in Queries

Model Inference data is partitioned daily according to the date column. To improve query performance and avoid scanning through all the data which can be significantly slower (and costlier), please leverage this partitioning scheme in your analytics queries.



Retrieving analytics programmatically

To retrieve data from Qwak Analytics Engine into a Pandas Dataframe use the run_analytics_query function of the QwakClient:

from qwak import QwakClient

client = QwakClient()  
df = client.run_analytics_query("select * from your_table")

When you call the code as shown below, the function will wait until the result is ready (or until the query fails for whatever reason).

However, you can also control how long you want to wait for the result by passing the timeout parameter to the get_analytics_data function. If the Qwak Analytics Engine won't return a response within a given time window, the client will raise a TimeoutError.

from datetime import timedelta  
from qwak import QwakClient

client = QwakClient()  
df = client.run_analytics_query("select \* from your_table", timeout=timedelta(seconds=123))


Logging custom values

A model's predict function can log custom data during the inference request. To use the custom data logger, we need to add the analytics_logger parameter to the predict function. Important: The parameter MUST be called analytics_logger!


@qwak.api(analytics=True)
def predict(self, df, analytics_logger):
    ...

The feature works only when the analytics feature of the Qwak API is enabled (it's enabled by default, or we can explicitly specify the analytics=True parameter).

Now, in the predict function, we can log any scalar value, lists, dictionaries, Pandas DataFrame, and any other JSON serializable object. The analytics_logger supports two ways of logging the values:

  1. One-by-one:
analytics_logger.log(column=’my_column’, value=the_value)
analytics_logger.log(column=’some_other_column’, value=yet_another_value)
  1. Multiple values at once:
analytics_logger.log_many(
    values={‘another_column’: ‘some_value’, 'something_else': 123}
)

Note that we use different function when we log multiple values (log_many instead of log)!

If you log different values with the same column name, only the last logged value will be logged (it overwrites previous logs).



Retrieving custom values

The Qwak Analytics view in the Qwak UI will display all the logged values with the column prefix logger_.

If we log: analytics_logger.log(column=’my_column’, value=the_value), Qwak Analytics displays a column logger_my_column with a value retrieved from the variable the_value.