Integrating Experiment Tracking Tools with Qwak

Overview

Experiment tracking is an essential aspect of the machine learning workflow, allowing you to monitor and compare different models and runs.

Qwak provides seamless integration with leading experiment tracking platforms such as Weights & Biases (wandb) and MLflow. This guide will walk you through the process of setting up these integrations within your Qwak environment.

Weights and Biases Integration

When using Weights & Biases (wandb), you store and manage artifacts, which can include datasets, models, and other files, in a centralized database on the wandb cloud service. Here's how to retrieve and utilize model artifacts, metrics, and parameters logged with wandb.

Setting Up wandb

Before you begin, ensure you have a Weights & Biases account. Follow these steps to integrate wandb with Qwak:

  1. Generate a wandb API Key:

    • Go to your wandb profile settings and create a new API key.
  2. Save the API Key as a Qwak Secret:

qwak secrets set --name 'wandb-api-key' --value "<YOUR_WANDB_API_KEY>"

Using a wandb in Your Model

To use wandb in your Qwak model, you'll need to:

  • Initialize wandb with your project and entity details.
  • Log in to wandb using the API key stored in Qwak Secrets.
  • Retrieve and log model artifacts, metrics, and parameters.

Here's a practical example of how to use wandb in your Qwak model :

from qwak import QwakModel
from qwak.clients.secret_service import SecretServiceClient
import wandb
import qwak
import os

ENTITY_VAR = 'WNB_ENTITY'
PROJECT_VAR = 'WNB_PROJECT'
RUN_ID_VAR = 'WNB_RUN_ID'

class MyCustomModel(QwakModel):

  	# Initialize Wandb configs from environment variables sent at build time.
    def __init__(self):
      self.entity = os.getenv(ENTITY_VAR)
      self.project = os.getenv(PROJECT_VAR)
      self.run_id = os.getenv(RUN_ID_VAR)
    

    def build(self):
        pass
      
    
    def initialize_model(self):
        """
        Invoked when a model is loaded at serving time. Called once per model instance initialization. 
        Can be used for loading and storing values that should only be available in a serving setting.
        """
        # Access the wandb secret API key from Qwak's SecretsManager
        secret_service = SecretServiceClient()
        wandb_api_key = secret_service.get_secret('wandb-api-key')
        
        # Log in to wandb
        wandb.login(key=wandb_api_key)
        
        # Initialize a wandb API object
        api = wandb.Api()
        
        # Replace 'my_entity', 'my_project', and 'run_id' with your specific details
        run = api.run(f"{self.entity}/{self.project}/{self.run_id}")
        
        # Replace 'model' with the name of your artifact
        artifact = run.use_artifact('model:latest')
        artifact_dir = artifact.download()
        
        """
        TODO Model initialization from artifact.
        """
        
        # Retrieve metrics
        metrics = {key: val for key, val in run.history().items() if key not in ['_step', '_runtime']}
        
        # Log metrics to Qwak Build
        qwak.log_metrics(metrics)

        # Retrieve parameters/configurations
        params = run.config
        
        # Log parameters to Qwak Build
        qwak.log_param(params)

        
    def predict(self, df):
        """
        Invoked on every API inference request.
        """
        pass


When initiating this build using the Qwak SDK, please include the necessary environment variables by utilizing the -E flag, as outlined in the Build Configurations documentation page.


Conclusion

By integrating experiment tracking tools like wandb and MLflow, you can enhance the capabilities of your Qwak-based ML models. This setup allows you to keep track of your experiments, compare results, and ensure that your ML operations are efficient and effective.