Automating model build and deployment helps maintaining models accurate in production.
This action streamlines the build and deployment workflows. It keeps your model accurate by automatically re-training and deploying based on a cron expression, defined time interval, or metric base triggers.
You can also define a deployment conditions to verify that the new build passes acceptance criteria within the desired parameters before replacing a currently deployed model.
Before configuring the automation, it is essential to store your model's code in a Git repository.
The automation will fetch the model's code during the training process. In the case of using a private repository, it is necessary to generate a Git access token and securely store the key in the Secret Manager
from qwak.automations import Automation, ScheduledTrigger, QwakBuildDeploy,\ BuildSpecifications, BuildMetric, ThresholdDirection, DeploymentSpecifications test_automation = Automation( name="retrain_my_model", model_id="my-model-id", trigger=ScheduledTrigger(cron="0 0 * * 0"), action=QwakBuildDeploy( build_spec=BuildSpecifications(git_uri="https://github.com/org_id/repository_name.git#dir_1/dir_2", git_access_token_secret="token_secret_name", git_branch="main", main_dir="main", tags=["prod"], env_vars=["key1=val1", "key2=val2", "key3=val3"]), deployment_condition=BuildMetric(metric_name="f1_score", direction=ThresholdDirection.ABOVE, threshold="0.65"), deployment_spec=DeploymentSpecifications(number_of_pods=1, cpu_fraction=2.0, memory="2Gi", variation_name="B") ) )
The default timezone for the cron scheduler is UTC.
QwakBuildDeploy action has three configuration parameters:
build_specdefines the location of the model code that we will build in the Qwak platform.
deployment_conditiondefines the metrics used to determine when to deploy the model after the training.
deployment_specspecifies the runtime environment parameters for model deployment.
Metrics used to trigger build or deploy automations must be logged during the model build phase.
To configure the automation build specification, we need a link to the git repository.
Note that the link consists of two parts delimited by hashtag
- The repository URL
- The path within the repository
For example, when we use this link:
The platform will clone the
https://github.com/org_id/repository_name.git repository and change the working directory to
dir_1/dir_2 before starting the build.
In this example,
dir_1/dir_2 should be the directory containing the
When using private repositories, we must also specify the access token.
As the Qwak platform doesn't allow the usage of plain text token, we must store the access tokens in the Qwak Secret Manager, and specify only the secret name.
When not using the default folder structure, in which
main is the models folder, we must also specify the git branch and the directory containing the ML model
In the build specification, you may control the number of CPUs, amount of memory or use GPUs Instance Sizes
Defining CPU resources:
Defining GPU resources:
It is possible specify the IAM role used in production (
assumed_iam_role) or a custom docker image (
Additionally, we can specify the environment variables to configure in the build environment.
he environment variables should be specified with the env_vars field (list), and the value as the following:
The model's code must log the metric that describes the model's performance. We will use the metric in the deployment condition. If you don't know how to do it, look at our Logging and Monitoring Guide.
During the build process, it is common to log metrics such as accuracy, F1 score, or loss. When executing the automation, these logged values may be compared against a specified threshold.
For each metric, it is possible to define whether the value should be above or below the threshold. Once this condition is met, the Qwak platform will proceed to deploy the model.
BuildMetric object has three parameters:
- metric_name: The metric name we logged during the build phase
- direction: Show the value be below or above the threshold, where the valid values are
- threshold: The threshold used for comparison
The threshold must always be a string, where
threshold="0.65"is a valid threshold and
To use a dynamic threshold, we can use a SQL expression as the threshold value.
In this case, the Qwak platform will run the SQL query in Qwak Model Analytics and compare the model's metric with the threshold produced by the SQL query.
The query must return a single row containing only one column!
After we build the model, compared its performance with the threshold, and concluded that the model is ready to be deployed, the platform will use the deployment specification to configure the model's runtime environment.
We may specify:
|number_of_http_server_workers||The number of threads used by the HTTP server.|
|http_request_timeout_ms||The request timeout.|
|daemon_mode||Should gunicorn process be daemonized, which makes the workers work in the background.|
|custom_iam_role_arn||The IAM role used in production|
|max_batch_size||Max batch size of record|
|deployment_process_timeout_limit||The timeout for the deployment (in seconds)|
|number_of_pods||The number of instances to be deployed|
|cpu_fraction||The CPU cores for Kubernetes.|
|memory||The amount of RAM|
|variation_name||The variant name if we run an A/B test.|
|auto_scale_config||The autoscaling configuration for Kubernetes|
|min_replica_count||The minimum number of replicas will scale the resource down to|
|max_replica_count||The maximum number of replicas of the target resource|
|polling_interval||This is the interval to check each trigger on. By default it's every 30 seconds|
|cool_down_period||The period to wait after the last trigger reported active before scaling the resource back to 0. By default it's 5 minutes (300 seconds).|
|prometheus_trigger||metric_type: The type of the metric - cpu/gpu/memory/latency|
aggregation_type: The type of the aggregation - min/max/avg/sum
time_period: The period to run the query based on
threshold: Value to start scaling for
|Environments||List of environment names to deploy to|
When we want to define an auto-scaling policy for our deployment, we have to use the following pattern:
auto_scale_config = AutoScalingConfig(min_replica_count=1, max_replica_count=10, polling_interval=30, cool_down_period=300, triggers=[ AutoScalingPrometheusTrigger( query_spec=AutoScaleQuerySpec( aggregation_type="max", metric_type="latency", time_period=4), threshold=60 ) ] )
Updated 3 days ago