Getting Features for Training

In Getting Started, we showed you how to retrieve a training data sample from the Feature Store.

Before proceeding, please make sure that you have the following dependencies installed:

pip install pyathena pyarrow

We can retrieve data from the OfflineClient by the entity name and the last modification timestamp.

When we retrieve features from the offline feature store, we get data that was ingested in a certain point in time. Additionally, we fetch data for a specific set of entities

import pandas as pd
from qwak.feature_store.offline.client import OfflineClient

offline_client = OfflineClient()

key_to_features = {'user': [
    'user-credit-risk-features.checking_account',
    'user-credit-risk-features.age',
    'user-credit-risk-features.job'
    ]
}

population = pd.DataFrame(
    columns=[                 'user', 'timestamp'                ],
    data   =[[ '06cc255a-aa07-4ec9-ac69-b896ccf05322', '2021-01-01 00:00:00']]
)

data = offline_client.get_feature_values(entity_key_to_features=key_to_features,
                                         population=population)

print(data)