Tracking MLFlow

If you use MLFlow, you can duplicate Databand metrics to the MLFlow store and maintain data in the MLFlow system as well.

To Integrate MLFlow with Databand

  1. Run the following command to install the integration plugin:
pip install databand[mlflow]
  1. Add the following configuration to your SDK Configuration

[mlflow_tracking]
## Enable tracking of MLFlow metrics to Databand store.
databand_tracking=True

## Optionally, define a URI for mlflow store; mlflow.get_tracking_uri() is used by default
duplicate_tracking_to=http://mlflow-store/

Task Example

The following example code shows how the logging works.

from dbnd import task
from mlflow import start_run, end_run
from mlflow import log_metric, log_param
from random import random, randint

@task
def mlflow_example():
    start_run()
    # params
    log_param("param1", randint(0, 100))
    log_param("param2", randint(0, 100))
    # metrics
    log_metric("foo1", random())
    log_metric("foo2", random())
    end_run()

Execution Flow

When you run dbnd run mlflow_example, the following happens in the backend:

  1. Databand creates a new DBND context
  2. dbnd_on_pre_init_context hook from dbnd_mlflow is triggered
  3. A new URI is computed to be used by mlflow For example: `dbnd://localhost:8081?duplicate_tracking_to=http%253A%252F%252Fmlflow-store%253A80%252F
  4. The new URI is set to be used with mlflow.set_tracking_uri()
  5. mlflow_example task starts:
  6. mlflow.start_run()
  7. mlflow reads entry_points for each installed package and finds:

"dbnd = dbnd_mlflow.tracking_store:get_dbnd_store", "dbnd+s = dbnd_mlflow.tracking_store:get_dbnd_store", "databand = dbnd_mlflow.tracking_store:get_dbnd_store", "databand+s = dbnd_mlflow.tracking_store:get_dbnd_store",

8. `mlflow` creates `TrackingStoreClient` by using the new URI
9. URI schema instructs to use `dbnd_mlflow.tracking_store:get_dbnd_store`
  * `get_dbnd_store` creates dbnd `TrackingAPIClient`
  * `get_dbnd_store` creates mlflow tracking store to duplicate tracking to
  * `get_dbnd_store` returns `DatabandStore` instance
10. `log_param()`/`log_metric()`
  * calls to `DatabandStore`
  * calls to `TrackingAPIClient`
  * calls to mlflow tracking store to duplicate tracking to `mlflow.end_run()`
12. `mlflow_example` ends
13. `dbnd_on_exit_context` hook from `dbnd_mlflow` is triggered
14. Restore the original mlflow tracking URI.