记录和注册 AI 代理

项目
09/27/2024

重要

此功能目前以公共预览版提供。

使用 Mosaic AI 代理框架记录 AI 代理。记录代理是开发过程的基础。日志记录将捕获代理代码和配置的“时间点”，以便你可以评估配置的质量。

要求

在记录 AI 代理之前创建 AI 代理。

基于代码的日志记录与基于序列化的日志记录

可以使用基于代码的 MLflow 日志记录或基于序列化的 MLflow 日志记录。 Databricks 建议使用基于代码的日志记录。

基于代码的 MLflow 日志记录：链的代码将捕获为 Python 文件。 Python 环境将捕获为包列表。部署链时，将还原 Python 环境，并执行链的代码，以将链加载到内存中，从而可以在调用终结点时调用它。

基于序列化的 MLflow 日志记录：Python 环境中的链代码和当前状态通常会通过 pickle 或 joblib 等库序列化到磁盘。部署链时，将还原 Python 环境，并将序列化的对象加载到内存中，以便在调用终结点时可以调用它。

下表显示了每种方法的优点和缺点。

方法	优点	缺点
基于代码的 MLflow 日志记录	- 克服了许多流行 GenAI 库不支持的序列化的固有限制。 - 保存原始代码的副本以供日后参考。 - 无需将代码重构为可序列化的单个对象。	必须从与链代码不同的笔记本（称为驱动程序笔记本）调用 `log_model(...)`。
基于序列化的 MLflow 日志记录	可以从定义模型的同一笔记本调用 `log_model(...)`。	- 原始代码不可用。 - 链中使用的所有库和对象都必须支持序列化。

对于基于代码的日志记录，记录代理或链的代码必须位于与链代码不同的笔记本中。此笔记本称为驱动程序笔记本。有关示例笔记本，请参阅示例笔记本。

使用 LangChain 的基于代码的日志记录

使用代码创建一个笔记本或 Python 文件。对于此示例，该笔记本或文件命名为 chain.py。笔记本或文件必须包含一个 LangChain 链，此处称为 lc_chain。
在笔记本或文件中包含 mlflow.models.set_model(lc_chain)。
创建新的笔记本作为驱动程序笔记本（在此示例中称为 driver.py）。
在驱动程序笔记本中，使用 mlflow.lang_chain.log_model(lc_model=”/path/to/chain.py”) 运行 chain.py 并将结果记录到 MLflow 模型。
部署模型。请参阅为生成式 AI 应用程序部署代理。代理的部署可能依赖于其他 Databricks 资源，例如矢量搜索索引和模型服务终结点。对于 LangChain 代理：
- MLflow log_model 会推断链所需的依赖项，并将其记录到已记录模型项目的 MLmodel 文件中。
- 在部署期间，databricks.agents.deploy 会自动创建访问这些推断的资源依赖项并与之通信所需的 M2M OAuth 令牌。
加载服务环境时，将执行 chain.py。
传入服务请求时，将调用 lc_chain.invoke(...)。


import mlflow

code_path = "/Workspace/Users/first.last/chain.py"
config_path = "/Workspace/Users/first.last/config.yml"

input_example = {
    "messages": [
        {
            "role": "user",
            "content": "What is Retrieval-augmented Generation?",
        }
    ]
}

# example using LangChain
with mlflow.start_run():
  logged_chain_info = mlflow.langchain.log_model(
    lc_model=code_path,
    model_config=config_path, # If you specify this parameter, this is the configuration that is used for training the model. The development_config is overwritten.
    artifact_path="chain", # This string is used as the path inside the MLflow model where artifacts are stored
    input_example=input_example, # Must be a valid input to your chain
    example_no_conversion=True, # Required
  )

print(f"MLflow Run: {logged_chain_info.run_id}")
print(f"Model URI: {logged_chain_info.model_uri}")

# To verify that the model has been logged correctly, load the chain and call `invoke`:
model = mlflow.langchain.load_model(logged_chain_info.model_uri)
model.invoke(example)

使用 PyFunc 的基于代码的日志记录

使用代码创建一个笔记本或 Python 文件。对于此示例，该笔记本或文件命名为 chain.py。笔记本或文件必须包含一个 PyFunc 类，此处称为 PyFuncClass。
在笔记本或文件中包含 mlflow.models.set_model(PyFuncClass)。
创建新的笔记本作为驱动程序笔记本（在此示例中称为 driver.py）。
在驱动程序笔记本中，使用 mlflow.pyfunc.log_model(python_model=”/path/to/chain.py”, resources=”/path/to/resources.yaml”) 运行 chain.py 并将结果记录到 MLflow 模型。 resources 参数会声明为模型提供服务所需的任何资源，例如矢量搜索索引或为基础模型提供服务的终结点。请参阅 PyFunc 的示例资源文件。
部署模型。请参阅为生成式 AI 应用程序部署代理。
加载服务环境时，将执行 chain.py。
传入服务请求时，将调用 PyFuncClass.predict(...)。

import mlflow

code_path = "/Workspace/Users/first.last/chain.py"
config_path = "/Workspace/Users/first.last/config.yml"

input_example = {
    "messages": [
        {
            "role": "user",
            "content": "What is Retrieval-augmented Generation?",
        }
    ]
}

# example using PyFunc model

resources_path = "/Workspace/Users/first.last/resources.yml"

with mlflow.start_run():
  logged_chain_info = mlflow.pyfunc.log_model(
    python_model=chain_notebook_path,
    artifact_path="chain",
    input_example=input_example,
    resources=resources_path,
    example_no_conversion=True,
  )

print(f"MLflow Run: {logged_chain_info.run_id}")
print(f"Model URI: {logged_chain_info.model_uri}")

# To verify that the model has been logged correctly, load the chain and call `invoke`:
model = mlflow.pyfunc.load_model(logged_chain_info.model_uri)
model.invoke(example)

指定 PyFunc 代理的资源

可以指定为模型提供服务所需的资源，例如矢量搜索索引和服务终结点。对于 LangChain，会自动拾取和记录资源以及模型。

部署 pyfunc 风格的代理时，必须手动添加已部署代理的任何资源依赖项。创建一个可访问 resources 参数中的所有指定资源的 M2M OAuth 标记，并将其提供给已部署的代理。

注意

可以通过在记录链时手动指定资源来替代终结点有权使用的资源。

下面展示了如何通过在 resources 参数中指定服务终结点和矢量搜索索引依赖项来添加它们。

 with mlflow.start_run():
   logged_chain_info = mlflow.pyfunc.log_model(
     python_model=chain_notebook_path,
     artifact_path="chain",
     input_example=input_example,
     example_no_conversion=True,
     resources=[
            DatabricksServingEndpoint(endpoint_name="databricks-mixtral-8x7b-instruct"),
            DatabricksServingEndpoint(endpoint_name="databricks-bge-large-en"),
            DatabricksVectorSearchIndex(index_name="rag.studio_bugbash.databricks_docs_index")
        ]
   )

还可以通过在 resources.yaml 文件中指定资源来添加资源。可以在 resources 参数中引用该文件路径。创建一个可访问 resources.yaml 中的所有指定资源的 M2M OAuth 标记，并将其提供给已部署的代理。

下面是定义模型服务终结点和矢量搜索索引的示例 resources.yaml 文件。


api_version: "1"
databricks:
  vector_search_index:
    - name: "catalog.schema.my_vs_index"
  serving_endpoint:
    - name: databricks-dbrx-instruct
    - name: databricks-bge-large-en

将链注册到 Unity Catalog

在部署链之前，必须将链注册到 Unity Catalog。当你注册链时，它会打包为 Unity Catalog 中的一个模型，你可以使用 Unity Catalog 权限对链中的资源进行授权。

import mlflow

mlflow.set_registry_uri("databricks-uc")

catalog_name = "test_catalog"
schema_name = "schema"
model_name = "chain_name"

model_name = catalog_name + "." + schema_name + "." + model_name
uc_model_info = mlflow.register_model(model_uri=logged_chain_info.model_uri, name=model_name)

后续步骤

向 AI 代理添加跟踪。
部署 AI 代理。

通过