Teams で RAG ボットを構築する

[アーティクル]
05/21/2024

高度な Q&チャットボットは、大規模言語モデル (LLM) の助けを借りて構築された強力なアプリです。チャットボットは、取得拡張生成 (RAG) というメソッドを使用して特定のソースから情報をプルすることで、質問に回答します。 RAG アーキテクチャには、次の 2 つの主要なフローがあります。

データインジェスト: ソースからデータを取り込み、インデックスを作成するためのパイプライン。これは通常、オフラインで発生します。
取得と生成: 実行時にユーザークエリを受け取り、関連するデータをインデックスから取得し、モデルに渡す RAG チェーン。

Microsoft Teamsを使用すると、RAG を使用して会話型ボットを構築し、生産性を最大化するための強化されたエクスペリエンスを作成できます。 Teams Toolkit では、Azure AI 検索、Microsoft 365 SharePoint、およびカスタム API の機能を異なるデータソースと LLM として組み合わせて、Teams で会話型検索エクスペリエンスを作成する一連の準備が整ったアプリテンプレートが [データと チャット ] カテゴリに用意されています。

前提条件

インストール	使用するには...
Visual Studio Code	JavaScript、TypeScript、または Python ビルド環境。最新バージョンを使用します。
Teams ツールキット	アプリのプロジェクトスキャフォールディングを作成する Microsoft Visual Studio Code 拡張機能。最新バージョンを使用します。
Node.js	バックエンド JavaScript ランタイム環境。詳細については、「プロジェクトの種類Node.js バージョン互換性テーブル」を参照してください。
Microsoft Teams	Microsoft Teams、チャット、会議、通話のアプリを通じて作業するすべてのユーザーと 1 か所で共同作業を行うことができます。
Azure OpenAI	最初に OpenAI API キーを作成して、OpenAI の生成済み事前トレーニングトランスフォーマー (GPT) を使用します。アプリをホストする場合、または Azure でリソースにアクセスする場合は、Azure OpenAI サービスを作成する必要があります。

新しい基本的な AI チャットボットプロジェクトを作成する

Visual Studio Code を開きます。
Visual Studio Code アクティビティバーの [Teams Toolkit ] アイコンを選択します。
[ 新しいアプリの作成] を選択します。

$Teams Toolkit サイドバーの [Create New Project]$新しいプロジェクトの作成$ リンクの場所を示すスクリーンショット。$
[ カスタム Copilot] を選択します。
[ データとチャット] を選択します。
[ カスタマイズ] を選択します。
[ JavaScript] を選択します。
[ Azure OpenAI ] または [OpenAI] を選択します。
選択したサービスに基づいて 、Azure OpenAI または OpenAI の資格情報を入力します。 Enter キーを押します。
[ 既定のフォルダー] を選択します。

既定の場所を変更するには、次の手順に従います。
1. [ 参照] を選択します。
2. プロジェクトワークスペースの場所を選択します。
3. [ フォルダーの選択] を選択します。
アプリのアプリ名を入力し、 Enter キーを選択します。

データとのチャット プロジェクトワークスペースが正常に作成されました。
[エクスプローラー] で、env>.env.testtool.user ファイルに移動します。
次の値を更新します。
- SECRET_AZURE_OPENAI_API_KEY=<your-key>
- AZURE_OPENAI_ENDPOINT=<your-endpoint>
- AZURE_OPENAI_DEPLOYMENT_NAME=<your-deployment>
アプリをデバッグするには、 F5 キーを選択するか、左側のウィンドウで [ 実行とデバッグ] (Ctrl + Shift + D) を選択し、ドロップダウンリストから [ テストツール (プレビュー)] で [デバッグ ] を選択します。

テストツールは、Web ページでボットを開きます。

ボットアプリのソースコードのツアーを開始する

フォルダー	コンテンツ
`.vscode`	デバッグ用の Visual Studio Code ファイル。
`appPackage`	Teams アプリマニフェストのテンプレート。
`env`	環境ファイル。
`infra`	Azure リソースをプロビジョニングするためのテンプレート。
`src`	アプリのソースコード。
`src/index.js`	ボットアプリサーバーを設定します。
`src/adapter.js`	ボットアダプターを設定します。
`src/config.js`	環境変数を定義します。
`src/prompts/chat/skprompt.txt`	プロンプトを定義します。
`src/prompts/chat/config.json`	プロンプトを構成します。
`src/app/app.js`	RAG ボットのビジネスロジックを処理します。
`src/app/myDataSource.js`	データソースを定義します。
`src/data/*.md`	生テキストデータソース。
`teamsapp.yml`	これは、Teams Toolkit プロジェクトファイルのメインです。プロジェクトファイルは、プロパティと構成ステージの定義を定義します。
`teamsapp.local.yml`	これにより、ローカル実行とデバッグを有効にするアクションで `teamsapp.yml` がオーバーライドされます。
`teamsapp.testtool.yml`	これにより、Teams アプリテストツールでローカル実行とデバッグを有効にするアクションで `teamsapp.yml` がオーバーライドされます。

Teams AI の RAG シナリオ

AI コンテキストでは、ベクターデータベースは、埋め込みデータを格納し、ベクター類似性検索を提供する RAG ストレージとして広く使用されています。 Teams AI ライブラリには、指定された入力の埋め込みを作成するのに役立つユーティリティが用意されています。

ヒント

Teams AI ライブラリにはベクターデータベースの実装がないため、作成した埋め込みを処理する独自のロジックを追加する必要があります。

JavaScript
Python

// create OpenAIEmbeddings instance
const model = new OpenAIEmbeddings({ ... endpoint, apikey, model, ... });

// create embeddings for the given inputs
const embeddings = await model.createEmbeddings(model, inputs);

// your own logic to process embeddings

# create OpenAIEmbeddings instance
model = OpenAIEmbeddings(OpenAIEmbeddingsOptions(api_key, model))

# create embeddings for the given inputs
embeddings = await model.create_embeddings(inputs)

# your own logic to process embeddings

次の図は、Teams AI ライブラリが取得と生成プロセスの各ステップを容易にする機能を提供する方法を示しています。

RAG シナリオを示すスクリーンショット。

入力を処理する: 最も簡単な方法は、ユーザーの入力を変更せずに取得に渡すことです。ただし、取得する前に入力をカスタマイズする場合は、特定の受信アクティビティにアクティビティハンドラーを追加できます。

データソースの取得: Teams AI ライブラリには、独自の取得ロジックを追加するための DataSource インターフェイスが用意されています。独自の DataSource インスタンスを作成する必要があり、Teams AI ライブラリはオンデマンドで呼び出します。

JavaScript
Python

class MyDataSource implements DataSource {
  /**
    * Name of the data source.
    */
  public readonly name = "my-datasource";

  /**
    * Renders the data source as a string of text.
    * @param context Turn context for the current turn of conversation with the user.
    * @param memory An interface for accessing state values.
    * @param tokenizer Tokenizer to use when rendering the data source.
    * @param maxTokens Maximum number of tokens allowed to be rendered.
    * @returns The text to inject into the prompt as a `RenderedPromptSection` object.
    */
  renderData(
    context: TurnContext,
    memory: Memory,
    tokenizer: Tokenizer,
    maxTokens: number
  ): Promise<RenderedPromptSection<string>> {
    ...
  }
}

class MyDataSource(DataSource):
  def __init__(self):
    self.name = "my_datasource_name"

  def name(self):
    return self.name

  async def render_data(self, _context: TurnContext, memory: Memory, tokenizer: Tokenizer, maxTokens: int):
    # your render data logic

プロンプトを使用して AI を呼び出す: Teams AI プロンプトシステムでは、augmentation.data_sources構成セクションを調整することで、DataSourceを簡単に挿入できます。これにより、プロンプトが DataSource とライブラリオーケストレーターに接続され、 DataSource テキストが最終的なプロンプトに挿入されます。詳細については、「 authorprompt」を参照してください。たとえば、プロンプトの config.json ファイルでは、次のようになります。
```
{
    "schema": 1.1,
    ...
    "augmentation": {
        "data_sources": {
            "my-datasource": 1200
        }
    }
}
```
ビルド応答: 既定では、Teams AI ライブラリは、ユーザーへのテキストメッセージとして AI によって生成された応答に応答します。応答をカスタマイズする場合は、既定の SAY アクションをオーバーライドするか、AI モデルを明示的に呼び出して、アダプティブカードなどの応答を作成できます。

アプリに RAG を追加するための実装の最小セットを次に示します。一般に、knowledgeをプロンプトに挿入するDataSourceを実装し、AI がknowledgeに基づいて応答を生成できるようにします。

DataSource インターフェイスを実装するmyDataSource.ts ファイルを作成します。

export class MyDataSource implements DataSource {
  public readonly name = "my-datasource";
  public async renderData(
    context: TurnContext,
    memory: Memory,
    tokenizer: Tokenizer,
    maxTokens: number
  ): Promise<RenderedPromptSection<string>> {
    const input = memory.getValue('temp.input') as string;
    let knowledge = "There's no knowledge found.";

    // hard-code knowledge
    if (input?.includes("shuttle bus")) {
      knowledge = "Company's shuttle bus may be 15 minutes late on rainy days.";
    } else if (input?.includes("cafe")) {
      knowledge = "The Cafe's available time is 9:00 to 17:00 on working days and 10:00 to 16:00 on weekends and holidays."
    }

    return {
      output: knowledge,
      length: knowledge.length,
      tooLong: false
    }
  }
}

app.ts ファイルにDataSourceを登録します。

JavaScript
Python

  // Register your data source to prompt manager
  planner.prompts.addDataSource(new MyDataSource());

  planner.prompts.add_data_source(MyDataSource())

prompts/qa/skprompt.txt ファイルを作成し、次のテキストを追加します。

The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly to answer user's question.

Base your answer off the text below:

prompts/qa/config.json ファイルを作成し、次のコードを追加してデータソースに接続します。

{
    "schema": 1.1,
    "description": "Chat with QA Assistant",
    "type": "completion",
    "completion": {
        "model": "gpt-35-turbo",
        "completion_type": "chat",
        "include_history": true,
        "include_input": true,
        "max_input_tokens": 2800,
        "max_tokens": 1000,
        "temperature": 0.9,
        "top_p": 0.0,
        "presence_penalty": 0.6,
        "frequency_penalty": 0.0,
        "stop_sequences": []
    },
    "augmentation": {
        "data_sources": {
            "my-datasource": 1200
        }
    }
}

データソースを選択する

Teams Toolkit では、データまたは RAG を 使用したチャット のシナリオで、次の種類のデータソースが提供されます。

カスタマイズ: データインジェストを完全に制御して、独自のベクターインデックスを作成し、データソースとして使用できます。詳細については、「独自のデータインジェストを作成する」を参照してください。

また、ベクターデータベースとして Azure Cosmos DB Vector Database Extension または Azure PostgreSQL Server ベクター拡張機能を使用したり、Web Search API をBingして最新の Web コンテンツを取得したりして、独自のデータソースに接続する任意のデータソースインスタンスを実装することもできます。
Azure AI Search: Azure AI Search Service にドキュメントを追加し、検索インデックスをデータソースとして使用するサンプルを提供します。
カスタム API: チャットボットが OpenAPI 説明ドキュメントで定義されている API を呼び出して、API サービスからドメインデータを取得できるようにします。
Microsoft Graph と SharePoint: Microsoft Graph Search API の Microsoft 365 コンテンツをデータソースとして使用するサンプルを提供します。

独自のデータインジェストを構築する

データインジェストを構築するには、次の手順に従います。

ソースドキュメントを読み込む: 埋め込みモデルがテキストのみを入力として受け取るように、文書に意味のあるテキストがあることを確認します。
チャンクに分割する: 埋め込みモデルには入力トークンの制限が設定されるため、API 呼び出しエラーを回避するためにドキュメントを分割してください。
埋め込みモデルの呼び出し: 埋め込みモデル API を呼び出して、指定された入力の埋め込みを作成します。
埋め込みを格納する: 作成した埋め込みをベクターデータベースに格納します。さらに参照するために、有用なメタデータと生のコンテンツも含めます。

loader.ts: ソース入力としてのプレーンテキスト。

import * as fs from "node:fs";

export function loadTextFile(path: string): string {
  return fs.readFileSync(path, "utf-8");
}

splitter.ts: テキストをチャンクに分割し、重複します。


// split words by delimiters.
const delimiters = [" ", "\t", "\r", "\n"];

export function split(content: string, length: number, overlap: number): Array<string> {
  const results = new Array<string>();
  let cursor = 0, curChunk = 0;
  results.push("");
  while(cursor < content.length) {
    const curChar = content[cursor];
    if (delimiters.includes(curChar)) {
      // check chunk length
      while (curChunk < results.length && results[curChunk].length >= length) {
        curChunk ++;
      }
      for (let i = curChunk; i < results.length; i++) {
        results[i] += curChar;
      }
      if (results[results.length - 1].length >= length - overlap) {
        results.push("");
      }
    } else {
      // append
      for (let i = curChunk; i < results.length; i++) {
        results[i] += curChar;
      }
    }
    cursor ++;
  }
  while (curChunk < results.length - 1) {
    results.pop();
  }
  return results;
}

embeddings.ts: Teams AI ライブラリ OpenAIEmbeddings を使用して埋め込みを作成します。

import { OpenAIEmbeddings } from "@microsoft/teams-ai";

const embeddingClient = new OpenAIEmbeddings({
  azureApiKey: "<your-aoai-key>",
  azureEndpoint: "<your-aoai-endpoint>",
  azureDeployment: "<your-embedding-deployment, e.g., text-embedding-ada-002>"
});

export async function createEmbeddings(content: string): Promise<number[]> {
  const response = await embeddingClient.createEmbeddings(content);
  return response.output[0];
}

searchIndex.ts: Azure AI Search インデックスを作成します。

import { SearchIndexClient, AzureKeyCredential, SearchIndex } from "@azure/search-documents";

const endpoint = "<your-search-endpoint>";
const apiKey = "<your-search-key>";
const indexName = "<your-index-name>";

const indexDef: SearchIndex = {
  name: indexName,
  fields: [
    {
      type: "Edm.String",
      name: "id",
      key: true,
    },
    {
      type: "Edm.String",
      name: "content",
      searchable: true,
    },
    {
      type: "Edm.String",
      name: "filepath",
      searchable: true,
      filterable: true,
    },
    {
      type: "Collection(Edm.Single)",
      name: "contentVector",
      searchable: true,
      vectorSearchDimensions: 1536,
      vectorSearchProfileName: "default"
    }
  ],
  vectorSearch: {
    algorithms: [{
      name: "default",
      kind: "hnsw"
    }],
    profiles: [{
      name: "default",
      algorithmConfigurationName: "default"
    }]
  },
  semanticSearch: {
    defaultConfigurationName: "default",
    configurations: [{
      name: "default",
      prioritizedFields: {
        contentFields: [{
          name: "content"
        }]
      }
    }]
  }
};

export async function createNewIndex(): Promise<void> {
  const client = new SearchIndexClient(endpoint, new AzureKeyCredential(apiKey));
  await client.createIndex(indexDef);
}

searchIndexer.ts: 作成した埋め込みとその他のフィールドを Azure AI Search Index にアップロードします。

import { AzureKeyCredential, SearchClient } from "@azure/search-documents";

export interface Doc {
  id: string,
  content: string,
  filepath: string,
  contentVector: number[]
}

const endpoint = "<your-search-endpoint>";
const apiKey = "<your-search-key>";
const indexName = "<your-index-name>";
const searchClient: SearchClient<Doc> = new SearchClient<Doc>(endpoint, indexName, new AzureKeyCredential(apiKey));

export async function indexDoc(doc: Doc): Promise<boolean> {
  const response = await searchClient.mergeOrUploadDocuments([doc]);
  return response.results.every((result) => result.succeeded);
}

index.ts: 上記のコンポーネントを調整します。

import { createEmbeddings } from "./embeddings";
import { loadTextFile } from "./loader";
import { createNewIndex } from "./searchIndex";
import { indexDoc } from "./searchIndexer";
import { split } from "./splitter";

async function main() {
  // Only need to call once
  await createNewIndex();

  // local files as source input
  const files = [`${__dirname}/data/A.md`, `${__dirname}/data/A.md`];
  for (const file of files) {
    // load file
    const fullContent = loadTextFile(file);

    // split into chunks
    const contents = split(fullContent, 1000, 100);
    let partIndex = 0;
    for (const content of contents) {
      partIndex ++;
      // create embeddings
      const embeddings = await createEmbeddings(content);

      // upload to index
      await indexDoc({
        id: `${file.replace(/[^a-z0-9]/ig, "")}___${partIndex}`,
        content: content,
        filepath: file,
        contentVector: embeddings,
      });
    }
  }
}

main().then().finally();

loader.py: ソース入力としてのプレーンテキスト。

def load_text_file(path: str) -> str:
    with open(path, 'r', encoding='utf-8') as file:
        return file.read()

splitter.py: テキストをチャンクに分割し、重複します。

def split(content: str, length: int, overlap: int) -> list[str]:
    delimiters = [" ", "\t", "\r", "\n"]
    results = [""]
    cursor = 0
    cur_chunk = 0
    while cursor < len(content):
        cur_char = content[cursor]
        if cur_char in delimiters:
            while cur_chunk < len(results) and len(results[cur_chunk]) >= length:
                cur_chunk += 1
            for i in range(cur_chunk, len(results)):
                results[i] += cur_char
            if len(results[-1]) >= length - overlap:
                results.append("")
        else:
            for i in range(cur_chunk, len(results)):
                results[i] += cur_char
        cursor += 1
    while cur_chunk < len(results) - 1:
        results.pop()
    return results

embeddings.py: Teams AI ライブラリ OpenAIEmbeddings を使用して埋め込みを作成します。

async def create_embeddings(text: str, embeddings):
    result = await embeddings.create_embeddings(text)

    return result.output[0]

search_index.py: Azure AI Search インデックスを作成します。

async def create_index_if_not_exists(client: SearchIndexClient, name: str):
    doc_index = SearchIndex(
        name=name,
        fields = [
            SimpleField(name="docId", type=SearchFieldDataType.String, key=True),
            SimpleField(name="docTitle", type=SearchFieldDataType.String),
            SearchableField(name="description", type=SearchFieldDataType.String, searchable=True),
            SearchField(name="descriptionVector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), searchable=True, vector_search_dimensions=1536, vector_search_profile_name='my-vector-config'),
        ],
        scoring_profiles=[],
        cors_options=CorsOptions(allowed_origins=["*"]),
        vector_search = VectorSearch(
            profiles=[VectorSearchProfile(name="my-vector-config", algorithm_configuration_name="my-algorithms-config")],
            algorithms=[HnswAlgorithmConfiguration(name="my-algorithms-config")],
        )
    )

    client.create_or_update_index(doc_index)

search_indexer.py: 作成した埋め込みとその他のフィールドを Azure AI Search Index にアップロードします。

from embeddings import create_embeddings
from search_index import create_index_if_not_exists
from loader import load_text_file
from split import split

async def get_doc_data(embeddings):
    file_path=f'{os.getcwd()}/my_file_path_1'
    raw_description1 = split(load_text_file(file_path), 1000, 100)
    doc1 = {
        "docId": "1",
        "docTitle": "my_titile_1",
        "description": raw_description1,
        "descriptionVector": await create_embeddings(raw_description1, embeddings=embeddings),
    }

    file_path=f'{os.getcwd()}/my_file_path_2'
    raw_description2 = split(load_text_file(file_path), 1000, 100)
    doc2 = {
        "docId": "2",
        "docTitle": "my_titile_2",
        "description": raw_description2,
        "descriptionVector": await create_embeddings(raw_description2, embeddings=embeddings),
    }

    return [doc1, doc2]

async def setup(search_api_key, search_api_endpoint):
    index = 'my_index_name'
    credentials = AzureKeyCredential(search_api_key)
    search_index_client = SearchIndexClient(search_api_endpoint, credentials)
    await create_index_if_not_exists(search_index_client, index)

    search_client = SearchClient(search_api_endpoint, index, credentials)
    embeddings=AzureOpenAIEmbeddings(AzureOpenAIEmbeddingsOptions(
          azure_api_key="<your-aoai-key>",
          azure_endpoint="<your-aoai-endpoint>",
          azure_deployment="<your-embedding-deployment, e.g., text-embedding-ada-002>"
    ))
    data = await get_doc_data(embeddings=embeddings)
    await search_client.merge_or_upload_documents(data)

index.py: 上記のコンポーネントを調整します。

from search_indexer import setup

search_api_key = '<your-key>'
search_api_endpoint = '<your-endpoint>'
asyncio.run(setup(search_api_key, search_api_endpoint))

データソースとしての Azure AI Search

このセクションでは、次の方法について説明します。

Azure OpenAI Service を使用して Azure AI Search にドキュメントを追加します。
RAG アプリで Azure AI Search インデックスをデータソースとして使用します。

Azure AI Search にドキュメントを追加する

注:

この方法では、AI モデルと呼ばれるエンドツーエンドのチャット API が作成されます。以前に作成したインデックスをデータソースとして使用し、Teams AI ライブラリを使用して取得とプロンプトをカスタマイズすることもできます。

ナレッジドキュメントを Azure AI Search Service に取り込み、データに対して Azure OpenAI を使用してベクターインデックスを作成できます。インジェスト後、インデックスをデータソースとして使用できます。

Azure Blob Storage でデータを準備します。
Azure OpenAI Studio で、[ データソースの追加] を選択します。
必要なフィールドを更新します。
[次へ] を選択します。

[ データ管理] ページが表示されます。
必要なフィールドを更新します。
[次へ] を選択します。
必要なフィールドを更新します。 [次へ] を選択します。
[ 保存して閉じる] を選択します。

Azure AI Search インデックスデータソースを使用する

Azure AI Search にデータを取り込んだ後、独自の DataSource を実装して、検索インデックスからデータを取得できます。

JavaScript
Python

const { AzureKeyCredential, SearchClient } = require("@azure/search-documents");
const { DataSource, Memory, OpenAIEmbeddings, Tokenizer } = require("@microsoft/teams-ai");
const { TurnContext } = require("botbuilder");

// Define the interface for document
class Doc {
  constructor(id, content, filepath) {
    this.id = id;
    this.content = content; // searchable
    this.filepath = filepath;
  }
}

// Azure OpenAI configuration
const aoaiEndpoint = "<your-aoai-endpoint>";
const aoaiApiKey = "<your-aoai-key>";
const aoaiDeployment = "<your-embedding-deployment, e.g., text-embedding-ada-002>";

// Azure AI Search configuration
const searchEndpoint = "<your-search-endpoint>";
const searchApiKey = "<your-search-apikey>";
const searchIndexName = "<your-index-name>";

// Define MyDataSource class implementing DataSource interface
class MyDataSource extends DataSource {
  constructor() {
    super();
    this.name = "my-datasource";
    this.embeddingClient = new OpenAIEmbeddings({
      azureEndpoint: aoaiEndpoint,
      azureApiKey: aoaiApiKey,
      azureDeployment: aoaiDeployment
    });
    this.searchClient = new SearchClient(searchEndpoint, searchIndexName, new AzureKeyCredential(searchApiKey));
  }

  async renderData(context, memory, tokenizer, maxTokens) {
    // use user input as query
    const input = memory.getValue("temp.input");

    // generate embeddings
    const embeddings = (await this.embeddingClient.createEmbeddings(input)).output[0];

    // query Azure AI Search
    const response = await this.searchClient.search(input, {
      select: [ "id", "content", "filepath" ],
      searchFields: ["rawContent"],
      vectorSearchOptions: {
        queries: [{
          kind: "vector",
          fields: [ "contentVector" ],
          vector: embeddings,
          kNearestNeighborsCount: 3
        }]
      },
      queryType: "semantic",
      top: 3,
      semanticSearchOptions: {
        // your semantic configuration name
        configurationName: "default",
      }
    });

    // Add documents until you run out of tokens
    let length = 0, output = '';
    for await (const result of response.results) {
      // Start a new doc
      let doc = `${result.document.content}\n\n`;
      let docLength = tokenizer.encode(doc).length;
      const remainingTokens = maxTokens - (length + docLength);
      if (remainingTokens <= 0) {
          break;
      }

      // Append doc to output
      output += doc;
      length += docLength;
    }
    return { output, length, tooLong: length > maxTokens };
  }
}

async def get_embedding_vector(text: str):
    embeddings = AzureOpenAIEmbeddings(AzureOpenAIEmbeddingsOptions(
        azure_api_key='<your-aoai-key>',
        azure_endpoint='<your-aoai-endpoint>',
        azure_deployment='<your-aoai-embedding-deployment>'
    ))
    
    result = await embeddings.create_embeddings(text)
    if (result.status != 'success' or not result.output):
        raise Exception(f"Failed to generate embeddings for description: {text}")
    
    return result.output[0]

@dataclass
class Doc:
    docId: Optional[str] = None
    docTitle: Optional[str] = None
    description: Optional[str] = None
    descriptionVector: Optional[List[float]] = None

@dataclass
class MyDataSourceOptions:
    name: str
    indexName: str
    azureAISearchApiKey: str
    azureAISearchEndpoint: str

from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
import json

@dataclass
class Result:
    def __init__(self, output, length, too_long):
        self.output = output
        self.length = length
        self.too_long = too_long

class MyDataSource(DataSource):
    def __init__(self, options: MyDataSourceOptions):
        self.name = options.name
        self.options = options
        self.searchClient = SearchClient(
            options.azureAISearchEndpoint,
            options.indexName,
            AzureKeyCredential(options.azureAISearchApiKey)
        )
        
    def name(self):
        return self.name

    async def render_data(self, _context: TurnContext, memory: Memory, tokenizer: Tokenizer, maxTokens: int):
        query = memory.get('temp.input')
        embedding = await get_embedding_vector(query)
        vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=2, fields="descriptionVector")

        if not query:
            return Result('', 0, False)

        selectedFields = [
            'docTitle',
            'description',
            'descriptionVector',
        ]

        searchResults = self.searchClient.search(
            search_text=query,
            select=selectedFields,
            vector_queries=[vector_query],
        )

        if not searchResults:
            return Result('', 0, False)

        usedTokens = 0
        doc = ''
        for result in searchResults:
            tokens = len(tokenizer.encode(json.dumps(result["description"])))

            if usedTokens + tokens > maxTokens:
                break

            doc += json.dumps(result["description"])
            usedTokens += tokens

        return Result(doc, usedTokens, usedTokens > maxTokens)

カスタム API 用の API をデータソースとして追加する

カスタム API テンプレートからさらに多くの API を使用してカスタム copilot を拡張するには、次の手順に従います。

./appPackage/apiSpecificationFile/openapi.*を更新します。

追加する API の対応する部分をスペックからコピーし、 ./appPackage/apiSpecificationFile/openapi.*に追加します。

./src/prompts/chat/actions.jsonを更新します。

次のオブジェクトの API のパス、クエリ、本文に必要な情報とプロパティを更新します。

{
  "name": "${{YOUR-API-NAME}}",
  "description": "${{YOUR-API-DESCRIPTION}}",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "object",
        "properties": {
          "${{YOUR-PROPERTY-NAME}}": {
            "type": "${{YOUR-PROPERTY-TYPE}}",
            "description": "${{YOUR-PROPERTY-DESCRIPTION}}",
          }
          // You can add more query properties here
        }
      },
      "path": {
        // Same as query properties
      },
      "body": {
        // Same as query properties
      }
    }
  }
}

./src/adaptiveCardsを更新します。

${{YOUR-API-NAME}}.jsonという名前の新しいファイルを作成し、API の API 応答のアダプティブカードを入力します。

./src/app/app.js ファイルを更新します。

module.exports = app;の前に次のコードを追加します。

app.ai.action(${{YOUR-API-NAME}}, async (context: TurnContext, state: ApplicationTurnState, parameter: any) => {
  const client = await api.getClient();

  const path = client.paths[${{YOUR-API-PATH}}];
  if (path && path.${{YOUR-API-METHOD}}) {
    const result = await path.${{YOUR-API-METHOD}}(parameter.path, parameter.body, {
      params: parameter.query,
    });
    const card = generateAdaptiveCard("../adaptiveCards/${{YOUR-API-NAME}}.json", result);
    await context.sendActivity({ attachments: [card] });
  } else {
    await context.sendActivity("no result");
  }
  return "result";
});

データソースとしての Microsoft 365

Microsoft Graph Search API を利用して、RAG アプリのデータソースとして Microsoft 365 コンテンツにクエリを実行する方法について説明します。 Microsoft Graph Search API の詳細については、「Microsoft Search API を使用して OneDrive と SharePoint のコンテンツを検索する」を参照してください。

前提条件: Graph API クライアントを作成し、SharePoint と OneDrive のファイル、フォルダー、ページ、ニュースにアクセスするための Files.Read.All アクセス許可スコープを付与する必要があります。

データインジェスト

SharePoint コンテンツを検索できる Microsoft Graph Search API を使用できます。そのため、ドキュメントが SharePoint または OneDrive にアップロードされていることを確認するだけで、追加のデータインジェストは不要です。

注:

SharePoint サーバーは、ファイル拡張子が [ファイルの種類の管理] ページに表示されている場合にのみ、ファイルのインデックスを作成します。サポートされているファイル拡張子の完全な一覧については、Microsoft 365 の SharePoint サーバーと SharePoint の既定のインデックス付きファイル名拡張子と解析されたファイルの種類を参照してください。

データソースの実装

SharePoint と OneDrive でテキストファイルを検索する例を次に示します。

import {
  DataSource,
  Memory,
  RenderedPromptSection,
  Tokenizer,
} from "@microsoft/teams-ai";
import { TurnContext } from "botbuilder";
import { Client, ResponseType } from "@microsoft/microsoft-graph-client";

export class GraphApiSearchDataSource implements DataSource {
  public readonly name = "my-datasource";
  public readonly description =
    "Searches the graph for documents related to the input";
  public client: Client;

  constructor(client: Client) {
    this.client = client;
  }

  public async renderData(
    context: TurnContext,
    memory: Memory,
    tokenizer: Tokenizer,
    maxTokens: number
  ): Promise<RenderedPromptSection<string>> {
    const input = memory.getValue("temp.input") as string;
    const contentResults = [];
    const response = await this.client.api("/search/query").post({
      requests: [
        {
          entityTypes: ["driveItem"],
          query: {
            // Search for markdown files in the user's OneDrive and SharePoint
            // The supported file types are listed here:
            // https://video2.skills-academy.com/sharepoint/technical-reference/default-crawled-file-name-extensions-and-parsed-file-types
            queryString: `${input} filetype:txt`,
          },
          // This parameter is required only when searching with application permissions
          // https://video2.skills-academy.com/graph/search-concept-searchall
          // region: "US",
        },
      ],
    });
    for (const value of response?.value ?? []) {
      for (const hitsContainer of value?.hitsContainers ?? []) {
        contentResults.push(...(hitsContainer?.hits ?? []));
      }
    }

    // Add documents until you run out of tokens
    let length = 0,
      output = "";
    for (const result of contentResults) {
      const rawContent = await this.downloadSharepointFile(
        result.resource.webUrl
      );
      if (!rawContent) {
        continue;
      }
      let doc = `${rawContent}\n\n`;
      let docLength = tokenizer.encode(doc).length;
      const remainingTokens = maxTokens - (length + docLength);
      if (remainingTokens <= 0) {
        break;
      }

      // Append do to output
      output += doc;
      length += docLength;
    }
    return { output, length, tooLong: length > maxTokens };
  }

  // Download the file from SharePoint
  // https://docs.microsoft.com/en-us/graph/api/driveitem-get-content
  private async downloadSharepointFile(
    contentUrl: string
  ): Promise<string | undefined> {
    const encodedUrl = this.encodeSharepointContentUrl(contentUrl);
    const fileContentResponse = await this.client
      .api(`/shares/${encodedUrl}/driveItem/content`)
      .responseType(ResponseType.TEXT)
      .get();

    return fileContentResponse;
  }

  private encodeSharepointContentUrl(webUrl: string): string {
    const byteData = Buffer.from(webUrl, "utf-8");
    const base64String = byteData.toString("base64");
    return (
      "u!" + base64String.replace("=", "").replace("/", "_").replace("+", "_")
    );
  }
}

次の方法で共有

Teams で RAG ボットを構築する

前提条件

新しい基本的な AI チャットボットプロジェクトを作成する

ボットアプリのソースコードのツアーを開始する

Teams AI の RAG シナリオ

データソースを選択する

独自のデータインジェストを構築する

サンプルコード

データソースとしての Azure AI Search

Azure AI Search にドキュメントを追加する

Azure AI Search インデックスデータソースを使用する

カスタム API 用の API をデータソースとして追加する

データソースとしての Microsoft 365

データインジェスト

データソースの実装

関連項目

フィードバック

その他のリソース

次の方法で共有

Teams で RAG ボットを構築する

前提条件

新しい基本的な AI チャットボット プロジェクトを作成する

ボット アプリのソース コードのツアーを開始する

Teams AI の RAG シナリオ

データ ソースを選択する

独自のデータ インジェストを構築する

サンプル コード

データ ソースとしての Azure AI Search

Azure AI Search にドキュメントを追加する

Azure AI Search インデックス データ ソースを使用する

カスタム API 用の API をデータ ソースとして追加する

データ ソースとしての Microsoft 365

データ インジェスト

データ ソースの実装

関連項目

フィードバック

その他のリソース

新しい基本的な AI チャットボットプロジェクトを作成する

ボットアプリのソースコードのツアーを開始する

データソースを選択する

独自のデータインジェストを構築する

サンプルコード

データソースとしての Azure AI Search

Azure AI Search インデックスデータソースを使用する

カスタム API 用の API をデータソースとして追加する

データソースとしての Microsoft 365

データインジェスト

データソースの実装