How to create my own Web data sources like Azure AI Studio for Azure AI Search

David Fuller 20 Reputation points
2024-07-01T23:59:45.9066667+00:00

I can rapidly create a RAG solution using a web site as a data source via Azure AI Studio from within the Chat Playground.

Here is a data source created in chat playground that indexes a web site:

User's image

When I take the next step to create the underlying index, indexer, datasource, etc within Azure AI search, I can see a few breadcrumbs, like the created index, but the datasource, indexer, etc. are hidden.

If I try to use the index from within the Azure portal for cognitive search the search fails with 401.

User's image

When I look at possible data sources within Azure AI(cognitive) Search, Web or web site is not an option at all. So it seems:

-Azure AI studio is using the plumbing of Azure AI Search for RAG. Is creating some breadcrumbs in my account that are visible, like the index, but also creating some non-accesible stuff, like I suspect a data source for web crawling.

Does anyone have any guidance on how to "recreate" the web crawling data source that I am able to use so easily from the Chat playground? OR how I can just re-use the index it creates and avoid the 401 error? Does Microsoft intend to productize/share the Web data source plug in from Azure AI Studio.

TIA.

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
831 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,538 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,577 questions
{count} votes

Accepted answer
  1. ajkuma 24,396 Reputation points Microsoft Employee
    2024-07-02T06:03:37.7066667+00:00

    David Fuller,

    Based on my understanding of your scenario.

    1.There is no direct web-crawling native mechanism in AI Search at this time.

    2.That error message is most likely because the vectorizer in the index has managed identity enabled and the Azure portal needs the keys.
    You may try navigating to> the index vector configuration and switch for keys > to leverage from the portal.
    User's image

    3.On making the web data source plugin from Azure AI Studio available for broader use - As I understand, currently, this functionality may not be planned directly in AI Search.
    I'll post back if I have more updates.

    I have relayed this feedback internally to our product team.
     

    If you wish you may share your feedback on Uservoice - All of the feedback, you share in these forums will be monitored and reviewed by the Microsoft engineering teams responsible for building Azure. Additionally, users with a similar request can up-vote your post and add their comments. Azure Search feedback


    If the answer helped (pointed, you in the right direction) > please click Accept Answer

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful