Rag Improvements

Question

Dear all,

I am try to improve the quality of my Rag system that is for now composed of a Storage account, an indexer that takes the blobs from such SA, an index and a openAI deployment for chatting (as well as the app interface). I do have text.split.skills as well as openAI vectorizers as cognitive skills of the indexer.

What is not really clear to me is how to implement few improvements that sometimes are mentioned in the documentation but but can't be trace back to something that fits in the pipeline I have mentioned above.

Specifically I would like to a different kind of chunking. I see the alternative are (if I am not mistaken) the langchain recursive chunking and the semantic chunking. Semantic Chunking though looks like only a document intelligence features which is basically just a markdown converter. On the other hand, for the langchain method, some code snippets are provided but I was wondering how can I implement in that in a skill? I have until now always used an indexer to load the index, but I guess it must be done from the SDK?

Another tool I wanted to implement is Multiple Query Generation + RRF. I believe RRF is already implemented in the Search service but how to generate multiple queries? Is there a skill I can connect to the search service (or are those only for the document ingestion phase?). Of course everything could be implemented from scratch but how to access the retrieval phase and reranking phase? It looked to me like they were all behing the search service API call.

If I am not mistaken Azure Data Brick mentions query expansion (https://video2.skills-academy.com/en-us/azure/databricks/generative-ai/tutorials/ai-cookbook/quality-rag-chain) but is it possible to integrate it in a normal cognitive search service?

Thank you for your time and patience.

Gabriel

Answer

How can I implement different types of chunking in my current pipeline?

To implement different chunking methods, you need to consider how your existing pipeline can be adapted or extended. The LangChain recursive chunking method, for example, is highly effective for creating contextually coherent document segments. However, this method isn’t natively supported as a cognitive skill within Azure Cognitive Search. You would need to integrate LangChain directly with your data pipeline, using Python or another language via the Azure SDK. This integration means pre-processing your data outside the standard indexer process—either by running a pre-processing script that applies recursive chunking and uploads processed chunks back to your Storage Account or by embedding this step within a custom Azure Function or Data Factory pipeline.

What about semantic chunking and document intelligence features?

Semantic chunking as mentioned in Azure documentation is primarily associated with features found in document intelligence solutions. These tools often include advanced processing like converting and parsing markdown documents. If you’re looking to leverage true semantic chunking, you might need to incorporate an external library or SDK that processes documents before ingestion into your index. Integrating tools like Azure Form Recognizer or Document Intelligence capabilities may require aligning the output with your indexing workflow and using a custom pipeline or SDK for handling these enriched document structures.

How can I implement Multiple Query Generation and integrate it with RRF in my search pipeline?

Multiple Query Generation (MQG) can enhance the retrieval phase by submitting various reformulations of the original query. While RRF (Reciprocal Rank Fusion) is typically embedded within the Azure Cognitive Search's reranking process, MQG is not a native skill or out-of-the-box feature. To generate multiple queries, you might use external NLP tools or libraries (OpenAI GPT models) to create variations of user queries before sending them to the Search service. This process would need to be integrated at the application level, where the modified queries are programmatically generated and sent to Azure Cognitive Search via the SDK. You can then apply RRF when these multiple queries are processed through the search service and results are combined programmatically or automatically.

Can I integrate query expansion from Databricks into a Cognitive Search workflow?

Yes, integrating query expansion methods from Databricks into a standard Azure Cognitive Search service is possible but requires custom implementation. Databricks offers advanced capabilities for generating expanded queries using generative AI models. To bring this into your Cognitive Search pipeline, you can preprocess queries through a Databricks workflow that connects to your Azure Search endpoint. This integration would involve using Azure Databricks to generate expanded or enriched queries, and then directing these queries to the search service for further retrieval and reranking. The connection between these services can be maintained through REST API calls or using the Azure SDK within Databricks notebooks.

How can I access the retrieval and reranking phases for customization?

The retrieval and reranking phases are managed via the Azure Cognitive Search service API. While the built-in search API handles these functions automatically, you can customize their behavior by using custom code or middleware that interfaces between your application and the search service. To access and influence the reranking phase, you could use query modifiers, post-processing scripts, or integrate with custom reranking models through the REST API. However, directly adding skills to influence retrieval and reranking would typically require an application-level approach where these stages are called and managed within your logic.

Share via

Rag Improvements

1 answer

Your answer