Thanks for asking question! You may want to know that Azure Cognitive Search distributes each index horizontally through a sharding process, which means that portions of an index are physically separate.
By default, the score of a document is calculated based on statistical properties of the data within a shard. This approach is generally not a problem for a large corpus of data, and it provides better performance than having to calculate the score based on information across all shards.
Also, this could cause two very similar documents (or even identical documents) to end up with different relevance scores if they end up in different shards as you mentioned.
You may try to compute the score based on the statistical properties across all shards, you can do so by adding scoringStatistics=global
as a query parameter (or add scoringStatistics: global
as a body parameter of the query request).
POST https://[service name].search.windows.net/indexes/hotels/docs/search?api-version=2020-06-30
{
"search": "<query string>",
"scoringStatistics": "global"
}
Using scoringStatistics will ensure that all shards in the same replica provide the same results.
Also, different replicas may be slightly different from one another as they are always getting updated with the latest changes to your index.
For more details check this document link:
- https://video2.skills-academy.com/en-us/azure/search/index-similarity-and-scoring
- https://video2.skills-academy.com/en-us/azure/search/search-capacity-planning#concepts-search-units-replicas-partitions-shards
Let us know if you have further query or issue remains.