Cosmos db Gremlin API repeat query taking more than 15 seconds to query the data

Shardul Pradhan 1 Reputation point
2020-08-24T14:50:42.907+00:00

We are using COSMOS DB Gremlin API for Graph DB. We are storing contact information between two people (vertices) along with some metadata of interaction saved as edge property. A person can connect with any number of people ranging from few to 1000s of people. We are also interested in getting the second level connect of the first connect. So overall interaction may easily involve 5000 - 10000 edges. When we try to query this data using repeat query upto 2 level. It takes aroun 15 seconds to fetch this data even when we increase the allocated RU or if we select Auto scale. The production will have heavier load than this, somewhere 100,000 vertex and 900,000 edges. Any idea how the query can be improved or is there is guidance on optimized performance.

Basic query to get this data:
g.V({id of a person).repeat(inE().has({filter for edge})).outV().dedup()).times(2).emit().tree()

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,632 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Navtej Singh Saini 4,226 Reputation points Microsoft Employee
    2020-08-26T20:36:54.257+00:00

    @Shardul Pradhan

    We checked this further with our Product group as well and they have conveyed that use of traversal. inE() should be avoided and data should be modeled to do outE().

    Our Graph Modeling document explains the same in detail.

    Please go through the same and get back to us with any questions and we will look into it further.

    Thanks
    Navtej S


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.