Obvious Reasons Why Distributed Mode Would Slow Down Query?

Parker_Erickson · July 9, 2021, 8:29pm

I recently moved a prototype that I have been working on from my local Docker instance to a cluster of machines, and on one of the queries, I experience ~100x slowdown when I add the “DISTRIBUTED” keyword to the query. Meanwhile, another query that I have experienced a ~30x speedup. Are there rules of thumb of why one distributed query would succeed in increasing efficiency while another would not? I am thinking maybe the initial size of seed vertex set, global/local accumulator usage, etc. would impact this?

Thanks!

Edit: I ran across this: https://docs.tigergraph.com/dev/gsql-ref/querying/distributed-query-mode#guidelines-for-selecting-distributed-query-mode documentation, and the query that slowed down does not start at a large number of vertices, but what is considered “many hops”?

Xinyu_Chang · July 14, 2021, 7:21pm

Yes, distributed query has higher overhead, the rule of thumb when to use distributed query is

When you have only one server, don’t use distributed.
When the query start from a single node or a few nodes, and traverse a small sub graph. Don’t use distributed query.
When you have a distributed cluster and the query start from all nodes, all nodes of a type, or large amount of nodes, and traverse very large number of vertexes and edges. Use distributed query.

Hop here doesn’t really matter, I think we should improve the document.

Thanks.

Parker_Erickson · July 14, 2021, 8:03pm

Great, thanks for the clarification!