LDBC Query Performance

jlan · March 18, 2024, 11:55am

Hi guys! I’m using Tigergraph for research and I’ve tested Tigergraph with LDBC queries. I wrote GSQL queries based on

https://github.com/tigergraph/ecosys/tree/ldbc/ldbc_benchmark/tigergraph/queries_v3/queries

and

https://github.com/ldbc/ldbc_snb_interactive_v1_impls/tree/main/tigergraph/queries.

I found running some short queries takes unexpectedly long time. Below is my IS7 query which takes a second to complete:

CREATE OR REPLACE QUERY IS7(INT id = 1099547741340) FOR GRAPH ldbc_graph SYNTAX v2 {
  TYPEDEF TUPLE<INT commentId, STRING commentContent, STRING commentCreationDate,
    INT replyAuthorId, STRING replyAuthorFirstName,
    STRING replyAuthorLastName, BOOL replyAuthorKnowsOriginalMessageAuthor> reply;
  OrAccum<BOOL> @knows;
  HeapAccum<reply>(100, commentCreationDate DESC, replyAuthorId ASC) @@result;

  vComment = SELECT c FROM Comment:c WHERE c.id == id;
  IF vComment.size() == 0 THEN
    vPost = SELECT p FROM Post:p WHERE p.id == id;
    P = SELECT p FROM vPost:s -(hasCreator>)- Person -(knows)- Person:p
      PER(p)
      ACCUM p.@knows += TRUE;
    C = SELECT p FROM vPost:s -(<replyOf)- Comment:c -(hasCreator>)- Person:p
      ACCUM @@result += reply(c.id, c.content, c.creationDate, p.id, p.firstName, p.lastName, p.@knows);
  ELSE
    P = SELECT p FROM vComment:s -(hasCreator>)- Person -(knows)- Person:p
      PER(p)
      ACCUM p.@knows += TRUE;
    C = SELECT p FROM vComment:s -(<replyOf)- Comment:c -(hasCreator>)- Person:p
      ACCUM @@result += reply(c.id, c.content, c.creationDate, p.id, p.firstName, p.lastName, p.@knows);
  END;

  PRINT @@result as result;
}

I found the “replyOf” steps unexpectedly slow. Do you guys have an idea what’s happening and maybe how to improve my query? I use installed queries and run on a SF10 dataset. I’ve used secondary indices for Comment/Post on their id. Thanks in advance for your help!

markmegerian · March 21, 2024, 3:15pm

Have you tried it as a DISTRIBUTED QUERY ?

jlan · March 21, 2024, 4:54pm

Not yet but we’ll put the test on a cluster very soon. I’m not sure this will help since most short queries take Tigergraph ~10ms to process on my PC. There’s a huge gap (~100x) here. Also tried cost-based optimization when installing those queries, but I didn’t see any difference. Thanks for mentioning this.

markmegerian · March 21, 2024, 5:26pm

One other random thought, is your schema defined to have reverse edges? It may not matter, but if you have a reverse edge for replyOf. you could go the other direction from post or comment to see if that makes a difference

jlan · April 13, 2024, 12:13pm

Hi, some updates on using DISTRIBUTED and the reverse-edge trick:

DISTRIBUTED optimized those complex queries but doesn’t seem very helpful to short queries.

All short queries are now efficient with the help of reverse edges. Seems it is always desirable to expand (following an outgoing edge) from the side with a smaller cardinality. Interesting.

Thanks for the help!

accumulator · April 13, 2024, 9:16pm

Hi,

can i understand more? since I led the benchmark effort as TigerGraph.

How large is your testing data?
how large is your memory for this benchmark?
what’s the query performance?

jlan · April 14, 2024, 7:17am

Hi!

We’re using sf10 on a standalone instance and sf10/30 on a cluster of up to 16 machines.
Not sure what you mean, but we deployed the single instance on a server with 756G memory. The cluster has 128G on each machine. Seems we do not have SSD, though.
A single instance produces decent performance (with some tuning efforts) in terms of both latency and throughput. All IS queries have ~30ms latency, and most IC queries are below 1 second. We fed a workload mixed with 99% IS and 1% IC and got about 250-300qps. I couldn’t reproduce the latency for IS on a Tigergraph cluster (most go up to 300-400ms), and the throughput goes down. IC seems ok.

We may need help to optimize the short query on a distributed setting, if possible.

accumulator · April 17, 2024, 12:04am

Cluster setup has two modes: distributed and non-distributed.

If you use non-distributed mode (remove the distributed key word), the performance maybe better.
IS is point query, touching few elements.

IC is more like neighborhood query, touching more data elements.