Complex Query Help

Hi,

I am attempting to write a query with a rather complicated series of hops and I can’t seem to figure out how to make it happen properly. In a nutshell, I would like to join multiple times against the same vertex set. Here’s a diagram of an example schema, query, and data set that will help illustrate my question:

In the above system I would like to write a query that will find vertex “A” while enforcing all of the required patterns specified in the query. To state the query in natural language terms:

  • Find all vertices that each satisfy all of:
    • Source of an edge of type e1 that targets a vertex with all of:
      • An edge of type e1 that targets any vertex
      • An edge of type e2 that targets any vertex
      • An edge of type e3 that targets any vertex
    • Source of an edge of type e2 that targets a vertex with all of:
      • An edge of type e2 that targets any vertex
      • An edge of type e3 that targets any vertex
      • An edge of type e4 that targets any vertex
    • Source of an edge of type e3 that targets a vertex with all of:
      • An edge of type e3 that targets any vertex
      • An edge of type e4 that targets any vertex
      • An edge of type e5 that targets any vertex

The trouble I run into is how to chain together the multi-hop patterns necessary to satisfy the requirements. Is a query like this even possible in TigerGraph? In SQL I would write something like:

SELECT DISTINCT id

FROM vertex_t1

JOIN edge_e1 AS e1a1 ON vertex_t1.id = e1a1.source

JOIN edge_e1 AS e1a2 ON e1a1.target = e1a2.source

JOIN edge_e2 AS e2a ON e1a1.target = e2a.source

JOIN edge_e3 AS e3a ON e1a1.target = e3a.source

JOIN edge_e2 AS e2b1 ON vertex_t1.id = e2b1.source

JOIN edge_e2 AS e2b2 ON e2b1.target = e2b2.source

JOIN edge_e3 AS e3b ON e2b1.target = e3a.source

JOIN edge_e4 AS e4b ON e2b1.target = e4a.source

JOIN edge_e3 AS e3c1 ON vertex_t1.id = e3c1.source

JOIN edge_e3 AS e3c2 ON e3c1.target = e3c2.source

JOIN edge_e4 AS e4c ON e3c1.target = e4c.source

JOIN edge_e5 AS e5c ON e3c1.target = e5c.source

Thanks in advance for any light you can shed!

Hi Robert,

This is doable please refer to the GSQL example below.

Start = {ANY};

// select vertexes having all 3 edge types
Start = select s from Start:s where s.outdegree(“edge_e1”) > 0 and s.outdegree(“edge_e2”) > 0 and s.outdegree(“edge_e3”) > 0;

// select vertexes having neighbors connected by edge_e1 , and the neighbors must have edge_e1 , edge_e2 and edge_e3 edges starting from them
Start = select s from Start:s-(edge_e1)-:t where t.outdegree(“edge_e1”) > 0 and t.outdegree(“edge_e2”) > 0 and t.outdegree(“edge_e3”) > 0

// select vertexes having neighbors connected by edge_e2 , and the neighbors must have edge_e2 , edge_e3 and edge_e4 edges starting from them

Start = select s from Start:s-(edge_e2)-:t where t.outdegree(“edge_e2”) > 0 and t.outdegree(“edge_e3”) > 0 and t.outdegree(“edge_e4”) > 0

// select vertexes having neighbors connected by edge_e3 , and the neighbors must have edge_e3 , edge_e4 and edge_e5 edges starting from them

Start = select s from Start:s-(edge_e3)-:t where t.outdegree(“edge_e3”) > 0 and t.outdegree(“edge_e4”) > 0 and t.outdegree(“edge_e5”) > 0

print Start;

Please let me know if you have any questions.

Thanks.

Hi Xinyu,

Outdegree is a neat way to solve this problem. Unfortunately, I think I may be oversimplified my toy example and therefore lost the key point with this solution. In the real problem I need to be able to filter on attributes of all the vertices in the query (and the vertices are all different types). Is there a way to combine two different vertex sets? For example, if the syntax worked I think this query would satisfy the problem:

Start = {ANY};

Middle1 = {ANY};
Middle2 = {ANY};
Middle3 = {ANY};

Middle1 = select m from Middle1:m-(edge_e1)-:t where m.attr1 == val1 and t.attrA == valA

Middle1 = select m from Middle1:m-(edge_e2)-:t where m.attr2 == val2 and t.attrB == valB
Middle1 = select m from Middle1:m-(edge_e3)-:t where m.attr3 == val3 and t.attrC == valC

Middle2 = select m from Middle2:m-(edge_e2)-:t where m.attr1 == val1 and t.attrA == valA

Middle2 = select m from Middle2:m-(edge_e3)-:t where m.attr2 == val2 and t.attrB == valB
Middle2 = select m from Middle2:m-(edge_e4)-:t where m.attr3 == val3 and t.attrC == valC

Middle3 = select m from Middle3:m-(edge_e3)-:t where m.attr1 == val1 and t.attrA == valA

Middle3 = select m from Middle3:m-(edge_e4)-:t where m.attr2 == val2 and t.attrB == valB
Middle3 = select m from Middle3:m-(edge_e5)-:t where m.attr3 == val3 and t.attrC == valC

Start = select s from Start:s-(edge_e1)-Middle1:m where s.attr1 == val1

Start = select s from Start:s-(edge_e2)-Middle2:m
Start = select s from Start:s-(edge_e3)-Middle3:m

print Start

Is there a syntactically valid way to achieve the above result?

Thanks!

Xinyu’s suggestion should still be good.

To use the same pattern in yours:

Start = {ANY};

Middle1 = select m from Start:m-(edge_e1)-:t where m.attr1 == val1 and t.attrA == valA

Middle1 = select m from Middle1:m-(edge_e2)-:t where m.attr2 == val2 and t.attrB == valB
Middle1 = select m from Middle1:m-(edge_e3)-:t where m.attr3 == val3 and t.attrC == valC

Middle2 = select m from Middle1:m-(edge_e2)-:t where m.attr1 == val1 and t.attrA == valA

Middle2 = select m from Middle2:m-(edge_e3)-:t where m.attr2 == val2 and t.attrB == valB
Middle2 = select m from Middle2:m-(edge_e4)-:t where m.attr3 == val3 and t.attrC == valC

Middle3 = select m from Middle2:m-(edge_e3)-:t where m.attr1 == val1 and t.attrA == valA

Middle3 = select m from Middle3:m-(edge_e4)-:t where m.attr2 == val2 and t.attrB == valB
Middle3 = select m from Middle3:m-(edge_e5)-:t where m.attr3 == val3 and t.attrC == valC

But clearly, we could do this throughout giving:

Start = {ANY};

Start = select m from Start:m-(edge_e1)-:t where m.attr1 == val1 and t.attrA == valA

Start = select m from Start:m-(edge_e2)-:t where m.attr2 == val2 and t.attrB == valB
Start = select m from Start:m-(edge_e3)-:t where m.attr3 == val3 and t.attrC == valC

Start = select m from Start:m-(edge_e2)-:t where m.attr1 == val1 and t.attrA == valA

Start = select m from Start:m-(edge_e3)-:t where m.attr2 == val2 and t.attrB == valB
Start = select m from Start:m-(edge_e4)-:t where m.attr3 == val3 and t.attrC == valC

Start = select m from Start:m-(edge_e3)-:t where m.attr1 == val1 and t.attrA == valA

Start = select m from Start:m-(edge_e4)-:t where m.attr2 == val2 and t.attrB == valB
Start = select m from Start:m-(edge_e5)-:t where m.attr3 == val3 and t.attrC == valC

print Start;

This could probably be further simplified (e.g. collecting the common edge types) but hopefully you get the idea.

Sorry, I may be being a bit dense here but I’m not sure that query validly solves the problem. I think using Start as the main vertex set in the 4th statement conflates vertex sets that are not required to be the same.

In the Middle1/2/3 plan it is not required that there be any overlap at all between the vertices in M1/2/3, they could be completely disjoint and still satisfy the requirement that Start be connected to at least one of each of them. However, when you start each set off with the result of the previous set, you introduce a requirement that they overlap completely.

I think my toy example is very misleading here. The real problem has a different vertex type in every position, and all the edges are of unique types that are specific to the vertices they connect. I will try to come up with one that illustrates the problem more clearly.

Edit: hopefully this example is a bit more clear about what I’m after:

Aha.

Okay, how about this:

OrAccum @m1;

OrAccum @m2;

OrAccum @m3;

Start = {ANY};

Middle1 = {ANY};

Middle2 = {ANY};

Middle3 = {ANY};

Middle1 = select m from Middle1:m-(edge_e1)-:t where m.attr1 == val1 and t.attrA == valA;

Middle1 = select m from Middle1:m-(edge_e2)-:t where m.attr2 == val2 and t.attrB == valB;

Middle1 = select m from Middle1:m-(edge_e3)-:t where m.attr3 == val3 and t.attrC == valC post-accum m.@m1=true;

Middle2 = select m from Middle2:m-(edge_e2)-:t where m.attr1 == val1 and t.attrA == valA;

Middle2 = select m from Middle2:m-(edge_e3)-:t where m.attr2 == val2 and t.attrB == valB;

Middle2 = select m from Middle2:m-(edge_e4)-:t where m.attr3 == val3 and t.attrC == valC post-accum m.@m2=true;

Middle3 = select m from Middle3:m-(edge_e3)-:t where m.attr1 == val1 and t.attrA == valA;

Middle3 = select m from Middle3:m-(edge_e4)-:t where m.attr2 == val2 and t.attrB == valB;

Middle3 = select m from Middle3:m-(edge_e5)-:t where m.attr3 == val3 and t.attrC == valC post-accum m.@m3=true;

Start = select s from Start:s-(edge_e1)-t where s.attr1 == val1 and t.@m1==true;

Start = select s from Start:s-(edge_e2)-t where s.attr2 == val2 and t.@m2==true;

Start = select s from Start:s-(edge_e3)-t where s.attr3 == val3 and t.@m3==true;

print Start

Yes! The OrAccum strategy worked perfectly and generalized fine into the generic algorithm I’m working on. I had no idea you could annotate vertices across select statements within a query. Thank you!

Out of curiosity, do you think the intersect or OrAccum strategy would perform better in general? To use the intersect strategy I think an additional pass would be required since we’d want to order and limit the intersected output.

The intersect approach is going to be more efficient because it doesn’t need additional operations on ACCUM. The intersection is faster.