Tip #2 - Checking existence of of a vertex without slowing your server

So, you have a vertex id string. And you’re not sure if it exists.

You could do the SQL like thing like this (hint, this is WRONG (until we fix it anyway), don’t do this!):

STRING thing_string = "the key I'm not sure about";
Result = select s from Thing:s where s.id=thing_string; //// oh noes!

The problem here is that TigerGraph will scan every Thing until it finds a match. Do this a lot and you’ll use up your CPU scanning vertices. It will give you the right result, but we want to be fast and efficient, right?

So here is the way.
We need a vertex id lookup.
To do that we have three options. One is to use a VERTEX parameter. This is the best way if we know the vertex exists.

There is also an internal function to_vertex(). This is also good within a function, but throws a runtime error if the vertex doesn’t exist. So no good for us.

What we need is the to_vertex_set() function. This is perfect. Given a set of vertex id’s, and the type of the vertex, it will find all the id’s that exist in the set, and return a set of VERTEX that correspond to those id’s. More to the point, it doesn’t throw an error if there are no vertices that correspond to the id’s.

All we need to do is some gymnastics so we can feed the function with what it needs both in terms of parameters and return types.

The signature of the function is:

<SET < VERTEX> > to_vertex_set( <SET < STRING > >, < STRING > )

The second parameter is the name of the type e.g. "Thing"

The first parameter has to be defined as a global SetAccum because that is how we define sets in function bodies e.g. SetAccum <STRING> @@my_thing_strings

The return type has to be defined as a global SetAccum because that is how we define sets in function bodies e.g. SetAccum <VERTEX> @@existing_things

After that, if we have anything in the return set, we have found the id! And even though the documentation says this is slow, it is a lot faster than scanning every vertex if you have a lot of vertices.

Giving us the following code:

 STRING thing_string = "the key I'm not sure about";

 SetAccum <STRING> @@my_thing_strings; 

SetAccum <VERTEX> @@existing_things;

@@my_thing_strings += thing_string; // set with single entry

@@existing_things = to_vertex_set(@@my_thing_strings, "Thing");

IF @@existing_things.size() > 0 THEN

// Do all the things when it exists

ELSE

// Do all the things if it doesn't exist

END;

We have a feature request in so we can automagically spot when someone uses the SQL like form, and do all this for you, but until then, this is the best way to retain the scorching performance we know and love.

2 Likes

Love this tip! @danmccreary Did you see this?

If you have tons of vertices (like 100 thousands, it will not easy to pass the parameters) need to check, SelectVertex() will really help you out
https://docs.tigergraph.com/dev/gsql-ref/querying/operators-functions-and-expressions#selectvertex

1 Like

And some query like this

CREATE QUERY test_vertex_parameter(VERTEX test_vertex_id) FOR GRAPH MyGraph { 
   PRINT test_vertex_id;
}

Will get 200 OK with two difficult results
Vertex exist {error:false}
Vertex not exist {error:true, message:“Failed to convert user vertex id for parameter test_vertex_id”}

1 Like