Recommended query organization

jimwu · November 30, 2020, 8:13am

What is the recommended organization of queries?

I have a .gsql file that drops/creates/installs 10 queries. Each time even if I only update 1 query, all 10 queries have to be recompiled. Is there a better way do this? I could put each query in a separate file, but then the .gsql files would become unmanageable quickly.

Szilard_Barany · November 30, 2020, 1:59pm

Maybe just remove the INSTALL QUERY statements, and apply a INSTALL QUERY ALL in a separate step?

Personally, I prefer modularity. In fact, I believe that large, monolithic code/script files get unmanageable, especially if a team is working on the code (collaboration is difficult if you have to worry all the time that your version of the code might overwrite other people’s additions/changes). Also, smaller code pieces allow more flexible and dynamic code changes. But this is my opinion only.

jimwu · November 30, 2020, 10:58pm

Ok. I will see how painful this becomes and then decide what the best approach is. Looks like a Makefile is in order at some point. Thanks a lot for your suggestion!

Szilard_Barany · December 1, 2020, 11:59am

I would be interested in your use case.

Generally speaking, a database schema (including a graph database schema = graph) does not change frequently. I am not saying it will never change, but it’s not likely it would change on daily basis, so I do not think it should become painful to manage the code.

And in most of the cases, designing and updating a schema is a human activity, as it requires thorough thinking. Certain parts can be automated, but I would be concerned about a fully automated schema building. Not from technical point of view: TigerGraph can easily handle agile schema changes (e.g. incorporating new data sources), and scripting it is fairly simple; it’s the efficiency of the schema that would concern me. TigerGraph is highly optimised and can store and process/analyse data very fast, but poor schema design negatively impacts query performance. To avoid that, someone has to put some thinking into the design process.

jimwu · December 1, 2020, 6:17pm

Here is my use case: I am developing a TigerGraph plugin consisting of UDFs. My test consists of ~15 queries defined in two .gsql files. What I am seeing is that I spend lots of time waiting for the installation of 15 queries during testing while most of time I only modify 1 or 2 queries. I am ok with having each query in its own .gsql file, but the drop dependency is in the way (see my other post). I am interest in learning how other people manage this. The way it’s currently enforced, once a project have lots of queries, “drop query all” is inevitable because managing the drop dependencies can be difficult.

Mingxi_Wu · December 2, 2020, 4:56am

did you try the interpret query mode? Or you want to install these queries as service?

we are considering to allow package concept to organize queries. But we are waiting for ISO GQL standard, so that we can be compatible with the standard.