Query slow the first time

dsolow · September 2, 2020, 12:58pm

I’m noticing that when I run a query on a vertex I haven’t accessed yet, it can sometimes be quite slow (30-40 seconds). When I run the same query again it’s fast (less than 10 seconds). Is this due to caching behavior? Was the required data not in memory the first time? Do I not have enough memory to fit my entire graph?

Mingxi_Wu · September 2, 2020, 9:07pm

when you first start tigergraph, it takes some time to warm up (loading topology to the memory etc.),
If you are using 3.0, you can use gadmin status to see the status of each component.

After warm up, it should be stable performance. Little variance can occur due to CPU cache warm up.

Richard_Henderson · September 3, 2020, 7:17am

Per your question re RAM, run a

free -m

command on the host to see if you are swapping.

Any non-zero values there will slow things down, and indicates you need more RAM.

dsolow · September 3, 2020, 12:22pm

I have swapping off.

Something I noticed is that switching from standard drives to SSDs increases the performance quite a lot. Is tigergraph reading some stuff from disk? or is it putting everything in memory?

Richard_Henderson · September 3, 2020, 1:40pm

Normally it does not need to read from disk. It always writes to disk, of course. There are settings that can keep some information on disk, but they are unusual. Certainly not the default.

Initial reading is done using mmap, which can occasionally page, but you say you have enough memory and swapping is off, so that shouldn’t be a concern. This is quite mysterious. Can you tell us a little more about your setup. Is it on-prem? Bare metal or virtual? How much data versus memory?

Generally, it is a memory based database for read.

dsolow · September 3, 2020, 2:07pm

This is Tigergraph 2.5.1 enterprise edition running on google cloud platform on a VM with 60GB RAM – the Tigergraph license caps memory usage at 50GB.

The size of the data in CSV files on disk is 71GB, however that’s fairly inflated because I’m using integers as IDs, and a lot of the data is stored as string compress with only a few hundred possible values.

free -h
              total        used        free      shared  buff/cache   available
Mem:            58G         41G        6.4G        1.1M         11G         17G
Swap:            0B          0B          0B

gadmin status -v graph
verbose is ON
=== graph ===
[m1     ][GRAPH][MSG ] Graph was loaded (/home/tigergraph/tigergraph/gstore/0/part/): partition size is 33.89GiB, IDS size is 18.61GiB, SchemaVersion: 0, VertexCount: 587225687, NumOfSkippedVertices: 0, NumOfDeletedVertices: 0, EdgeCount: 1509633144
[m1     ][GRAPH][INIT] True
[INFO   ][GRAPH][MSG ] Above vertex and edge counts are for internal use which show approximate topology size of the local graph partition. Use DML to get the correct graph topology information
[SUMMARY][GRAPH] graph is ready

Let me know if you’d like to see anything else. In general we get satisfactory performance out of Tigergraph, but I’m looking into increasing performance (or at least understanding what factors affect performance).

To be clear I’m only talking about read workloads here. This database is read-only at this point

Richard_Henderson · September 3, 2020, 3:54pm

OK. Looks okay. I suggest you temporarily increase the size of the instance and see if it displays the same behavior. If this is immediately after boot, then the OS may shuffle memory about between main memory and disk cache till it reaches steady state post-boot, but I would expect this to be finished relatively quickly, and I can’t remember the last time I saw this as a problem.

If it is a significant problem then we can open a ticket with support and see if engineering have any further insight. Let me know what you’d like to do.