TEMP_TABLE cleanup

John_Chen · February 23, 2021, 4:45pm

Hi, I am using TEMP_TABLE to prepare data and then load it, it works great. After the data is loaded, should I drop the TEMP_TABLE to clean up memory? If yes, is there any special syntax and when should I drop it?

Also, I noticed that after using TEMP_TABLE, the loading sometimes gets stuck, and I have to do gadmin restart all -y rather frequently. When it is stuck, I saw this in log:

[INFO] Start loading /home/tigergraph/mydata/mydata.csv, LineBatch = 8192, LineOffset = 1, ByteOffset = 140
0223 17:02:59.458 I event/command_consumer.go:56] Received cmd. Name: ReportKafkaStatus, Id: ab96bc427c284357a5894348766d074e, CreatedTimeStamp: 1614099778798289634, DeadlineTimeStamp: 1614099793798289634  
0223   17:02:59.570 D event/command_consumer.go:58] Cmd Id: ab96bc427c284357a5894348766d074e, Parameter: type_url:"type.googleapis.com/tigergraph.tutopia.common.pb.Bytes" value:"\n\014\010\001\020\001(:0\302\352\324\201\006"  
E0223 17:03:23.047664 36894 heartbeat_client.cpp:211] Issue client session write failed. rc:kNoResource round:39  
E0223 17:03:26.408869 36894 heartbeat_client.cpp:211] Issue client session write failed. rc:kNoResource round:42  
E0223 17:03:30.693869 36912 gdict.cpp:584] RPC timeout reached, no retry, timeout: 5000
E0223 17:03:30.698330 36912 gbrain_service_manager.cpp:366] Can not list service root: /__services, rc: kTimeout
0223 17:03:30.969 I event/command_consumer.go:56] Received cmd. Name: ReportKafkaStatus, Id: 54a2a517bc804fe89b2832125afedf60, CreatedTimeStamp: 1614099810719235860, DeadlineTimeStamp: 1614099825719235860

Thanks,
John

Mingxi_Wu · February 25, 2021, 8:02pm

is this on distributed cluster or single machine?

John_Chen · February 25, 2021, 8:03pm

Single node running on docker

Mingxi_Wu · March 1, 2021, 9:18pm

we are trying to reproduce it. Do you have a simple test case to show the use pattern?
Such as json sample file. And the schema.

Do you have to list in one object that you want to flatten at the same time?

John_Chen · March 1, 2021, 10:57pm

Hi Mingxi (@Mingxi_Wu), here is the simplified info

schema

CREATE SCHEMA_CHANGE JOB change_schema FOR GRAPH MyGraph {
ADD VERTEX Application(PRIMARY_ID id STRING, name STRING, name2 STRING, name3 STRING, note STRING, tag STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true";
ADD VERTEX CapabilityDomain(PRIMARY_ID id STRING, name STRING, note STRING, tag STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true";
ADD VERTEX Capability(PRIMARY_ID id STRING, name STRING, note STRING, tag STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true";

    ADD DIRECTED EDGE serves(FROM Application, TO Capability) WITH REVERSE_EDGE="rev_serves";
    ADD DIRECTED EDGE has_a(FROM CapabilityDomain, TO Capability) WITH REVERSE_EDGE="rev_has_a";
}

loader

  DEFINE FILENAME MyDataSource = "myinput.csv";

  # trim all data and give them more friendly names
  LOAD MyDataSource TO TEMP_TABLE ttTrim2 
      (appName, domain, capability, Y2021, Y2022, Y2023, Y2024, Y2025, Y2026, Y2027, ID, infoID) 
    VALUES 
      (gsql_trim($0), gsql_trim($1), gsql_trim($2), gsql_trim($3), gsql_trim($4), gsql_trim($5), gsql_trim($6), gsql_trim($7), gsql_trim($8), gsql_trim($9), gsql_trim($10), gsql_concat("Info: ", gsql_trim($0), ":", gsql_trim($1), ":", gsql_trim($2)))
    USING SEPARATOR=",", HEADER="true", EOL="\n", QUOTE="double";

  LOAD TEMP_TABLE ttTrim2 
    TO VERTEX Application VALUES($"ID", $"appName", _, _, _, "tag"),
    TO VERTEX CapabilityDomain VALUES($"domain", $"domain", _, "tag"),
    TO VERTEX Capability VALUES($"capability", $"capability", _, "tag"),
    TO EDGE serves VALUES($"ID", $"capability"),
    TO EDGE has_a VALUES($"domain", $"capability");
}

myinput.csv

Application Name,Domain,Capability,2021,2022,2023,2024,2025,2026,2027,ID
App1,Domain1,Cap1,A,,,,,,,1111
App2,Domain2,Cap1,B,,,,,,,2222
App2,Domain2,Cap1,B,,,,,,,2222
App2,Domain2,Cap2,B,,,,,,,2222
App3,Domain3,Cap3,B,B,B,B,B,B,,3333
App3,Domain4,Cap3,B,B,B,B,B,B,,3333
App3,Domain5,Cap3,B,B,B,B,B,B,,3333
App3,Domain6,Cap4,B,B,B,B,B,B,,3333
App3,Domain7,Cap5,B,B,B,B,B,B,,3333
App4,Domain8,Cap5,A,,,,,,,4444

The data is a couple of hundreds rows, same type of info.