Substring operations

Seems like I might be missing something obvious, but I don’t see any substring, left, right, mid, or any operation to get a substring. In my specific case, I want the first 3 characters of a string attribute that can be of varying lengths.

I don’t see a way to make TRIM work for this.

    string b = "aa ABC aaa";
    string c = " a A   ";

    PRINT trim(BOTH trim(BOTH " " FROM c) FROM b);
    # prints "BC"

@markmegerian can you give me an example. I’ve never written this before, but instinctually I’m thinking of using the LIKE operator to do string matching or something like (above) or a combo of both.

TRIM
https://docs.tigergraph.com/dev/gsql-ref/querying/operators-functions-and-expressions#string-functions

LIKE
https://docs.tigergraph.com/dev/gsql-ref/querying/operators-functions-and-expressions#like

Sure, here is an example. I don’t think I can combine LIKE and TRIM to get this to work.

Vertex Part - STRING partNum, STRING partDesc

The first 3 characters of any partNum is the category of part.
For instance, partNum J4566, “Front right body panel”
partNum J45 “Body panel”

I need to be able to go from J4566 to the first 3 characters to get to the category.

Note, I have changed the specifics, but this is the general idea. I don’t own the schema so I cannot change the data, I need to work with this. There is no EDGE between them, I just need to get to it using a simple substring(partNum,3) or something to that effect. While I realize that it would be ideal to have an EDGE, I still am interested in a general solution for substring.

I just verified GSQL doesn’t support substring. If you create a quick UDF like (not tested and not a C++ expert so you might need to tweak):

int main () {
inline string substring(string str, int b, int e)  
std::str.substr (b,e); 
return str;
}

source (C++ function http://www.cplusplus.com/reference/string/string/substr/)

Then to call:

partNum = J4566
b = 0
e = 2

substring(partNum,b,e)

# returns J45
 
# index 0 = J
# index 1 = 4
# index 2 = 5
1 Like

Got this working, good suggestion to use a UDF. Here is the simple code, since others are likely to want a substring function:

inline string substring(string str, int b, int e) {
return str.substr (b,e);
}

3 Likes

@markmegerian Ok we should just have a library… Just Initialized one and added here ecosys/UDF at master · tigergraph/ecosys · GitHub

UDF Library

String BASED UDF

substring

Given a string of text, return a substring from index begin to index end

Variable Description Example
str input string of text The Apple is Red
b index of string you would like to begin with index 4 = A
e index of string you would like to end with index 8 = e

UDF Code

inline string substring(string str, int b, int e) {
return str.substr (b,e);
}

Example
str = “The Apple is Red”
substring(s, 4, 8)

return = Apple

Now adding more:

1 Like

@Jon_Herke

I would love a simple datetime_to_datestring UDF which accepts a DATETIME and returns a sortable string YYYY-MM-DD format. Doesn’t have to be fancy.

Just thinking off the top of my head.

What if you create a local variable like @epochINT then do the sort ASC or DECS on @epochINT and return datetime?

Frank
Birthdate: 1989-10-10

Jill
Birthdate: 1999-12-09

Frank
Birthdate: 2000-03-20

Run QUERY with DESC on @epochINT

Frank
Birthdate: 2000-03-20
@epochINT: 953580840

Jill
Birthdate: 1999-12-09
@epochINT: 944768040

Frank
Birthdate: 1989-10-10
@epochINT: 624051240

@markmegerian

True, but the issue I have encountered dozens of times, if not more, is that I want to use the sortable string representation inside my query, in an accumulator, and also use it to return human-readable output.

So using the epoch achieves the sorting part, but not the output part. Say I want a Person Vertex to have a MapAccum<STRING, INT> where the STRING is human readable date and the INT is a count of something. I have to construct the date (like in my Fun with Date Strings post) but having a UDF to accomplish this would be much cleaner.

One other point, even though there is conversion using datetime_format at print time, this doesn’t solve the problem either. I need the date string at query time, inside ACCUM, not just at PRINT time.