Use of duplicate edge name

I am working on Health Care provider graph model and trying to model the following schema but presented with the error message that “Vertex or Edge name cannot be duplicated in on graph”. I understand the restriction for Vertex but don’t agree with the same restriction on Edge. In real life same relationship can exists between different nodes and would like to have that flexibility to use the same relationship name between different vertices.

Ex.

Practitioner –(PARTICIPATE_IN)-ProviderNetwork

ProviderGroup –(PARTICIPATE_IN)-ProviderNetwork

Facility –(PARTICIPATE_IN)-ProviderNetwork

Here 3 different entity Individual Practitioner, Provider Group and Facility who can participate in provider network to provide services are represented. All 3 have the same relationship (PARTICIPATE_IN) with the Provider Network but TigerGraph restriction forces to write the relationship with unique name which I feel is counter intuitive to graph model.

1 Like

Hi Ashok,

If you are defining the edge from the command line. You write your schema script this way:

CREATE UNDIRECTED EDGE PARTICIPATE_IN (FROM Practitioner|ProviderGroup|Facility, TO ProviderNetwork)

Then during the data loading. You need to specify the vertex type like this:

However, by defining edges like this makes the GraphStudio schema visualization messy.

If you are defining the schema from GraphStudio. You can choose the starting vertex type to Any Vertex.

image.png

Thanks.

1 Like

@Jon_Herke

I want to throw in a few cents here, cause I’ve been trying to migrate to Tiger for a few days now and also bumped up against this concern. I’m excited by so much of the graph performance and features, but some design decisions like this one have frustrated my adoption process to the point of giving up after a few days of effort in several directions (trying cloud, then building out EE, building schemas in Graph Studio then GSQL, then trying to operate on graph with built-in endpoints).

OO languages scope field names to classes. Since classes and fields are used to represent data and relationships in code, it holds that we’d often reuse the same values to represent ‘edges’ between different sets of models - for instance, every codebase I’ve ever touched has used the field ‘user’ or similar to represent an edge relationship between many models and type User. But not all models, which also matters for robustness and security.

A lack of similar syntactic scope on the graph is pretty unintuitive, and it’s a hard block because we lose either natural JSON de/serialization or schematic clarity. Wildcard edges as a substitute violate our natural data scopes and principles of least privilege, which hurts our ability to construct robust data systems at scale.

I’ve found similar issues elsewhere - for instance, I find it really confusing that there’s no natural way to represent one-to-one relationships, enums, or unique values. And that the query/upsert schema is so far from a natural serialization of models that it feels pretty intractable to save or retrieve application data in a generified way…I think it basically takes custom codegen since upserts require inserting a ‘value’ object for each attribute (equally complex to handle GraphQL endpoint with native models since edges are represented as types, when they are most often attribute-less fields in code).

In addition to these schema and query design issues, there are other significant practical concerns that drive up adoption efforts for app developers like me, some of which I noted elsewhere: no JWT claims support (gotta write custom middleware), no subscriptions for reactive design (gotta build [time, cost, network traffic] expensive intermediate service), GSQL features out of sync with GraphStudio (had to try/fail to match edges to models in both because reverse_edge limitations), and $11k minimum yearly cost for cloud persistence (!!! - gotta spin up EE Free, though EE setup is awesomely easy).

I think it’s important to share that, as a developer focused on building scalable, unified data contexts for modern web/app businesses, I can’t justify the overhead and schematic fuzziness of interoperating with TG in its current form even though so much of the feature set makes me want to commit. It’s clear you’ve had a ton of success designing around analytics and data ingestion for existing datasets, but it really feels like the programmatic needs and data pipelines of modern app developers were kinda ignored…despite them being the core driver of growth in your existing tangible markets.

I hope these designs are refined with time - particularly those necessary to support our existing models with standard serialized object queries - because the product otherwise looks so powerful and primed for our rapidly growing data system needs as well. I’m really excited to get it working…but it’s a shame that it took so much effort digging through docs and trying things out to even grok the extent of these limitations vs. other relational/graph databases for app development.

Thanks for the feedback. I would love to invite you to give some examples, so that we can improve our product.

Let me try to get clarity on the pain points.

  • “OO languages scope field names to classes…”

Do you suggest we should allow attribute/field name be identical to its vertex/edge type name? E.g.

create vertex person (ssn int primary key, person string)

  • " A lack of similar syntactic scope on the graph is pretty unintuitive, and it’s a hard block because we lose either natural JSON de/serialization or schematic clarity. Wildcard edges as a substitute violate our natural data scopes and principles of least privilege, which hurts our ability to construct robust data systems at scale."

Can you give an example that blocks you on the above point? I have a guess, but I want to make sure I understand your blockers.

-“I’ve found similar issues elsewhere - for instance, I find it really confusing that there’s no natural way to represent one-to-one relationships, enums, or unique values. And that the query/upsert schema is so far from a natural serialization of models that it feels pretty intractable to save or retrieve application data in a generified way…I think it basically takes custom codegen since upserts require inserting a ‘value’ object for each attribute (equally complex to handle GraphQL endpoint with native models since edges are represented as types, when they are most often attribute-less fields in code).”

I need example to see the pain point.

@Mingxi_Wu

Of course! Consider the following pseudo-code typical for application data models:

interface Entity {
     String id;
     DateTime createdTime;
     DateTime updatedTime;
}

class User extends Entity {
     Set<Photo> photos;
     Set<Photo> taggedPhotos;
     Photo profilePhoto;
     Set<Video> videos;
}

class Photo extends Entity {
     User creator;
     Set<User> taggedUsers;
}

Class Video extends Entity {
     User creator;
}

Function setEntity<T extends Entity>(T entity) {
     //This generic function uses serializers to save any entity to TigerGraph
}
  1. I don’t mean equivalence between vertex name and attribute/edge name, but rather two distinct ideas:

    • Freedom to define reverse edge name. Here we have the relationship (User) <> (Photo) represented by edge ‘photos’ + reverse ‘creator’. If I can’t freely define the edge/reverse names (eg. GraphStudio), I can’t maintain congruence between my graph model and native data models. For instance, I won’t be able to easily serialize the data for storage using standard libraries without custom code or annotations to translate the field names. This is because serialization libraries often use codegen to introspect on data models and generate code that maps JSON keys to the identically named Object fields. Thus I must instead build workarounds to match the arbitrary ‘reverse_xyz’ TigerGraph edge names to their natural fields. That’s actually a lot of manual labor and cognitive load when we’re rapidly prototyping models and otherwise have out-of-the-box serialization solutions.

    • Logical scoping of edge names to unique vertex pairs. (User) <> (Photo) and (User) <> (Video) both have edge ‘creator,’ but each has a different reverse edge (‘photos’, ‘videos’). GSQL doesn’t let me use two distinct edge+reverse statements to define these two relationships, since ‘creator’ is scoped globally when the first statement is executed. I see that in GSQL I can create a wildcard edge ‘creator’ for vertices Photo|Video, but this syntax only allows me to create one reverse edge name. Same schema disconnect + de/serialization problem. And for our purposes, it makes much more sense to define edge+reverse names uniquely for each relationship than in a single wildcard statement that has to be searched and updated anytime a developer happens to use the same field name.

  2. Consider a serializer that takes a User with a profilePhoto to construct this JSON:

     {
        type: User,
        id: ‘abc’,
        createdTime: 123, 
        updatedTime: 123,
        profilePhoto: {
           id: ‘bcd’, 
           createdTime: 123, 
           updatedTime: 123,
        }
     }
    
    • My claim is that it’s solvable but onerous to intercept this JSON and convert it into the format that TigerGraph expects, where each attribute has a map ‘attribute’: {‘value’: ‘xyz’, ‘op’: code}. While it might seem nice to have sophisticated upsert operations per attribute, it’s also a very complicated API when application needs are often basic CRUD operations. It concerns me how, as a primary API, this violates KISS principle - it seems terribly over-engineered.

    • I would love a simplified endpoint such that I can call setEntity() to send this JSON to create these vertices + reverse/edge accordingly, maybe even using a syntax where I can explicitly define one op code for the entire JSON payload.

  3. It is important to the structure and robustness of my application that only one profilePhoto exists for each User at any time, as this matches both my mental model and natural data structures.

  4. Given all of the above, imagine the challenges of taking the TigerGraph output via REST or GraphQL, which is nowhere near a standard JSON representation of the data model (example in #2 above), and reconstructing an object through deserialization.

The general point is that every statically typed app development language - Java, Swift, Dart, etc - uses an industry standard way of representing data on the wire. TigerGraph ignores this standard completely, which requires us to write a whole mess of custom code to construct/deconstruct the representation that TG understands (worse, two distinct messes for REST and GraphQL). Further, standard edge relationships (one-to-one) aren’t supported and there’s a disconnect in edge definition, which leaves a ton of cognitive burden on the user to juggle two disjoint data schemas. Thus high startup cost, poor scalability, and very low robustness.

Some example references on serializing data:

https://developer.apple.com/documentation/foundation/archives_and_serialization/encoding_and_decoding_custom_types