Understanding inter-play between parallel processing and accumulator?

Maatdeamon · July 20, 2020, 5:18pm

Looking at the simplest shortpath algorithm:

CREATE DISTRIBUTED QUERY shortest_ss_no_wt (VERTEX source, BOOL display) FOR GRAPH social {
/* This query is Single-Source Shortest Path without weights on edges. It calculates the shortest distance from the given vertex source to all other connected vertices, and shows one shortest path between them.   
The JSON version also show visualization of the network. 
The attribute version only store the distance into attribute, not the path.
*/

        MinAccum<int> @dis;
        OrAccum @visited;
        ListAccum<vertex> @path;
        SetAccum<edge> @@edgeSet;

  ##### Initialization  #####
        Source = {source};
        Source = SELECT s 
                 FROM Source:s
                 ACCUM s.@visited += true, 
                       s.@dis = 0,
                       s.@path = s; 
        ResultSet = {source};

  ##### Calculate distances and paths #####
        WHILE(Source.size()>0) DO
                Source = SELECT t
                         FROM Source:s -(Friend:e)-> :t
                         WHERE t.@visited == false
                         ACCUM t.@dis += s.@dis + 1,
                               t.@path = s.@path + [t],
                               t.@visited += true;
                ResultSet = ResultSet UNION Source;
        END;

  ##### Print the results #####

        PRINT ResultSet[ResultSet.@dis, ResultSet.@path];
        IF display THEN
                ResultSet = SELECT s
                            FROM ResultSet:s -(Friend:e)-> :t
                            ACCUM @@edgeSet += e;
                PRINT @@edgeSet;
        END;


}

Focused on this particular piece of code:

FROM Source:s -(Friend:e)-> :t
                         WHERE t.@visited == false
                         ACCUM t.@dis += s.@dis + 1,
                               t.@path = s.@path + [t],
                               t.@visited += true;

I wonder what does happens when two nodes in the shortest path lead to a same node i.e.
x-> A and y->A

A will be selected twice in the same iteration as part of the edge set right ?

Considering that t is A here, how is t updated during the aggregation phase from those two edges ? The last update wins ? in particular how is t.@path = s.@path + [t] updated, which path get added to A. The path from x, or the path y ?

Richard_Henderson · July 21, 2020, 8:58am

Could be either.

Given two paths of equal length, and given that we need to return a single path, it returns whichever one updates last as you thought.

If you run it multiple times with equal paths, you may see different paths returned.