Example that uses parallel_do to do parallel preorder traversal of a sparse graph.

Each vertex in the graph is called a "cell". Each cell has a value. The value is a matrix. Some of the cells have operators that compute the cell's value, using other cell's values as input. A cell that uses the value of cell x is called a successor of x.

The algorithm works as follows.

  1. Compute the set of cells that have no inputs. This set is called root_set.
  2. Each cell has an associated field ref_count that is an atomic integer. Initialize ref_count to the number of inputs for the Cell.
  3. Update each cell in root_set, by applying a parallel_do to a root_set
  4. After updating a cell, for each of its successors
    1. Atomically decrement the successor's ref_count
    2. If the count became zero, add the cell to the set of cells to be updated, by calling parallel_do_feeder_impl::add.
The times printed are for the traversal and update, and do not include time for computing the root_set.

The example is using custom synchronization via ref_count atomic variable. Correctness checking tools might not take this into account, and report data races between different tasks that are actually synchronized.

Note: It is important to understand that this example is unlikely to show speedup if the cell values are changed to type "float". The reason is twofold.

System Requirements

For the most up to date system requirements, see the release notes.

Files
main.cpp
Main program which parses command line options and runs the algorithm with different numbers of threads.
parallel_preorder.cpp
Implementation of the parallel preorder traversal algorithm.
Graph.h
Interfaces of the Graph and Cell classes.
Graph.cpp
Implementations of the Graph and Cell classes.
Matrix.h
The Matrix class definition.
Makefile
Makefile for building the example.
Directories
msvs
Contains Microsoft* Visual Studio* workspace for building and running the example (Windows* systems only).
xcode
Contains Xcode* IDE workspace for building and running the example (macOS* systems only).

For information about the minimum supported version of IDE, see release notes.

Build instructions

General build directions can be found here.

Usage
parallel_preorder -h
Prints the help for command line options
parallel_preorder [n-of-threads=value] [n-of-nodes=value] [n-of-traversals=value] [silent]
parallel_preorder [n-of-threads [n-of-nodes [n-of-traversals]]] [silent]
n-of-threads is the number of threads to use; a range of the form low[:high], where low and optional high are non-negative integers or 'auto' for a platform-specific default number.
n-of-nodes is a number of nodes in the graph. Default value is 1000.
n-of-traversals is the number of times to evaluate the graph. Default value is 500.
silent - no output except elapsed time.
To run a short version of this example, e.g., for use with Intel® Parallel Inspector:
Build a debug version of the example (see the build instructions).
Run it with the desired number of threads and smaller number of traversals, e.g., parallel_preorder 4 1000 5.

Up to parent directory
Legal Information

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
* Other names and brands may be claimed as the property of others.
© 2019, Intel Corporation