Merge sort algorithm is a comparison-based sorting algorithm. In this sample, we use top-down implementation, which recursively splits list into two halves (called sublists) until size of list is 1. Then merge these two sublists and produce a sorted list. This sample could run in serial, or in parallel with OpenMP* Tasking #pragma omp task and #pragma omp taskwait.

For more details about merge sort algorithm and top-down implementation, please refer to http://en.wikipedia.org/wiki/Merge_sort.

System Requirements:

Code Change Highlights:

This code is optimized with #pragma omp task, which allows to merge sublists in parallel.
#pragma omp task
serial version:
void merge_sort(int a[], int tmp_a[], int first, int last)
{
    if (first < last) {
        int middle = (first + last + 1) / 2;
        merge_sort(a, tmp_a, first, middle - 1);
        merge_sort(a, tmp_a, middle,last);
        merge(a, tmp_a, first, middle, last);
    }
}
#pragma omp task version:
void merge_sort_openmp(int a[], int tmp_a[], int first, int last)
{
    if (first < last) {
        int middle = (first + last + 1) / 2; // = first + (last - first + 1) / 2;
        // Splits list a[first:last] into two halves (called sublists).
        // One is [first:middle-1], and another is [middle:last].
        // For sake of performance, only when the list is big enough,
        // we create tasks with #pragma omp task.
        if (last - first < 5000) {
            merge_sort(a, tmp_a, first, middle - 1);
            merge_sort(a, tmp_a, middle, last);
        }
        else {
			#pragma omp task
            merge_sort_openmp(a, tmp_a, first, middle - 1);
			#pragma omp task
            merge_sort_openmp(a, tmp_a, middle, last);
			#pragma omp taskwait
        }
        merge(a, tmp_a, first, middle, last);
}

Performance Data:

Modified time Compiler (Intel® 64) Compiler options System specifications
3.4x Intel® C++ Compiler 19.0 for Windows /O2 /Qipo /MD /Qopenmp /Qstd=c++11
Microsoft Windows Server* 2017 (x64)
Intel(R) Core(TM) i7-4770R CPU @ 3.2GHz
8GB memory
3.8x Intel® C++ Compiler 19.0 for Linux -O2 -ipo -qopenmp -std=c++11
Red Hat Enterprise Linux Server 7
3nd Generation Intel® Core™ i7-4790 CPU @ 3.60GHz
32GB memory

Build Instructions:

For Visual Studio* users:
For Windows Command Line users:
For Linux*/macOS* users: