Merge sort algorithm is a comparison-based sorting algorithm. In this sample, we use top-down implementation, which recursively splits list into two halves (called sublists) until size of list is 1. Then merge these two sublists and produce a sorted list. This sample could run in serial, or in parallel with OpenMP* Tasking #pragma omp task and #pragma omp taskwait.
For more details about merge sort algorithm and top-down implementation, please refer to http://en.wikipedia.org/wiki/Merge_sort.
#pragma omp task, which allows to merge sublists in parallel.
#pragma omp task
void merge_sort(int a[], int tmp_a[], int first, int last)
{
if (first < last) {
int middle = (first + last + 1) / 2;
merge_sort(a, tmp_a, first, middle - 1);
merge_sort(a, tmp_a, middle,last);
merge(a, tmp_a, first, middle, last);
}
}
#pragma omp task version:
void merge_sort_openmp(int a[], int tmp_a[], int first, int last)
{
if (first < last) {
int middle = (first + last + 1) / 2; // = first + (last - first + 1) / 2;
// Splits list a[first:last] into two halves (called sublists).
// One is [first:middle-1], and another is [middle:last].
// For sake of performance, only when the list is big enough,
// we create tasks with #pragma omp task.
if (last - first < 5000) {
merge_sort(a, tmp_a, first, middle - 1);
merge_sort(a, tmp_a, middle, last);
}
else {
#pragma omp task
merge_sort_openmp(a, tmp_a, first, middle - 1);
#pragma omp task
merge_sort_openmp(a, tmp_a, middle, last);
#pragma omp taskwait
}
merge(a, tmp_a, first, middle, last);
}
| Modified time | Compiler (Intel® 64) | Compiler options | System specifications | |||
|---|---|---|---|---|---|---|
| 3.4x | Intel® C++ Compiler 19.0 for Windows | /O2 /Qipo /MD /Qopenmp /Qstd=c++11 |
|
|||
| 3.8x | Intel® C++ Compiler 19.0 for Linux | -O2 -ipo -qopenmp -std=c++11 |
|
.sln filePERF_NUMBuild.bat [perf_num] [vc|vs|cl]
perf_num: collect performance numbers (will run example 5 times and take average time)vc|vs|cl: Use the Visual Studio compilerBuild.bat run [help|0|1|2]Build.bat [perf_num]
perf_num: collect performance numbers (will run example 5 times and take average time)Build.bat runsource <icc-install-dir>/bin/compilervars.sh {ia32|intel64}make [icpc] [perf_num=1]
perf_num=1: collect performance numbers (will run example 5 times and take average time)make run [option=help|0|1|2]