The matrix sample allows you to explore the various capabilities of the Intel® VTune™ Amplifier for analyzing threaded applications.
By default, the sample uses a square matrix with the size of 2048. You may want to increase the matrix size by redefining the macro MAXTHREADS in the multiply.h. However, make sure the size is multiple of # of threads, which is made for simplicity. By default, the number of threads executed is equal to the number of CPU cores available in the system (defined in run-time). You may want to limit the maximum number of created threads by modifying the MAXTHREADS.
By default, the sample uses a native Win32 threads threading model. OpenMP* threading model is available. For the Intel® Math Kernel Library (Intel® MKL)-based kernel, consider either Intel MKL's multithreaded or single-threaded implementation. Make the kernel multithreaded by yourself in the latter case. To change the threading model, select either Release (default), Release_OMP, or Release_MKL project configuration.
You can select one among several kernels available in the sample. Specify one of them as a macro definition of MULTIPLY in the multiply.h header file and rebuild the project.
multiply0 - Basic serial implementation
multiply1 - Basic multithreaded implementation
multiply2 - Optimized implementation with Loop interchange and vectorization (use Compiler vectorization options)
multiply3 - Optimized implementation with adding Cache blocking and data alignment (add ALIGNED macro to Compiler preprocessor)
multiply4 - Optimized implementation with matrix transposition and loop unrolling
multiply5 - Most optimal version with using Intel MKL. Link the Intel MKL library to the project and add the USE_MKL macro to Compiler preprocessor.
Multiply kernels availability depends on the threading model you choose.
Hardware:
For the most up to date hardware requirements, see the Intel® VTune™ Amplifier release notes
Software:
Use the matrix.sln solution in the vc12 subdirectory. Convert the solution and project for later version of Microsoft Visual Studio* IDE, if necessary. Intel compiler is a default configuration in the solution.
Detailed build instructions are available in the tutorials listed below.
The matrix sample for the Intel® VTune™ Amplifier is accompanied by the following tutorial:
Identifying Hardware Issues: Identify the hardware-related issues in your application such as data sharing, cache misses, branch misprediction, and others.
Intel, the Intel logo, VTune and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation in the United States and/or other countries.
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission from Khronos.
© 2017, Intel Corporation