Addons/math/mt/Benchmarking
User Guide | Installation | Development | Categories | Git | Build Log
Objective
Benchmark GEMM, TRSM and GESV methods from J primitives, mt addon, BLIS and OpenBLAS library wrappers in both single-threaded and multi-threaded environment.
Preparation
Prepare external libraries with methods to compare:
user@host:~/lib> ls -1
libblis_threads=1.so
libblis_threads=n.so
libopenblas_threads=1.so
libopenblas_threads=n.so
By handiwork
To estimate performance, a raw data from the test log can be used e.g.:
load 'math/mt'
mkmat=. _1 1 0 3 _6 4&gemat_mt_
log=. mkmat testbasicmm_mt_ 2 # 1000
'sts tms'=. 0 4 { log
In the code snippet above, various matrix-multiply methods were tested by random float 1000*1000 matrices in single-threaded environment. Sentences executed were saved into 2-rank string array sts
(one sentence per row), and estimated execution durations were saved into tms
vector (one atom per sentence):
sts ; ,. tms
+-----------------+--------+
|(+/ .*) |0.000345|
|mp |0.000346|
|dgemmnn_mtbla_ |0.003074|
|... |... |
+-----------------+--------+
See log format in mt.ijs
file. An execution duration for each sentence is estimated as proposed in [1]: "the minimum run-time of 3-5 executions of the program when the machine is lightly loaded.".
Having problem sizes given and execution durations produced, it's possible to compute any other indicators e.g. FLOPS or "duration per value".
By customized script
But developing a specialized code can make benchmarking process far more simple and convenient. Place the script File:Bmk.ijs into ~temp/bmk.ijs
file and run it:
user@host:~/j9.6> ./jconsole.sh
load '~temp/bmk.ijs'
nn=. 100 liso4dhs_mt_ 100 60 NB. repeat for n=100..6000 with step 100
bmk_mttmp_ nn
... (output is skipped)
This script's execution will result in creating 6 text files with numeric data (3 matrix methods * 2 thread modes (single/multi)) and 6 corresponding graph files (.pdf
when was run within jconsole or .png
when was run within Qt Jconsole):
user@host:~/j-user/temp> ls -1 bmk_*
bmk_GEMM_threads=1.dat
bmk_GEMM_threads=1.pdf
bmk_GEMM_threads=n.dat
bmk_GEMM_threads=n.pdf
bmk_GESV_threads=1.dat
bmk_GESV_threads=1.pdf
bmk_GESV_threads=n.dat
bmk_GESV_threads=n.pdf
bmk_TRSM_threads=1.dat
bmk_TRSM_threads=1.pdf
bmk_TRSM_threads=n.dat
bmk_TRSM_threads=n.pdf
References
- ↑ Magne Haveraaen, Hogne Hundvebakke. Some Statistical Performance Estimation Techniques for Dynamic Machines. Appeared in Weihai Yu & al. (eds.): Norsk Informatikk-konferanse 2001, Tapir, Trondheim Norway 2001, pp. 176-185. URL: https://www.ii.uib.no/saga/papers/perfor-5d.pdf