1
0
Fork 0
tinyrocs/docs/_source/benchmarks.rst

186 lines
5.0 KiB
ReStructuredText

==========
Benchmarks
==========
System benchmarks.
Top500
======
In what year would this be the world's fastest computer?
`<https://top500.org>`_
Of the 500 fastest computers in the world, 500 of them run the Linux kernel.
Of the distributions used by the top 500 clusters, there is a list, but not
really summarized. For instance at this URL for the November 2023,
"Operating System" category, RHEL, Ubuntu, etc. are listed in many different
versions. Just plain "Linux" is listed for 45% of the clusters.
`<https://top500.org/statistics/list/>`_
By my rough summary, of the 500 machines on the list in November, 2023,
272 of them have a known distro. Broken down into major distro categories,
it is roughly:
* RHEL: 56%, 152 systems.
* SLES: 29%, 79 systems.
* Ubuntu: 15%, 41 systems.
Although Ubuntu is a Debian derivative, none of the systems listed Debian.
There were no Arch Linux, Gentoo, or similar other distros listed.
Of the RHEL clones, Rocky Linux appears to be ascendant.
Linpack
-------
The Linpack TPP benchmark "measures the floating point rate of execution for solving a linear
system of equations."
`<https://www.netlib.org/benchmark/hpl>`_
rocHPL
^^^^^^
There is a ROCm optimized version of HPL.
`<https://github.com/ROCm/rocHPL>`_
* It looks like it hasn't been updated for ROCm release 6.0.2 though. The ``gfx1100`` isn't listed.
* Depends on ``roctracer`` and ``roctx``.
* May need MPI recompiled for GPU.
* OpenMP may be needed too (if not here, elsewhere).
DGEMM
-----
DGEMM "measures the floating point rate of execution of double precision real matrix-matrix multiplication."
STREAM
------
STREAM is "a simple synthetic benchmark program that measures sustainable memory bandwidth (in GB/s)
and the corresponding computation rate for simple vector kernel."
`<https://www.cs.virginia.edu/stream>`_
PTRANS
------
PTRANS (parallel matrix transpose) "exercises the communications where pairs of processors
communicate with each other simultaneously. It is a useful test of the total communications
capacity of the network."
`<https://www.netlib.org/parkbench/html/matrix-kernels.html>`_
RandomAccess
------------
"RandomAccess measures the rate of integer random updates of memory (GUPS)."
`<https://hpcchallenge.org/projectsfiles/hpcc/RandomAccess.html>`_
FFT
---
"FFT measures the floating point rate of execution of double precision complex
one-dimensional Discrete Fourier Transform (DFT)."
`<http://www.ffte.jp>`_
Communication Bandwidth and Latency
-----------------------------------
Communication bandwidth and latency is "a set of tests to measure latency and bandwidth of a
number of simultaneous communication patterns; based on b_eff (effective bandwidth benchmark)."
`<https://fs.hlrs.de/projects/par/mpi/b_eff>`_
``hpcc``
--------
HPC Challenge benchmarks.
`<https://hpcchallenge.org/hpcc>`_
The HPC Challenge benchmarks are in the Debian ``hpcc`` package.
.. code-block:: sh
cp -p /usr/share/doc/hpcc/examples/_hpccinf.txt hpccinf.txt
hpcc
See the Output section of this documentation for benchmark results.
tinygrad
========
Benchmarks in tinygrad.
mlnotcommons
------------
Proprietary with a few libre datasets and benchmarks available.
Don't let "Commons" in the name lead you to think this is available to the mere public.
Lots of proprietary bits involved, closed lists, corporate signups and signatures, etc.
Their use of "Commons" in their name perhaps causes confusion in the marketplace
with Wikipedia Commons (and other groups that serve the public).
This isn't like Wikipedia Commons at all.
The upstream tinycorp is working on implementing some of their benchmarks using
``tinygrad`` and AMD GPUs.
`<https://mlcommons.org/datasets>`_
`<https://mlcommons.org/benchmarks>`_
`<https://github.com/mlcommons>`_
Phoronix Test Suite
===================
Phoronix test suite:
`<https://github.com/phoronix-test-suite/phoronix-test-suite/>`_
`<https://www.phoronix-test-suite.com/>`_
.. code-block:: sh
git clone https://github.com/phoronix-test-suite/phoronix-test-suite/
cd phoronix-test-suite/
apt install php-cli php-xml
./phoronix-test-suite list-missing-dependencies
./phoronix-test-suite list-tests
Meh, this automatically installs dependencies and builds, but doesn't use ROCm.
ROCm
====
Benchmarks optimized for ROCm.
HPL
---
HPL for ROCm from AMD.
.. code-block:: sh
git clone https://github.com/ROCm/rocHPL
cd rocHPL/
# git checkout v6.0.0 # build fails in Ubuntu
./install.sh
# ./build/bin/rochpl --input ./build/rocHPL/HPL.dat
# 1 GPU (works then fails subsequent runs)
./mpirun_rochpl -P 1 -Q 1 -N 45056 --NB 384
Node Binding: Process 0 [(p,q)=(0,0)] CPU Cores: 64 - {0-63}
GPU Binding: Process 0 [(p,q)=(0,0)] GPU: 0, pciBusID c3
Local matrix size = 15.1361 GBs
./mpirun_rochpl -P 1 -Q 2 -N 64000 --NB 384
./mpirun_rochpl -P 2 -Q 2 -N 90112 --NB 384
./mpirun_rochpl -P 2 -Q 4 -N 128000 --NB 384
HPCG
----
HPCG for ROCm.
.. code-block:: sh
git clone https://github.com/ROCm/rocHPCG
cd rocHPCG/
./install.sh