ROCm 6.1.1 build scripts for Debian

This commit is contained in:
Jeff Moe 2024-05-29 12:22:28 -06:00
parent ca21ec983b
commit 0523fe705b
19 changed files with 2280 additions and 8004 deletions

View file

@ -20,5 +20,5 @@ cmake -B build -G Ninja \
-DROCM_DIR=/opt/rocm
ninja -C build package
sudo dpkg -i build/amd-smi-lib_23.4.2.99999-local_amd64.deb \
build/amd-smi-lib-tests_23.4.2.99999-local_amd64.deb
sudo dpkg -i build/amd-smi-lib_24.5.1.99999-local_amd64.deb \
build/amd-smi-lib-tests_24.5.1.99999-local_amd64.deb

View file

@ -27,7 +27,7 @@ cmake -B build -G Ninja \
-DROCM_PATH=/opt/rocm \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DHIPCC_BIN_DIR=/opt/rocm/bin \
-DHIPCC_BIN_DIR=/usr/bin \
-DCLR_BUILD_HIP=ON \
-DUSE_PROF_API=OFF \
-DBUILD_TESTING=OFF \

View file

@ -1,6 +1,4 @@
git clone https://github.com/ROCm/ROCm-CompilerSupport
cd ROCm-CompilerSupport/lib/comgr
git checkout rocm-6.1.1
cd llvm-project/amd/comgr
rm -rf build
cmake -B build -G Ninja \
-DAMDDeviceLibs_DIR=/usr/lib/cmake/AMDDeviceLibs \
@ -15,11 +13,11 @@ cmake -B build -G Ninja \
-DCPACK_GENERATOR=DEB \
-DCPACK_SOURCE_TBZ2=OFF \
-DCPACK_SOURCE_TGZ=OFF \
-DCPACK_SOURCE_TXZ=OFF \
-DCPACK_SOURCE_TZ=OFF \
-DROCM_CCACHE_BUILD=ON \
-DROCM_DIR=/opt/rocm \
-Dhip_DIR=/opt/rocm/share/rocm/cmake
-Dhip_DIR=/home/jebba/devel/ROCm/hip
#-Dhip_DIR=/opt/rocm/share/rocm/cmake
ninja -C build package
sudo dpkg -i build/comgr_2.6.0.99999-local_amd64.deb
sudo dpkg -i build/comgr_2.7.0.99999-local_amd64.deb

View file

@ -1,25 +1,21 @@
git clone https://github.com/ROCm/HIPCC hipcc
cd hipcc/
git checkout rocm-6.1.1
cd llvm-project/amd/hipcc
sed -i -e 's/, hip-dev, rocm-core, rocm-llvm//g' CMakeLists.txt
rm -rf build
cmake -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_INSTALL_PREFIX=/opt/rocm \
-DCMAKE_PREFIX_PATH=/opt/rocm/ \
-DCPACK_BINARY_DEB=ON \
-DCPACK_BINARY_STGZ=OFF \
-DCPACK_BINARY_TGZ=OFF \
-DCPACK_BINARY_TZ=OFF \
-DCPACK_GENERATOR=DEB \
-DCPACK_PACKAGING_INSTALL_PREFIX=/opt/rocm \
-DCPACK_SOURCE_TBZ2=OFF \
-DCPACK_SOURCE_TGZ=OFF \
-DCPACK_SOURCE_TXZ=OFF \
-DCPACK_SOURCE_TZ=OFF \
-DHIPCC_BACKWARD_COMPATIBILITY=ON
-DROCM_DIR=/opt/rocm
ninja -C build package
sudo dpkg -i build/hipcc_1.0.0.99999-local_amd64.deb
# It won't honor install paths, so cruft:
sudo ln -s /usr/hip/bin/hip* /opt/rocm/bin/

View file

@ -43,8 +43,3 @@ cmake -S llvm -B build -G Ninja \
ninja -C build package
sudo dpkg -i build/llvm_17.0.0git_amd64.deb
sudo apt-mark hold llvm
# LLVM_BUILD_LLVM_DYLIB
# LLVM_BUILD_EXTERNAL_COMPILER_RT
# LLVM_ENABLE_RTTI
# LLVM_OPTIMIZED_TABLEGEN
# LLVM_TOOL_LLVM_DRIVER_BUILD

View file

@ -1,6 +1,11 @@
git clone https://github.com/ROCm/rocBLAS
cd rocBLAS/
git checkout rocm-6.1.1
# Broken Tensile:
#git checkout rocm-6.1.1
# Broken Tensile:
#git checkout remotes/origin/release/rocm-rel-6.1
# test
git checkout develop
deactivate
rm -rf venv
virtualenv venv
@ -26,8 +31,25 @@ cmake -B build -G Ninja \
-DCPACK_SOURCE_TGZ=OFF \
-DCPACK_SOURCE_TXZ=OFF \
-DCPACK_SOURCE_TZ=OFF \
-DROCM_DEP_ROCMCORE=OFF
-DROCM_DEP_ROCMCORE=OFF \
-DTENSILE_USE_LLVM=ON \
-DTensile_COMPILER=hipcc
ninja -C build package
sudo dpkg -i build/rocblas-dev_4.0.0-88df9726~dirty_amd64.deb \
build/rocblas_4.0.0-88df9726~dirty_amd64.deb
# worked in 6.0.2, tensile fails in 6.1.1
-DTENSILE_USE_LLVM=ON \
-DTensile_COMPILER=hipcc
# XXX fails, with, without
-DAMDGPU_TARGETS=gfx1100 \
-DTENSILE_GPU_ARCHS=gfx1100
-DTENSILE_USE_HIP=OFF \
-DTENSILE_USE_LLVM=ON
# last resort, if it won't build:
-DBUILD_WITH_TENSILE=OFF
-DTensile_COMPILER= ?

View file

@ -16,5 +16,5 @@ cmake -B build -G Ninja \
-DROCM_DEP_ROCMCORE=OFF
ninja -C build package
sudo dpkg -i build/rocm-cmake_0.11.0-5a34e72_amd64.deb
sudo dpkg -i build/rocm-cmake_0.12.0-f6fcfe7_amd64.deb
sudo apt-mark hold rocm-cmake

View file

@ -19,4 +19,4 @@ cmake -B build -G Ninja \
-DROCM_VERSION=6.1.1
ninja -C build package
sudo dpkg -i build/rocm-core_6.1.1.6.1.1-local_amd64.deb
sudo dpkg -i build/rocm-core_6.1.1.60101-local_amd64.deb

View file

@ -3,6 +3,8 @@ rm -rf rocm_smi_lib
git clone https://github.com/ROCm/rocm_smi_lib
cd rocm_smi_lib/
git checkout rocm-6.1.1
# XXX, looks for wrong version of library. Kludge.
sed -i -e 's/librocm_smi64.so.7/librocm_smi64.so.2/g' python_smi_tools/rsmiBindings.py
rm -rf build
cmake -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
@ -19,3 +21,6 @@ cmake -B build -G Ninja \
ninja -C build package
sudo dpkg -i build/rocm-smi-lib_2.8.0.99999-local_amd64.deb
exit
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_C_COMPILER=clang \

View file

@ -20,4 +20,4 @@ cmake -B build -G Ninja \
-DCPACK_SOURCE_TZ=OFF
ninja -C build package
sudo dpkg -i build/rocprim-dev_3.0.0-c8297d68~dirty_amd64.deb
sudo dpkg -i build/rocprim-dev_3.1.0-85253f87~dirty_amd64.deb

View file

@ -21,5 +21,5 @@ cmake -B build -G Ninja \
-DROCM_CCACHE_BUILD=ON
ninja -C build package
sudo dpkg -i build/hsa-rocr_1.12.0-local_amd64.deb \
build/hsa-rocr-dev_1.12.0-local_amd64.deb
sudo dpkg -i build/hsa-rocr_1.13.0-local_amd64.deb \
build/hsa-rocr-dev_1.13.0-local_amd64.deb

View file

@ -21,5 +21,5 @@ cmake -B build -G Ninja \
-DCPACK_SOURCE_TZ=OFF
ninja -C build package
sudo dpkg -i build/rocsparse-dev_3.0.2-1c5d839c~dirty_amd64.deb \
build/rocsparse_3.0.2-1c5d839c~dirty_amd64.deb
sudo dpkg -i build/rocsparse-dev_3.1.2-edb27708~dirty_amd64.deb \
build/rocsparse_3.1.2-edb27708~dirty_amd64.deb

View file

@ -0,0 +1,361 @@
# SOME DESCRIPTIVE TITLE.
# Copyright (C) 2023, 2024 Jeff Moe
# This file is distributed under the same license as the tinyrocs: Direct to
# Chip Liquid Cooled GPU AI Cluster package.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2024.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: tinyrocs: Direct to Chip Liquid Cooled GPU AI Cluster"
" 0\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-05-29 09:58-0600\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: en\n"
"Language-Team: en <LL@li.org>\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.15.0\n"
#: ../../../_source/toolchain-6.1.1.rst:3
msgid "Debian ROCm Toolchain"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:4
msgid ""
"HOWTO rebuld all of ROCm for the ``6.1.1`` release on Debian stable "
"(12/bookworm)."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:7
msgid "The main GPU toolchain is built from AMD upstream ROCm sources."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:9
msgid ""
"A build of all the sources requires building LLVM (``clang``) and the "
"various ROCm libraries and applications."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:12
msgid ""
"There is a bit of a chicken and egg problem with getting the libraries "
"and compilers built."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:15
msgid ""
"AMD's ROCm source repositories contain most of the toolchain software "
"that needs to be built. Note, there are many equivalent packages in "
"Debian's main free repository, but they are older versions than needed "
"(including in unstable sid)."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:20
msgid ""
"The ``amdgpu`` module is in the Linux kernel source, not part of this "
"toolchain. See \"Operating System\" about building the ``amdgpu`` Linux "
"kernel module. This toolchain isn't used to build the Linux kernel "
"module. The stock Debian gcc is used for the kernel module."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:28
msgid "Uninstall"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:29
msgid ""
"Starting from a \"clean slate\" Perhaps blow out all the old files and "
"start from scratch..."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:41
msgid "LLVM"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:42
msgid ""
"First pass of LLVM build. Doesn't include \"everything\" because it needs"
" other dependencies built with itself first."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:50
msgid "hip"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:51
msgid ""
"This just sets up the HIP headers for clr to use later, it doesn't "
"actually build anything."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:59
msgid "rocm-core"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:60
msgid "Build ``rocm-core``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:67
msgid "rocm-cmake"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:68
msgid "Build ``rocm-cmake``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:75
msgid "amd-smi"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:76
msgid "Build ``amd-smi``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:83
msgid "roct-thunk-interface"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:84
msgid ""
"This needs a patchlet or other applications (e.g. ``rocminfo``) won't be "
"able to build. Just needs a one-liner:"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:90
msgid "Build ``roct-thunk-interface``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:97
msgid "rocm-device-libs"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:98
msgid "Build ``rocm-device-libs``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:100
msgid ""
"Using the deprecated rocm-device-libs repository, as it is what is used "
"for release ``6.1.1``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:106
msgid ""
"In later releases, this package may be built under the ``llvm-"
"project/amd`` directory."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:111
msgid "rocr-runtime"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:112
msgid "Build ``rocr-runtime``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:114
msgid ""
"This has an option for ``TARGET_DEVICES``. By default all targets are "
"built. This adds a *lot* of time to the build for devices that won't be "
"used. But if they aren't included, other packages further down the "
"toolchain may complain, so include them all for now."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:119
msgid "List of possible targets:"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:121
msgid ""
"``gfx700;gfx701;gfx702;gfx801;gfx802;gfx803;gfx805;gfx810;gfx900;gfx902;gfx904;gfx906;gfx908;gfx909;gfx90a;gfx90c;gfx940;gfx941;gfx942;"
" "
"gfx1010;gfx1011;gfx1012;gfx1013;gfx1030;gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036;gfx1100;gfx1101;gfx1102;gfx1103``"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:124
msgid "The AMD Radeon 7900 XTX target is ``gfx1100``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:129
msgid ""
"For some reason, this is installing headers to ``/usr/hsa`` instead of "
"``/opt/rocm``. It is ignoring the ``PREFIX``. Workaround..."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:134
msgid "hipcc"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:135
msgid "hipcc per upstream:"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:137
msgid ""
"HIPCC has moved! This project is now located in the AMD Fork of the LLVM "
"Project, under the \"amd/hipcc\" directory. This repository is now read-"
"only."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:142
msgid "So use the llvm-project repo built above to build hipcc..."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:149
msgid "rocminfo"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:150
msgid "``rocminfo``"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:152
msgid ""
"This ignores ``CMAKE_INSTALL_PREFIX=/opt/rocm`` and installs to "
"``/usr/bin`` which causes other applications to fail."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:155
msgid ""
"So cruftily add some symlinks at the end to appease these applications "
"looking in the wrong spot."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:163
msgid "rocm-bandwidth-test"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:164
msgid "``rocm-bandwidth-test``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:171
msgid "comgr"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:172
msgid "AKA ``ROCm-CompilerSupport``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:174
msgid "Build ``comgr``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:176
msgid ""
"This is another that in latest HEAD uses ``llvm-project/amd/`` directory,"
" similar to ``hipcc``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:184
msgid "clr"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:185
msgid "OpenCL and more."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:192
msgid "rocBLAS"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:193
msgid "rocBLAS."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:195
msgid ""
"This builds for many targets, which takes ~155 minutes to build, but only"
" ``gfx1100`` is really needed."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:198
msgid ""
"The cmake build scripts download Tensile from git. Not sure if it uses a "
"specific tag, or is a moving build..."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:204
msgid "Other options, maybe:"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:215
msgid ""
"XXX ? Fortran can't be built because flang can't be built in the LLVM "
"build first pass. If LLVM is rebuilt with flang, then maybe Fortran can "
"be enabled."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:221
msgid "rocprim"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:222
msgid "``rocprim``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:229
msgid "rocsparse"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:230
msgid "``rocsparse``."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:232
msgid "Has ``GPU_TARGETS`` option, all are built."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:239
msgid "rocsolver"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:240
msgid "``rocsolver`` for hipBLAS etc. This takes hour++ to build."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:243
msgid "Has option: ``-DAMDGPU_TARGETS=gfx1100``, but builds for all targets."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:250
msgid "hipBLAS"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:251
msgid "``hipBLAS`` plz."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:268
msgid "OpenMP"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:269
msgid ""
"OpenMP can be built as a part of LLVM, but it fails in a first pass "
"build. It can be built (perhaps) a rebuild of LLVM with LLVm."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:272
msgid "OpenMP repos, check rebuild."
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:276
msgid "OpenMPI"
msgstr ""
#: ../../../_source/toolchain-6.1.1.rst:277
msgid ""
"OpenMPI, not to be confused with OpenMP. They can be used independently "
"of each other, or used together. Briefly, OpenMP parallelizes across one "
"machine (GPUs/CPUs), and OpenMPI parallelizes across multiple machines "
"(network). Both rebuilt for ROCm is ideal. Then applications need to be "
"built against them."
msgstr ""

View file

@ -93,17 +93,19 @@ Build ``roct-thunk-interface``.
:language: bash
device-libs
-----------
Build ``device-libs``.
rocm-device-libs
----------------
Build ``rocm-device-libs``.
Using the deprecated device-libs repository, as it is what is used
for release ``6.1.1``. In later releases, this package is built
under the ``llvm-project/amd`` directory.
Using the deprecated rocm-device-libs repository, as it is what is used
for release ``6.1.1``.
.. literalinclude:: _static/toolchain/rocm-6.1.1/build-device-libs.sh
.. literalinclude:: _static/toolchain/rocm-6.1.1/build-rocm-device-libs.sh
:language: bash
In later releases, this package may be built under the
``llvm-project/amd`` directory.
rocr-runtime
------------
@ -130,7 +132,14 @@ For some reason, this is installing headers to ``/usr/hsa`` instead of
hipcc
-----
hipcc built under clr. This seems better.
hipcc per upstream:
HIPCC has moved!
This project is now located in the AMD Fork of the LLVM Project, under the "amd/hipcc" directory.
This repository is now read-only.
So use the llvm-project repo built above to build hipcc...
.. literalinclude:: _static/toolchain/rocm-6.1.1/build-hipcc.sh
:language: bash
@ -165,14 +174,7 @@ AKA ``ROCm-CompilerSupport``.
Build ``comgr``.
This is another that in latest HEAD uses ``llvm-project/amd/`` directory,
but in ``6.1.1`` this isn't available. So use the ``rocm-6.1.1`` tag,
like the others in this build.
The first time this is built, it has a non-fatal ``hip_DIR-NOTFOUND``
in cmake. This is because clr needs to be built. But
clr can't be build without comgr first. So build comgr, then
clr, then rebuild comgr so it finds the HIP directory in the
second build...
similar to ``hipcc``.
.. literalinclude:: _static/toolchain/rocm-6.1.1/build-comgr.sh
:language: bash
@ -182,35 +184,16 @@ clr
---
OpenCL and more.
Note HSAIL needs to be disabled or the build fails.
``-DROCCLR_ENABLE_HSAIL=OFF``
.. literalinclude:: _static/toolchain/rocm-6.1.1/build-clr.sh
:language: bash
comgr Second Pass
-----------------
Re-build ``comgr`` now that clr is built, so the HIP directory is found.
The build commands are the same in both builds.
.. literalinclude:: _static/toolchain/rocm-6.1.1/build-comgr.sh
:language: bash
comgr installs to ``/usr/lib``. Some apps expect it elsewhere.
Cruft XXX workaround:
.. code-block:: sh
sudo ln -s /usr/lib/libamd_comgr.so* /opt/rocm/lib/
rocBLAS
-------
rocBLAS.
Fails to build for rocm-6.1.1. Tensile error.
This builds for many targets, which takes ~155 minutes to build, but only
``gfx1100`` is really needed.
@ -257,6 +240,10 @@ Has ``GPU_TARGETS`` option, all are built.
rocsolver
---------
``rocsolver`` for hipBLAS etc.
Fails because rocBLAS fails above.
This takes hour++ to build.
Has option: ``-DAMDGPU_TARGETS=gfx1100``, but builds for all targets.
@ -300,3 +287,11 @@ and OpenMPI parallelizes across multiple machines (network).
Both rebuilt for ROCm is ideal. Then applications need to be
built against them.
rocm_smi_lib
------------
``rocm_smi_lib`` plz.
.. literalinclude:: _static/toolchain/rocm-6.1.1/build-rocm_smi_lib.sh
:language: bash

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff