====== Ubuntu ====== Ubuntu is a GNU/Linux distribution downstream from Debian with proprietary bits added. It is used in many Top500 clusters. It is used by tinygrad on the tinybox. ``_ Documentation ============= Ubuntu server docs. ``_ Install ======= Get Ubuntu and install. The upstream tinybox runs Ubuntu 22.04 LTS. Run that, perhaps. ``_ ``_ Write to USB drive, make sure device is correct... .. code-block:: sh sudo dd if=ubuntu-22.04.3-live-server-amd64.iso of=/dev/sdXX bs=16M status=progress oflag=sync Configuration ============= Setup, perhaps as so: * ssh keys. Packages ======== Update and install new packages from Ubuntu repos. .. code-block:: sh # Use IPv4 for apt echo 'Acquire::ForceIPv4 "true";' | sudo tee /etc/apt/apt.conf.d/99force-ipv4 # Set up apt-cache echo 'Acquire::http::Proxy "http://192.168.1.1:3142";' | sudo tee /etc/apt/apt.conf.d/90cache sudo sed -i -e 's/https:/http:/g' /etc/apt/sources.list.d/*.list sudo apt update sudo apt dist-upgrade sudo apt install bc bison build-essential ccache cmake-curses-gui colordiff \ cpufrequtils devscripts dpkg-dev equivs flex gfortran git haveged host \ libbz2-dev libdrm-dev libedit-dev libegl1-mesa-dev libelf-dev libffi-dev \ libhdf5-openmpi-dev liblzma-dev libncurses-dev libnuma-dev \ libopenmpi-dev libpomp2-dev libsqlite3-dev libssl-dev libsystemd-dev \ libudev-dev libxml2-dev libxml2-utils libz3-dev libzstd-dev lshw \ lzma-dev mesa-common-dev net-tools ninja-build nlohmann-json3-dev \ ntpsec-ntpdate nvme-cli ocl-icd-opencl-dev openmpi-bin pahole pkg-config \ portaudio19-dev python3-argcomplete python3-pip python3-pygments \ python3-venv python3-virtualenv python3-yaml quilt rsync rsyslog sshfs \ sudo swig traceroute vim xxd python3-sphinx git-lfs hwdata \ lua5.3 liblua5.3-dev libmpfr-dev libmsgpack-dev libfmt-dev \ environment-modules python3-numpy pybind11-dev libopengl-dev zip zsh \ hpcc gawk googletest libdw-dev libgtest-dev libsigsegv2 \ libbabeltrace-dev libbabeltrace1 libbison-dev libncurses5-dev \ libtext-unidecode-perl tex-common texinfo ucx-utils libucx-dev \ librdmacm-dev OS Configuration ---------------- Operating system configuration. .. code-block:: sh # Lazy sudo sed -i -e 's/%sudo\tALL=(ALL:ALL) ALL/%sudo ALL=(ALL) NOPASSWD: ALL/g' /etc/sudoers * After all packages installed, add to groups: sudo adduser debian audio sudo adduser debian dialout sudo adduser debian kvm sudo adduser debian render sudo adduser debian video # Disable various startup packages systemctl disable XXX User Configuration ================== Set up the user account. Configure to use various caching services already available in the cluster. ccache ------ There is a ``redis`` ``ccache`` server on the tinyrocs network. Edit ``~/.config/ccache/ccache.conf`` thusly: .. code-block:: remote_storage = redis://192.168.1.2 remote_only = true reshare = true PATH ---- Add the ROCm binary path and ccache (XXX) to ``~/.bashrc``: .. code-block:: sh PATH=/usr/lib/ccache:/opt/rocm/bin:$PATH Python pip cache ---------------- Set up to use LAN ``pip`` cache ``pydev`` if available, by editing ``~/.config/pip/pip.conf``, such as: .. code-block:: sh [global] trusted-host = 192.168.1.3 index-url = http://192.168.1.3:4040/root/pypi/+simple/ [search] index = http://192.168.1.3:4040/root/pypi/ ROCm ==== ROCm for Ubuntu. .. code-block:: sh sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)" wget https://repo.radeon.com/amdgpu-install/6.0.2/ubuntu/jammy/amdgpu-install_6.0.60002-1_all.deb sudo apt install ./amdgpu-install_6.0.60002-1_all.deb sudo apt update sudo apt install amdgpu-dkms sudo apt install rocm-hip-libraries # sudo reboot sudo apt install rocm-hip-sdk rocm-ml-sdk rocm-opencl-sdk rocm-openmp-sdk \ rocm-bandwidth-test rocm-clang-ocl amdgpu-dkms-headers rocm \ llvm-amdgpu llvm-amdgpu-runtime rocm-dkms rocm-dev rocm-libs \ rocm-khronos-cts rocm-ocltst rocm-validation-suite \ smi-lib-amdgpu smi-lib-amdgpu-dev \ libstdc++-12-dev python-is-python3 \ vulkan-amdgpu libvulkan-dev libvulkan-volk-dev vulkan-tools \ vulkan-validationlayers-dev glslang-dev glslang-tools # sudo apt purge --autoremove libc6-dev-i386 libc6-dev-x32 sudo apt install gcc-multilib Misc ==== More. .. code-block:: sh systemctl disable ModemManager.service nvmefc-boot-connections.service \ nvmf-autoconnect.service open-iscsi.service ubuntu-advantage.service \ ufw.service unattended-upgrades.service update-notifier-download.timer \ update-notifier-motd.timer \ apport-autoreport.path apport-autoreport.timer apport-forward.socket \ apt-daily.timer apt-daily-upgrade.timer fwupd-refresh.timer \ remote-fs.target iscsid.socket motd-news.timer \ ua-reboot-cmds.service ua-timer.timer sudo snap install nvtop GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 selinux=0 apparmor=0" lvresize --resizefs -L 500G /dev/ubuntu-vg/ubuntu-lv XXX Disable sound card. XXX long time to wait for network to be configured ... XXX