258 lines
5.6 KiB
Markdown
258 lines
5.6 KiB
Markdown
# Pytorch
|
|
A forklet of Pytorch for my own quirky needs.
|
|
|
|
* Build notes and scripts for different machines I admin.
|
|
* CPU builds.
|
|
* Meh ROCm AMD GPU builds.
|
|
* Proprietary Nvidia builds.
|
|
* Flailing at getting larger GPUs + Pytorch on ppc64le going.
|
|
* Kludges to workaround disabled ipv6.
|
|
|
|
|
|
# Install
|
|
thusly, on Debian stable (bookworm/12).
|
|
|
|
|
|
## Dependencies
|
|
Perhaps this and more:
|
|
|
|
### OS
|
|
```
|
|
sudo apt install git build-essential libssl-dev zlib1g-dev \
|
|
libbz2-dev libreadline-dev libsqlite3-dev curl ccache \
|
|
libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev \
|
|
libffi-dev liblzma-dev gcc-11 g++-11 libblis64-dev ninja-build \
|
|
libblis-dev libfftw3-dev libmpfr-dev protobuf-compiler protobuf-c-compiler \
|
|
libasmjit-dev python3-virtualenv python3-pip
|
|
```
|
|
|
|
### Python
|
|
At present, seems latest Python that works happily with most Pytorch
|
|
applications is Python 3.10.
|
|
|
|
Use pyenv to manage versions, install something like:
|
|
|
|
```
|
|
# :)
|
|
curl https://pyenv.run | bash
|
|
```
|
|
|
|
Add to `~/.bashrc`, then re-source it (logout/in or whatever):
|
|
|
|
```
|
|
export PYENV_ROOT="$HOME/.pyenv"
|
|
command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
|
|
eval "$(pyenv init -)"
|
|
eval "$(pyenv virtualenv-init -)"
|
|
```
|
|
|
|
Get Python version 3.10:
|
|
|
|
```
|
|
pyenv install 3.10
|
|
```
|
|
|
|
Perhaps other versions, ala:
|
|
|
|
```
|
|
pyenv install 3.9
|
|
pyenv install 3.11.6
|
|
pyenv install 3.12
|
|
pyenv install 3.13-dev
|
|
```
|
|
|
|
#### Setup Pip Download Cache
|
|
So as to not download files from the Internet multiple times from
|
|
each machine in the cluster, a Pip download cache can be set up
|
|
thusly, assuming server IP 192.168.100.101, username debian:
|
|
|
|
```
|
|
mkdir -p ~/devel/devpi
|
|
cd ~/devel/devpi
|
|
virtualenv env
|
|
source env/bin/activate
|
|
pip install -U setuptools wheel pip
|
|
pip install devpi-server devpi-web
|
|
|
|
sudo mkdir /srv/devpi
|
|
sudo chown debian:debian /srv/devpi
|
|
|
|
devpi-init \
|
|
--serverdir /srv/devpi
|
|
|
|
devpi-gen-config \
|
|
--host=0.0.0.0 \
|
|
--port 4040 \
|
|
--serverdir /srv/devpi \
|
|
--absolute-urls
|
|
|
|
sudo apt install nginx
|
|
|
|
sudo cp ~debian/devel/devpi/gen-config/nginx-devpi.conf /etc/nginx/sites-available/
|
|
|
|
cd /etc/nginx/sites-enabled
|
|
sudo ln -s ../sites-available/nginx-devpi.conf .
|
|
|
|
sudo apt install supervisor
|
|
sudo cp ~debian/devel/devpi/gen-config/supervisor-devpi.conf /etc/supervisor/conf.d/
|
|
|
|
crontab -e
|
|
@reboot /usr/local/sbin/supervisord -c /home/debian/etc/supervisor-devpi.conf
|
|
|
|
supervisord -c gen-config/supervisord.conf
|
|
|
|
sudo reboot
|
|
|
|
devpi use http://192.168.100.101:4040
|
|
|
|
devpi login root --password ''
|
|
devpi user -m root password=FOO
|
|
devpi user -l
|
|
devpi logoff
|
|
devpi user -c debian password=BAR email=devpi@localhost
|
|
devpi login debian --password=BAR
|
|
|
|
devpi index -c dev bases=root/pypi
|
|
devpi use debian/dev
|
|
devpi install pytest
|
|
```
|
|
|
|
|
|
#### Add Pip Download Cache
|
|
Add this to clients to use the cache:
|
|
|
|
```
|
|
mkdir -p ~/.config/pip
|
|
cat > ~/.config/pip/pip.conf <<EOF
|
|
[global]
|
|
trusted-host = 192.168.100.101
|
|
index-url = http://192.168.100.101:4040/root/pypi/+simple/
|
|
|
|
[search]
|
|
index = http://192.168.100.101:4040/root/pypi/
|
|
EOF
|
|
```
|
|
|
|
## Other caches
|
|
Also set up `ccache` cluster with Redis for remote.
|
|
Note, Redis needs `systemctl edit redis-server` to set timeouts
|
|
to inifinity or it may just keep restarting itself. Thx systemd.
|
|
|
|
And npm cache (verdaccio), rust cache (panamax),
|
|
apt cache (apt-cacher-ng). And `sccache` for rust
|
|
compiles.
|
|
|
|
|
|
## Get Deepcrayon Pytorch Repo
|
|
```
|
|
git clone https://spacecruft.org/deepcrayon/pytorch
|
|
cd pytorch/
|
|
```
|
|
|
|
## Compile
|
|
Now actually build.
|
|
|
|
### git make more
|
|
From System76, but came with A5000.
|
|
Has non-free Debian nvidia junk installed.
|
|
|
|
This rebuilds from scratch.
|
|
|
|
```
|
|
export PYTHONVER=3.10
|
|
export TORCHVER=deepcrayon-v2.1
|
|
export GCCVER=11
|
|
export CMAKE_C_COMPILER=/usr/lib/ccache/gcc-${GCCVER}
|
|
export CMAKE_CXX_COMPILER=/usr/lib/ccache/g++-${GCCVER}
|
|
|
|
cd ~/devel/deepcrayon/pytorch # or wherever repo is
|
|
source deactivate
|
|
rm -rf venv
|
|
rm -rf build
|
|
mkdir -p build
|
|
|
|
git checkout deepcrayon-v2.1
|
|
git clean -ff
|
|
git reset --hard HEAD
|
|
git clean -ff
|
|
git pull
|
|
git submodule update --init --recursive
|
|
|
|
virtualenv -p ${PYTHONVER} venv
|
|
source venv/bin/activate
|
|
pip install -U setuptools wheel pip
|
|
pip install -r requirements.txt
|
|
|
|
# huh
|
|
cd third_party/python-peachpy
|
|
python setup.py generate
|
|
cd ../..
|
|
|
|
# will barf, but sets up some dirs:
|
|
python setup.py build --cmake-only
|
|
|
|
# Use `ccmake` instead of `cmake` if you want to configure further.
|
|
#
|
|
# For amd64 CPU:
|
|
cmake build -DBLAS=BLIS -DTP_BUILD_PYTHON=ON
|
|
|
|
# For amd64 nvidia A6000 GPU (`sm_86`) XXX NON-FREE:
|
|
cmake build -DCUDAToolkit_INCLUDE_DIR=/usr/include -DBLAS=BLIS \
|
|
-DCUDA_SDK_ROOT_DIR=/usr -DENABLE_CUDA=ON -DTP_BUILD_PYTHON=ON
|
|
|
|
# For ppc64le CPU:
|
|
cmake build -DUSE_NCCL=OFF -DBLAS=BLIS -DTP_BUILD_PYTHON=ON \
|
|
-DUSE_FBGEMM=OFF
|
|
|
|
# For ppc64le testing nvidia A5000 GPU (`sm_86`):
|
|
cmake build -DCUDAToolkit_INCLUDE_DIR=/usr/include -DBLAS=BLIS \
|
|
-DCUDA_SDK_ROOT_DIR=/usr -DENABLE_CUDA=ON -DTP_BUILD_PYTHON=ON \
|
|
-DUSE_FBGEMM=OFF
|
|
|
|
# Make a wheel:
|
|
python setup.py bdist_wheel
|
|
# or:
|
|
python setup.py install
|
|
```
|
|
|
|
Also consider, such as:
|
|
|
|
```
|
|
# -DNNL_GPU_VENDOR -DUSE_NATIVE_ARCH -DBUILD_CAFFE2=ON
|
|
# -DUSE_OPENCL=ON -DUSE_REDIS=ON -DUSE_ROCKSDB=ON -DUSE_ZMQ=ON
|
|
# -DUSE_LMDB=ON -DUSE_LMDB=ON -DUSE_GLOG=ON
|
|
# -DUSE_FFMPEG=ON -DCUPTI_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu
|
|
# -DUSE_NVRTC=ON -DUSE_OPENCV=ON -DUSE_ZSTD=ON
|
|
# -DCMAKE_CUDA_ARCHITECTURES=native
|
|
```
|
|
|
|
The resulting binary will be ala:
|
|
|
|
```
|
|
./dist/torch-2.1.0a0+git83f7fe3-cp310-cp310-linux_x86_64.whl
|
|
```
|
|
|
|
# Upstream
|
|
Main upstream:
|
|
|
|
* https://github.com/pytorch/pytorch
|
|
|
|
See also: `README-upstream.md`.
|
|
|
|
|
|
# Disclaimer
|
|
I am not a programmer.
|
|
|
|
|
|
# Copyright
|
|
Unofficial project, not related to upstream projects.
|
|
|
|
Upstream sources under their respective copyrights.
|
|
|
|
|
|
# License
|
|
MIT.
|
|
|
|
*Copyright © 2023, Jeff Moe.*
|
|
|