Go to file
Jeff Moe 4880bf3ec1 nvidia gpu build noted 2023-11-08 09:52:21 -07:00
.ci Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.circleci Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.ctags.d Forklet of Pytorch 2023-11-08 09:01:59 -07:00
.devcontainer Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.github Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.vscode Upstream v2.1.0 2023-11-08 09:13:36 -07:00
android Upstream v2.1.0 2023-11-08 09:13:36 -07:00
aten Upstream v2.1.0 2023-11-08 09:13:36 -07:00
benchmarks Upstream v2.1.0 2023-11-08 09:13:36 -07:00
binaries Upstream v2.1.0 2023-11-08 09:13:36 -07:00
c10 Upstream v2.1.0 2023-11-08 09:13:36 -07:00
caffe2 Upstream v2.1.0 2023-11-08 09:13:36 -07:00
cmake Upstream v2.1.0 2023-11-08 09:13:36 -07:00
docs Upstream v2.1.0 2023-11-08 09:13:36 -07:00
functorch Upstream v2.1.0 2023-11-08 09:13:36 -07:00
ios Upstream v2.1.0 2023-11-08 09:13:36 -07:00
modules Upstream v2.1.0 2023-11-08 09:13:36 -07:00
mypy_plugins Upstream v2.1.0 2023-11-08 09:13:36 -07:00
scripts Upstream v2.1.0 2023-11-08 09:13:36 -07:00
test Upstream v2.1.0 2023-11-08 09:13:36 -07:00
third_party Upstream v2.1.0 2023-11-08 09:13:36 -07:00
tools Upstream v2.1.0 2023-11-08 09:13:36 -07:00
torch Ugly kludge to workaround kernel ipv6.disable=1 2023-11-08 09:42:32 -07:00
torchgen Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.bazelignore Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.bazelrc Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.bazelversion Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.buckconfig.oss Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.clang-format Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.clang-tidy Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.cmakelintrc Forklet of Pytorch 2023-11-08 09:01:59 -07:00
.coveragerc Forklet of Pytorch 2023-11-08 09:01:59 -07:00
.dockerignore Forklet of Pytorch 2023-11-08 09:01:59 -07:00
.flake8 Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.gdbinit Forklet of Pytorch 2023-11-08 09:01:59 -07:00
.git-blame-ignore-revs Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.gitattributes Forklet of Pytorch 2023-11-08 09:01:59 -07:00
.gitignore Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.gitmodules Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.isort.cfg Forklet of Pytorch 2023-11-08 09:01:59 -07:00
.lintrunner.toml Upstream v2.1.0 2023-11-08 09:13:36 -07:00
.lldbinit Upstream v2.1.0 2023-11-08 09:13:36 -07:00
BUCK.oss Forklet of Pytorch 2023-11-08 09:01:59 -07:00
BUILD.bazel Upstream v2.1.0 2023-11-08 09:13:36 -07:00
CITATION.cff Forklet of Pytorch 2023-11-08 09:01:59 -07:00
CMakeLists.txt Upstream v2.1.0 2023-11-08 09:13:36 -07:00
CODEOWNERS Upstream v2.1.0 2023-11-08 09:13:36 -07:00
CODE_OF_CONDUCT.md Forklet of Pytorch 2023-11-08 09:01:59 -07:00
CONTRIBUTING.md Upstream v2.1.0 2023-11-08 09:13:36 -07:00
Dockerfile Upstream v2.1.0 2023-11-08 09:13:36 -07:00
GLOSSARY.md Forklet of Pytorch 2023-11-08 09:01:59 -07:00
LICENSE Forklet of Pytorch 2023-11-08 09:01:59 -07:00
MANIFEST.in Forklet of Pytorch 2023-11-08 09:01:59 -07:00
Makefile Upstream v2.1.0 2023-11-08 09:13:36 -07:00
NOTICE Forklet of Pytorch 2023-11-08 09:01:59 -07:00
README-upstream.md mv upstream readme 2023-11-08 09:16:34 -07:00
README.md nvidia gpu build noted 2023-11-08 09:52:21 -07:00
RELEASE.md Upstream v2.1.0 2023-11-08 09:13:36 -07:00
SECURITY.md Forklet of Pytorch 2023-11-08 09:01:59 -07:00
WORKSPACE Upstream v2.1.0 2023-11-08 09:13:36 -07:00
aten.bzl Upstream v2.1.0 2023-11-08 09:13:36 -07:00
buckbuild.bzl Upstream v2.1.0 2023-11-08 09:13:36 -07:00
build.bzl Upstream v2.1.0 2023-11-08 09:13:36 -07:00
build_variables.bzl Upstream v2.1.0 2023-11-08 09:13:36 -07:00
c2_defs.bzl Upstream v2.1.0 2023-11-08 09:13:36 -07:00
c2_test_defs.bzl Forklet of Pytorch 2023-11-08 09:01:59 -07:00
defs.bzl Upstream v2.1.0 2023-11-08 09:13:36 -07:00
docker.Makefile Upstream v2.1.0 2023-11-08 09:13:36 -07:00
mypy-nofollow.ini Upstream v2.1.0 2023-11-08 09:13:36 -07:00
mypy-strict.ini Upstream v2.1.0 2023-11-08 09:13:36 -07:00
mypy.ini Upstream v2.1.0 2023-11-08 09:13:36 -07:00
pt_ops.bzl Upstream v2.1.0 2023-11-08 09:13:36 -07:00
pt_template_srcs.bzl Forklet of Pytorch 2023-11-08 09:01:59 -07:00
pyproject.toml Upstream v2.1.0 2023-11-08 09:13:36 -07:00
pytest.ini Upstream v2.1.0 2023-11-08 09:13:36 -07:00
requirements-flake8.txt Upstream v2.1.0 2023-11-08 09:13:36 -07:00
requirements.txt Upstream v2.1.0 2023-11-08 09:13:36 -07:00
setup.py Upstream v2.1.0 2023-11-08 09:13:36 -07:00
ubsan.supp Upstream v2.1.0 2023-11-08 09:13:36 -07:00
ufunc_defs.bzl Forklet of Pytorch 2023-11-08 09:01:59 -07:00
version.txt Upstream v2.1.0 2023-11-08 09:13:36 -07:00

README.md

Pytorch

A forklet of Pytorch for my own quirky needs.

  • Build notes and scripts for different machines I admin.
  • CPU builds.
  • Meh ROCm AMD GPU builds.
  • Proprietary Nvidia builds.
  • Flailing at getting larger GPUs + Pytorch on ppc64le going.
  • Kludges to workaround disabled ipv6.

Install

thusly, on Debian stable (bookworm/12).

Dependencies

Perhaps this and more:

OS

sudo apt install git build-essential libssl-dev zlib1g-dev \
    libbz2-dev libreadline-dev libsqlite3-dev curl ccache \
    libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev \
    libffi-dev liblzma-dev \
    python3-virtualenv python3-pip

Python

At present, seems latest Python that works happily with most Pytorch applications is Python 3.10.

Use pyenv to manage versions, install something like:

# :)
curl https://pyenv.run | bash

Add to ~/.bashrc, then re-source it (logout/in or whatever):

export PYENV_ROOT="$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"

Get Python version 3.10:

pyenv install 3.10

Perhaps other versions, ala:

pyenv install 3.9
pyenv install 3.11.6
pyenv install 3.12
pyenv install 3.13-dev

Setup Pip Download Cache

So as to not download files from the Internet multiple times from each machine in the cluster, a Pip download cacche can be set up thusly, assuming server IP 192.168.100.101, username debian:

mkdir -p ~/devel/devpi
cd ~/devel/devpi
virtualenv env
source env/bin/activate
pip install -U setuptools wheel pip
pip install devpi-server devpi-web

sudo mkdir /srv/devpi
sudo chown debian:debian /srv/devpi

devpi-init \
    --serverdir /srv/devpi

devpi-gen-config \
    --host=0.0.0.0 \
    --port 4040 \
    --serverdir /srv/devpi \
    --absolute-urls

sudo apt install nginx

sudo cp ~debian/devel/devpi/gen-config/nginx-devpi.conf /etc/nginx/sites-available/

cd /etc/nginx/sites-enabled
sudo ln -s ../sites-available/nginx-devpi.conf .

sudo apt install supervisor
sudo cp ~debian/devel/devpi/gen-config/supervisor-devpi.conf /etc/supervisor/conf.d/

crontab -e
@reboot /usr/local/sbin/supervisord -c /home/debian/etc/supervisor-devpi.conf

supervisord -c gen-config/supervisord.conf

sudo reboot

devpi use http://192.168.100.101:4040

devpi login root --password ''
devpi user -m root password=FOO
devpi user -l
devpi logoff
devpi user -c debian password=BAR email=devpi@localhost
devpi login debian --password=BAR

devpi index -c dev bases=root/pypi
devpi use debian/dev
devpi install pytest

Add Pip Download Cache

Add this to clients to use the cache:

mkdir -p ~/.config/pip
cat > ~/.config/pip/pip.conf <<EOF
[global]
trusted-host = 192.168.100.101
index-url = http://192.168.100.101:4040/root/pypi/+simple/

[search]
index = http://192.168.100.101:4040/root/pypi/
EOF

Get Deepcrayon Pytorch Repo

git clone https://spacecruft.org/deepcrayon/pytorch
cd pytorch/

Compile

Now actually build.

amd64 with Nvidia A6000

From System76, but came with A5000. Has non-free Debian nvidia junk installed.

#!/bin/bash

export PYTHONVER=3.10
export TORCHVER=2.1.0
export GCCVER=11
export CMAKE_C_COMPILER=/usr/lib/ccache/gcc-${GCCVER}
export CMAKE_CXX_COMPILER=/usr/lib/ccache/g++-${GCCVER}

cd ~/devel/pytorch/pytorch
source deactivate
rm -rf venv
rm .python-version
rm -rf build
mkdir -p build

git checkout main
git clean -ff
git reset --hard HEAD
git clean -ff
git pull
git submodule update
git checkout v${TORCHVER}
git submodule update --init --recursive

pyenv local ${PYTHONVER}
virtualenv -p ${PYTHONVER} venv
source venv/bin/activate
pip install -U setuptools wheel pip
pip install -r requirements.txt

# will barf, but sets up some dirs:
python setup.py build --cmake-only

# Use ccmake if you want to configure further:
cmake build -DCUDAToolkit_INCLUDE_DIR=/usr/include -DBLAS=BLIS \
    -DCUDA_SDK_ROOT_DIR=/usr -DENABLE_CUDA=ON -DTP_BUILD_PYTHON=ON

python setup.py install

Also consider, such as:

# -DNNL_GPU_VENDOR -DUSE_NATIVE_ARCH -DBUILD_CAFFE2=ON
# -DUSE_OPENCL=ON -DUSE_REDIS=ON -DUSE_ROCKSDB=ON -DUSE_ZMQ=ON
# -DUSE_LMDB=ON -DUSE_LMDB=ON -DUSE_GLOG=ON
# -DUSE_FFMPEG=ON -DCUPTI_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu
# -DUSE_NVRTC=ON -DUSE_OPENCV=ON -DUSE_ZSTD=ON
# -DCMAKE_CUDA_ARCHITECTURES=native

Upstream

Main upstream:

See also: README-upstream.md.

Disclaimer

I am not a programmer.

Copyright

Unofficial project, not related to upstream projects.

Upstream sources under their respective copyrights.

License

MIT.

Copyright © 2023, Jeff Moe.