|
Literature
Africa |
CUDAResourcesInstallationUJ machines are controlled by puppet. The CUDA class is here on the group GitHub account. Ubuntu 22.04 machines use the developer.download.nvidia.com deb repo. The psi-cluster machine should act as a http proxy for apt, accelerating downloads. Via apt, puppet installs the cuda, cudnn and datacenter-gpu-manager packages. ExamplesKeras distributed notebookJupyter lab can run Keras notebooks fully with PYPI packages (and cuda-drivers). The notebook for this example highlights how to train and evaluate a ML model across multiple GPUs in a single host machine. See the online version here. To set up with pip: pip install tensorflow[and-cuda] keras jupyterlab matplotlib tensorflow-addons tensorflow_datasets wget https://raw.githubusercontent.com/tensorflow/docs/master/site/en/tutorials/distribute/keras.ipynb jupyter lab --ip=0.0.0.0 This will start jupyter lab on all network interfaces. If you need access from outside the machines network, take a look at SSHTunnels, in particular how to run a Dynamic SOCKS proxy. Note: While evaluating the notebook, you should see some errors about CUDA plugins already having been registered. This is to be expected and seems to have no impact on execution. Hopefully this is fixed when TensorFlow 2.16 is released. This will be the first version to support Keras V3. |