{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Setting up environment" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Installation\n", "\n", "Install Ablator via pip:\n", "```\n", "pip install ablator\n", "```\n", "For development version of Ablator:\n", "```\n", "pip install git+https://github.com/fostiropoulos/ablator.git@v0.0.1-mp\n", "```\n", "\n", "Note: Python version is should be 3.10 or newer" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Prerequisites" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Setting up ray cluster\n", "\n", "Ablator leverages `ray` to scale its ablation studies, enabling a cluster of parallel trials (each corresponds to a configuration variation). With this, later when launching the experiment, Ablator will populate trials to the cluster based on the resources available there.\n", "\n", "This section provides steps to manually set up a ray cluster on your machines. However, there are other ways to set up a ray cluster (on AWS, GCP, or Kubernetes). Refer to [ray clusters docs](https://docs.ray.io/en/latest/cluster/getting-started.html#cluster-index) to learn more.\n", "\n", "##### Start the Head node\n", "Choose any node to be the head node and run the following shell command. Note that Ray will choose port 6379 by default:\n", "```shell\n", "ray start --head\n", "```\n", "The command will print out the Ray cluster address, which can be passed to `ray start` on other machines to start and attach worker nodes to the cluster (see below). If you receive a `ConnectionError`, check your firewall settings and network configuration.\n", "\n", "##### Start Worker nodes\n", "On each of the other nodes, run the following command to connect to the head node:\n", "```\n", "ray start --address=\n", "```\n", "`head-node-address:port` should be the value printed by the command on the head node (e.g., `123.45.67.89:6379`). \n", "\n", "##### Launch experiment to the cluster\n", "Once the cluster is set up, you can launch an experiment to the cluster by specifying the cluster head address. A preview of the command is as follows:\n", "```python\n", "trainer.launch(working_directory=\"\", ray_head_address=\"\")\n", "```\n", "\n", "##### Setup ray nodes in Python\n", "Alternative to the CLI commands above, you can also set up ray nodes in Python:\n", "\n", "```python\n", "import ray\n", "if not ray.is_initialized():\n", " ray.init(address=\"\")\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "Note\n", "\n", "- An ablation experiment can also be launched on a cluster that lives in a single machine (in fact, most of the tutorials run ablation experiments this way). And since Ablator supports setting up a ray cluster, you won't have to do that manually in this case.\n", "- This tutorial for setting up ray cluster is adapted from [this ray doc](https://docs.ray.io/en/latest/cluster/vms/user-guides/launching-clusters/on-premises.html).\n", "- Here you can find instructions to launch a ray cluster using `cluster-launcher`, which sets up all nodes at once instead of manually going over each of the node.\n", "\n", "
" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### GPU-Accelerated Computing (Optional)\n", "If GPUs are available, you can run your experiments wth the power of GPU acceleration, which can significantly speed up the training process. To run CUDA Python, you’ll need the [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads) installed on your system with CUDA-capable GPUs.\n", "\n", "After setting up CUDA, you can install the cuda-enabled torch packages. Refer to [this tutorial](https://pytorch.org/get-started/locally/) for instructions on how to install these. A sample command to install `torch` and `torchvision` cuda package is shown below:\n", "```\n", "pip install torch torchvision --index-url https://download.pytorch.org/whl/cu117 --force-reinstall\n", "```" ] } ], "metadata": { "language_info": { "name": "python" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }