Table of Contents
We mentioned in the previous No-Code AI Model Training with TAO Toolkit post that you can develop faster, easier way to create highly accurate, customized, and enterprise-ready AI models using TAO Toolkit without coding. In this post we will mention about how to install TAO Toolkit and getting started to the model training.
TAO Toolkit System Requirements
To achieve satisfactory training performance with TAO Toolkit and its accompanying models, a specific system configuration is advised. It is recommended that systems possess a minimum of 16 GB of system RAM, coupled with 16 GB of GPU RAM. The system should also be powered by an 8-core CPU, and incorporate at least 1 NVIDIA GPU.
Additionally, having 100 GB of SSD storage space is crucial for smooth operations. Notably, TAO Toolkit is compatible with discrete GPUs including models like H100, A100, A40, A30, A2, A16, A100x, A30x, V100, T4, Titan-RTX, and Quadro-RTX. However, it’s essential to understand that TAO Toolkit does not extend support to GPUs that predate the Pascal generation. Also, Ubuntu 20.04 LTS is required.
How to Use TAO Toolkit?
The TAO Launcher offers a convenient and streamlined approach to accessing TAO’s functionalities. Rooted in Python, this Command Line Interface (CLI) serves as a front-end to the various TAO Toolkit containers that have been constructed on renowned deep learning platforms, PyTorch and Tensorflow.
Its primary function is automation; depending on the model you wish to employ, whether for computer vision tasks or conversational AI purposes, the Launcher intuitively decides which container is suitable and runs it automatically.
Directly from Container
For those who seek a hands-on approach, TAO provides the option to run its capabilities directly from a docker container. To do this, users should have a clear idea about which specific container they need, as TAO encompasses a variety of them.
The choice of container is closely linked to the type of model one wishes to train. While this method might appear more complicated, it offers more granular control. However, it’s worth noting that the intricacies of container selection are conveniently bypassed when utilizing the Launcher CLI.
Designed with scalability and integration in mind, the TAO Toolkit API presents itself as a Kubernetes service. This service facilitates the creation of comprehensive AI models via RESTful APIs. Installation of the API service is seamless, requiring only a Kubernetes cluster (be it local or hosted on AWS EKS) and a Helm chart paired with a few dependencies.
The true strength of this method lies in its scalability, accommodating GPUs within the cluster and allowing for expansion across multiple nodes. For interaction, users can either employ the TAO client CLI for remote access or seamlessly weave it into their applications and services by directly calling the REST APIs.
For users who desire a more straightforward or unencumbered setup, TAO offers the option to run directly on bare-metal, effectively sidestepping the requirements of docker or Kubernetes (K8s). An appealing aspect of this method is its compatibility with platforms like Google Colab, where TAO notebooks can be deployed directly. This eliminates the need for intricate infrastructure configurations and is especially beneficial for those keen on leveraging readily accessible platforms.
Install TAO Toolkit Python Package
conda create -n tao
conda activate tao
conda package and environment manager to create a new virtual environment named “tao”. Virtual environments allow developers to maintain isolated spaces with specific package versions, ensuring that there are no conflicts between packages used in different projects. After creating the “tao” environment, this command activates it. Once activated, any package you install or any Python script you run will use the Python interpreter and packages specific to the “tao” environment.
pip install nvidia-tao
With the “tao” environment active, this command utilizes
pip to install the “nvidia-tao” package.
info invokes the TAO toolkit’s CLI to retrieve and display information about the current state or version of the TAO Toolkit. Depending on the toolkit’s design, it provides details such as the version number, installed modules, or other relevant data.
Install TAO Toolkit Docker Container
The TAO Toolkit offers users the flexibility to run its functionalities directly from a container. This approach leverages containerization technologies, such as Docker, to encapsulate the toolkit and all its dependencies into a self-sufficient package. You can use PyTorch or TensorFlow backends. Take a look at TAO Toolkit NGC Catalog page for the details.
docker pull nvcr.io/nvidia/tao/tao-toolkit:5.0.0-pyt
For example, you can pull TAO Toolkit 5.0 with PyTorch backend using this Docker command. After pulling the image you can create a new container and run TAO commands inside the container directly.
TAO provides multiple avenues to access its features, be it through the Python-based Launcher CLI, direct container interaction, API integration on Kubernetes, or the Python Wheel for more straightforward deployments. Each method is designed to cater to varying levels of expertise and project requirements. Using the toolkit directly from a container, for instance, encapsulates the benefit of reproducibility and consistency, thus eliminating potential system discrepancies.
In sum, TAO Toolkit’s multifaceted accessibility, combined with its robust features, positions it as a compelling choice for AI enthusiasts and professionals seeking efficient and flexible solutions.