Before You Begin¶

Driverless AI can run on machines with only CPUs or machines with CPUs and GPUs. For the best (and intended-as-designed) experience, install Driverless AI on modern data center hardware with GPUs and CUDA support. Feature engineering and model building are primarily performed on CPU and GPU respectively. For this reason, Driverless AI benefits from multi-core CPUs with sufficient system memory and GPUs with sufficient RAM. For best results, we recommend GPUs that use the Pascal or Volta architectures. (Note that the older K80 and M60 GPUs available in EC2 are supported and very convenient, but not as fast.) Image and NLP use cases in particular benefit significantly from GPU usage. For more information, see GPUs in Driverless AI.

Driverless AI supports local, LDAP, and PAM authentication. Authentication can be configured by setting environment variables or via a config.toml file. Refer to the Authentication Methods section for more information. Note that the default authentication method is “unvalidated.”

Driverless AI also supports HDFS, S3, Google Cloud Storage, Google Big Query, KDB, MinIO, and Snowflake access. Support for these data sources can be configured by setting environment variables for the data connectors or via a config.toml file. Refer to the Data Connectors section for more information.

Sizing Requirements¶

Sizing Requirements for Native Installs¶

Driverless AI requires a minimum of 5 GB of system memory in order to start experiments and a minimum of 5 GB of disk space in order to run a small experiment. Note that these limits can changed in the config.toml file. We recommend that you have sufficient system CPU memory (64 GB or more) and 1 TB of free disk space available.

Sizing Requirements for Docker Installs¶

For Docker installs, we recommend 1 TB of free disk space. Driverless AI uses approximately 38 GB. In addition, the unpacking/temp files require space on the same Linux mount /var during installation. Once Driverless AI runs, the mounts from the Docker container can point to other file system mount points.

GPU Sizing Requirements¶

If you are running Driverless AI with GPUs, ensure that your GPU has compute capability >=3.5 and at least 4GB of RAM. If these requirements are not met, then Driverless AI switches to CPU-only mode.

Sizing Requirements for Storing Experiments¶

We recommend that your Driverless tmp directory has at least 500 GB to 1 TB of space. The (Driverless) tmp directory holds all experiments and all datasets. We also recommend that you use SSDs (preferably NVMe).

Virtual Memory Settings in Linux¶

If you are running Driverless AI on a Linux machine, we recommend setting the overcommit memory to 0. The setting can be changed with the following command:

sudo sh -c "/bin/echo 0 > /proc/sys/vm/overcommit_memory"

This is the default value that indicates that the Linux kernel is free to overcommit memory. If this value is set to 2, then the Linux kernel does not overcommit memory. In the latter case, the memory requirements of Driverless AI may surpass the memory allocation limit and prevent the experiment from completing.

Memory Requirements per Experiment¶

As a rule of thumb, the memory requirement per experiment is approximately 5 to 10 times the size of the dataset. Dataset size can be estimated as the number of rows x columns x 4 bytes; if text is present in the data, then more bytes per element are needed.

Backup Strategy¶

The Driverless AI tmp directory is used to store all experiment artifacts such as deployment artifacts and MLIs. It also stores the master.db database that tracks users to Driverless artifacts. Note that no files should be added or deleted in the tmp folder outside of what Driverless AI adds automatically.

We recommend periodically stopping Driverless AI and backing up the Driverless AI tmp directory to ensure that a copy of the Driverless AI state is available for instances where you may need to revert to a prior state.

Upgrade Strategy¶

When upgrading Driverless AI, note that:

This release deprecates experiments and MLI models from Driverless versions 1.7.0 and earlier.
We recommend following these steps before upgrading:
- Build MLI models: Before upgrading, run MLI jobs on models that you want to continue to interpret in future Driverless AI releases. If an MLI job appears in the list of Interpreted Models in your current version, then it is retained after upgrading.
- Build MOJO pipelines: Before upgrading, build MOJO pipelines on all desired models.
- Stop Driverless AI and make a backup (copy) of the Driverless AI tmp directory.

The upgrade process inherits the service user and group from /etc/dai/User.conf and /etc/dai/Group.conf. You do not need to manually specify the DAI_USER or DAI_GROUP environment variables during an upgrade.

Note: Driverless AI does not support data migration from a newer version to an older version. If you rollback to an older version of Driverless AI after upgrading, newer versions of the master.db file will not work with the older Driverless AI version. For this reason, we recommend saving a copy of the older ‘tmp’ directory to fully restore the older Driverless AI version’s state.

Other Notes¶

Supported Browsers¶

Driverless AI is tested most extensively on Chrome and Firefox. For the best user experience, we recommend using the latest version of Chrome. You may encounter issues if you use other browsers or earlier versions of Chrome and/or Firefox.

To `sudo` or Not to `sudo`¶

Driverless RPM and DEB installs require sudo access. The TARSH install can be done without sudo access.

Some of the installation steps in the document may show sudo prepending different commands. Note that using sudo may not always be required.

Note about Docker Configuration (`ulimit`)¶

When running Driverless AI with Docker, it is recommended to configure ulimit options by using the --ulimit argument to docker run. The following is an example of how to configure these options:

--ulimit nproc=65535:65535 \
--ulimit nofile=4096:8192 \

Refer to https://docs.docker.com/engine/reference/commandline/run/#set-ulimits-in-container—ulimit for more information on these options.

Note about nvidia-docker 1.0¶

If you have nvidia-docker 1.0 installed, you need to remove it and all existing GPU containers. Refer to https://github.com/NVIDIA/nvidia-docker/blob/master/README.md for more information.

Deprecation of `nvidia-smi`¶

The nvidia-smi command has been deprecated by NVIDIA. Refer to https://github.com/nvidia/nvidia-docker#upgrading-with-nvidia-docker2-deprecated for more information. The installation steps have been updated for enabling persistence mode for GPUs.

New `nvidia-container-runtime-hook` Requirement for PowerPC Users¶

PowerPC users are now required to install the nvidia-container-runtime-hook when running in Docker. Refer to https://github.com/nvidia/nvidia-docker#rhel-docker for more information. The IBM Docker installation steps have been updated to reflect this information.

Note About CUDA Versions¶

Your host environment must have CUDA 10.0 or later with NVIDIA drivers >= 440.82 installed (GPU only). Driverless AI ships with its own CUDA libraries, but the driver must exist in the host environment. Go to https://www.nvidia.com/Download/index.aspx to get the latest NVIDIA Tesla V/P/K series driver.

Note About Authentication¶

The default authentication setting in Driverless AI is “unvalidated.” In this case, Driverless AI will accept any login and password combination, it will not validate whether the password is correct for the specified login ID, and it will connect to the system as the user specified in the login ID. This is true for all instances, including Cloud, Docker, and native instances.

We recommend that you configure authentication. Driverless AI provides a number of authentication options, including LDAP, PAM, Local, and None. Refer to Authentication Methods for information on how to enable a different authentication method.

Note: Driverless AI is also integrated with IBM Spectrum Conductor and supports authentication from Conductor. Contact sales@h2o.ai for more information about using IBM Spectrum Conductor authentication.

Note About Shared File Systems¶

If your environment uses a shared file system, then you must set the following configuration option:

datatable_strategy='write'

The above can be specified in the config.toml file (for native installs) or specified as an environment variable (Docker image installs).

This configuration is required because, in some cases, Driverless AI can fail to read files during an experiment. The write option will allow Driverless AI to properly read and write data from shared file systems to disk.

Note About the Master Database File¶

The master.db file keeps track of users to Driverless artifacts in the DAI tmp directory. If you are running two versions of Driverless AI, keep in mind that newer versions of the master.db file will not work with older versions of Driverless AI.