This guide will explain how to set up your machine to run the OpenCL™ version of TensorFlow™ using ComputeCpp, a SYCL™ implementation. This guide describes how to build and run TensorFlow 1.9 on any device supporting SPIR or SPIR-V.
These instructions were tested on Ubuntu 16.04 with an AMD R9 Nano Fury GPU. For Arm platforms please read our other guide and for other platforms, please adapt the instructions below.
Environment Setup
These instructions related to the following versions:
- Tensorflow : ee7c7ba
- ComputeCpp : 1.0.0
- CPU : 64-bit CPU
- GPU : AMD R9 Nano Fury
Notes
- For older or newer versions of TensorFlow, please contact Codeplay for build documentation
- OpenCL devices other than those listed above may work, but Codeplay does not support them at this time
Pre-Requisites
Ubuntu 16.04.1 (more recent versions of Ubuntu, or other Linux distributions, may work but are not supported)
Driver amdgpu-pro version 17.50-511655
Note: this version is required as more recent versions have not been verified to work, and may break OpenCL support for features such as SPIR
wget --referer http://support.amd.com/ https://www2.ati.com/drivers/linux/ubuntu/amdgpu-pro-17.50-511655.tar.xz
tar xf amdgpu-pro-17.50-511655.tar.xz
./amdgpu-pro-17.50-511655/amdgpu-pro-install --opencl=legacy --headless
These options install the driver for compute only, so please look at the help if you want to use this driver for graphics too.
Verify your OpenCL installation with clinfo
sudo apt-get update
sudo apt-get install clinfo
clinfo
The output should list at least one platform and one device. The Extensions field of the device properties should include "cl_khr_spir" and/or "cl_khr_il_program".
Build TensorFlow with SYCL
Install dependency packages:
sudo apt-get update
sudo apt-get install -y git gcc build-essential libpython-all-dev opencl-headers openjdk-8-jdk python python-dev python-pip zlib1g-dev
sudo pip install numpy==1.14.5 wheel==0.31.1 six==1.11.0 mock==2.0.0 enum34==1.1.6 scipy==0.18.1 sklearn
Specific python package versions are added here for reference. Version 1.14.5 of numpy is required, as newer versions are known to break.
Install toolchains
-
Register for an account on Codeplay’s developer website: https://developer.codeplay.com/computecppce/latest/download
-
From that Downloads page, download the following version: Ubuntu 16.04 > 64bit > computecpp-ce-1.0.0-ubuntu.16.04-64bit.tar.gz
tar -xf ComputeCpp-CE-1.0.0-Ubuntu.16.04-64bit.tar.gz
sudo mv ComputeCpp-CE-1.0.0-Ubuntu-16.04-x86\_64 /usr/local/computecpp
export COMPUTECPP\_TOOLKIT\_PATH=/usr/local/computecpp
export LD\_LIBRARY\_PATH+=:/usr/local/computecpp/lib
/usr/local/computecpp/bin/computecpp\_info
The "computecpp_info" tool should now list your supported devices similar to the below message.
********************************************************************************
ComputeCpp Info (CE 1.0.0)
********************************************************************************
Toolchain information:
GLIBCXX: 20150426
This version of libstdc++ is supported.
********************************************************************************
Device Info:
Discovered 1 devices matching:
platform :
device type :
--------------------------------------------------------------------------------
Device 0:
Device is supported : YES - Tested internally by Codeplay Software Ltd.
CL_DEVICE_NAME : Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
CL_DEVICE_VENDOR : Intel(R) Corporation
CL_DRIVER_VERSION : 1.2.0.25
CL_DEVICE_TYPE : CL_DEVICE_TYPE_CPU
********************************************************************************
Notes:
-
If you see "error while loading shared libraries:
libOpenCL.so
" then you have not installed the OpenCL drivers needed to run ComputeCpp. -
If the device is listed as "untested" by the tool, Codeplay does not test or support that specific device but it would still be expected to work correctly.
Install Bazel
wget https://github.com/bazelbuild/bazel/releases/download/0.16.0/bazel_0.16.0-linux-x86_64.deb
sudo apt install -y bazel_0.16.0-linux-x86_64.deb
bazel version
Check that the bazel version output from the above command is 0.16.0. More recent versions may work but are not supported.
Build and Test TensorFlow
git clone http://github.com/codeplaysoftware/tensorflow
cd tensorflow
git checkout ee7c7ba6aea710f09ce09af45885922a55577496
export PYTHON_BIN_PATH=/usr/bin/python
export USE_DEFAULT_PYTHON_LIB_PATH=1
export TF_NEED_JEMALLOC=1
export TF_NEED_MKL=0
export TF_NEED_GCP=0
export TF_NEED_HDFS=0
export TF_ENABLE_XLA=0
export TF_NEED_CUDA=0
export TF_NEED_VERBS=0
export TF_NEED_MPI=0
export TF_NEED_GDR=0
export TF_NEED_AWS=0
export TF_NEED_S3=0
export TF_NEED_KAFKA=0
export TF_DOWNLOAD_CLANG=0
export TF_SET_ANDROID_WORKSPACE=0
export TF_NEED_OPENCL_SYCL=1
export TF_NEED_COMPUTECPP=1
export TF_USE_DOUBLE_SYCL=1
Set this to 0 if double-precision floating-point operations are not needed, or are not supported by your device (you can verify this by checking for "cl_khr_fp64" in the Extensions field of the device properties output by the clinfo command).
export TF_USE_HALF_SYCL=0
Half-precision floating-point operations are not supported yet.
export TF_SYCL_BITCODE_TARGET=spir64
- The possible values for this option are 'spir32', 'spir64, 'spirv32', 'spirv64' or 'ptx64' depending on which intermediate language your OpenCL library supports:
- 'ptx64' is for Nvidia GPUs.
- On other platforms, check the device properties output by the clinfo command:
- In the Extensions field, if "cl_khr_spir" is present, use 'spirXX', or if cl_khr_il_program is present, use 'spirvXX'.
- Substitute XX above for the value of the "Address bits" field. Note that issues can arise if the device’s "Address bits" value doesn’t match that of the host CPU e.g. a 64-bit CPU and 32-bit GPU.
./configure
bazel build --verbose_failures -c opt --config=sycl --cxxopt=-no-serial-memop //tensorflow/tools/pip_package:build_pip_package
- --cxxopt=-no-serial-memop is recommended for better performances however not all devices support it. If all the operations using the GPU throw an exception, please remove this option.
Build and install the wheel
bazel-bin/tensorflow/tools/pip_package/build_pip_package <path/to/output/folder>
sudo pip install <path/to/output/folder>/tensorflow-1.9.0-cp27-cp27mu-linux_x86_64.whl
Running the Benchmarks
To verify the installation, you can execute some of the standard TensorFlow benchmarks. The example below shows how to run AlexNet:
git clone http://github.com/tensorflow/benchmarks
cd benchmarks
git checkout ae48d7118ac17e5069820bf3105a766317d72f3a
More recent versions should work but require some changes in the option to allow for SYCL.
python scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --num_batches=10 --local_parameter_device=sycl --device=sycl --batch_size=1 --forward_only=true --model=alexnet --data_format=NHWC
You may see warnings about deprecated functions, but they can be safely ignored.
Run the Tests
Running the tests is a good way to check which operations are currently supported with a particular device. You can do so with the command:
bazel test --test_lang_filters=cc,py --test_timeout 1500 --verbose_failures -c opt --config=sycl --cxxopt=-no-serial-memop -- //tensorflow/... -//tensorflow/compiler/... -//tensorflow/contrib/distributions/... -//tensorflow/contrib/lite/... -//tensorflow/contrib/session_bundle/... -//tensorflow/contrib/slim/... -//tensorflow/contrib/verbs/... -//tensorflow/core/distributed_runtime/... -//tensorflow/core/kernels/hexagon/... -//tensorflow/go/... -//tensorflow/java/... -//tensorflow/python/debug/... -//tensorflow/stream_executor/...
Make sure to use the same compiler options as during the build step, in particular regarding --cxxopt=-no-serial-memop.