TensorFlow™ Cross Compilation Guide
Introduction
This guide will explain how to set up your machine to run the SYCL™ version of TensorFlow™ using ComputeCpp. This guide describes how to cross-compile TensorFlow 1.9 and run it on any device supporting SPIR or SPIR-V. To compile natively please read our other guide.
Configuration management
These instructions relate to the following configuration:
Host platform | Ubuntu 16.04, x86_64 architecture (more recent versions of Ubuntu, or other Linux distributions, may work but are not supported) |
Target platform | HiKey 960 development board with an Arm Mali G71 MP8 GPU, running Debian 9, aarch64 architecture Chromebook with a PowerVR Rogue GX6250 GPU, running Ubuntu 16.04, aarch64 architecture |
TensorFlow | master |
ComputeCpp | Latest |
Python | 3.5 (more recent versions of Python may work but are not supported) |
Notes:
- For older or newer versions of TensorFlow, please contact Codeplay for build documentation.
- If you are interested in the latest features you may try our experimental branch.
- OpenCL devices other than those listed above may work, but Codeplay does not support them at this time.
- It is strongly recommended to make sure you have a working OpenCL installation before building TensorFlow, see here.
- It is strongly recommended to test your ComputeCpp installation using the Eigen tests before building TensorFlow.
Build TensorFlow with SYCL
Install dependency packages
sudo dpkg --add-architecture arm64
echo "deb [arch=arm64] http://ports.ubuntu.com/ xenial main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/arm.list
echo "deb [arch=arm64] http://ports.ubuntu.com/ xenial-updates main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/arm.list
echo "deb [arch=arm64] http://ports.ubuntu.com/ xenial-security main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/arm.list
echo "deb [arch=arm64] http://ports.ubuntu.com/ xenial-backports main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/arm.list
sudo sed -i 's#deb http://gb.archive.ubuntu.com/ubuntu#deb [arch=amd64] http://gb.archive.ubuntu.com/ubuntu#g ' /etc/apt/sources.list
sudo sed -i 's#deb http://security.ubuntu.com/ubuntu#deb [arch=amd64] http://security.ubuntu.com/ubuntu#g ' /etc/apt/sources.list
sudo apt update
sudo apt install -y git cmake libpython3-all-dev:arm64 opencl-headers openjdk-8-jdk python3 python3-pip zlib1g-dev:arm64 ocl-icd-opencl-dev:arm64
pip install -U --user numpy==1.14.5 wheel==0.31.1 six==1.11.0 mock==2.0.0 enum34==1.1.6
Specific python package versions are added here for reference. More recent versions of numpy is known to break some tests in this version of TensorFlow.
The rest of the guide will assume Python 3.5 is used by default. This can be done for example with:
alias python=python3.5
alias pip=pip3.5
Install toolchains
- Register for an account on Codeplay's developer website: https://developer.codeplay.com/computecppce/latest/download
- From that page, download the following version: Latest > linux-gnu > x86_64
computecpp-ce-*-x86_64-linux-gnu.tar.gz
- From the same page, download the following version: Latest > linux-gnu > aarch64
computecpp-ce-*-aarch64-linux-gnu.tar.gz
Set up an environment variable with the ComputeCpp version so that you can copy and paste the commands below.
For example:
export CCPP_VERSION=2.0.0
tar -xf computecpp-ce-${CCPP_VERSION}-x86_64-linux-gnu.tar.gz
tar -xf ComputeCpp-CE-${CCPP_VERSION}-aarch64-linux-gnu.tar.gz
cp ComputeCpp-CE-${CCPP_VERSION}-Ubuntu-16.04-x86_64/bin/* ComputeCpp-CE-${CCPP_VERSION}-Ubuntu-16.04-ARM_64/bin/
wget https://releases.linaro.org/components/toolchain/binaries/6.3-2017.05/aarch64-linux-gnu/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu.tar.xz
tar -xf gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu.tar.xz
mkdir -p $HOME/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu/aarch64-linux-gnu/libc/usr/include/aarch64-linux-gnu
ln -s /usr/include/aarch64-linux-gnu/python3.5m/ $HOME/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu/aarch64-linux-gnu/libc/usr/include/aarch64-linux-gnu/
ln -s /usr/lib/aarch64-linux-gnu/libOpenCL.so $HOME/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu/aarch64-linux-gnu/libc/usr/lib/libOpenCL.so
export COMPUTECPP_TOOLKIT_PATH=$HOME/ComputeCpp-CE-${CCPP_VERSION}-Ubuntu-16.04-ARM_64
export TF_SYCL_CROSS_TOOLCHAIN=$HOME/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu
export TF_SYCL_CROSS_TOOLCHAIN_NAME=aarch64-linux-gnu
Install Bazel
wget https://github.com/bazelbuild/bazel/releases/download/0.16.0/bazel_0.16.0-linux-x86_64.deb
sudo apt install -y ./bazel_0.16.0-linux-x86_64.deb
bazel version
Check that the bazel version output from the above command is 0.16.0.
Configure TensorFlow
Configure TensorFlow as described in the native guide.
CC_OPT_FLAGS
is a set of flags to provide when compiling with --config=opt
.
It will vary depending on your architecture.
The flag -march=native
cannot be used when cross-compiling, you can set for instance:
export CC_OPT_FLAGS="-march=armv8-a"
If you are unsure which flags to use, do not use --config=opt
.
Build TensorFlow
./configure
bazel build --verbose_failures --jobs=6 --config=sycl --config=opt //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package $TF_WHL_DIR
mv $TF_WHL_DIR/tensorflow-1.9.0-cp35-cp35m-linux_x86_64.whl $TF_WHL_DIR/tensorflow-1.9.0-cp35-cp35m-linux_aarch64.whl
The .whl
file may have a different name depending on your architecture or the
Python version provided by PYTHON_BIN_PATH
. Here Python 3.5 is assumed to be
the default version used.
Notes:
- Make sure to re-run
./configure
if you change any of the environment variables described in this guide. - It is recommended to keep the exact same environment variables when building again with bazel to avoid re-building from scratch.
- It is recommended to provide
--jobs=X
tobazel
with X strictly smaller than your number of threads to reduce RAM usage.
Bundle and install the wheel
Choose an existing folder to output the TensorFlow wheel, for example:
export TF_WHL_DIR=$HOME
bazel-bin/tensorflow/tools/pip_package/build_pip_package $TF_WHL_DIR
mv $TF_WHL_DIR/tensorflow-1.9.0-cp35-cp35m-linux_x86_64.whl $TF_WHL_DIR/tensorflow-1.9.0-cp35-cp35m-linux_aarch64.whl
Set up the development board
Copy ComputeCpp-CE-${CCPP_VERSION}-Ubuntu.16.04-ARM64.tar.gz
and
$TF_WHL_DIR/tensorflow-1.9.0-cp35-cp35m-linux_aarch64.whl
to your device e.g.
using the scp
command.
All of the following commands should be run on the development board. Depending on how your development board's disk space has been partitioned, you may have to manage the available space carefully - the following requires at least 1.2GB free.
Install an OpenCL driver
See our other guide to install an OpenCL driver supporting SPIR or SPIR-V.
Install dependency packages
apt -y install git python3 python3-pip
# The following apt packages are required to build scipy from source but can be removed later
apt -y install gcc gfortran python3-dev libopenblas-dev liblapack-dev cython
Make Python 3.5 the default for example with:
alias python=python3.5
alias pip=pip3.5
Many tests and benchmarks require more pip packages than the minimal set of packages listed in the pre-requisites. The versions listed below are known to work with this build of TensorFlow:
pip install -U --user numpy==1.14.5 wheel==0.31.1 six==1.11.0 mock==2.0.0 enum34==1.1.6 portpicker==1.2.0
# Cython is required to build the next packages from source but can be removed later
pip install -U --user cython==0.29.1
pip install -U --user scipy==1.1.0
pip install -U --user scikit-learn==0.20.2
pip install -U --user --no-deps sklearn
Install Tensorflow
pip install --user tensorflow-1.9.0-cp35-cp35m-linux_aarch64.whl
Set up ComputeCpp
tar -xf ComputeCpp-CE-${CCPP_VERSION}-Ubuntu.16.04-ARM64.tar.gz
export LD_LIBRARY_PATH+=:$HOME/ComputeCpp-CE-${CCPP_VERSION}-Ubuntu-16.04-ARM_64/lib
$HOME/ComputeCpp-CE-${CCPP_VERSION}-Ubuntu-16.04-ARM_64/bin/computecpp_info
The output should show that at least one OpenCL driver has been found, it may not support SPIR which is fine.
Run benchmarks
To verify the installation, you can execute some of the standard TensorFlow benchmarks. You can find an example in our other guide.