To build our local voice satellite on a Debian system rather than using the ATOM Echo device we need something that can handle the wake word component; the piece that means we only send audio to the Home Assistant server for processing by whisper.cpp when we’ve detected someone is trying to talk to us.

openWakeWord seems to be one of the better ways to do this, and is well supported. However. It relies on TensorFlow Lite (now LiteRT) which is a complicated mess of machine learning code. tflite-runtime is available from PyPI, but that’s prebuilt and we’re trying to avoid that.

Despite, on initial impressions, it looking quite complicated to deal with building TensorFlow - Bazel is an immediate warning - it turns out to be incredibly simple to build your own .deb:

$ wget -O tensorflow-v2.15.1.tar.gz https://github.com/tensorflow/tensorflow/archive/refs/tags/v2.15.1.tar.gz
…
$ tar -axf tensorflow-v2.15.1.tar.gz
$ cd tensorflow-2.15.1/
$ BUILD_NUM_JOBS=$(nproc) BUILD_DEB=y tensorflow/lite/tools/pip_package/build_pip_package_with_cmake.sh
…
$ find . -name *.deb
./tensorflow/lite/tools/pip_package/gen/tflite_pip/python3-tflite-runtime-dbgsym_2.15.1-1_amd64.deb
./tensorflow/lite/tools/pip_package/gen/tflite_pip/python3-tflite-runtime_2.15.1-1_amd64.deb

This is hiding an awful lot of complexity, however. In particular the number of 3rd party projects that are being downloaded in the background (and compiled, to be fair, rather than using binary artefacts).

We can build the main C++ wrapper .so directly with cmake, allowing us to investigate a bit more:

mkdir tf-build
cd tf-build/
cmake \
    -DCMAKE_C_FLAGS="-I/usr/include/python3.11" \
    -DCMAKE_CXX_FLAGS="-I/usr/include/python3.11" \
    ../tensorflow-2.15.1/tensorflow/lite/
cmake --build . -t _pywrap_tensorflow_interpreter_wrapper
…
[100%] Built target _pywrap_tensorflow_interpreter_wrapper
$ ldd _pywrap_tensorflow_interpreter_wrapper.so
    linux-vdso.so.1 (0x00007ffec9588000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f22d00d0000)
    libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f22cf600000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f22d00b0000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f22cf81f000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f22d01d1000)

Looking at the output we can see that pthreadpool, FXdiv, FP16 + PSimd are all downloaded, and seem to have ways to point to a local copy. That seems positive.

However, there are even more hidden dependencies, which we can see if we look in the _deps/ subdirectory of the build tree. These don’t appear to be as easy to override, and not all of them have packages already in Debian.

First, the ones that seem to be available: abseil-cpp, cpuinfo, eigen, farmhash, flatbuffers, gemmlowp, ruy + xnnpack

(lots of credit to the Debian Deep Learning Team for these, and in particular Mo Zhou)

Dependencies I couldn’t see existing packages for are: OouraFFT, ml_dtypes & neon2sse.

At this point I just used the package I built with the initial steps above. I live in hope someone will eventually package this properly for Debian, or that I’ll find the time to try and help out, but that’s not going to be today.

I wish upstream developers made it easier to use system copies of their library dependencies. I wish library developers made it easier to build and install system copies of their work. pkgconf is not new tech these days (pkg-config appears to date back to 2000), and has decent support in CMake. I get that there can be issues with incompatibilities even in minor releases, or awkwardness in doing builds of multiple connected projects, but at least give me the option to do so.