Local Voice Assistant Step 3: A Detour into Tensorflow

May 14, 2025 • 0 comments

To build our local voice satellite on a Debian system rather than using the ATOM Echo device we need something that can handle the wake word component; the piece that means we only send audio to the Home Assistant server for processing by whisper.cpp when we’ve detected someone is trying to talk to us.

openWakeWord seems to be one of the better ways to do this, and is well supported. However. It relies on TensorFlow Lite (now LiteRT) which is a complicated mess of machine learning code. tflite-runtime is available from PyPI, but that’s prebuilt and we’re trying to avoid that.

Despite, on initial impressions, it looking quite complicated to deal with building TensorFlow - Bazel is an immediate warning - it turns out to be incredibly simple to build your own .deb:

$ wget -O tensorflow-v2.15.1.tar.gz https://github.com/tensorflow/tensorflow/archive/refs/tags/v2.15.1.tar.gz
…
$ tar -axf tensorflow-v2.15.1.tar.gz
$ cd tensorflow-2.15.1/
$ BUILD_NUM_JOBS=$(nproc) BUILD_DEB=y tensorflow/lite/tools/pip_package/build_pip_package_with_cmake.sh
…
$ find . -name *.deb
./tensorflow/lite/tools/pip_package/gen/tflite_pip/python3-tflite-runtime-dbgsym_2.15.1-1_amd64.deb
./tensorflow/lite/tools/pip_package/gen/tflite_pip/python3-tflite-runtime_2.15.1-1_amd64.deb

This is hiding an awful lot of complexity, however. In particular the number of 3rd party projects that are being downloaded in the background (and compiled, to be fair, rather than using binary artefacts).

We can build the main C++ wrapper .so directly with cmake, allowing us to investigate a bit more:

mkdir tf-build
cd tf-build/
cmake \
    -DCMAKE_C_FLAGS="-I/usr/include/python3.11" \
    -DCMAKE_CXX_FLAGS="-I/usr/include/python3.11" \
    ../tensorflow-2.15.1/tensorflow/lite/
cmake --build . -t _pywrap_tensorflow_interpreter_wrapper
…
[100%] Built target _pywrap_tensorflow_interpreter_wrapper
$ ldd _pywrap_tensorflow_interpreter_wrapper.so
    linux-vdso.so.1 (0x00007ffec9588000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f22d00d0000)
    libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f22cf600000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f22d00b0000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f22cf81f000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f22d01d1000)

Looking at the output we can see that pthreadpool, FXdiv, FP16 + PSimd are all downloaded, and seem to have ways to point to a local copy. That seems positive.

However, there are even more hidden dependencies, which we can see if we look in the _deps/ subdirectory of the build tree. These don’t appear to be as easy to override, and not all of them have packages already in Debian.

First, the ones that seem to be available: abseil-cpp, cpuinfo, eigen, farmhash, flatbuffers, gemmlowp, ruy + xnnpack

(lots of credit to the Debian Deep Learning Team for these, and in particular Mo Zhou)

Dependencies I couldn’t see existing packages for are: OouraFFT, ml_dtypes & neon2sse.

At this point I just used the package I built with the initial steps above. I live in hope someone will eventually package this properly for Debian, or that I’ll find the time to try and help out, but that’s not going to be today.

I wish upstream developers made it easier to use system copies of their library dependencies. I wish library developers made it easier to build and install system copies of their work. pkgconf is not new tech these days (pkg-config appears to date back to 2000), and has decent support in CMake. I get that there can be issues with incompatibilities even in minor releases, or awkwardness in doing builds of multiple connected projects, but at least give me the option to do so.

Local Voice Assistant Step 2: Speech to Text and back

May 1, 2025 • 0 comments

Having setup an ATOM Echo Voice Satellite and hooked it up to Home Assistant we now need to actually do something with the captured audio. Home Assistant largely deals with voice assistants using the Wyoming Protocol, which describes itself as essentially JSONL + PCM audio. It works nicely in terms of meaning everything can exist as separate modules that then just communicate over network sockets, and there are a whole bunch of Python implementations of the pieces necessary.

The first bit I looked at was speech to text; how do I get what I say to the voice satellite into something that Home Assistant can try and parse? There is a nice self contained speech recognition tool called whisper.cpp, which is a low dependency implementation of inference using OpenAI’s Whisper model. This is wrapped up for Wyoming as part of wyoming-whisper-cpp. Here we get into something that unfortunately seems common in this space; the repo contains a forked copy of whisper.cpp with enough differences that I couldn’t trivially make it work with regular whisper.cpp. That means missing out on new development, and potential improvements (the fork appears to be at v1.5.4, upstream is up to v1.7.5 at the time of writing). However it was possible to get up and running easily enough.

[I note there is a Wyoming Whisper API client that can use the whisper.cpp server, and that might be a cleaner way to go in the future, especially if whisper.cpp ends up in Debian.]

I stated previously I wanted all of this to be as clean an installed on Debian stable as possible. Given most of this isn’t packaged, that’s meant I’ve packaged things up as I go. I’m not at the stage anything is suitable for upload to Debian proper, but equally I’ve tried to make them a reasonable starting point. No pre-built binaries available, just Salsa git repos. https://salsa.debian.org/noodles/wyoming-whisper-cpp in this case. You need python3-wyoming from trixie if you’re building for bookworm, but it doesn’t need rebuilt.

You need a Whisper model that’s been converts to ggml format; they can be found on Hugging Face. I’ve ended up using the base.en model. I found small.en gave more accurate results, but took a little longer, when doing random testing, but it doesn’t seem to make much of a difference for voice control rather than plain transcribing.

[One of the open questions about uploading this to Debian is around the use of a prebuilt AI model. I don’t know what the right answer is here, and whether the voice infrastructure could ever be part of Debian proper, but the current discussion on the interpretation of the DFSG on AI models is very relevant.]

I run this in the same container as my Home Assistant install, using a systemd unit file dropped in /etc/systemd/system/wyoming-whisper-cpp.service:

[Unit]
Description=Wyoming whisper.cpp server
After=network.target

[Service]
Type=simple
DynamicUser=yes
ExecStart=wyoming-whisper-cpp --uri tcp://localhost:10030 --model base.en

MemoryDenyWriteExecute=false
ProtectControlGroups=true
PrivateDevices=false
ProtectKernelTunables=true
ProtectSystem=true
RestrictRealtime=true
RestrictNamespaces=true

[Install]
WantedBy=multi-user.target

It needs the Wyoming Protocol integration enabled in Home Assistant; you can “Add Entry” and enter localhost + 10030 for host + port and it’ll get added. Then in the Voice Assistant configuration there’ll be a whisper.cpp option available.

Text to speech turns out to be weirdly harder. The right answer is something like Wyoming Piper, but that turns out to be hard on bookworm. I’ll come back to that in a future post. For now I took the easy option and used the built in “Google Translate” option in Home Assistant. That needed an extra stanza in configuration.yaml that wasn’t entirely obvious:

media_source:

With this, and the ATOM voice satellite, I could now do basic voice control of my Home Assistant setup, with everything except the text-to-speech piece happening locally! Things such as “Hey Jarvis, turn on the study light” work out of the box. I haven’t yet got into defining my own phrases, partly because I know some of the things I want (“What time is it?”) are already added in later Home Assistant versions than the one I’m running.

Overall I found this initially complicated to setup given my self-imposed constraints about actually understanding the building blocks and compiling them myself, but I’ve been pretty impressed with the work that’s gone into it all. Next step, running a voice satellite on a Debian box.

Local Voice Assistant Step 1: An ATOM Echo voice satellite

Apr 24, 2025 • 0 comments

Back when I setup my home automation I ended up with one piece that used an external service: Amazon Alexa. I’d rather not have done this, but voice control is extremely convenient, both for us, and guests. Since then Home Assistant has done a lot of work in developing the capability of a local voice assistant - 2023 was their Year of Voice. I’ve had brief looks at this in the past, but never quite had the time to dig into setting it up, and was put off by the fact a lot of the setup instructions were just “Download our prebuilt components”. While I admire the efforts to get Home Assistant fully packaged for Debian I accept that’s a tricky proposition, and settle for running it in a venv on a Debian stable container. Voice requires a lot more binary components, and I want to have “voice satellites” in more than one location, so I set about trying to understand a bit better what I was deploying, and actually building the binary bits myself.

This is the start of a write-up of that. I’ll break it into a bunch of posts, trying to cover one bit in each, because otherwise this will get massive. Let’s start with some requirements:

All local processing; no call-outs to external services
Ability to have multiple voice satellites in the house
A desire to do wake word detection on the satellites, to avoid lots of network audio traffic all the time
As clean an install on a Debian stable based system as possible
Binaries built locally
No need for a GPU

My house server is an AMD Ryzen 7 5700G, so my expectation was that I’d have enough local processing power to be able to do this. That turned out to be a valid assumption - speech to text really has come a long way in recent years. I’m still running Home Assistant 2024.3.3 - the last one that supports (but complains about) Python 3.11. Trixie has started the freeze process, so once it releases I’ll look at updating the HA install. For now what I have has turned out to be Good Enough, but I know there have been improvements upstream I’m missing.

Finally, before I get into the details, I should point out that if you just want to get started with a voice assistant on Home Assistant and don’t care about what’s under the hood, there are a bunch of more user friendly details on Home Assistant’s site itself, and they have pre-built images you can just deploy.

My first step was sorting out a “voice satellite”. This is the device that actually has a microphone and speaker and communicates with the main Home Assistant setup. I’d seen the post about a $13 voice assistant, and as a result had an ATOM Echo sitting on my desk I hadn’t got around to setting up.

Here, we ignore a bit about delving into exactly what’s going on under the hood, even if we’re compiling locally. This is a constrained embedded device and while I’m familiar with the ESP32 IDF build system I just accepted that using ESPHome and letting it do it’s thing was the quickest way to get up and running. It is possible to do this all via the web with a pre-built image, but I wanted to change the wake word to “Hey Jarvis” rather than the default “Okay Nabu”, and that was a good reason to bother doing a local build. We’ll get into actually building a voice satellite on Debian in later posts.

I started with the default upstream assistant config and tweaked it a little for my setup:

diff of my configuration tweaks

$ diff -u m5stack-atom-echo.yaml assistant.yaml
--- m5stack-atom-echo.yaml    2025-04-18 13:41:21.812766112 +0100
+++ assistant.yaml  2025-01-20 17:33:24.918585244 +0000
@@ -1,7 +1,7 @@
 substitutions:
-  name: m5stack-atom-echo
+  name: study-atom-echo
   friendly_name: M5Stack Atom Echo
-  micro_wake_word_model: okay_nabu  # alexa, hey_jarvis, hey_mycroft are also supported
+  micro_wake_word_model: hey_jarvis  # alexa, hey_jarvis, hey_mycroft are also supported
 
 esphome:
   name: ${name}
@@ -16,15 +16,26 @@
     version: 4.4.8
     platform_version: 5.4.0
 
+# Enable logging
 logger:
+
+# Enable Home Assistant API
 api:
+  encryption:
+    key: "TGlrZVRoaXNJc1JlYWxseUl0Rm9vbGlzaFBlb3BsZSE="
 
 ota:
   - platform: esphome
-    id: ota_esphome
+    password: "itsnotarealthing"
 
 wifi:
+  ssid: "My Wifi Goes Here"
+  password: "AndThePasswordGoesHere"
+
+  # Enable fallback hotspot (captive portal) in case wifi connection fails
   ap:
+    ssid: "Study-Atom-Echo Fallback Hotspot"
+    password: "ThisIsRandom"
 
 captive_portal:

(I note that the current upstream config has moved on a bit since I first did this, but I double checked the above instructions still work at the time of writing. I end up pinning ESPHome to the right version below due to that.)

It turns out to be fairly easy to setup ESPHome in a venv and get it to build + flash the image for you:

Instructions for building + flashing ESPHome to ATOM Echo

noodles@sevai:~$ python3 -m venv esphome-atom-echo
noodles@sevai:~$ . esphome-atom-echo/bin/activate
(esphome-atom-echo) noodles@sevai:~$ cd esphome-atom-echo/
(esphome-atom-echo) noodles@sevai:~/esphome-atom-echo$  pip install esphome==2024.12.4
Collecting esphome==2024.12.4
  Using cached esphome-2024.12.4-py3-none-any.whl (4.1 MB)
…
Successfully installed FontTools-4.57.0 PyYAML-6.0.2 appdirs-1.4.4 attrs-25.3.0 bottle-0.13.2 defcon-0.12.1 esphome-2024.12.4 esphome-dashboard-20241217.1 freetype-py-2.5.1 fs-2.4.16 gflanguages-0.7.3 glyphsLib-6.10.1 glyphsets-1.0.0 openstep-plist-0.5.0 pillow-10.4.0 platformio-6.1.16 protobuf-3.20.3 puremagic-1.27 ufoLib2-0.17.1 unicodedata2-16.0.0
(esphome-atom-echo) noodles@sevai:~/esphome-atom-echo$ esphome compile assistant.yaml 
INFO ESPHome 2024.12.4
INFO Reading configuration assistant.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
INFO Updating https://github.com/jesserockz/esphome-components.git@None
…
Linking .pioenvs/study-atom-echo/firmware.elf
/home/noodles/.platformio/packages/toolchain-xtensa-esp32@8.4.0+2021r2-patch5/bin/../lib/gcc/xtensa-esp32-elf/8.4.0/../../../../xtensa-esp32-elf/bin/ld: missing --end-group; added as last command line option
RAM:   [=         ]  10.6% (used 34632 bytes from 327680 bytes)
Flash: [========  ]  79.8% (used 1463813 bytes from 1835008 bytes)
Building .pioenvs/study-atom-echo/firmware.bin
Creating esp32 image...
Successfully created esp32 image.
esp32_create_combined_bin([".pioenvs/study-atom-echo/firmware.bin"], [".pioenvs/study-atom-echo/firmware.elf"])
Wrote 0x176fb0 bytes to file /home/noodles/esphome-atom-echo/.esphome/build/study-atom-echo/.pioenvs/study-atom-echo/firmware.factory.bin, ready to flash to offset 0x0
esp32_copy_ota_bin([".pioenvs/study-atom-echo/firmware.bin"], [".pioenvs/study-atom-echo/firmware.elf"])
==================================================================================== [SUCCESS] Took 130.57 seconds ====================================================================================
INFO Successfully compiled program.
(esphome-atom-echo) noodles@sevai:~/esphome-atom-echo$ esphome upload --device /dev/serial/by-id/usb-Hades2001_M5stack_9552AF8367-if00-port0 assistant.yaml 
INFO ESPHome 2024.12.4
INFO Reading configuration assistant.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
INFO Updating https://github.com/jesserockz/esphome-components.git@None
…
INFO Upload with baud rate 460800 failed. Trying again with baud rate 115200.
esptool.py v4.7.0
Serial port /dev/serial/by-id/usb-Hades2001_M5stack_9552AF8367-if00-port0
Connecting....
Chip is ESP32-PICO-D4 (revision v1.1)
Features: WiFi, BT, Dual Core, 240MHz, Embedded Flash, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
MAC: 64:b7:08:8a:1b:c0
Uploading stub...
Running stub...
Stub running...
Configuring flash size...
Auto-detected Flash size: 4MB
Flash will be erased from 0x00010000 to 0x00176fff...
Flash will be erased from 0x00001000 to 0x00007fff...
Flash will be erased from 0x00008000 to 0x00008fff...
Flash will be erased from 0x00009000 to 0x0000afff...
Compressed 1470384 bytes to 914252...
Wrote 1470384 bytes (914252 compressed) at 0x00010000 in 82.0 seconds (effective 143.5 kbit/s)...
Hash of data verified.
Compressed 25632 bytes to 16088...
Wrote 25632 bytes (16088 compressed) at 0x00001000 in 1.8 seconds (effective 113.1 kbit/s)...
Hash of data verified.
Compressed 3072 bytes to 134...
Wrote 3072 bytes (134 compressed) at 0x00008000 in 0.1 seconds (effective 383.7 kbit/s)...
Hash of data verified.
Compressed 8192 bytes to 31...
Wrote 8192 bytes (31 compressed) at 0x00009000 in 0.1 seconds (effective 813.5 kbit/s)...
Hash of data verified.

Leaving...
Hard resetting via RTS pin...
INFO Successfully uploaded program.

And then you can watch it boot (this is mine already configured up in Home Assistant):

Watching the ATOM Echo boot

$ picocom --quiet --imap lfcrlf --baud 115200 /dev/serial/by-id/usb-Hades2001_M5stack_9552AF8367-if00-port0
I (29) boot: ESP-IDF 4.4.8 2nd stage bootloader
I (29) boot: compile time 17:31:08
I (29) boot: Multicore bootloader
I (32) boot: chip revision: v1.1
I (36) boot.esp32: SPI Speed      : 40MHz
I (40) boot.esp32: SPI Mode       : DIO
I (45) boot.esp32: SPI Flash Size : 4MB
I (49) boot: Enabling RNG early entropy source...
I (55) boot: Partition Table:
I (58) boot: ## Label            Usage          Type ST Offset   Length
I (66) boot:  0 otadata          OTA data         01 00 00009000 00002000
I (73) boot:  1 phy_init         RF data          01 01 0000b000 00001000
I (81) boot:  2 app0             OTA app          00 10 00010000 001c0000
I (88) boot:  3 app1             OTA app          00 11 001d0000 001c0000
I (96) boot:  4 nvs              WiFi data        01 02 00390000 0006d000
I (103) boot: End of partition table
I (107) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=58974h (362868) map
I (247) esp_image: segment 1: paddr=0006899c vaddr=3ffb0000 size=03400h ( 13312) load
I (253) esp_image: segment 2: paddr=0006bda4 vaddr=40080000 size=04274h ( 17012) load
I (260) esp_image: segment 3: paddr=00070020 vaddr=400d0020 size=f5cb8h (1006776) map
I (626) esp_image: segment 4: paddr=00165ce0 vaddr=40084274 size=112ach ( 70316) load
I (665) boot: Loaded app from partition at offset 0x10000
I (665) boot: Disabling RNG early entropy source...
I (677) cpu_start: Multicore app
I (677) cpu_start: Pro cpu up.
I (677) cpu_start: Starting app cpu, entry point is 0x400825c8
I (0) cpu_start: App cpu up.
I (695) cpu_start: Pro cpu start user code
I (695) cpu_start: cpu freq: 160000000
I (695) cpu_start: Application information:
I (700) cpu_start: Project name:     study-atom-echo
I (705) cpu_start: App version:      2024.12.4
I (710) cpu_start: Compile time:     Apr 18 2025 17:29:39
I (716) cpu_start: ELF file SHA256:  1db4989a56c6c930...
I (722) cpu_start: ESP-IDF:          4.4.8
I (727) cpu_start: Min chip rev:     v0.0
I (732) cpu_start: Max chip rev:     v3.99 
I (737) cpu_start: Chip rev:         v1.1
I (742) heap_init: Initializing. RAM available for dynamic allocation:
I (749) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (755) heap_init: At 3FFB8748 len 000278B8 (158 KiB): DRAM
I (761) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (767) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (774) heap_init: At 40095520 len 0000AAE0 (42 KiB): IRAM
I (781) spi_flash: detected chip: gd
I (784) spi_flash: flash io: dio
I (790) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
[I][logger:171]: Log initialized
[C][safe_mode:079]: There have been 0 suspected unsuccessful boot attempts
[D][esp32.preferences:114]: Saving 1 preferences to flash...
[D][esp32.preferences:143]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
[I][app:029]: Running through setup()...
[C][esp32_rmt_led_strip:021]: Setting up ESP32 LED Strip...
[D][template.select:014]: Setting up Template Select
[D][template.select:023]: State from initial (could not load stored index): On device
[D][select:015]: 'Wake word engine location': Sending state On device (index 1)
[D][esp-idf:000]: I (100) gpio: GPIO[39]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 

[D][binary_sensor:034]: 'Button': Sending initial state OFF
[C][light:021]: Setting up light 'M5Stack Atom Echo 8a1bc0'...
[D][light:036]: 'M5Stack Atom Echo 8a1bc0' Setting:
[D][light:041]:   Color mode: RGB
[D][template.switch:046]:   Restored state ON
[D][switch:012]: 'Use listen light' Turning ON.
[D][switch:055]: 'Use listen light': Sending state ON
[D][light:036]: 'M5Stack Atom Echo 8a1bc0' Setting:
[D][light:047]:   State: ON
[D][light:051]:   Brightness: 60%
[D][light:059]:   Red: 100%, Green: 89%, Blue: 71%
[D][template.switch:046]:   Restored state OFF
[D][switch:016]: 'timer_ringing' Turning OFF.
[D][switch:055]: 'timer_ringing': Sending state OFF
[C][i2s_audio:028]: Setting up I2S Audio...
[C][i2s_audio.microphone:018]: Setting up I2S Audio Microphone...
[C][i2s_audio.speaker:096]: Setting up I2S Audio Speaker...
[C][wifi:048]: Setting up WiFi...
[D][esp-idf:000]: I (206) wifi:
[D][esp-idf:000]: wifi driver task: 3ffc8544, prio:23, stack:6656, core=0
[D][esp-idf:000]: 

[D][esp-idf:000][wifi]: I (1238) system_api: Base MAC address is not set

[D][esp-idf:000][wifi]: I (1239) system_api: read default base MAC address from EFUSE

[D][esp-idf:000][wifi]: I (1274) wifi:
[D][esp-idf:000][wifi]: wifi firmware version: ff661c3
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1274) wifi:
[D][esp-idf:000][wifi]: wifi certification version: v7.0
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1286) wifi:
[D][esp-idf:000][wifi]: config NVS flash: enabled
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1297) wifi:
[D][esp-idf:000][wifi]: config nano formating: disabled
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1317) wifi:
[D][esp-idf:000][wifi]: Init data frame dynamic rx buffer num: 32
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1338) wifi:
[D][esp-idf:000][wifi]: Init static rx mgmt buffer num: 5
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1348) wifi:
[D][esp-idf:000][wifi]: Init management short buffer num: 32
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1368) wifi:
[D][esp-idf:000][wifi]: Init dynamic tx buffer num: 32
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1389) wifi:
[D][esp-idf:000][wifi]: Init static rx buffer size: 1600
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1399) wifi:
[D][esp-idf:000][wifi]: Init static rx buffer num: 10
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1419) wifi:
[D][esp-idf:000][wifi]: Init dynamic rx buffer num: 32
[D][esp-idf:000][wifi]: 

[D][esp-idf:000]: I (1441) wifi_init: rx ba win: 6

[D][esp-idf:000]: I (1441) wifi_init: tcpip mbox: 32

[D][esp-idf:000]: I (1450) wifi_init: udp mbox: 6

[D][esp-idf:000]: I (1450) wifi_init: tcp mbox: 6

[D][esp-idf:000]: I (1460) wifi_init: tcp tx win: 5760

[D][esp-idf:000]: I (1471) wifi_init: tcp rx win: 5760

[D][esp-idf:000]: I (1481) wifi_init: tcp mss: 1440

[D][esp-idf:000]: I (1481) wifi_init: WiFi IRAM OP enabled

[D][esp-idf:000]: I (1491) wifi_init: WiFi RX IRAM OP enabled

[C][wifi:061]: Starting WiFi...
[C][wifi:062]:   Local MAC: 64:B7:08:8A:1B:C0
[D][esp-idf:000][wifi]: I (1513) phy_init: phy_version 4791,2c4672b,Dec 20 2023,16:06:06

[D][esp-idf:000][wifi]: I (1599) wifi:
[D][esp-idf:000][wifi]: mode : sta (64:b7:08:8a:1b:c0)
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1600) wifi:
[D][esp-idf:000][wifi]: enable tsf
[D][esp-idf:000][wifi]: 

[D][esp-idf:000][wifi]: I (1605) wifi:
[D][esp-idf:000][wifi]: Set ps type: 1

[D][esp-idf:000][wifi]: 

[D][wifi:482]: Starting scan...
[D][esp32.preferences:114]: Saving 1 preferences to flash...
[D][esp32.preferences:143]: Saving 1 preferences to flash: 1 cached, 0 written, 0 failed
[W][micro_wake_word:151]: Wake word detection can't start as the component hasn't been setup yet
[D][esp-idf:000][wifi]: I (1646) wifi:
[D][esp-idf:000][wifi]: Set ps type: 1

[D][esp-idf:000][wifi]: 

[W][component:157]: Component wifi set Warning flag: scanning for networks
…
[I][wifi:617]: WiFi Connected!
…
[D][wifi:626]: Disabling AP...
[C][api:026]: Setting up Home Assistant API server...
[C][micro_wake_word:062]: Setting up microWakeWord...
[C][micro_wake_word:069]: Micro Wake Word initialized
[I][app:062]: setup() finished successfully!
[W][component:170]: Component wifi cleared Warning flag
[W][component:157]: Component api set Warning flag: unspecified
[I][app:100]: ESPHome version 2024.12.4 compiled on Apr 18 2025, 17:29:39
…
[C][logger:185]: Logger:
[C][logger:186]:   Level: DEBUG
[C][logger:188]:   Log Baud Rate: 115200
[C][logger:189]:   Hardware UART: UART0
[C][esp32_rmt_led_strip:187]: ESP32 RMT LED Strip:
[C][esp32_rmt_led_strip:188]:   Pin: 27
[C][esp32_rmt_led_strip:189]:   Channel: 0
[C][esp32_rmt_led_strip:214]:   RGB Order: GRB
[C][esp32_rmt_led_strip:215]:   Max refresh rate: 0
[C][esp32_rmt_led_strip:216]:   Number of LEDs: 1
[C][template.select:065]: Template Select 'Wake word engine location'
[C][template.select:066]:   Update Interval: 60.0s
[C][template.select:069]:   Optimistic: YES
[C][template.select:070]:   Initial Option: On device
[C][template.select:071]:   Restore Value: YES
[C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Button'
[C][gpio.binary_sensor:016]:   Pin: GPIO39
[C][light:092]: Light 'M5Stack Atom Echo 8a1bc0'
[C][light:094]:   Default Transition Length: 0.0s
[C][light:095]:   Gamma Correct: 2.80
[C][template.switch:068]: Template Switch 'Use listen light'
[C][template.switch:091]:   Restore Mode: restore defaults to ON
[C][template.switch:057]:   Optimistic: YES
[C][template.switch:068]: Template Switch 'timer_ringing'
[C][template.switch:091]:   Restore Mode: always OFF
[C][template.switch:057]:   Optimistic: YES
[C][factory_reset.button:011]: Factory Reset Button 'Factory reset'
[C][factory_reset.button:011]:   Icon: 'mdi:restart-alert'
[C][captive_portal:089]: Captive Portal:
[C][mdns:116]: mDNS:
[C][mdns:117]:   Hostname: study-atom-echo-8a1bc0
[C][esphome.ota:073]: Over-The-Air updates:
[C][esphome.ota:074]:   Address: study-atom-echo.local:3232
[C][esphome.ota:075]:   Version: 2
[C][esphome.ota:078]:   Password configured
[C][safe_mode:018]: Safe Mode:
[C][safe_mode:020]:   Boot considered successful after 60 seconds
[C][safe_mode:021]:   Invoke after 10 boot attempts
[C][safe_mode:023]:   Remain in safe mode for 300 seconds
[C][api:140]: API Server:
[C][api:141]:   Address: study-atom-echo.local:6053
[C][api:143]:   Using noise encryption: YES
[C][micro_wake_word:051]: microWakeWord:
[C][micro_wake_word:052]:   models:
[C][micro_wake_word:015]:     - Wake Word: Hey Jarvis
[C][micro_wake_word:016]:       Probability cutoff: 0.970
[C][micro_wake_word:017]:       Sliding window size: 5
[C][micro_wake_word:021]:     - VAD Model
[C][micro_wake_word:022]:       Probability cutoff: 0.500
[C][micro_wake_word:023]:       Sliding window size: 5

[D][api:103]: Accepted 192.168.39.6
[W][component:170]: Component api cleared Warning flag
[W][component:237]: Component api took a long time for an operation (58 ms).
[W][component:238]: Components should block for at most 30 ms.
[D][api.connection:1446]: Home Assistant 2024.3.3 (192.168.39.6): Connected successfully
[D][ring_buffer:034]: Created ring buffer with size 2048
[D][micro_wake_word:399]: Resetting buffers and probabilities
[D][micro_wake_word:195]: State changed from IDLE to START_MICROPHONE
[D][micro_wake_word:107]: Starting Microphone
[D][micro_wake_word:195]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[D][esp-idf:000]: I (11279) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4

[D][micro_wake_word:195]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD

That’s enough to get a voice satellite that can be configured up in Home Assistant; you’ll need the ESPHome Integration added, then for the noise_psk key you use the same string as I have under api/encryption/key in my diff above (obviously do your own, I used dd if=/dev/urandom bs=32 count=1 | base64 to generate mine).

If you’re like me and a compulsive VLANer and firewaller even within your own network then you need to allow Home Assistant to connect on TCP port 6053 to the ATOM Echo, and also allow access to/from UDP port 6055 on the Echo (it’ll send audio from that port to Home Assistant, then receive back audio to the same port).

At this point you can now shout “Hey Jarvis, what time is it?” at the Echo, and the white light will turn flashing blue (indicating it’s heard the wake word). Which means we’re ready to teach Home Assistant how to do something with the incoming audio.

Who pays the cost of progress in software?

Mar 24, 2025 • 0 comments

I am told, by friends who have spent time at Google, about the reason Google Reader finally disappeared. Apparently it had become a 20% Project for those who still cared about it internally, and there was some major change happening to one of it upstream dependencies that was either going to cause a significant amount of work rearchitecting Reader to cope, or create additional ongoing maintenance burden. It was no longer viable to support it as a side project, so it had to go. This was a consequence of an internal culture at Google where service owners are able to make changes that can break downstream users, and the downstream users are the ones who have to adapt.

My experience at Meta goes the other way. If you own a service or other dependency and you want to make a change that will break things for the users, it’s on you to do the migration, or at the very least provide significant assistance to those who own the code. You don’t just get to drop your new release and expect others to clean up; doing that tends to lead to changes being reverted. The culture flows the other way; if you break it, you fix it (nothing is someone else’s problem).

There are pluses and minuses to both approaches. Users having to drive the changes to things they own stops them from blocking progress. Service/code owners having to drive the changes avoids the situation where a wildly used component drops a new release that causes a lot of high priority work for folk in order to adapt.

I started thinking about this in the context of Debian a while back, and a few incidents since have resulted in my feeling that we’re closer to the Google model than the Meta model. Anyone can upload a new version of their package to unstable, and that might end up breaking all the users of it. It’s not quite as extreme as rolling out a new service, because it’s unstable that gets affected (the clue is in the name, I really wish more people would realise that), but it can still result in release critical bugs for lots other Debian contributors.

A good example of this are toolchain changes. Major updates to GCC and friends regularly result in FTBFS issues in lots of packages. Now in this instance the maintainer is usually diligent about a heads up before the default changes, but it’s still a whole bunch of work for other maintainers to adapt (see the list of FTBFS bugs for GCC 15 for instance - these are important, but not serious yet). Worse is when a dependency changes and hasn’t managed to catch everyone who might be affected, so by the time it’s discovered it’s release critical, because at least one package no longer builds in unstable.

Commercial organisations try to avoid this with a decent CI/CD setup that either vendors all dependencies, or tracks changes to them and tries rebuilds before allowing things to land. This is one of the instances where a monorepo can really shine; if everything you need is in there, it’s easier to track the interconnections between different components. Debian doesn’t have a CI/CD system that runs for every upload, allowing us to track exact causes of regressions. Instead we have Lucas, who does a tremendous job of running archive wide rebuilds to make sure we can still build everything. Unfortunately that means I am often unfairly grumpy at him; my heart sinks when I see a bug come in with his name attached, because it often means one of my packages has a new RC bug where I’m going to have to figure out what changed elsewhere to cause it. However he’s just (very usefully) surfacing an issue someone else created, rather than actually being the cause of the problem.

I don’t know if I have a point to this post. I think it’s probably that I wish folk in Free Software would try and be mindful of the incompatible changes they might introducing, and the toil they create for other volunteer developers, often not directly visible to the person making the change. The approach done by the Debian toolchain maintainers strikes me as a good balance; they do a bunch of work up front to try and flag all the places that might need to make changes, far enough in advance of the breaking change actually landing. However they don’t then allow a tardy developer to block progress.

RIP: Steve Langasek

Mar 2, 2025 • 0 comments

[I’d like to stop writing posts like this. I’ve been trying to work out what to say now for nearly 2 months (writing the mail to -private to tell the Debian project about his death is one of the hardest things I’ve had to write, and I bottled out and wrote something that was mostly just factual, because it wasn’t the place), and I’ve decided I just have to accept this won’t be the post I want it to be, but posted is better than languishing in drafts.]

Last weekend I was in Portland, for the Celebration of Life of my friend Steve, who sadly passed away at the start of the year. It wasn’t entirely unexpected, but that doesn’t make it any easier.

I’ve struggled to work out what to say about Steve. I’ve seen many touching comments from others in Debian about their work with him, but what that’s mostly brought home to me is that while I met Steve through Debian, he was first and foremost my friend rather than someone I worked with in Debian. And so everything I have to say is more about that friendship (and thus feels a bit self-centred).

My first memory of Steve is getting lost with him in Porto Alegre, Brazil, during DebConf4. We’d decided to walk to a local mall to meet up with some other folk (I can’t recall how they were getting there, but it wasn’t walking), ended up deep in conversation (ISTR it was about shared library transititions), and then it took a bit longer than we expected. I don’t know how that managed to cement a friendship (neither of us saw it as the near death experience others feared we’d had), but it did.

Unlike others I never texted Steve much; we’d occasionally chat on IRC, but nothing major. That didn’t seem to matter when we actually saw each other in person though, we just picked up like we’d seen each other the previous week. DebConf became a recurring theme of when we’d see each other. Even outside DebConf we went places together. The first time I went somewhere in the US that wasn’t the Bay Area, it was to Portland to see Steve. He, and his family, came to visit me in Belfast a couple of times, and I did road trip from Dublin to Cork with him. He took me to a volcano.

Steve saw injustice in the world and actually tried to do something about it. I still have a copy of the US constitution sitting on my desk that he gave me. He made me want to be a better person.

The world is a worse place without him in it, and while I am better for having known him, I am sadder for the fact he’s gone.

Previous Page: 1 of 87 Next