Thoughts on the acquisition of GitHub by Microsoft

Back at the start of 2010, I attended linux.conf.au in Wellington. One of the events I attended was sponsored by GitHub, who bought me beer in a fine Wellington bar (that was very proud of having an almost complete collection of BrewDog beers, including some Tactical Nuclear Penguin). I proceeded to tell them that I really didn’t understand their business model and that one of the great things about git was the very fact it was decentralised and we didn’t need to host things in one place any more. I don’t think they were offended, and the announcement Microsoft are acquiring GitHub for $7.5 billion proves that they had a much better idea about this stuff than me.

The acquisition announcement seems to have caused an exodus. GitLab reported over 13,000 projects being migrated in a single hour. IRC and Twitter were full of people throwing up their hands and saying it was terrible. Why is this? The fear factor seemed to come from was who was doing the acquiring. Microsoft. The big, bad Linux is a cancer folk. I saw a similar, though more muted, reaction when LinkedIn were acquired.

This extremely negative reaction to Microsoft seems bizarre to me these days. I’m well aware of their past, and their anti-competitive practises (dating back to MS-DOS vs DR-DOS). I’ve no doubt their current embrace of Free Software is ultimately driven by business decisions rather than a sudden fit of altruism. But I do think their current behaviour is something we could never have foreseen 15+ years ago. Did you ever think Microsoft would be a contributor to the Linux kernel? Is it fair to maintain such animosity? Not for me to say, I guess, but I think that some of it is that both GitHub and LinkedIn were services that people were already uneasy about using, and the acquisition was the straw that broke the camel’s back.

What are the issues with GitHub? I previously wrote about the GitHub TOS changes, stating I didn’t think it was necessary to fear the TOS changes, but that the centralised nature of the service was potentially something to be wary of. joeyh talked about this as long ago as 2011, discussing the aspects of the service other than the source code hosting that were only API accessible, or in some other way more restricted than a git clone away. It’s fair criticism; the extra features offered by GitHub are very much tied to their service. And yet I don’t recall the same complaints about SourceForge, long the home of choice for Free Software projects. Its problems seem to be more around a dated interface, being slow to enable distributed VCSes and the addition of advertising. People left because there were much better options, not because of idiological differences.

Let’s look at the advantages GitHub had (and still has) to offer. I held off on setting up a GitHub account for a long time. I didn’t see the need; I self-hosted my Git repositories. I had the ability to setup mailing lists if I needed them (and my projects generally aren’t popular enough that they did). But I succumbed in 2015. Why? I think it was probably as part of helping to run an OpenHatch workshop, trying to get people involved in Free software. That may sound ironic, but helping out with those workshops helped show me the benefit of the workflow GitHub offers. The whole fork / branch / work / submit a pull request approach really helps lower the barrier to entry for people getting started out. Suddenly fixing an annoying spelling mistake isn’t a huge thing; it’s easy to work in your own private playground and then make that work available to upstream and to anyone else who might be interested.

For small projects without active mailing lists that’s huge. Even for big projects that can be a huge win. And it’s not just useful to new contributors. It lowers the barrier for me to be a patch ‘n run contributor. Now that’s not necessarily appealing to some projects, because they’d rather get community involvement. And I get that, but I just don’t have the time to be active in all the projects I feel I can offer something to. Part of that ease is the power of git, the fact that a clone is a first class repo, capable of standing alone or being merged back into the parent. But another part is the interface GitHub created, and they should get some credit for that. It’s one of those things that once you’re presented with it it makes sense, but no one had done it quite as slickly up to that point. Submissions via mailing lists are much more likely to get lost in the archives compared to being able to see a list of all outstanding pull requests on GitHub, and the associated discussion. And subscribe only to that discussion rather than everything.

GitHub also seemed to appear at the right time. It, like SourceForge, enabled easy discovery of projects. Crucially it did this at a point when web frameworks were taking off and a whole range of developers who had not previously pull large chunks of code from other projects were suddenly doing so. And writing frameworks or plugins themselves and feeling in the mood to share them. GitHub has somehow managed to hit critical mass such that lots of code that I’m sure would have otherwise never seen the light of day are available to all. Perhaps the key was that repos were lightweight setups under usernames, unlike the heavier SourceForge approach of needing a complete project setup per codebase you wanted to push. Although it’s not my primary platform I engage with GitHub for my own code because the barrier is low; it’s couple of clicks on the website and then I just push to it like my other remote repos.

I seem to be coming across as a bit of a GitHub apologist here, which isn’t my intention. I just think the knee-jerk anti GitHub reaction has been fascinating to observe. I signed up to GitLab around the same time as GitHub, but I’m not under any illusions that their hosted service is significantly different from GitHub in terms of having my data hosted by a third party. Nothing that’s up on either site is only up there, and everything that is is publicly available anyway. I understand that as third parties they can change my access at any point in time, and so I haven’t built any infrastructure that assumes their continued existence. That said, why would I not take advantage of their facilities when they happen to be of use to me?

I don’t expect my use of GitHub to significantly change now they’ve been acquired.

Hooking up Home Assistant to Alexa + Google Assistant

I have an Echo Dot. Actually I have two; one in my study and one in the dining room. Mostly we yell at Alexa to play us music; occasionally I ask her to set a timer, tell me what time it is or tell me the news. Having setup Home Assistant it seemed reasonable to try and enable control of the light in the dining room via Alexa.

Perversely I started with Google Assistant, even though I only have access to it via my phone. Why? Because the setup process was a lot easier. There are a bunch of hoops to jump through that are documented on the Google Assistant component page, but essentially you create a new home automation component in the Actions on Google interface, connect it with the Google OAuth stuff for account linking, and open up your Home Assistant instance to the big bad internet so Google can connect.

This final step is where I differed from the provided setup. My instance is accessible internally at home, but I haven’t wanted to expose it externally yet (and I suspect I never well, but instead have the ability to VPN back in to access or similar). The default instructions need you to open up API access publicly, and configure up Google with your API password, which allows access to everything. I’d rather not.

So, firstly I configured up my external host with an Apache instance and a Let’s Encrypt cert (luckily I have a static IP, so this was actually the base host that the Home Assistant container runs on). Rather than using this to proxy the entire Home Assistant setup I created a unique /external/google/randomstring proxy just for the Google Assistant API endpoint. It looks a bit like this:

<VirtualHost *:443>
  ServerName my.external.host

  ProxyPreserveHost On
  ProxyRequests off

  RewriteEngine on

  # External access for Google Assistant
  ProxyPassReverse /external/google/randomstring http://hass-host:8123/api/google_assistant
  RewriteRule ^/external/google/randomstring$ http://hass-host:8123/api/google_assistant?api_password=myapipassword [P]
  RewriteRule ^/external/google/randomstring/auth$ http://hass-host:8123/api/google_assistant/auth?%{QUERY_STRING}&&api_password=myapipassword [P]

  SSLEngine on
  SSLCertificateFile /etc/ssl/my.external.host.crt
  SSLCertificateKeyFile /etc/ssl/private/my.external.host.key
  SSLCertificateChainFile /etc/ssl/lets-encrypt-x3-cross-signed.crt
</VirtualHost>

This locks down the external access to just being the Google Assistant end point, and means that Google have a specific shared secret rather than the full API password. I needed to configure up Home Assistant as well, so configuration.yaml gained:

google_assistant:
  project_id: homeautomation-8fdab
  client_id: oFqHKdawWAOkeiy13rtr5BBstIzN1B7DLhCPok1a6Jtp7rOI2KQwRLZUxSg00rIEib2NG8rWZpH1cW6N
  access_token: l2FrtQyyiJGo8uxPio0hE5KE9ZElAw7JGcWRiWUZYwBhLUpH3VH8cJBk4Ct3OzLwN1Fnw39SR9YArfKq
  agent_user_id: noodles@earth.li
  api_key: nyAxuFoLcqNIFNXexwe7nfjTu2jmeBbAP8mWvNea
  exposed_domains:
    - light

Setting up Alexa access is more complicated. Amazon Smart Home skills must call an AWS Lambda - the code that services the request is essential a small service run within Lambda. Home Assistant supports all the appropriate requests, so the Lambda code is a very simple proxy these days. I used Haaska which has a complete setup guide. You must do all 3 steps - the OAuth provider, the AWS Lambda and the Alexa Skill. Again, I wanted to avoid exposing the full API or the API password, so I forked Haaska to remove the use of a password and instead use a custom URL. I then added the following additional lines to the Apache config above:

# External access for Amazon Alexa
ProxyPassReverse /external/amazon/stringrandom http://hass-host:8123/api/alexa/smart_home
RewriteRule /external/amazon/stringrandom http://hass-host:8123/api/alexa/smart_home?api_password=myapipassword [P]

In the config.json I left the password field blank and set url to https://my.external.host/external/amazon/stringrandom. configuration.yaml required less configuration than the Google equivalent:

alexa:
  smart_home:
    filter:
      include_entities:
        - light.dining_room_lights
        - light.living_room_lights
        - light.kitchen
        - light.snug

(I’ve added a few more lights, but more on the exact hardware details of those at another point.)

To enable in Alexa I went to the app on my phone, selected the “Smart Home” menu option, enabled my Home Assistant skill and was able to search for the available devices. I can then yell “Alexa, turn on the snug” and magically the light turns on.

Aside from being more useful (due to the use of the Dot rather than pulling out a phone) the Alexa interface is a bit smoother - the command detection is more reliable (possibly due to the more limited range of options it has to work out?) and adding new devices is a simple rescan. Adding new devices with Google Assistant seems to require unlinking and relinking the whole setup.

The only problem with this setup so far is that it’s only really useful for the room with the Alexa in it. Shouting from the living room in the hope the Dot will hear is a bit hit and miss, and I haven’t yet figured out a good alternative method for controlling the lights there that doesn’t mean using a phone or a tablet device.

Getting started with Home Assistant

Having set up some MQTT sensors and controllable lights the next step was to start tying things together with a nicer interface than mosquitto_pub and mosquitto_sub. I don’t yet have enough devices setup to be able to do some useful scripting (turning on the snug light when the study is cold is not helpful), but a web control interface makes things easier to work with as well as providing a suitable platform for expansion as I add devices.

There are various home automation projects out there to help with this. I’d previously poked openHAB and found it quite complex, and I saw reference to Domoticz which looked viable, but in the end I settled on Home Assistant, which is written in Python and has a good range of integrations available out of the box.

I shoved the install into a systemd-nspawn container (I have an Ansible setup which makes spinning one of these up with a basic Debian install simple, and it makes it easy to cleanly tear things down as well). One downside of Home Assistant is that it decides it’s going to install various Python modules once you actually configure up some of its integrations. This makes me a little uncomfortable, but I set it up with its own virtualenv to make it easy to see what had been pulled in. Additionally I separated out the logs, config and state database, all of which normally go in ~/.homeassistant/. My systemd service file went in /etc/systemd/system/home-assistant.service and looks like:

[Unit]
Description=Home Assistant
After=network-online.target

[Service]
Type=simple
User=hass
ExecStart=/srv/hass/bin/hass -c /etc/homeassistant --log-file /var/log/homeassistant/homeassistant.log

MemoryDenyWriteExecute=true
ProtectControlGroups=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectSystem=true
RestrictRealtime=true
RestrictNamespaces=true

[Install]
WantedBy=multi-user.target

Moving the state database needs an edit to /etc/homeassistant/configuration.yaml (a default will be created on first startup, I’ll only mention the changes I made here):

recorder:
  db_url: sqlite:///var/lib/homeassistant/home-assistant_v2.db

I disabled the Home Assistant cloud piece, as I’m not planning on using it:

# cloud:

And the introduction card:

# introduction:

The existing MQTT broker was easily plumbed in:

mqtt:
  broker: mqtt-host
  username: hass
  password: !secret mqtt_password
  port: 8883
  certificate: /etc/ssl/certs/ca-certificates.crt

Then the study temperature sensor (part of the existing sensor block that had weather prediction):

sensor:
  - platform: mqtt
    name: "Study Temperature"
    state_topic: "collectd/mqtt.o362.us/mqtt/temperature-study"
    value_template: "{{ value.split(':')[1] }}"
    device_class: "temperature"
    unit_of_measurement: "°C"

The templating ability let me continue to log into MQTT in a format collectd could parse, while also being able to pull the information into Home Assistant.

Finally the Sonoff controlled light:

light:
  - platform: mqtt
    name: snug
    command_topic: 'cmnd/sonoff-snug/power'

I set http_password (to prevent unauthenticated access) and mqtt_password in /etc/homeassistant/secrets.yaml. Then systemctl start home-assistant brought the system up on http://hass-host:8123/, and the default interface presented the study temperature and a control for the snug light, as well as the default indicators of whether the sun is up or not and the local weather status.

I do have a few niggles with Home Assistant:

  • Single password for access: There’s one password for accessing the API endpoint, so no ability to give different users different access or limit what an external integration can do.
  • Wants an entire subdomain: This is a common issue with webapps; they don’t want to live in a subdirectory under a main site (I also have this issue with my UniFi controller and Keybase, who don’t want to believe my main website is old skool with /~noodles/). There’s an open configurable webroot feature request, but no sign of it getting resolved. Sadly it involves work to both the backend and the frontend - I think a modicum of hacking could fix up the backend bits, but have no idea where to start with a Polymer frontend.
  • Installs its own software: I don’t like the fact the installation of Python modules isn’t an up front thing. I’d rather be able to pull a dependency file easily into Ansible and lock down the installation of new things. I can probably get around this by enabling plugins, allowing the modules to be installed and then locking down permissions but it’s kludgy and feels fragile.
  • Textual configuration: I’m not really sure I have a good solution to this, but it’s clunky to have to do all the configuration via a text file (and I love scriptable configuration). This isn’t something that’s going to work out of the box for non-technical users, and even for those of us happy hand editing YAML there’s a lot of functionality that’s hard to discover without some digging. One of my original hopes with Home Automation was to get a better central heating control and if it’s not usable by any household member it isn’t going to count as better.

Some of these are works in progress, some are down to my personal preferences. There’s active development, which is great to see, and plenty of documentation - both offical on the project website, and in the community forums. And one of the nice things about tying everything together with MQTT is that if I do decide Home Assistant isn’t the right thing down the line, I should be able to drop in anything else that can deal with an MQTT broker.

Actually switching something with the SonOff

Getting a working MQTT temperature monitoring setup is neat, but not really what we think of when someone talks about home automation. For that we need some element of control. There are various intelligent light bulb systems out there that are obvious candidates, but I decided I wanted the more simple approach of switching on and off an existing lamp. I ended up buying a pair of Sonoff Basic devices; I’d rather not get into trying to safely switch mains voltages myself. As well as being cheap the Sonoff is based upon an ESP8266, which I already had some experience in hacking around with (I have a long running project to build a clock I’ll eventually finish and post about). Even better, the Sonoff-Tasmota project exists, providing an alternative firmware that has some support for MQTT/TLS. Perfect for my needs!

There’s an experimental OTA upgrade approach to getting a new firmware on the Sonoff, but I went the traditional route of soldering a serial header onto the board and flashing using esptool. Additionally none of the precompiled images have MQTT/TLS enabled, so I needed to build the image myself. Both of these turned out to be the right move, because using the latest release (v5.13.1 at the time) I hit problems with the device rebooting as soon as it got connected to the MQTT broker. The serial console allowed me to see the reboot messages, and as I’d built the image myself it was easy to tweak things in the hope of improving matters. It seems the problem is related to the memory consumption that enabling TLS requires. I went back a few releases until I hit on one that works, with everything else disabled. I also had to nail the Espressif Arduino library version to an earlier one to get a reliable wifi connection - using the latest worked fine when the device was power via USB from my laptop, but not once I hooked it up to the mains.

Once the image is installed on the device (just the normal ESP8266 esptool write_flash 0 sonoff-image.bin approach), start mosquitto_sub up somewhere. Plug the Sonoff in (you CANNOT have the Sonoff plugged into the mains while connected to the serial console, because it’s not fully isolated), and you should see something like the following:

$ mosquitto_sub -h mqtt-host -p 8883 --capath /etc/ssl/certs/ -v -t '#' -u user1 -P foo
tele/sonoff/LWT Online
cmnd/sonoff/POWER (null)
tele/sonoff/INFO1 {"Module":"Sonoff Basic","Version":"5.10.0","FallbackTopic":"DVES_123456","GroupTopic":"sonoffs"}
tele/sonoff/INFO3 {"RestartReason":"Power on"}
stat/sonoff/RESULT {"POWER":"OFF"}
stat/sonoff/POWER OFF
tele/sonoff/STATE {"Time":"2018-05-25T10:09:06","Uptime":0,"Vcc":3.176,"POWER":"OFF","Wifi":{"AP":1,"SSId":"My SSID Is Here","RSSI":100,"APMac":"AA:BB:CC:12:34:56"}}

Each of the Sonoff devices will want a different topic rather than the generic ‘sonoff’, and this can be set via MQTT:

$ mosquitto_pub -h mqtt.o362.us -p 8883 --capath /etc/ssl/certs/ -t 'cmnd/sonoff/topic' -m 'sonoff-snug' -u user1 -P foo

The device will provide details of the switchover via MQTT:

cmnd/sonoff/topic sonoff-snug
tele/sonoff/LWT (null)
stat/sonoff-snug/RESULT {"Topic":"sonoff-snug"}
tele/sonoff-snug/LWT Online
cmnd/sonoff-snug/POWER (null)
tele/sonoff-snug/INFO1 {"Module":"Sonoff Basic","Version":"5.10.0","FallbackTopic":"DVES_123456","GroupTopic":"sonoffs"}
tele/sonoff-snug/INFO3 {"RestartReason":"Software/System restart"}
stat/sonoff-snug/RESULT {"POWER":"OFF"}
stat/sonoff-snug/POWER OFF
tele/sonoff-snug/STATE {"Time":"2018-05-25T10:16:29","Uptime":0,"Vcc":3.103,"POWER":"OFF","Wifi":{"AP":1,"SSId":"My SSID Is Here","RSSI":76,"APMac":"AA:BB:CC:12:34:56"}}

Controlling the device is a matter of sending commands to the cmd/sonoff-snug/power topic - 0 for off, 1 for on. All of the available commands are listed on the Sonoff-Tasmota wiki.

At this point I have a wifi connected mains switch, controllable over MQTT via my internal MQTT broker.

(If you want to build your own Sonoff Tasmota image it’s actually not too bad; the build system is Ardunio style on top of PlatformIO. That means downloading a bunch of bits before you can actually build, but the core is Python based so it can be done as a normal user within a virtualenv. Here’s what I did:

# Make a directory to work in and change to it
mkdir sonoff-ws
cd sonoff-ws
# Build a virtual Python environment and activate it
virtualenv platformio
source platformio/bin/activate
# Install PlatformIO core
pip install -U platformio
# Clone Sonoff Tasmota tree
git clone https://github.com/arendst/Sonoff-Tasmota.git
cd Sonoff-Tasmota
# Checkout known to work release
git checkout v5.10.0
# Only build the sonoff firmware, not all the language variants
sed -i 's/;env_default = sonoff$/env_default = sonoff/' platformio.ini
# Force older version of espressif to get more reliable wifi
sed -i 's/platform = espressif8266$/&@1.5.0/' platformio.ini
# Edit the configuration to taste; essentially comment out all the USE_*
# defines and enable USE_MQTT_TLS
vim sonoff/user_config.h
# Actually build. Downloads a bunch of deps the first time.
platformio run

I’ve put my Sonoff-Tasmota user_config.h up in case it’s of help when trying to get up and going. At some point I need to try the latest version and see if I can disable enough to make it happy with MQTT/TLS, but for now I have an image that does what I need.)

Home Automation: Graphing MQTT sensor data

So I’ve setup a MQTT broker and I’m feeding it temperature data. How do I actually make use of this data? Turns out collectd has an MQTT plugin, so I went about setting it up to record temperature over time.

First problem was that although the plugin supports MQTT/TLS it doesn’t support it for subscriptions until 5.8, so I had to backport the fix to the 5.7.1 packages my main collectd host is running.

The other problem is that collectd is picky about the format it accepts for incoming data. The topic name should be of the format <host>/<plugin>-<plugin_instance>/<type>-<type_instance> and the data is <unixtime>:<value>. I modified my MQTT temperature reporter to publish to collectd/mqtt-host/mqtt/temperature-study, changed the publish line to include the timestamp:

publish.single(pub_topic, str(time.time()) + ':' + str(temp),
            hostname=Broker, port=8883,
            auth=auth, tls={})

and added a new collectd user to the Mosquitto configuration:

mosquitto_passwd -b /etc/mosquitto/mosquitto.users collectd collectdpass

And granted it read-only access to the collectd/ prefix via /etc/mosquitto/mosquitto.acl:

user collectd
topic read collectd/#

(I also created an mqtt-temp user with write access to that prefix for the Python script to connect to.)

Then, on the collectd host, I created /etc/collectd/collectd.conf.d/mqtt.conf containing:

LoadPlugin mqtt

<Plugin "mqtt">
        <Subscribe "ha">
                Host "mqtt-host"
                Port "8883"
                User "collectd"
                Password "collectdpass"
                CACert "/etc/ssl/certs/ca-certificates.crt"
                Topic "collectd/#"
        </Subscribe>
</Plugin>

I had some initial problems when I tried setting CACert to the Let’s Encrypt certificate; it actually wants to point to the “DST Root CA X3” certificate that signs that. Or using the full set of installed root certificates as I’ve done works too. Of course the errors you get back are just of the form:

collectd[8853]: mqtt plugin: mosquitto_loop failed: A TLS error occurred.

which is far from helpful. Once that was sorted collectd started happily receiving data via MQTT and producing graphs for me:

Study temperature

This is a pretty long winded way of ending up with some temperature graphs - I could have just graphed the temperature sensor using collectd on the Pi to send it to the monitoring host, but it has allowed a simple MQTT broker, publisher + subscriber setup with TLS and authentication to be constructed and confirmed as working.

subscribe via RSS