The (lack of a) return-to-office conspiracy
During COVID companies suddenly found themselves able to offer remote working where it hadn’t previously been on offer. That’s changed over the past 2 or so years, with most places I’m aware of moving back from a fully remote situation to either some sort of hybrid, or even full time office attendance. For example last week Amazon announced a full return to office, having already pulled remote-hired workers in for 3 days a week.
I’ve seen a lot of folk stating they’ll never work in an office again, and that RTO is insanity. Despite being lucky enough to work fully remotely (for a role I’d been approached about before, but was never prepared to relocate for), I feel the objections from those who are pro-remote often fail to consider the nuances involved. So let’s talk about some of the reasons why companies might want to enforce some sort of RTO.
Real estate value
Let’s clear this one up first. It’s not about real estate value, for most companies. City planners and real estate investors might care, but even if your average company owned their building they’d close it in an instant all other things being equal. An unoccupied building costs a lot less to maintain. And plenty of companies rent and would save money even if there’s a substantial exit fee.
Occupancy levels
That said, once you have anyone in the building the equation changes. If you’re having to provide power, heating, internet, security/front desk staff etc, you want to make sure you’re getting your money’s worth. There’s no point heating a building that can seat 100 for only 10 people present. One option is to downsize the building, but that leads to not being able to assign everyone a desk, for example. No one I know likes hot desking. There are also scheduling problems about ensuring there are enough desks for everyone who might turn up on a certain day, and you’ve ruled out the option of company/office wide events.
Coexistence builds relationships
As a remote worker I wish it wasn’t true that most people find it easier to form relationships in person, but it is. Some of this can be worked on with specific “teambuilding” style events, rather than in office working, but I know plenty of folk who hate those as much as they hate the idea of being in the office. I am lucky in that I work with a bunch of folk who are terminally online, so it’s much easier to have those casual conversations even being remote, but I also accept I miss out on some things because I’m just not in the office regularly enough. You might not care about this (“I just need to put my head down and code, not talk to people”), but don’t discount it as a valid reason why companies might want their workers to be in the office. This often matters even more for folk at the start of their career, where having a bunch of experience folk around to help them learn and figure things out ends up working much better in person (my first job offered to let me go mostly remote when I moved to Norwich, but I said no as I knew I wasn’t ready for it yet).
Coexistence allows for unexpected interactions
People hate the phrase “water cooler chat”, and I get that, but it covers the idea of casual conversations that just won’t happen the same way when people are remote. I experienced this while running Black Cat; every time Simon and I met up in person we had a bunch of useful conversations even though we were on IRC together normally, and had a VoIP setup that meant we regularly talked too. Equally when I was at Nebulon there were conversations I overheard in the office where I was able to correct a misconception or provide extra context. Some of this can be replicated with the right online chat culture, but I’ve found many places end up with folk taking conversations to DMs, or they happen in “private” channels. It happens more naturally in an office environment.
It’s easier for bad managers to manage bad performers
Again, this falls into the category of things that shouldn’t be true, but are. Remote working has increased the ability for people who want to slack off to do so without being easily detected. Ideally what you want is that these folk, if they fail to perform, are then performance managed out of the organisation. That’s hard though, there are (rightly) a bunch of rights workers have (I’m writing from a UK perspective) around the procedure that needs to be followed. Managers need organisational support in this to make sure they get it right (and folk are given a chance to improve), which is often lacking.
Summary
Look, I get there are strong reasons why offering remote is a great thing from the company perspective, but what I’ve tried to outline here is that a return-to-office mandate can have some compelling reasons behind it too. Some of those might be things that wouldn’t exist in an ideal world, but unfortunately fixing them is a bigger issue than just changing where folk work from. Not acknowledging that just makes any reaction against office work seem ill-informed, to me.
Thoughts on Advent of Code + Rust
Diego wrote about his dislike for Advent of Code and that reminded me I hadn’t written up my experience from 2023. Mostly because, spoiler, I never actually completed it and always intended to do so and then write it up. I think it’s time to accept I’m not going to do that, and write down some thoughts before I forget all of them. These are somewhat vague, given the time that’s elapsed, but I think still relevant. You might also find Roger’s problem write up interesting.
I’ve tried AoC a couple of times before; I think I had a very brief attempt back in 2021, and I got 4 days in for 2022. For Advent of Code 2023 I tried much harder to actually complete the challenges, and got most of the way there. I didn’t allow myself to move on to the next day until fully completing the previous day, and didn’t end up doing the second half of December 24th, or any of December 25th.
Rust
First I want to talk about Rust, which is the language I chose to use for the problems. I’ve dabbled a little in it, but I’d like more familiarity with the basic language, and some programming problems seemed like a good way to get that. It’s a language I want to like; I’ve spent a lot of my career writing C, do more in Go these days, and generally think Rust promises a low level, run-time light environment like C but with the rough edges taken off.
I set myself the challenge of using just bare Rust; no external crates, no use of cargo. I was accused of playing on hard mode by doing this, but it really wasn’t the intention - I figured that I should be able to do what I needed without recourse to anything outside the core language, and didn’t want what seemed like the extra complexity of dealing with cargo.
That caused problems, however. I’m used to by-default generic error handling in Go through the error
type, but Rust seems to have much more tightly typed errors. I was pointed at anyhow as the right way to do this in Rust. I still find this surprising; I ended up using unwrap()
a lot when I think with more generic error handling I could have used ?
.
The other thing I discovered is that by default rustc
is heavy on the debug output. I got significantly better results on some of the solutions with rustc -O -C target-cpu=native source.rs
. I probably shouldn’t be surprised by this, but worth noting.
Rust, to me, has a syntax only a C++ programmer could love. I am not a C++ programmer. Coming from C I found Go to be a nice, simple syntax to learn. Rust has not been the same. There’s a lot more punctuation, and it’s not always clear to me what it’s doing. This applies more when reading other people’s code than when writing it myself, obviously, but I see a lot of Rust code that could give Perl a run for its money in terms of looking like line noise.
The borrow checker didn’t bug me too much, but did add overhead to my thinking. The Rust compiler is generally very good at outputting helpful error messages when the programmer is an idiot. I ended up having to use a RefCell for one solution, and using .iter()
for loops rather than explicit iterators (why, why is this different?). I also kept forgetting to explicitly mark variables as mutable when declaring them.
Things I liked? There’s a rich set of first class data types. Look, I’m a C programmer, I’m easily pleased. You give me some sort of hash array and I’ll be happy. Rust manages that, tuples, strings, all the standard bits any modern language can provide. The whole impl
thing for adding methods to structures I like as a way of providing some abstraction, though I think Go has a nicer syntax for it. The compiler, as mentioned, is great at spitting out useful errors for the most part. Also although I wasn’t using external crates for AoC I do appreciate there’s a decent ecosystem there now (though that brings up another gripe: rust seems to still be a fairly fast moving target, to the extent I can no longer rely on the compiler in Debian stable to be able to compile random projects I find).
Advent of Code
Let’s talk about the advent of code bit now. Hopefully it’s long enough since it came out that this won’t be spoilers for anyone, but if you haven’t attempted the 2023 AoC and might, you might want to stop reading here.
First, a refresher on the format for those who might not be aware of it. Problems are posted daily from December 1st until the 25th. Each is in 2 parts; the second part is not viewable until you have provided the correct answer for the first part. There’s a whole leaderboard thing going on, but the puzzle opens at midnight UTC-5 so generally by the time I wake up and have time to look the problem has been solved many times over; no chance of getting listed.
Credit to AoC creator, Eric Wastl, for writing up the set of problems in an entertaining fashion. I quite enjoyed seeing how the puzzle would be phrased each day, and the whole thing obviously brings a lot of joy to folk I know.
I always start AoC thinking it’ll be a fun set of puzzles to solve. Then something happens and I miss a day or two, and all of a sudden I’ve a bunch of catching up to do and it’s all a bit more of a chore. I hit that at some points this time, but made a concerted effort to try and power through it.
That perseverance was required up front, because I found the second part of Day 1 to be ill specified, and had to iterate a few times to actually calculate the desired solution (IIRC, issues about whether sevenone
at the end of a line ended up as 7 or 1 really tripped me up). I don’t recall any other problems that bit me as hard on the specification as this one, but it happening up front was unfortunate.
The short example input doesn’t always help with this either; either it’s not enough to be able to extrapolate patterns, or it doesn’t show all the variations you need to account for (that aren’t fully specified in the text), or in a few cases it turned out I needed to understand the shape of the actual data to produce a solution that could actually complete in a reasonable time.
Which brings me to another matter, sometimes brute force doesn’t actually work. This is fine, but the second part of the day’s problem can change the approach you’d take. So sometimes I got lucky in the way I handled the first half, and doing the second half was a simple 5 minute tweak, and sometimes I had to entirely change the way I was storing data.
You might claim that if I was a better programmer I’d have always produced a first half solution that was amenable to extension for the second half. First, I dispute that; I think there are always situations where the problem domain can change in enough directions that you can’t handle all of them without a lot of effort. Secondly, I didn’t find AoC an environment that encouraged me to optimise for generic solutions. Maybe some of the puzzles in isolation would allow for that, but a month of daily problems to solve while still engaging in regular life meant I hacked things up, took short cuts based on the knowledge I had of the input data, etc, etc.
Overall I can see the appeal, but the sheer quantity and the fact I write code as part of my day job just made it feel too much like a chore, rather than a fun mental exercise. I did wonder how they’d look as a set of interview puzzles (obviously a subset, rather than all of them), but I’m not sure how you’d actually use them for that - I wouldn’t want anyone to have to solve them in a live interview.
So, in case it’s not obvious, I’m not planning to engage in AoC again this yet. But I’m continuing to persevere with Rust (though most of my work stuff is thankfully still Go).
Using QEmu for UEFI/TPM testing
This is one of those posts that’s more for my own reference than likely to be helpful for others. If you’re unlucky it’ll have some useful tips for you. If I’m lucky then I’ll get a bunch of folk pointing out some optimisations.
First, what I’m trying to achieve. I want a virtual machine environment where I can manually do tests on random kernels, and also various TPM related experiments. So I don’t want something as fixed as a libvirt setup. I’d like the following:
- It to be fairly lightweight, so I can run it on a laptop while usefully doing other things
- I need a TPM 2.0 device to appear to the guest OS, but it doesn’t need to be a real TPM
- Block device discard should work, so I can back it with a
qcow2
image and use fstrim to keep the actual on disk size small, without constraining my potential for file system expansion should I need it - I’ve no need for graphics, in fact a serial console would be better as it eases copy & paste, especially when I screw up kernel changes
That turns out to be possible, but it took a bunch of trial and error to get there. So I’m writing it down. I generally do this on a Fedora based host system (FC40 at present, but this all worked with FC38 + FC39 too), and I run Debian 12 (bookworm) as the guest. At present I’m using qemu 8.2.2 and swtpm 0.9.0, both from the FC40 packages.
One other issue I spent too long tracking down is that the version of grub 2.06 in bookworm does not manage to pass the TPMEventLog
through to the guest kernel properly. The events get measured and the PCRs updated just fine, but /sys/kernel/security/tpm0/binary_bios_measurements
doesn’t even get created. Using either grub 2.06 from FC40, or the 2.12 backport in bookworm-backports
, makes this work just fine.
Anyway, for reference, the following is the script I use to start the swtpm, and then qemu. The debugcon
line can be dropped if you’re not interested in OVMF debug logging. This needs the guest OS to be configured up for a serial console, but avoids the overhead of graphics emulation.
As I said at the start, I’m open to any hints about other options I should be passing; as long as I get acceptable performance in the guest I care more about reducing host load than optimising for the guest.
#!/bin/sh
BASEDIR=/home/noodles/debian-qemu
if [ ! -S ${BASEDIR}/swtpm/swtpm-sock ]; then
echo Starting swtpm:
swtpm socket --tpmstate dir=${BASEDIR}/swtpm \
--tpm2 \
--ctrl type=unixio,path=${BASEDIR}/swtpm/swtpm-sock &
fi
echo Starting QEMU:
qemu-system-x86_64 -enable-kvm -m 2048 \
-machine type=q35 \
-smbios type=1,serial=N00DL35,uuid=fd225315-f72a-4d66-9b16-55363c6c938b \
-drive if=pflash,format=qcow2,readonly=on,file=/usr/share/edk2/ovmf/OVMF_CODE_4M.qcow2 \
-drive if=pflash,format=raw,file=${BASEDIR}/OVMF_VARS.fd \
-global isa-debugcon.iobase=0x402 -debugcon file:${BASEDIR}/debian.ovmf.log \
-device virtio-blk-pci,drive=drive0,id=virblk0 \
-drive file=${BASEDIR}/debian-12-efi.qcow2,if=none,id=drive0,discard=on \
-net nic,model=virtio -net user \
-chardev socket,id=chrtpm,path=${BASEDIR}/swtpm/swtpm-sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm \
-device tpm-tis,tpmdev=tpm0 \
-display none \
-nographic \
-boot menu=on
Sorting out backup internet #4: IPv6
The final piece of my 5G (well, 4G) based based backup internet connection I needed to sort out was IPv6. While Three do support IPv6 in their network they only seem to enable it for certain devices, and the MC7010 is not one of those devices, even though it also supports IPv6.
I use v6 a lot - over 50% of my external traffic, last time I looked. One suggested option was that I could drop the IPv6 Router Advertisements when the main external link went down, but I have a number of internal services that are only presented on v6 addresses so I needed to ensure clients in the house continued to have access to those.
As it happens I’ve used the Hurricane Electric IPv6 Tunnel Broker in the past, so my pass was re-instating that. The 5G link has a real external IPv4 address, and it’s possible to update the endpoint using a simple HTTP GET. I added the following to my /etc/dhcp/dhclient-exit-hooks.d/modem-interface-route
where we are dealing with an interface IP change:
# Update IPv6 tunnel with Hurricane Electric
curl --interface $interface 'https://username:password@ipv4.tunnelbroker.net/nic/update?hostname=1234'
I needed some additional configuration to bring things up, so /etc/network/interfaces
got the following, configuring the 6in4 tunnel as well as the low preference default route, and source routing via the 5g table, similar to IPv4:
pre-up ip tunnel add he-ipv6 mode sit remote 216.66.80.26
pre-up ip link set he-ipv6 up
pre-up ip addr add 2001:db8:1234::2/64 dev he-ipv6
pre-up ip -6 rule add from 2001:db8:1234::/64 lookup 5g
pre-up ip -6 route add default dev he-ipv6 table 5g
pre-up ip -6 route add default dev he-ipv6 metric 1000
post-down ip tunnel del he-ipv6
We need to deal with IPv4 changes in for the tunnel endpoint, so modem-interface-route
also got:
ip tunnel change he-ipv6 local $new_ip_address
/etc/nftables.conf
had to be taught to accept the 6in4 packets from the tunnel in the input chain:
# Allow HE tunnel
iifname "sfp.31" ip protocol 41 ip saddr 216.66.80.26 accept
Finally, I had to engage in something I never thought I’d deal with; IPv6 NAT. HE provide a /48, and my FTTP ISP provides me with a /56, so this meant I could do a nice stateless 1:1 mapping:
table ip6 nat {
chain postrouting {
type nat hook postrouting priority 0
oifname "he-ipv6" snat ip6 prefix to ip6 saddr map { 2001:db8:f00d::/56 : 2001:db8:666::/56 }
}
}
This works. Mostly. The problem is that HE, not unreasonably, expect your IPv4 address to be pingable. And it turns out Three have some ranges that this works on, and some that it doesn’t. Which means it’s a bit hit and miss whether you can setup the tunnel.
I spent a while trying to find an alternative free IPv6 tunnel provider with a UK endpoint. There’s less call for them these days, so I didn’t manage to find any that actually worked (or didn’t have a similar pingable requirement). I did consider whether I wanted to end up with routes via a VM, as I described in the failover post, but looking at costings for VMs with providers who could actually give me an IPv6 range I decided the cost didn’t make it worthwhile; the VM cost ended up being more than the backup SIM is costing monthly.
Finally, it turns out happy eyeballs mostly means that when the 5G ends up on an IP that we can’t setup the IPv6 tunnel on, things still mostly work. Browser usage fails over quickly and it’s mostly my own SSH use that needs me to force IPv4. Purists will groan, but this turns out to be an acceptable trade-off for me, at present. Perhaps if I was seeing frequent failures the diverse routes approach to a VM would start to make sense, but for now I’m pretty happy with the configuration in terms of having a mostly automatic backup link take over when the main link goes down.
Sorting out backup internet #3: failover
With local recursive DNS and a 5G modem in place the next thing was to work on some sort of automatic failover when the primary FTTP connection failed. My wife works from home too and I sometimes travel so I wanted to make sure things didn’t require me to be around to kick them into switch the link in use.
First, let’s talk about what I didn’t do. One choice to try and ensure as seamless a failover as possible would be to get a VM somewhere out there. I’d then run Wireguard tunnels over both the FTTP + 5G links to the VM, and run some sort of routing protocol (RIP, OSPF?) over the links. Set preferences such that the FTTP is preferred, NAT v4 to the VM IP, and choose somewhere that gave me a v6 range I could just use directly.
This has the advantage that I’m actively checking link quality to the outside work, rather than just to the next hop. It also means, if the failover detection is fast enough, that existing sessions stay up rather than needing re-established.
The downsides are increased complexity, adding another point of potential failure (the VM + provider), the impact on connection quality (even with a decent endpoint it’s an extra hop and latency), and finally the increased cost involved.
I can cope with having to reconnect my SSH sessions in the event of a failure, and I’d rather be sure I can make full use of the FTTP connection, so I didn’t go this route. I chose to rely on local link failure detection to provide the signal for failover, and a set of policy routing on top of that to make things a bit more seamless.
Local link failure turns out to be fairly easy. My FTTP is a PPPoE configuration, so in /etc/ppp/peers/aquiss
I have:
lcp-echo-interval 1
lcp-echo-failure 5
lcp-echo-adaptive
Which gives me a failover of ~ 5s if the link goes down.
I’m operating the 5G modem in “bridge” rather than “router” mode, which means I get the actual IP from the 5G network via DHCP. The DHCP lease the modem hands out is under a minute, and in the event of a network failure it only hands out a 192.168.254.x
IP to talk to its web interface. As the 5G modem is the last resort path I choose not to do anything special with this, but the information is at least there if I need it.
To allow both interfaces to be up and the FTTP to be preferred I’m simply using route metrics. For the PPP configuration that’s:
defaultroute-metric 100
and for the 5G modem I have:
iface sfp.31 inet dhcp
metric 1000
vlan-raw-device sfp
There’s a wrinkle in that pppd
will not replace an existing default route, so I’ve created /etc/ppp/ip-up.d/default-route
to ensure it’s added:
#!/bin/bash
[ "$PPP_IFACE" = "pppoe-wan" ] || exit 0
# Ensure we add a default route; pppd will not do so if we have
# a lower pref route out the 5G modem
ip route add default dev pppoe-wan metric 100 || true
Additionally, in /etc/dhcp/dhclient.conf
I’ve disabled asking for any server details (DNS, NTP, etc) - I have internal setups for the servers I want, and don’t want to be trying to select things over the 5G link by default.
However, what I do want is to be able to access the 5G modem web interface and explicitly route some traffic out that link (e.g. so I can add it to my smokeping tests). For that I need some source based routing.
First step, add a 5g table to /etc/iproute2/rt_tables
:
16 5g
Then I ended up with the following in /etc/dhcp/dhclient-exit-hooks.d/modem-interface-route
, which is more complex than I’d like but seems to do what I want:
#!/bin/sh
case "$reason" in
BOUND|RENEW|REBIND|REBOOT)
# Check if we've actually changed IP address
if [ -z "$old_ip_address" ] ||
[ "$old_ip_address" != "$new_ip_address" ] ||
[ "$reason" = "BOUND" ] || [ "$reason" = "REBOOT" ]; then
if [ ! -z "$old_ip_address" ]; then
ip rule del from $old_ip_address lookup 5g
fi
ip rule add from $new_ip_address lookup 5g
ip route add default dev sfp.31 table 5g || true
ip route add 192.168.254.1 dev sfp.31 2>/dev/null || true
fi
;;
EXPIRE)
if [ ! -z "$old_ip_address" ]; then
ip rule del from $old_ip_address lookup 5g
fi
;;
*)
;;
esac
What does all that aim to do? We want to ensure traffic directed to the 5G WAN address goes out the 5G modem, so I can SSH into it even when the main link is up. So we add a rule directing traffic from that IP to hit the 5g
routing table, and a default route in that table which uses the 5G link. There’s no configuration for the FTTP connection in that table, so if the 5G link is down the traffic gets dropped, which is what we want. We also configure 192.168.254.1
to go out the link to the modem, as that’s where the web interface lives.
I also have a curl callout (curl --interface sfp.31 …
to ensure it goes out the 5G link) after the routes are configured to set dynamic DNS with Mythic Beasts, which helps with knowing where to connect back to. I seem to see IP address changes on the 5G link every couple of days at least.
Additionally, I have an entry in the interfaces
configuration carving out the top set of the netblock my smokeping server is in:
up ip rule add from 192.0.2.224/27 lookup 5g
My smokeping /etc/smokeping/config.d/Probes
file then looks like:
*** Probes ***
+ FPing
binary = /usr/bin/fping
++ FPingNormal
++ FPing5G
sourceaddress = 192.0.2.225
+ FPing6
binary = /usr/bin/fping
which allows me to use probe = FPing5G
for targets to test them over the 5G link.
That mostly covers the functionality I want for a backup link. There’s one piece that isn’t quite solved, however, IPv6, which can wait for another post.
subscribe via RSS