Automatic Autossh Tunnel under Systemd

It took me half the afternoon to figure out how to do this with all the bells and whistles (mostly a learning experience on systemd), so I’d better write it down! Meanwhile, it took me at least a dozen reference docs to piece it together, because autossh has pretty brief documentation.

Edit 2021-03-27: One thing in the original version of this article that didn’t work as intended was keeping the systemd unit file in a different location and linking it where it needed to be. That turns out not to work right under reload. Further info below.

I had a remote server running a mail forwarder which is password protected but exposed, yet I was only using it from devices within my LAN. That’s not only not ideal, but a smidge less reliable due to various odd properties of my LAN (with regard to WAN failover behaviors), so I wanted to move this traffic into an SSH tunnel. My local tunnel endpoint would be a Debian 10 (Buster) machine that is fully within the LAN, not a perimeter device. I want this connection to come up fully automatically on boot/reboot, and give me simple control as a service (hence systemd).

Remote: an unprivileged, high port PPPPP on server RRR.RRR.RRR.RRR, whose ssh server listens on port xxxxx.

Local: the same unprivileged, high port PPPPP on server LLL.LLL.LLL.LLL

My remote machine was already appropriately set up, all I needed to do was add an unprivileged user account to add this connection. In the code below, you’ll see the account name (on both the local and remote servers) is raul, which is also the local server’s name on the network. Substitute your account name of choice wherever you see this. Before beginning the real work, you need this account set up on both machines, with key based authentication (with no pass phrase). Log into the remote machine from the local account at least once, to verify it works with the certificate, and to store the host key.

Install autossh on your local Debian machine with sudo apt install autossh.

Since everything will be run as your unprivileged user, it’s actually a bit easier to do all your initial editing from that account so that you don’t have to play with permissions later. That’s not how I started, but it would’ve saved me some steps. So, switch to your new account with su raul or equivalent. We’ll be keeping the three files necessary to run this in that account’s home directory at /home/raul/, but one of them is created automatically by your script, so we really only need to write the startup script and the systemd unit.

One thing to note beforehand – the formatting and word wrapping on this site can make it less than obvious what’s an actual newline in code snippets, and what’s just word wrap. Because of this, I’ve linked a copy of the maillink.sh script where you can get to it directly, and just change out your account name, addresses, and ports.

Startup Script

First, we’ll create the startup script, which is the meat of the work. Without further ado, create /home/raul/maillink.sh:

#!/bin/bash

# This script starts an ssh tunnel to matter to locally expose the mail port, PPPPP.                                                                           

logger -t maillink.sh "Begin autossh setup script for mail link."
S_TIME=`awk '{print int($1)}' /proc/uptime`
MAX_TIME="300"

# First, verify connection to outside world is working. This bit is optional.

while true ; do
        M_RESPONSE=`ping -c 5 -A RRR.RRR.RRR.RRR | grep -c "64 bytes from"`
        C_TIME=`awk '{print int($1)}' /proc/uptime`
        E_TIME=`expr $C_TIME - $S_TIME`
        [[ $M_RESPONSE == "5" ]] && break
        if  [ $E_TIME -gt $MAX_TIME ]
        then
                logger -t maillink.sh "Waiting for network, timed out after $E_TIME seconds."                                                                  
                exit 1
        fi
        sleep 10
done

logger -t maillink.sh "Network detected up after $E_TIME seconds."

# Now, start the tunnel in the background.

export AUTOSSH_PIDFILE="/home/raul/maillink.pid"

autossh -f -M 0 raul@RRR.RRR.RRR.RRR -p xxxxx -N \
        -o "ServerAliveInterval 60" -o "ServerAliveCountMax 3" \
        -L LLL.LLL.LLL.LLL:PPPPP:127.0.0.1:PPPPP

MAILPID=`cat /home/raul/maillink.pid`

logger -t maillink.sh "Autostart command run, pid is $MAILPID."

As you can see, this script has a few more bells and whistles than just the most basic. While systemd should take care of making sure we wait until the network is up, I want the script to verify that it can reach my actual remote server before it starts the tunnel. It will try every ten seconds for up to five minutes, until it gets 100% of its test pings back in one go.

I also like logs … putting a few log lines into your script sure makes it easier to figure out why things are going wrong.

The “AUTOSSH_PIDFILE” line is necessary for making this setup play pretty with systemd and give us a way to stop the tunnel the nice way (systemctl stop maillink) instead of manually killing it. That environment variable will cause autossh to store its pid once it starts up. Autossh responds to a nice, default kill by neatly shutting down the ssh tunnel and itself, so it makes that control easy. Of course, finding that and figuring out how to do it was … less easy, but it’s simple once you know. That pid file is the second important file in making this work, but this line creates it automatically whenever the tunnel is working.

Now, the meat of this file is the autossh command. Some of these are for autossh itself, and most get passed to ssh. Here’s a breakdown of each part of this command:

  • -f moves the process to background on startup (this is an autossh flag)
  • -M 0 turns off the built in autossh connection checking system – we’re relying on newer monitoring built into OpenSSH instead.
  • raul@RRR.RRR.RRR.RRR -p xxxxx are the username, address, and port number of the remote server, to be passed to ssh (along with all that follows). If your remote server uses the default port of 22, you can leave the port flag out. If your local and remote accounts for this service will be the same, you can also leave out the account name, but I find it clearer left in.
  • -N tells ssh not to execute any remote commands. The manpage for ssh mentions this is useful for just forwarding ports. What nothing tells you, is that in my experience this autossh tunnel simply won’t work without the -N flag. It failed silently until I added it in.
  • -o “ServerAliveInterval 60” -o “ServerAliveCountMax 3” are the flags for the built in connection monitoring we’ll be using. ServerAliveInterval causes the client to send a keep-alive null packet to the server at the specified interval in seconds, expecting a response. ServerAliveCountMax sets the limit of how many times in a row the response can fail to arrive before the client automatically disconnects. When it does disconnect, it will exit with a non-zero status (i.e. “something went wrong”), and autossh will restart the tunnel – that’s the main function of autossh. If you kill the ssh process intentionally, it returns 0 and autossh assumes that’s intentional, so it will end itself.
  • -L LLL.LLL.LLL.LLL:PPPPP:127.0.0.1:PPPPP is the real meat of the command, as this is the actual port forward. It translates to, “Take port PPPPP of the remote server’s loopback (127.0.0.1) interface, and expose it on port PPPPP of this local client’s interface at address LLL.LLL.LLL.LLL.” That local address is optional, but if you don’t put it in, it will default to exposing that port on the local client’s loopback interface, too. That’s great if you just need to access it from the client computer, but I needed this port exposed to the rest of my LAN.

One handy thing to note, is that you can forward multiple ports through this single tunnel. You can just keep repeating that -L line to forward however many ports you need. Or, if you’re forwarding for various different services that you might not want all active at the same time, it’s easy to duplicate the startup script and service file, tweak the names and contents, and have a second separate service to control.

Before you test this the first time, it’s important to make sure it’s executable!

chmod +x /home/raul/maillink.sh

At this point, if you aren’t looking to make this a systemd service, you’re done – you can actually just start your connection manually using “. /home/raul/maillink.sh” (note the . and space up front) and stop it manually using kill <pid>, where the pid is the number saved in the maillink.pid file. (If you’re planning to do this, it’s actually easiest to keep these in your main user’s home directory and modify the script for the pid location accordingly.) At this point, you should manually test the script to ensure everything is working the way you expected. You should see some helpful output in your syslog, and you should also see that port listening on your local machine if you run netstat -tunlp4.

Systemd Unit

However, with just a little more work, making this controllable turns out to be pretty simple. It took way longer to corral the info on how to do it, than it would’ve taken to do if I’d already known how.

Edit 2021-03-27: I originally tried placing this unit file in /home/raul/ and sym linking it into /etc/systemd/system/. That … well, doesn’t work. It works fine the first time, when you run systemctl daemon-reload to pull the unit into the system. The problem is, for whatever reason systemd will not find that file on reboot, even though the link is still there. You’d have to reload manually every time, which just defeats the purpose. Edits have been made accordingly below.

First, create the systemd unit file, which must be located in /etc/systemd/system/maillink.service:

[Unit]
Description=Keeps a tunnel to remote mailserver open
Wants=network-online.target
After=network-online.target

[Service]
Type=forking
RemainAfterExit=yes
User=raul
Group=raul
ExecStart=/home/raul/maillink.sh                                                                                                                               
ExecStop=kill `cat /home/raul/maillink.pid`

[Install]
WantedBy=multi-user.target

This file contains:

  • A description
  • Two lines that ensure it won’t run until after networking is up
  • Two lines that instruct the system to run it under your “raul” account
  • A command to be run to start the service
  • A command to be run to stop the service – this shuts it down clean using the pid file we saved on startup

Next, we need to update systemd to see the new service:

sudo systemctl daemon-reload

Now your systemd unit is loaded. You should be able to verify this by running systemctl status maillink, which should give you a bit of information on your service and its description.

Next, we can test the systemd setup out. First start the service once using sudo systemctl start maillink, and make sure it starts without error messages. Check it as well with systemctl status maillink, and verify the port is there using netstat -tnlp4.

If all went well, that status command should give you some nice output, including the process tree for the service, and an excerpt of relevant parts of the syslog.

Make sure you also verify the stop command, with systemctl stop maillink. This should turn off the port in netstat, and you should also no longer see autossh or ssh when you run ps -e.

If all looks good, you’re good to set this up and leave it! Enable the service to autostart using systemctl enable maillink, and if it’s not started already, start it back up with systemctl start maillink.

And, here’s hoping this was clear and helpful. If you catch any bugs, please let me know!

Quick and Dirty Live View of rsyslog Output

I mentioned in a post yesterday that I was watching the syslog of my router to see when it sent a boot image over tftp. However, OpenWRT does not have a “live updating” syslog view – so how was I doing that, just clicking refresh over and over with my third hand, while holding down a reset button with the other two? No, there’s a really easy way to do this with a stupidly simple bash script.

My routers use remote logging to an internal rsyslog server on my LAN, and you’ll see my script is set up to support that. However, this is very easy to modify to support OpenWRT’s default logging, as well.

Without further ado, here’s the script, which lives in my log folder:

#!/bin/sh

# Live log display

while true; do
        tail -n 52 $1
        sleep 5
done

My various consoles I log into this from have anywhere from a 40 to 50 line display set up, hence the “52” – it’s reading and displaying the last 52 lines of the log every five seconds. If you always use shorter consoles, you can easily trim this down.

By passing the name of the script you want to read, this script has also been made “universal” in that it can be used to monitor any log on your machine. I also use it on a couple of my other servers, with no modifications. If you want to monitor “hexenbiest.log” you simply cd into the appropriate log folder, and run:

./loglive hexenbiest.log

Stock OpenWRT doesn’t write the log to a real file, it uses a ring buffer in memory that may be accessed using the command logread. To modify this script to run on a stock OpenWRT router, place it in the home folder (/root/) instead, and modify it to read accordingly:

#!/bin/sh

# Live log display

while true; do
        logread | tail -n 52
        sleep 5
done

This way, instead of reading the last 52 lines of a file every five seconds, it’s reading the last 52 lines of the logread output instead.

You might think it would make sense to clear the terminal before each output, but I didn’t personally find that helpful. In fact, it resulted in a visible flicker every time the log updated. Helpful if you need to know each time it reads, but I didn’t find that useful myself.

Using dnsmasq under OpenWRT as a TFTP boot server

Update (2022-03-13): this past year, Mikrotik did something that broke their ability to netboot over DHCP (affecting any RouterOS version newer than 6.46.6), which makes flashing these routers using most people’s usual methods (often tftp32, which is distributed with ROOter images, for example) much more difficult. The method in this article is unaffected, and still works fine. Using this method, dnsmasq is actually transparently handling the Mikrotik using BOOTP which is not broken, and is Mikrotik’s default netboot mode. I didn’t even realize this when first writing the article, because at the time it didn’t matter. However, I just flashed a new hEX yesterday using these instructions, and it was on RouterOS 6.47.9.

Lots of routers now offer a nice little web interface you can use to upload firmware. However, there are still a lot of routers that are easiest to flash using netboot methods like tftp. There are plenty of tutorials on doing this, but most focus on using a server installed on your computer. If this is a second router and you already have a working OpenWRT main router, it’s often actually much easier to just use your main router to TFTP boot, which is something dnsmasq (the default DHCP and DNS server) can do out of the box.

In my case, I already have a primary router with external USB storage up and running. This brief tutorial gives the bare bones steps on what you need to do to use this to flash a second router that supports netboot. I’ll be flashing a Mikrotik hEX RB750Gr3 in this example, since I had one I needed to do anyway. If you don’t already have some external storage set up on your main router, take care of that first – the existing tutorials for that are pretty good, so I won’t duplicate that here.

First, boot up your new router at least once and get its MAC address. For some reason things will go more smoothly if you assign it a static IP when it first boots up as a DHCP client.

Configure /etc/config/dhcp (which controls dnsmasq) on your main router. First, turn on the tftp server, and point it to your USB storage:

config dnsmasq
     ...
     option enable_tftp '1'
     option tftp_root '/mnt/stor/tftp'

Make sure that second line you added points to the correct folder on your USB storage.

Add a static IP for the box you’ll flash:

config host
      option mac 'B8:27:EB:2B:08:85'
      option name 'somehost'
      option ip '192.168.1.240'

Change that MAC to your new router, and give it whatever name and address on its WAN you can remember. You won’t actually need it once it boots up, and you can delete this section once your new router is flashed.

Now, drop the file in the appropriate folder. For TFTP booted routers, you usually need two firmware images: one it can netboot off of from TFTP (which usually has “factory” in the name), and the real copy that gets flashed to the flash memory (usually has “upgrade”). This is a two step process – the netbooted image will not actually be saved to the router, and this is actually a great way to test an OpenWRT build before you flash. You then use the netbooted “factory” image to flash the router using the permanent “upgrade” image. If you don’t do that second step, when you reboot the router, it’ll go straight back to its original OS and settings from memory.

Now, the critical part – take that netboot image in your folder (mine is “openwrt-RB750gr3-GO2021-02-03-factory.bin” for the OpenWRT ROOter fork), and rename it “vmlinux”.

Some router manufacturers also need to find your TFTP server at a specific address, as well. Mikrotik apparently expects 192.168.1.10. If your LAN is already at 192.168.1.0 and the .10 address is free, it is trivial to add .10 as a second address for your main router (this will not affect anything else of significance). From your main router’s command line, simply run:

ip addr add 192.168.1.10 dev eth0.5

Change the bit after “dev” to match whichever interface or port connects to your LAN. In my case, my LAN is on a VLAN port of my router, hence eth0.5.

Now, it’s time to netboot. Shut down your new router if it isn’t already, and plug is WAN port into your existing network.

For the Mikrotik hEX, to trigger a netboot, you plug in the power jack with the reset button already held down. The button does several things depending on how long you hold it down; it comes on lit steady, then starts flashing, then goes off completely. Once it’s off completely you can release the button, as it will be looking for a netboot. If you’re watching your log on your main DHCP router, it’ll post up a log line when it sends the boot image to a DHCP client.

Give it time to boot up, and then try connecting from a client plugged into the LAN side of the new router. One advantage of doing it this way is that you don’t tie up your main computer as both the boot tftp server and the tool you need to log into the new router with. If your OpenWRT image netbooted successfully, you should find your new router at 192.168.1.1 from your test computer.

Now, for the last important part – flash the permanent image! You need to go to System -> Backup / Flash Firmware on the new router and flash that upgrade image, or what you’ve just done won’t stay permanent.

Dell Latitude E6410 with GPU overheating – Solved!

This one took stupidly long to sort out.

My work, which shall remain unnamed, had a pile of these Dell Latitude E6410’s for years, most of which worked adequately if never particularly well. They were quirky. They were slowly retired in favor of better equipment, and a handful were kept around as “emergency spares” until they got so out of date that they were finally kicked off the domain. They became off network utility machines until my IT folks couldn’t even keep Windows working on them anymore. The last one finally got officially “disposed” and handed to me to see if I can make anything useful for the office out of it, because I seem to be able to keep stuff alive.

Here’s the problem it had – you could get it to run for a few minutes, and then it’d just overheat and switch off. I switched it over to Debian because it’s a little lighter on resources (and we needed a spare linux box anyway), and that did improve things … slightly, for a year or so.

If you search “overheating E6410” on Google, you’ll see a pile of them, with almost no solutions. I did eventually conclude the thermal pads on its heat sink had died, and pulled it apart to replace the pads with decent thermal paste. This got us almost another year of usable performance out of it – the CPU performed well, but the GPU would still overheat if it did anything hard.

Finally, a year ago, it got back to just overheating the GPU after two minutes, and I stuffed it in a drawer until I had time to screw around with it.

I had a use for it this afternoon, and an hour to spare to look it back over, so I went ahead and pulled the heat sink back off to get a look. It’s easy to get to on these – one screw and the bottom cover slides off, two screws to remove the fan, and four to pull the rest of the heat sink. It’s one combined unit for the CPU and GPU:

laptop heat sink
Dell Latitude E6410 heat sink

I still had decent thermal paste that hadn’t hardened, and the radiator on the right hand end wasn’t clogged up. I could hear the fan working. But I finally spotted the problem – the GPU wasn’t making very good contact with this oddly shaped heat sink module! The CPU would purr along at 47C, and the GPU would shoot up to 95C and trigger a shutoff within minutes.

Since this machine was already “technically trash” and had one foot in the recycle bin, I said heck with it. The GPU is under the little, studded bit of the aluminum casting, right under where the copper heat bar reverses curvature. I pulled the assembly back out, took it in both hands, and just bent the heat sink bar. I bent it down in the middle as shown, so that with the radiator in its case slot on the right, and the CPU screws mounted, it might have a shot in heck of actually having the heat sink properly contact the GPU.

Turns out that’s all it needed. Now it’s sitting here with the GPU running at 47C as well, and it’s useful again. Not bad for a machine I was about two minutes away from drop kicking toward the recycle bin.

So, if you’ve been wading through the dozens of search results on overheating E6410s, and you’re at your wits end – pull the heat sink off and bend it to get better contact. It’s quick, easy, and you might well save your sanity, too.

Update 2021-03-17: This little machine has now been running for eight days straight, without a single GPU excursion over about 60C that I’ve noticed. This was just bad contact between the GPU and heat sink all along. Heck of a note … but it bodes well for getting a few more years use out of the thing!