this post was submitted on 22 Oct 2023
427 points (98.2% liked)

Selfhosted

40394 readers
332 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

Hi all home network administrators :) Haven't posted anything here since June, when I told you about Gatekeeper 1.1.0. Back then it was a pretty bare-bones (and maybe slightly buggy) DNS + DHCP server with a web UI with a list of LAN clients. Back at 1.1.0 Gatekeeper didn't even configure your LAN interface or set up NAT rules. It used to be something like dnsmasq - but with a web UI.

I've been improving it for the past couple of months - simplified it a lot, fixed bugs, optimized internals, improved the looks & added a bunch of quality-of-life features. Now it's a program that turns any linux machine into a home internet gateway. It's closer to OpenWRT in single executable file.

One big thing that happend was that I've attempted to replace the ubiquitous nft-based NAT (where the kernel processes the packets according to a rule-based script) with nfqueue (where the kernel sends the packet to userspace so that it can be altered arbitrarily). I've expected this to be buggy & slow. Well, it was hell to implement but it turns out that it's not slow at all. On the debug build my router can push 60GiB+ / second over TCP (over virtualized ethernet of course). And I'm not even using any io_uring magic yet. Quite honestly I don't even know how to explain it since it's slightly above the peak DDR4 transfer rate (I'm running dual channel DDR4-3200). Maybe the pages are not flushed to RAM & are only exchanged through CPU caches? Anyway I'm pretty excited because userspace access to all traffic opens a lot of new possibilities...

The first thing is NAT. By default Linux only supports symmetric NAT, which is pretty secure but is also fairly hard for the peer-to-peer protocols to pierce. There are some patches that make Linux full-cone but they're not, and are not expected to become a part of the mainline kernel (at least according to OpenWRT forums). Now, since we have access to every packet we can take care of this ourselves. We can create a couple hash tables to track connections, alter the source & destination IP, recomputing the checksums if necessary. Suddenly we can have full-cone NAT, on any linux machine, without patching the kernel! At runtime it's not as configurable as netfilter + conntrack but it's a whole lot simpler - since now we can use a general purpose programming language rather than netfilter rules.

Another cool feature that we can now have are truly realtime traffic graphs. Summary of each packet traversing the network boundary is immediately streamed to the connected web UIs over WebSocket. This is way faster than the alternatives based on reading some /proc/ or /sys/ files every couple of seconds. Gatekeeper also aggregates the traffic from the last 24 hours between each pair of hosts into a histogram with 100ms resolution and allow clients to view it, scroll through it, compute stats, download as JSON or CSV. You can retroactively check which device talked with what IP, at what time with unprecedented resolution.

My next step is going to be capturing the traffic that goes through into a 5MB circular buffer (separate buffer for each LAN client) & downloading it as Wireshark-compatible pcap files. Computationally it's almost free. IoT devices usually don't transmit a lot of data. 5MB may actually cover months of traffic for the simpler ones. If any device is did anything weird, it will finally be possible to investigate it - even after it already happened.

Long-term Gatekeeper could do even more. For example offer assistance in setting up TLS MITM, perform some online grouping / analysis on the live traffic.

I still have some ground work to do - like automatically setting up Wireless LAN, bridgidng multiple interfaces into a single one and I think there may be a bug that causes crashes when checking GitHub for updates. But I wanted to share it sooner rather than later. I hope that despite its imperfections some of you will find it useful!

(I've had some issues with cross-instance posting. This is attempt #6)

you are viewing a single comment's thread
view the rest of the comments
[–] poVoq@slrpnk.net 7 points 1 year ago (1 children)

The readme is not so clear on this: does it also do port forwarding?

But seems pretty cool indeed 👍

[–] maf@lemmy.world 8 points 1 year ago* (last edited 1 year ago) (3 children)

Unfortunately explicit, stable port redirections is something that is still missing. I'll have to implemnt them (with a proper UI) eventually because under the hood they are also a necessary building block for other features. At the moment there are only "ephemeral" port redirects which may be sufficient for you. They are created automatically when a LAN machine sends out a packet from some source port. That port is then implicitly forwarded back to that machine. This is actually a part of the "Full Cone NAT" thing.

This can be triggered manually for example with something like:

nc -p 80 1.2.3.4 1234 # send a dummy TCP packet from port 80

Ephemeral port redirections don't expire but can be taken over if another LAN host also uses the same source port for outgoing traffic. This may happen randomly because source ports are usually picked at random by the OS. Generally ports below ~32k should be fairly stable because Linux doesn't use those by default (I don't know about Windows). Redirecting ports below 1024 should be even more stable because they're reserved for specific well-known services.

[–] astraeus@programming.dev 4 points 1 year ago (1 children)

What makes the port redirections difficult to implement in the code? I’m imagining the kernel has some way of handling this without too many external libraries but I’m not well-versed enough on this to know for sure.

[–] maf@lemmy.world 6 points 1 year ago (1 children)

The relevant part of NAT is actually just those 20 lines.

The hardest part is actually the UI :P The difficulty in building nice UI comes from potential ports listening on the local WAN interface (for example if the machine is also running any HTTP or SSH servers). I'd like the user to see at a glance what ports are used for what (port used by a local service - what service is that?, ephemeral port redirection using the full cone nat table - where is it redirected?, any symmetric nat connections together with their last activity / timeouts / traffic summary). Ideally the same interface should also allow the user to create new redirects.

[–] InvertedParallax@lemm.ee 3 points 1 year ago

I love what you did, especially the c++.

Using a unifi right now but this is the perfect replacement, especially since it's programmable, just put a few nic ports on a vm and let it run.

Just beautiful.

[–] nybble41@programming.dev 2 points 1 year ago (1 children)

So you're not remapping the source ports to be unique? There's no mechanism to avoid collisions when multiple clients use the same source port? Full Cone NAT implies that you have to remember the mapping (potentially indefinitely—if you ever reassign a given external IP:port combination to a different internal IP or port after it's been used you're not implementing Full Cone NAT), but not that the internal and external ports need to be identical. It would generally only be used when you have a large enough pool of external IP addresses available to assign a unique external IP:port for every internal IP:port. Which usually implies a unique external IP for each internal IP, as you can't restrict the number of unique ports used by each client. This is why most routers only implement Symmetric NAT.

(If you do have sufficient external IPs the Linux kernel can do Full Cone NAT by translating only the IP addresses and not the ports, via SNAT/DNAT prefix mapping. The part it lacks, for very practical reasons, is support for attempting to create permanent unique mappings from a larger number of unconstrained internal IP:port combinations to a smaller number of external ones.)

[–] maf@szmer.info 2 points 1 year ago* (last edited 1 year ago)

So you’re not remapping the source ports to be unique? There’s no mechanism to avoid collisions when multiple clients use the same source port?

Regarding port collisions. In Gatekeeper there are both - a Symmetric NAT & Full Cone NAT. Both are used in tandem. I didn't mention the former before. Symmetric NAT takes precedence over Full Cone NAT when a connection has already been established (we observed the remote host and have a record of which LAN IP they're talking to). You're 100% correct that without Symmetric NAT there would be port collisions and computers in LAN would fight over ports. I actually started out with just the Full Cone NAT only (where collisions can happen) and used it on my network for a couple of weeks. It seemed to work in my home environment but I was a little worried about potential flakiness so I've implemented a backup mechanism eventually.

Full Cone NAT implies that you have to remember the mapping (potentially indefinitely—if you ever reassign a given external IP:port combination to a different internal IP or port after it’s been used you’re not implementing Full Cone NAT) (...)

Ah, I also recalled something like that! What you're saying about NAT assignments being permanent & requiring multiple IPs to avoid collisions. I think there was a course at my university or some Cisco course that taught that... I haven't been able to find any online sources that would confirm those definitions today but I also remember something along the lines of what you're describing. I have no idea what happened with those terms. Maybe the "permanent assignments" don't make much sense in wireless networks WiFi devices can appear and disappear at any time?

Edit: I found it - the proper term for this was "Static NAT" (as opposed to "Dynamic NAT" where the redirections expire).

(...) but not that the internal and external ports need to be identical.

Right. Port preservation is not a strictly necessary part of Full Cone NAT. It's a nice feature though. I guess the technical classification would be "Full Cone NAT with port preservation".

(If you do have sufficient external IPs the Linux kernel can do Full Cone NAT by translating only the IP addresses and not the ports, via SNAT/DNAT prefix mapping. The part it lacks, for very practical reasons, is support for attempting to create permanent unique mappings from a larger number of unconstrained internal IP:port combinations to a smaller number of external ones.)

This is very cool indeed. I didn't knew that. Thanks!