TCP is the worst abstraction.

TCP is the worst abstraction.

You are hopefully familiar with Leaky Abstractions as described by Joel Spolsky. The idea is that when you add layers to hide messy details, you can mostly avoid having to know what exactly is going on – until something breaks. Think of it as putting a smooth plastic coating on your car. Everything is really simple and zero-maintenance until your engine breaks and now you’re peeling plastic back trying to figure out which part is on fire…

TCP makes some big promises. “Your data will magically arrive in order and on time!” “Don’t worry about it, I’ll retry for you.” “Sure – I can send any amount of data!” “Hahah, packet sizes? I’m sure we don’t have to worry about those.”

Let’s talk about springing leaks. Just like when your upstairs neighbor’s toilet springs a leak and you have to deal with the concrete realities that a high flow water source above your bedroom ceiling introduces, springing leaks means you can’t use your abstraction anymore – you now have to work with the underlying system, often at one remove (or more!) because you’re working through the abstraction you chose to shield you from this in the first place!

TCP is leaky as a sieve. TCP says “I’ll just act like a stream and send bytes to someone on the internet!” But here are just a few areas where TCP breaks:

  • If you send too much data at once (the OS buffer fills and the write fails; you then have to resend).
  • If you send too little data at a time (the OS will sometimes fix this for you, see Nagle’s Algorithm, which can be good or bad depending on when that data needs to go over the wire).
  • If you try to read too much data at once (again, the OS receive buffer has limited size – so you have to be able to read your data in chunks that fit inside that limit).
  • If you transfer data at the wrong rate (the TCP flow control rules can be a problem).
  • If you try to read too little data at a time (then OS call overhead dominates your transfer speeds).
  • If you want to assume data has arrived (it may not have, you have to peek and see how much data there is and only read if there is enough, which necessitates careful design of your protocol to accomodate this).
  • If you want to initialize a connection in a deterministic fashion. (You have to do a bunch of careful checks of domain/IP/etc. to make sure it will even go through and once the connection is initialized you have to figure out if it’s alive or not. It can also take quite a while to establish a connection and get data flowing, see efforts like SPDY)
  • If you are on a lossy network (it will incur arbitrary overhead resending lost data).
  • If you want to manage latency (you have to take care to send data in correct packet boundaries).
  • If you want to connect through a firewall (good luck with that one).
  • If you want to use nonblocking IO. (You have to do a bunch of platform specific crud and even then not all actions are nonblocking; you have to live in a separate thread and block there.)
  • If you want to run a popular service. (There are a lot of ways the OS can be tricked by outside parties into mismanaging its resources leading to starvation/denial of service attacks.)

IMHO, TCP is an abstraction in name only. If you want to get any kind of decent results from it, you have to fully understand the entire stack. So now not only do you have to know everything about TCP, you have to know everything (or at least most of it) about IP, about how the OS runs its networking stack, about what tricks routers and the internet will play on you, about how your protocol’s data is set up, and so on.

I came to networking in a roundabout way. I did a couple of small TCP projects in my teens, but I spent most of my formative programming years (18-23 or so) working with Torque, which uses the User Datagram Protocol (UDP). Here’s what UDP code looks like:

// Send a packet.
sendto(mysocket, data, dataLen, 0, &destAddress, sizeof(destAddress));
// Receive a packet.
recvfrom(mysocket, data, dataLen, 0, &fromAddress, sizeof(fromAddress));

It’s very very simple and it maps almost directly to what the Internet actually gives you, which is the ability to send and receive routed packets from peers. These packets aren’t guaranteed to arrive in order nor are they guaranteed to arrive at all. In general they won’t be corrupted but it would behoove you to check that, too.

This is primitive, like banging two rocks together! Why do this to yourself? Well – it depends. If you just need to create some basic networking behavior and don’t care if it’s subpar, TCP works well enough, and if you have to, you can get it to sing for certain situations. And sometimes TCP is required because of firewalls or other technical issues. But if you want to build something that is native for the network, and really works well, go with UDP. UDP is a flat abstraction. You have to take responsibility for the network’s behavior and handle packet loss and misdelivery. But by doing so you can skip leaky abstractions and take full advantage of what the network can do for you.

Sometimes it’s better to solve hard problems up front, rather than ignoring them and hoping they go away

Author: Ben Garney

See https://bengarney.com/ and @bengarney for details.

%d bloggers like this: