Table of contents
- IP Alone Isn’t Enough
- Ports and Sockets — Numbers That Point to Processes
- Why TCP Is Called a “Reliable” Protocol
- 3-Way Handshake — Three Exchanges to Open a Connection
- Closing a Connection — 4-Way Handshake
- Flow Control vs. Congestion Control
- TCP State Machine
- UDP — Choosing to Forgo Reliability
- TCP vs. UDP — Direct Comparison
- Seeing TCP with Your Own Eyes — Capturing Handshakes with tcpdump
- Common Issues in Practice
IP Alone Isn’t Enough
In Part 2, we said IP addresses determine “which host to send to.” But a single host runs many programs — a web server, a mail server, an SSH daemon. With just an IP address, there’s no way to know “which program should this data go to.” That’s where ports come in. And the mechanism that connects processes on both ends using the combination of IP + port is the socket.
On top of that sit two protocols: TCP and UDP. Both belong to OSI Layer 4 (Transport), but their personalities are polar opposites. TCP bets everything on “reliable delivery,” while UDP bets everything on “fast and lightweight.” This post traces why each was designed the way it was, and what governs the choice between them.
Ports and Sockets — Numbers That Point to Processes
A port is a 16-bit number (0—65535). It’s a ticket number that distinguishes multiple processes sharing the same IP. Web servers listen on port 80 or 443, SSH on 22, PostgreSQL on 5432. The numbers aren’t binding, but there are conventionally assigned ones.
- 0 ~ 1023 (Well-Known Ports): Require administrator privileges to open. HTTP 80, HTTPS 443, SSH 22, DNS 53, etc.
- 1024 ~ 49151 (Registered Ports): Service ports registered with IANA. PostgreSQL 5432, Redis 6379, Prometheus 9090, etc.
- 49152 ~ 65535 (Ephemeral Ports): Temporarily assigned by the OS when a client connects. Every time your browser connects to a server, it grabs one from this range
A socket is a communication endpoint that bundles an IP address and a port together. When two sockets are connected, the processes on both ends can exchange data as if reading and writing to a file. From the Linux perspective, a socket is just a file descriptor. You work with it using read and write.
A single connection is uniquely identified by four values — source IP, source port, destination IP, destination port. These four values are called a 4-tuple. The reason a single server can handle tens of thousands of simultaneous connections on the same port (e.g., 443) is that the client-side IP or port differs, making every tuple unique.
flowchart LR
subgraph C["Client 192.168.1.10"]
CA["Browser<br/>sport: 54321"]
end
subgraph S["Server 93.184.216.34"]
SA["Web server<br/>dport: 443"]
end
CA -- "4-tuple<br/>(192.168.1.10, 54321, 93.184.216.34, 443)" --> SA
You can view currently open sockets on Linux with the ss command.
ss -tunap
# TCP/UDP, numeric, all states, with process names
# Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
# tcp LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",...))
# tcp ESTAB 0 0 192.168.1.10:54321 93.184.216.34:443 users:(("chrome",...))
LISTEN means “waiting for someone to connect on this port,” and ESTAB means “the connection is open.” This is the first command you reach for in practice when asking “what’s using this port?”
Why TCP Is Called a “Reliable” Protocol
When TCP (Transmission Control Protocol) is said to guarantee reliability, it means the following:
- Order guarantee: Delivered to the receiver in the order sent
- Loss recovery: If a packet is lost in transit, it’s retransmitted until it arrives
- Duplicate elimination: If the same packet arrives twice, only one copy is used
- Flow control: If the receiver is overwhelmed, the sending rate is reduced
- Congestion control: If the network itself is congested, the overall transmission rate is lowered
All of this is achieved on top of IP. IP is a “send it and hope for the best” protocol. TCP layers reliability onto that unreliable foundation. Thanks to this, applications can maintain the illusion that “data written will arrive at the other end in order.” This illusion is the essence of TCP.
3-Way Handshake — Three Exchanges to Open a Connection
TCP is a connection-oriented protocol. Before exchanging data, both sides first agree to “let’s open a connection.” Because this agreement takes three packets, it’s called the 3-way handshake.
sequenceDiagram
participant C as Client
participant S as Server
Note over C,S: Connection establishment
C->>S: SYN (seq=x)
Note right of S: Ready to accept incoming connection
S->>C: SYN-ACK (seq=y, ack=x+1)
C->>S: ACK (ack=y+1)
Note over C,S: Data exchange can now begin
C->>S: HTTP GET /
S->>C: HTTP 200 OK + body
Here’s what each step means.
- SYN (Synchronize): The client sends the first packet saying “Let’s open a connection. My initial sequence number is x”
- SYN-ACK: The server responds “Sure. My initial sequence number is y, and I’m expecting x+1 from you next”
- ACK: The client wraps up with “I’m expecting y+1 from you next”
Why three times instead of two? Because both sides need to tell the other their initial sequence number and confirm the other received it. If sequence numbers get misaligned, retransmission logic breaks down. The third exchange also prevents “ghost connections” caused by old SYN packets arriving late.
This handshake is also a primary source of latency. On a network with an RTT (Round-Trip Time) of 50ms, there’s at least 50ms of delay before the first data is exchanged. HTTPS stacks TLS handshake on top of this, easily adding hundreds of milliseconds. Because of this cost, HTTP/2 keeps a single connection alive for multiple requests, and HTTP/3 abandons TCP entirely in favor of QUIC (Quick UDP Internet Connections).
Closing a Connection — 4-Way Handshake
Opening takes three exchanges, closing takes four. Both sides independently declare “I have nothing more to send” and get acknowledgment.
sequenceDiagram
participant C as Client
participant S as Server
C->>S: FIN
S->>C: ACK
Note over S: Server may still have data to send
S->>C: FIN
C->>S: ACK
Note over C: Enters TIME_WAIT state briefly
The side that sent a FIN can still receive data coming from the other direction. This is called half-close. It’s important for protocols that need pipeline processing (SSH, large file transfers, etc.).
The client that sent the final ACK briefly enters the TIME_WAIT state. This is a safety mechanism — “in case the last ACK was lost and the server retransmits its FIN.” The default is 2 * MSL (about 60 seconds). Because of this behavior, servers processing tens of thousands of short connections can accumulate TIME_WAIT sockets, leading to port exhaustion. In such cases, kernel parameters like net.ipv4.tcp_tw_reuse may need adjustment.
Flow Control vs. Congestion Control
When TCP is called “reliable,” the heart of that reliability is speed regulation. There are two kinds.
Flow control protects the receiver. When the receiver’s buffer is full, it signals the sender to stop. TCP uses the Window Size field in the header to report “the number of bytes I can currently accept” with every acknowledgment. If this value hits 0, the sender pauses.
Congestion control protects the network itself. When routers along the path get overloaded and start dropping packets, the sender sharply reduces speed and then slowly ramps back up. The classic algorithm is TCP Reno, with its four phases: Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery.
flowchart LR
SS["Slow Start<br/>exponential increase"] --> CA["Congestion Avoidance<br/>linear increase"]
CA -- packet loss --> FR["Fast Retransmit<br/>immediate retransmission"]
FR --> FCR["Fast Recovery<br/>halve window size"]
FCR --> CA
- Slow Start: Begins with a small window (cwnd=1) and doubles it with each ACK. Cautious initial probing
- Congestion Avoidance: Once a threshold (ssthresh) is crossed, the window grows by only 1 per round. Approaching the limit slowly
- Fast Retransmit: Three duplicate ACKs trigger immediate retransmission without waiting for the timer
- Fast Recovery: On loss detection, the window is halved and returns to Congestion Avoidance
The more recent TCP BBR (Bottleneck Bandwidth and RTT) adjusts speed by measuring actual bandwidth and RTT rather than relying on packet loss. Google champions it, and it often outperforms Reno on long-distance or mobile connections. Detailed background is well covered at Wikipedia: TCP congestion control.
Having the instinct to distinguish flow control from congestion control — “protecting the receiver vs. protecting the network” — helps you decide which knob to turn when TCP tuning is needed.
TCP State Machine
A TCP connection passes through several states during its lifecycle. Here are the ones you’ll encounter most often.
- LISTEN: Server is waiting for incoming connections
- SYN-SENT: Client has sent a SYN and is waiting for a response
- SYN-RECEIVED: Server has received a SYN and sent a SYN-ACK
- ESTABLISHED: Connection is fully open, data can be exchanged
- FIN-WAIT-1 / FIN-WAIT-2: The side actively closing the connection
- CLOSE-WAIT: The other side said “let’s close” and it’s my turn to respond
- TIME-WAIT: Safely waiting after sending the final ACK
An accumulation of CLOSE_WAIT is a signal of an application bug. The other side has sent a “done” signal, but our process hasn’t called close(). If ss shows many CLOSE_WAIT connections, inspect the connection teardown logic in the code.
UDP — Choosing to Forgo Reliability
UDP (User Datagram Protocol) follows the exact opposite philosophy of TCP. “Fire and forget.” No order guarantee, no retransmission, no flow control. The header is a mere 8 bytes.
UDP header (8 bytes):
Source port (16 bits)
Destination port (16 bits)
Length (16 bits)
Checksum (16 bits)
Why is such a protocol needed? Because for certain types of data, arriving late is worse than not arriving at all.
- DNS queries: A single small request/response and it’s done. There’s no reason to pay the handshake cost
- Real-time video/audio: Receiving a retransmitted frame from one second ago is useless. Better to drop the missed frame and receive the latest one
- Game state synchronization: Player positions are refreshed every frame. There’s no need to retransmit old positions
- Large-scale broadcast/multicast: TCP, which establishes connections with each recipient, doesn’t fit
Need reliability on top of UDP? The application implements it. QUIC is a prime example — a protocol built on UDP that combines TCP’s reliability with TLS encryption, and HTTP/3 runs on top of it. It offers faster handshakes (0-RTT possible), connection migration, and resolution of head-of-line blocking.
TCP vs. UDP — Direct Comparison
| Attribute | TCP | UDP |
|---|---|---|
| Connection | Connection-oriented (handshake) | Connectionless |
| Reliability | Guaranteed (retransmission, ordering) | None |
| Flow/congestion control | Yes | No |
| Header size | 20+ bytes | 8 bytes |
| Speed | Relatively slower | Fast |
| Use cases | HTTP, SSH, DB, file transfer | DNS, VoIP, gaming, QUIC |
The criterion for choosing is simple. “Must the data arrive, even if late? Or is late data useless?” If the former, TCP. If the latter, UDP.
Seeing TCP with Your Own Eyes — Capturing Handshakes with tcpdump
Words keep things abstract, so let’s capture actual packets with tcpdump. This command requires root privileges.
# Capture TCP packets to/from a target host
sudo tcpdump -i any -n 'host example.com and tcp port 443' -c 6
# Example output (simplified)
# IP 192.168.1.10.54321 > 93.184.216.34.443: Flags [S], seq 1000
# IP 93.184.216.34.443 > 192.168.1.10.54321: Flags [S.], seq 2000, ack 1001
# IP 192.168.1.10.54321 > 93.184.216.34.443: Flags [.], ack 2001
# IP 192.168.1.10.54321 > 93.184.216.34.443: Flags [P.], len 517 ← ClientHello
# ...
[S] is SYN, [S.] is SYN-ACK, [.] alone is ACK, and [P.] is Push (data). This Flags notation is a single-character abbreviation of the TCP header’s flag bits. Once you’ve seen a handshake with your own eyes, the protocol becomes much more concrete.
You can observe UDP the same way. Running sudo tcpdump -i any -n 'udp port 53' to capture DNS queries reveals the clean simplicity — one request, one response, no handshake.
Common Issues in Practice
Here are some typical real-world headaches related to TCP/UDP.
- Keep-Alive and idle connections: Cloud load balancers and firewalls drop connections that are idle for a certain period. The application doesn’t know the connection is dead and gets an error when trying to use it.
SO_KEEPALIVEor application-level heartbeats are needed - Nagle’s algorithm and latency: TCP’s default optimization (Nagle) bundles small packets before sending. For real-time needs like chat or command-line tools, disable it with
TCP_NODELAY - Port exhaustion: Processing tens of thousands of short connections per second can exhaust ephemeral ports. This often manifests alongside TIME_WAIT issues
- MTU and fragmentation: When a TCP segment exceeds the link’s MTU (typically 1500 bytes), it gets fragmented. In VPN or tunneling environments where MTU shrinks, this can cause a sharp performance drop
Each of these topics could fill a separate post. Here, we’re just labeling them as “these exist.”
In the next post, we dissect one of the most widely used protocols running on top of TCP/UDP — DNS. We’ll follow how the human language of example.com becomes the number 93.184.216.34, and the roles played by the various servers participating in that translation.

Loading comments...