Friday, March 18, 2011

TCP 3 Way Handshake


How do computers “connect” to each other over a network? At the transport layer (layer 3 of the OSI 7 layer networking model), TCP has what we call the “Three Way Handshake”. The general idea is to make sure both sides of the connection (client and server give positive acknowledgements of a connection). We send packets back and forth with certain flags set in the TCP headers. In the case of the handshake, the flags set are the SYN flag, the ACK flag, and both the SYN and ACK flags at the same time. Imagine we have a client and a server. The communication for establishing the connection would be as follows:

Client                    Server

SYN        ->                                            step 1

                <-            SYN-ACK                step 2

ACK        ->                                            step 3

Each step represents a packet being sent from one endpoint to the other. In order for this handshake to work, the Server has to be listening on a certain port number. When the Server receives a SYN packet on that port, it knows someone is trying to connect to it. It then checks the source IP address of the SYN packet. After some validation, it will allocate a logical connection to the remote Client with that IP address. When this validation successfully completes, it tells the application listening at that port that it just got a connection, and replies to the Client with a SYN-ACK packet. Once the Client receives a SYN-ACK packet, it knows that the Server acknowledges its connection attempt and has allocated a connection for it in memory. The fact that we have both sides allocating memory and storing a connection state means that TCP is a “stateful” protocol rather than a stateless protocol, since it keeps a connection state in memory.

When writing network code, this whole process is abstracted out by system calls to the Operating System’s networking API. For the client, the call would look something like:

connect(sd, sinRemotePtr, sizeof(struct sockaddr_in));

In this case, connect() is the function we call, and pass a SOCKET strucure, a sockaddr pointer, and the size of a sockaddr_in pointer. This function call eventually takes care of sending SYN, making sure we receive a SYN-ACK, sending ACK, and will return a successful value if the handshake occurred correctly. On the Server side, we would call:

accept(ListeningSocket, sinRemotePtr, &nAddrSize);

This function will take a SOCKET structure, a pointer to a structure where it can write the incoming connection data, and the address of a variable that stores the size of the structure it writes the connection data to. This function eventually takes care of sending the SYN-ACK packet back to the Client when it receives a SYN packet on an interface, and passing an opened socket up the network stack to the calling application.

As you can see, there is much more happening under the hood than it immediately obvious, when writing network code. From a programmer’s point of view, it’s just a network connection function call. Thanks to the idea of abstraction across all engineering disciplines, the programmer doesn’t need to worry about the internal mechanics of the functions, just how to use it.

connect(): http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx

accept(): http://msdn.microsoft.com/en-us/library/ms737526(VS.85).aspx

1 comment: