CptS/EE 455 - Socket Patterns

Socket Patterns

There are a few essential patterns that you need to learn for effective socket programming. The most basic ones contained in the lecture from last time. Let's reconstruct them: (Book references: chapters 2 and 4)

Stream communications

Server

    s1 = socket(...); # args: protocol family, datagram or stream, protocol
    bind(s1, ...); # passing address family and address
    listen(s1, backlog); 
    while (still serving) {
       s2 = accept(s1); 
       while (still talking to current client) {
          send(s2, ...) and recv(s2, ...) 
       }
       close(s2);
    }
    close(s1);

Client

    s1 = socket(...);
    connect(s1, ...); # passing address family and address
    while (still talking to server) {
       send(s1, ...) and recv(s1, ...);
    }
    close(s1);

Remember that a stream socket does not preserve message boundaries -- nth recv may read more or fewer bytes than the nth send. However, bytes will always be received in the order that they were sent. Also, a send() operation may not accept all of the bytes that were specified to be sent. Why? Consequences of these two things for programming?

Patterns for datagrams

Server

    s1 = socket(...);
    while (still serving) {
       sendto(s1, address(in), ...) and recvfrom(s1, address (out), ...) to
       talk to any and all clients
    }
    close(s1);

Client

Just like the server! But notice about the requirements on the order of sendto and recvfrom in the different programs! Datagram communication has the advantage that message boundaries are preserved, but messages may be lost, re-ordered, or duplicated.

Limitations of the basic patterns

Stream server serves only one client at a time
Datagram client/server is susceptible to deadlock if even a single message is lost
Not apparent above, but stream server has a built-in performance limitation because of a typical optimization that says "don't actually send for awhile after client program performs a send in the hopes that it might send more"

Forking server pattern

We can overcome the first problem by using a separate process (or thread) to handle each client. Once a connection is accept'ed responsibility for reading and writing the new socket descriptor is handed off to a new (or pre-existing) process (or thread).

Note the additional closes. Remember that file and socket descriptors are copied into the new process. The child does not need a descriptor for the listening socket and the parent does not need a descriptor for client interaction socket.

    int childCount = 0;
    s1 = socket(...); # args: protocol family, datagram or stream, protocol
    bind(s1, ...); # passing address family and address
    listen(s1, backlog); 
    while (still serving) {
       s2 = accept(s1); 
       if (fork()==0) {
          close(s1); # closing fd does not close socket unless last reference
          while (still talking to current client) {
             send(s2, ...) and recv(s2, ...) 
          }
          close(s2);
       } 
       close(s2);
       childCount++;
       # reap children here!
       while (childCount) {
          pid = waitpid(-1, NULL. WNOHANG);
          if (pid==0) break;
          else childCount--;
       }
    }
    close(s1);

Using select

The select system call allows a process to figure out which send and receive operations it can perform without waiting, and to wait for one of many fds to become ready for reading, writing, or error handling. The format is

   select(int maxDescPlus1, fd_set *readDescs, fd_set *writeDescs, fd_set *exceptionDescs, struct timeval *timeout)

Select suspends the process until either one of the specified fds becomes ready or the timeout occurs. To wait forever pass NULL as the timeout. To not wait, pass a timeval representing 0 time.

The fd_set type represents a bit vector in which a bit is '1' iff the corresponding fd is in the set. fd_sets are manipulated with macros

FD_ZERO(fd_set *set)
FD_CLR(int fd, fd_set *set)
FD_SET(int fd, fd_set *set)
FD_ISSET(int fd, fd_set *set)

Note that the first parameter of select needs to be one more than the maximum file descriptor number put into any of the sets using FD_SET. Why? (This is a common mistake! -- at least for me)

Select returns the number of descriptors ready to perform I/O and it changes the sets to reflect which descriptors are ready. It returns 0 if no descriptors are read -- it timed out; and it returns -1 if an error occurred -- usually, including an invalid descriptor in one of the sets. In Linux, select also changes the timeout structure to account for the amount of time that elapsed. Another common mistake is to forget to re-initialize the timeout prior to each call. After select returns with non-zero result you have to check each of the fds that you asked it to tell you about, first using FD_ISSET then performing at least (and usually only) one recv (read) or send (write) operation.

You can use select with all empty fd sets to do a precision sleep on systems that only have the sleep system call which only allows sleeping for integral numbers of seconds.

Socket Patterns CptS/EE 455 - Computer Communication Networks - Fall 2016 Washington State University
Home Kliks Calendar Syllabus Resources People Project turn-in	Socket Patterns There are a few essential patterns that you need to learn for effective socket programming. The most basic ones contained in the lecture from last time. Let's reconstruct them: (Book references: chapters 2 and 4) Stream communications Server s1 = socket(...); # args: protocol family, datagram or stream, protocol bind(s1, ...); # passing address family and address listen(s1, backlog); while (still serving) { s2 = accept(s1); while (still talking to current client) { send(s2, ...) and recv(s2, ...) } close(s2); } close(s1); Client s1 = socket(...); connect(s1, ...); # passing address family and address while (still talking to server) { send(s1, ...) and recv(s1, ...); } close(s1); Remember that a stream socket does not preserve message boundaries -- nth recv may read more or fewer bytes than the nth send. However, bytes will always be received in the order that they were sent. Also, a send() operation may not accept all of the bytes that were specified to be sent. Why? Consequences of these two things for programming? Patterns for datagrams Server s1 = socket(...); while (still serving) { sendto(s1, address(in), ...) and recvfrom(s1, address (out), ...) to talk to any and all clients } close(s1); Client Just like the server! But notice about the requirements on the order of sendto and recvfrom in the different programs! Datagram communication has the advantage that message boundaries are preserved, but messages may be lost, re-ordered, or duplicated. Limitations of the basic patterns Stream server serves only one client at a time Datagram client/server is susceptible to deadlock if even a single message is lost Not apparent above, but stream server has a built-in performance limitation because of a typical optimization that says "don't actually send for awhile after client program performs a send in the hopes that it might send more" Forking server pattern We can overcome the first problem by using a separate process (or thread) to handle each client. Once a connection is accept'ed responsibility for reading and writing the new socket descriptor is handed off to a new (or pre-existing) process (or thread). Note the additional closes. Remember that file and socket descriptors are copied into the new process. The child does not need a descriptor for the listening socket and the parent does not need a descriptor for client interaction socket. int childCount = 0; s1 = socket(...); # args: protocol family, datagram or stream, protocol bind(s1, ...); # passing address family and address listen(s1, backlog); while (still serving) { s2 = accept(s1); if (fork()==0) { close(s1); # closing fd does not close socket unless last reference while (still talking to current client) { send(s2, ...) and recv(s2, ...) } close(s2); } close(s2); childCount++; # reap children here! while (childCount) { pid = waitpid(-1, NULL. WNOHANG); if (pid==0) break; else childCount--; } } close(s1); Using select The select system call allows a process to figure out which send and receive operations it can perform without waiting, and to wait for one of many fds to become ready for reading, writing, or error handling. The format is select(int maxDescPlus1, fd_set readDescs, fd_set writeDescs, fd_set exceptionDescs, struct timeval timeout) Select suspends the process until either one of the specified fds becomes ready or the timeout occurs. To wait forever pass NULL as the timeout. To not wait, pass a timeval representing 0 time. The fd_set type represents a bit vector in which a bit is '1' iff the corresponding fd is in the set. fd_sets are manipulated with macros FD_ZERO(fd_set set) FD_CLR(int fd, fd_set set) FD_SET(int fd, fd_set set) FD_ISSET(int fd, fd_set set) Note that the first parameter of select needs to be one more than the maximum file descriptor number put into any of the sets using FD_SET. Why? (This is a common mistake! -- at least for me) Select returns the number of descriptors ready to perform I/O and it changes the sets to reflect which descriptors are ready. It returns 0 if no descriptors are read -- it timed out; and it returns -1 if an error occurred -- usually, including an invalid descriptor in one of the sets. In Linux, select also changes the timeout structure to account for the amount of time that elapsed. Another common mistake is to forget to re-initialize the timeout prior to each call. After select returns with non-zero result you have to check each of the fds that you asked it to tell you about, first using FD_ISSET then performing at least (and usually only) one recv (read) or send (write) operation. You can use select with all empty fd sets to do a precision sleep on systems that only have the sleep system call which only allows sleeping for integral numbers of seconds.
(c) 2004-2006 Carl H. Hauser E-mail questions or comments to Prof. Carl Hauser