A Simple Web Server

⊕While Malcom claims that this will be the last ring handler we will ever write, I think the real value of the exercise is that it provides a basis for understanding the gory details of HTTP request handling. I started thinking about web servers and RESTful APIs in 2021 after watching a talk by Malcom Sparks for the London Clojurians meetup. The nitty-gritty of the talk is how to build a ring middleware handler–the one to rule them all–a little bit at a time as he follows the various HTTP RFCs. Malcom’s talk assumes a web server and is only concerned with the middleware piece.

⊕There have also been a number of articles on the HTTP protocol, e.g., HTMP/1.0 from Scratch Much later, I stumbled on a short Computerphile video by Laurence Tratt that shows how to implement a web server in 25 lines of Rust. And then I read a blog post by Daniel Szmulewicz that took Computerphile‘s idea further, implementing a more complex (but still relatively simple) web server in Clojure from the ground up, including writing basic handlers. This short piece reminded me of Malcom’s talk, which I still wanted to revisit.

I decided to write a little version of the server in Guile as a way to learn about network socket programming and bring up my Guile game a bit. The notes below are just my jotting down of what I discovered as I followed this idea and are not meant to provide any guidace: network programming is tricky, reader beware!

Guile’s Example Server

The Guile Reference manual’s section on networking has two code examples of how to use network sockets: a client and a server–we take them as our starting point:

The client

(let ((s (socket PF_INET SOCK_STREAM 0)))
  (connect s AF_INET (inet-pton AF_INET "127.0.0.1") 80)
  (display "GET / HTTP/1.0\r\n\r\n" s)

  (do ((line (read-line s) (read-line s)))
      ((eof-object? line))
    (display line)
    (newline)))

and the server

(let ((s (socket PF_INET SOCK_STREAM 0)))
  (setsockopt s SOL_SOCKET SO_REUSEADDR 1)
  ;; Specific address?
  ;; (bind s AF_INET (inet-pton AF_INET "127.0.0.1") 2904)
  (bind s AF_INET INADDR_ANY 2904)
  (listen s 5)

  (simple-format #t "Listening for clients in pid: ~S" (getpid))
  (newline)

  (while #t
    (let* ((client-connection (accept s))
           (client-details (cdr client-connection))
           (client (car client-connection)))
      (simple-format #t "Got new client connection: ~S"
                     client-details)
      (newline)
      (simple-format #t "Client address: ~S"
                     (gethostbyaddr
                      (sockaddr:addr client-details)))
      (newline)
      ;; Send back the greeting to the client port
      (display "Hello client\r\n" client)
      (close client))))

Neat! Let’s try these examples to see what happens!

The setup:

Run the server program in one terminal window
Run the client program in another terminal window
Connect to the server using Brave and Firefox

The terminal running the server starts up fine, tellings us the port that it is listening on. The terminal executing the client program produces the expected output, Hello, client, but then throws a backtrace:

Hello client
Backtrace:
In ice-9/boot-9.scm:
  1755:12  8 (with-exception-handler _ _ #:unwind? _ # _)
In unknown file:
           7 (apply-smob/0 #<thunk 7f5942315300>)
In ice-9/boot-9.scm:
    724:2  6 (call-with-prompt ("prompt") #<procedure 7f5942322140 …> …)
In ice-9/eval.scm:
    619:8  5 (_ #(#(#<directory (guile-user) 7f5942318c80>)))
In ice-9/boot-9.scm:
   2858:4  4 (save-module-excursion #<procedure 7f594230a2d0 at ice-…>)
  4410:12  3 (_)
In /home/afm/data/repos/web-server/client.scm:
     7:27  2 (_)
In ice-9/rdelim.scm:
   195:24  1 (read-line _ _)
In unknown file:
           0 (%read-line #<input-output: socket 5>)

ERROR: In procedure %read-line:
In procedure fport_read: Connection reset by peer

It looks as though the client program was expecting input but the the connection was reset by the server.

We get the same message from Firefox (connection reset) when we try to connect to http://127.0.0.1:2904. Brave throws a different error when we try to connect to the same URL: ERR_INVALID_HTTP_RESPONSE. The message Hello client that should have been sent by the server is not shown on either browser.

My first hypothesis was that the GET request data sent by the client was still waiting in the socket since the server doesn’t read any data sent by the client, and that this somehow triggered an error when the server closed the socket. ⊕This snippet of code relies on knowing that the client’s GET request is followed by a blank like consisting of \r\n. I added a bit of code to the server.scm file that reads data from the client to test this idea:

;; read client request
(do ((line (read-line client) (read-line client)))
    ((string= line "\r"))
  (begin
    (display line)
    (newline)))

⊕For more details on POSIX socket programming, look at Beej’s Guide to Network Programming. It’s a really readable tutorial on socket programming, notwidthstanding that it targets C. Sure enough: now the client.scm program succeeds, printing the server message and exiting with no errors. This behaviour, however, seems arbitrary: why would the client care whether or not the server read its message? Now that we know the real issue, we can go hunting for an answer, which I found in this article, where the author points out a corner case of RFC 1122:

A host MAY implement a ‘half-duplex’ TCP close sequence, so that an application that has called CLOSE cannot continue to read data from the connection. If such a host issues a CLOSE call while received data is still pending in TCP, or if new data is received after CLOSE is called, its TCP SHOULD send a RST to show that data was lost.
,

Bingo! The GET request from the client is pending readable data, so TCP issues a RST, i.e., a connection reset message to let the client know that data has been lost.

⊕shutdown client takes 3 values for its how parameter: 0, stop receiving; 1, stop transmitting. Discard data waiting, don’t ack, don’t retransmit lost data; 2, stop both receiving and transmitting. But making sure that the server reads all the data from the client doesn’t feel like a robust approach, particularly since it establishes a coupling between the client and the host. So there must be a better way: the server can invoke (shutdown client 2) before closing the socket, where the 2 means that we are closing the socket for reception and transmission. From a TCP perspective, this means that the client socket receives a FIN signal, which results in the read-line function call receiving the EOF object expected by the client. Voila! This simple code change fixes the behaviour of the Guile client program: it now prints the server message and exits normally with no errors.

The fix also works in Firefox, which now is able to display expected the server message, hello client in the browser window; the browser’s network panel shows 16-byte transfers for the GET / and the favicon requests with return codes of 200 for both.

Brave still complains about the invalid HTTP response, which suggests that we should provide a valid HTTP response so that Brave (and Chrome) will display the server message. Thus, we change the response in the server program:

(display
  (string-append "HTTP/1.1 200 OK\r\n"
                 "Content-Length: 12\r\n"
                 "Content-Type: text/plain; charset=utf-8\r\n"
                 "\r\n"
                 "Hello World!") client)

We can now see our message Hello World! displayed in the browser window in both Brave and Firefox. The client also displays the message, including the response headers (it doesn’t know anything about the HTTP protocol so the 200 code and headers are displayed as strings).

⊕We could have used a Guile web server library for its lowel level of functionality but this would not honour the spirit of the exercise. Our Guile code is not as succinct as the Rust and Clojure examples of the articles referenced above: those languages expose a higher-level socket abstraction, while Guile requires us to deal with the details of POSIX sockets.

Tydying Up...

The code for the server program now looks like this:

(let ((s (socket PF_INET SOCK_STREAM 0)))
  (setsockopt s SOL_SOCKET SO_REUSEADDR 1)
  ;; Specific address?
  ;; (bind s AF_INET (inet-pton AF_INET "127.0.0.1") 2904)
  (bind s AF_INET INADDR_ANY 2904)
  (listen s 5)

  (simple-format #t "Listening for clients in pid: ~S" (getpid))
  (newline)

  (while #t
    (let* ((client-connection (accept s))
           (client-details (cdr client-connection))
           (client (car client-connection)))
      (simple-format #t "Got new client connection: ~S"
                     client-details)
      (newline)
      (simple-format #t "Client address: ~S"
                     (gethostbyaddr
                      (sockaddr:addr client-details)))
      (newline)
      ;; Send back the greeting to the client port
      (display (string-append "HTTP/1.1 200 OK\r\n"
                                "Content-Length: 12\r\n"
                                "Content-Type: text/plain; charset=utf-8\r\n"
                                "\r\n"
                                "Hello World!") client)
      (shutdown client 2) ;; sends FIN to client--EOF on client socket read
      (close client))))

Now it is a good time to refactor the code for the server a bit. The first thing we need to do is factor the code printing the details of the connections to the REPL. Note that we also never close the socket used by the server to listen for connections on port 2904. And we use an infinite while loop to handle any connection requests–we should stop responding to connection requests after the server listening socket has been closed since there will be no more incoming requests.

Accordingly, print-connection-info! now holds all the code that prints the connection details of all connection requests. We isolate the creation of the listening socket into its own start-server! function, and also define its counterpart, stop-server! Finally, the connection request handling code now takes a server as a parameter and it checks for the connection to be closed as its terminating condition. Thus:

(use-modules (ice-9 futures))

(define (print-connection-info! client-details)
  (simple-format #t "Got new client connection: ~S"
                 client-details)
  (newline)
  (simple-format #t "Client address: ~S"
                 (gethostbyaddr
                  (sockaddr:addr client-details)))
  (newline))

;; server is the socket connection that the server listens on
(define (tcp-transport server)
  (future
   (while (not (port-closed? server))
     (let* ((client-connection (accept server))
            (client-details (cdr client-connection))
            (client (car client-connection)))

       (print-connection-info! client-details)

       (call-with-port client
         (lambda (client)
           (begin
             ;; send minimal OK message back to client
             (display (string-append "HTTP/1.1 200 OK\r\n"
                                     "Content-Length: 12\r\n"
                                     "Content-Type: text/plain; charset=utf-8\r\n"
                                     "\r\n"
                                     "Hello World!") client)
             ;; sent FIN to client - EOF on client socket read
             (shutdown client 2))))))))

(define start-server!
  (lambda ()
    (let ((s (socket PF_INET SOCK_STREAM 0)))
      (setsockopt s SOL_SOCKET SO_REUSEADDR 1)
      ;; Specific address?
      ;; (bind s AF_INET (inet-pton AF_INET "127.0.0.1") 2904)
      (bind s AF_INET INADDR_ANY 2904)
      (listen s 5)

      (simple-format #t "Listening for clients in pid: ~S" (getpid))
      (newline)

      s)))

(define (stop-server! server)
  (shutdown server 2)
  (close server))

Note that we are also using a future to handle connection requests, which means that the code on the server no longer blocks, which allows us to have this workflow at the REPL:

> (define server (start-server!))
> (tcp-transport server)
> (stop-server! server)

The first line starts a socket connection listening for requests on port 2904 (of course, we could parameterize the port number). The next command starts up a non-blocking function that will respond to connection requests coming from the server. Finally, the last line closes the socket, resulting in the termination of the execution of tcp-transport.

Next Iteration

We need to do more thing to achieve parity with the toy webserver of the article and video: read the request from the client and generate a response. In particular, we should be able to serve a file from a valid GET request.

Our next step is to read the request from the client and (temporarily) display it in the REPL so that we can figure out the logic needed for parsing the client request:

(define (get-request port)
  (let ((line (read-line port)))
    (if (string=? "" (string-trim line))
        '()
        (cons line (get-request port)))))

and we update the body of the function passed to call-with-port so that we can see the client request:

(lambda (client)
  (begin
    (let ((req (get-request client)))
      (simple-format #t "~S" req)
      ;; send minimal OK message back to client
      (display (string-append "HTTP/1.1 200 OK\r\n"
      "Content-Length: 12\r\n"
      "Content-Type: text/plain; charset=utf-8\r\n"
      "\r\n"
      "Hello World!") client))
      ;; sent FIN to client - EOF on client socket read
      (shutdown client 2)))

Connecting to the client using a browser yields a list of strings (formatted for clarity):

("GET / HTTP/1.1\r"
"Host: localhost:2904\r"
"Connection: keep-alive\r"
"sec-ch-ua: \"Not/A)Brand\";v=\"8\", \"Chromium\";v=\"126\", \"Brave\";v=\"126\"\r"
"sec-ch-ua-mobile: ?0\r"
"sec-ch-ua-platform: \"Linux\"\r" "Upgrade-Insecure-Requests: 1\r"
"User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36\r"
"Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8\r"
"Sec-GPC: 1\r"
"Accept-Language: en-GB,en\r"
"Sec-Fetch-Site: none\r"
"Sec-Fetch-Mode: navigate\r"
"Sec-Fetch-User: ?1\r"
"Sec-Fetch-Dest: document\r"
"Accept-Encoding: gzip, deflate, br, zstd\r")

Our task consists in parsing the GET string in terms of the command, the resource requested, and the protocol used. We then collect all the headers as a list.