Insights from the cURL developer
The first talk of the day was by Daniel Stenberg. If you ever used cURL, you might recognize him name from his personal domain haxx.se. The world is changing, and a tool like cURL must evolve as well. One of those big changes is support for HTTP/2, a fairly new protocol.
Daniel started off with explaining the importance of HTTP for the internet. We run almost everything on it, and usage grew tremendously. In the last four years, the average website got heavier. For example the number of objects, from 80 growing to 100 objects (HTML, CSS, JavaScript, etc). More impressive was the size of an average web page: four years ago that was about 800 KB, now it is 2300 KB.
Problem: latency
The main issue with “current” HTTP/1.1 protocol is what Daniel calls the “round trip bonanza”. A lot of connections are initiated (usually around 40 for the average website), each requesting a single object. The requests itself need to ping-pong between receiver and sender, adding up a lot of delay, as they are queued. The slightest addition of latency makes the current HTTP protocol inefficient. For most users bandwidth is no longer the issue, but this inefficient queuing. Daniel also referred to it as “head of line blocking”, like choosing in the supermarket the quickest line. The first person in that line will have a huge impact on the rest.
HTTP/1.1 workarounds
Last years people came up with clever ideas, or hacks, to improve the efficiency. Thinks like spriting, where many objects are merged and delivered as one single object. Think of small icons, which are merged into a single PNG. Then there isconcatenation (cat *.js > newfile.js), to reduce the number of files. Another example includes embedding those images in your CSS files, with base64 encoding. This is named inlining. Last but not least, Daniel mentioned sharding. As browsers generally allow up to 6 connections per domain, people started to spread their objects on more domains. This effectively allows more connections. With many tabs open in your browser, you can imagine the impact your browser has on the TCP stack of your operating system.
HTTP History
The HTTP protocol itself has a rich history, even though it doesn’t have that many versions. It started in 1996 with HTTP/1.0 (RFC 1945), followed by HTTP/1.1 in 1997 (RFC 2068) and an update in 1999 (RFC 2616). Then it was silent for a long time, till 2007 and httpbis refreshed HTTP/1.1.
The real big change started in 2009 by the well-known search engine Google. Their protocol SPDY wanted to solve many of the inefficiencies of HTTP/1.1. It was implemented in their data centers and Chrome browser in 2011. In the next year, the work on HTTP/2 started. In 2014 we got a slight revision to HTTP/1.1 (RFC 7230), making room for finally HTTP/2 in 2015 (RFC 7540).
What is HTTP/2?
If you create a new protocol, adoption is key. This is especially true for something with the importance of HTTP/2, making a serious change to the web. So Daniel explained the protocol is actually a framing layer, and maintaining a lot. So no changes to the old protocol, nor changing too many things. For example, POST and GET requests will remain. The naming convention (HTTP:// and HTTPS://) will also be unchanged. Another thing pointed out is that HTTP/1.1 will remain for a long time.
One of the biggest lessons learned is that protocols should not have too many optional components. HTTP/2 has therefore mostly mandatory items, specifying how it should be implemented. You are compliant with the specification, or not at all. That keeps things simple. For that same reason there is no minor version. The next version will most likely be HTTP/3.
Binary versus plaintext
One of the biggest changes is that the new protocol is no longer plaintext. Telnetting to your web server no longer gives you the typical response. Instead, the protocol is binary, which makes framing much easier. It has also performance benefits, as there are less translations needed. With default support for TLS and compression, things definitely changed. On the question how to do analysis now? Popular tools like Wireshark already have support. More tools for debugging, or security testing, will have to adopt.
Headers
The new protocol has two frame types, headers and data. Regarding these headers it is interesting to know that they grew over the past years. More data seems to be included, like cookie data. With HTTP/2 these will be compressed as well, especially as there is a lot of repetition involved.
Connections
One of the biggest changes is reusing connections. So data streams are multiplexed now, enhancing data transfers and reducing overhead. Systems together decide on how many parallel streams occur. Another improvement is so-called “connection coalescing”: if the client detects multiple sites using the same source system, that connections will be “unsharded” and merged with an existing stream.
Where previously HTTP requests were done one by one, in the new protocol they are done at the same time. So no more waiting for the first request to finish, reducing the initial request. To ensure the most optimal data flow occurs, streams can be prioritized by giving them a weight.
Server push
Normally the client does the requests. It requests the first pieces of HTML, then decides what else it will request, like CSS and JavaScript files. With the new protocol that will change. The server will be allowed to make suggestions and push data. That is, if the clients wants to accept that.
Adoption rates
As of April 2016, Daniel shared that Firefox already used 23% over HTTP/2. Regarding HTTPS, that is used for 35% of the traffic. Most browsers already support the new protocol. When it comes to the top 10 million websites, 7% has adopted the new protocol. That increases to 9% for the top 1 million, and even more for the top 500 (19%). Not bad at all, considering that most traffic will go to the last group. One of the big content delivery networks, Akamai, made the switch. The Google bots and Amazon platform are expected to follow soon.
How you can adopt
Daniel explained that you can easily run HTTP/2 yourself. Make sure to use an up-to-date browser. To see if you are using the new protocol, you can install browser extensions (search for HTTP2 browser indicator).
If you are doing things on the server side, it gets a little bit more complicated. First of all you need a web server that supports it. Your version from the repositories might not be up-to-date enough. So have a look at the latest version of litespeed, nginx, or Apache. Newer web server daemons like nghttp2, H2O, Caddy, will definitely have support for it.
Second problem is that OpenSSL might be too old, therefore missing APLN support. This stands for application protocol layer negotiation, which defines what protocol to use. Only by running the very latest versions of your Linux distribution, you might be having more luck now. Last but not least, you will have to configure a certificate. Fortunately the Let’s Encryptproject makes things easier now, for both configuration and the price tag.
So what about HTTP/2 security?
As we cover mainly security on this blog, it was interesting to note that a few items around security popped up. One of those was dealing with the some recent attacks like BEAST and CRIME.
HTTPS
A big misconception is that you need HTTPS when running HTTP/2. HTTPS is not mandatory according the specification. However, browsers don’t allow it. This means effectively that you will need a SSL/TLS certificate. But if you really want, you can still run the new protocol on port 80. It might be a good thing to have HTTPS available, as it provides privacy and user protection.
Safe defaults
The new protocol does not support things like SSL renegotiation, SSL compression, or any protocol before TLS 1.2 with weak ciphers. So on that level the protocol makes the web a lot safer.
Server push security
The security of server pushing is still vague. In this area more development is needed.
The future
A few areas are currently still vague. So is the security and implementation of server pushes under development. Also client certificates are not supported yet. Daniel listed this as a possible improvement, together with more tuning of the TCP stack, minor changes to handling of cookies.
Beyond HTTP/2 it is the goal to slowly drop the legacy of the first HTTP protocol (1.0 and 1.1). HTTP/3 will happen a lot faster, not the 16 years it now took. One of the more interesting developments is QUIC, or quick UDP internet connections. It is TCP, TLS, and HTTP/2 over UDP! More more head of line blocking, the main issue with using TCP.
About Daniel Stenberg
Daniel is from Sweden and works for Mozilla. He still keeps evolving the popular cURL toolkit (curl and libcurl). I had the honor to sit across the table with him and the other speakers. Knowledgeable, friendly and a great sense of humor.
No comments:
Post a Comment