Searching for Signal

the n01se blog

The Most Important Parts of HTML5

or Why <video> and <audio> are Boring

or The New Web Platform

or An Introduction to HTML5

 

A Little Perspective

The Birth of the Web

20 years ago today (Aug 6th, 1991), Tim Berners-Lee released the World Wide Web on the world while working at CERN. Actually what he released was a program called "WorldWideWeb" which was eventually renamed "Nexus" to clarify the distinction between the concept of the World Wide Web and the browser itself.

The initial browser could render documents written in HyperText Markup Language (HTML). The first HTML supported some limited formatting and hyperlinks (we just call them links now) to other documents. Here is a screenshot from 2 years later when color and inline images were added:

WorldWideWeb browser running on a NeXT System

WorldWideWeb Browser Running on a NeXT System

The Rise of the Web

In the next 10 years "the Web" exploded in terms of innovation, standardization (Footnote 1), number of users, browsers (installs and variations), web servers (installs and variations), and economic impact. With the release of Mosaic (at NCSA) in 1993, with its seamless integration of graphics and text, the World Wide Web quickly grew to become the dominant use of the Internet and the driving force behind Internet adoption. In fact, for most computer users "the Web" and "the Internet" have become synonymous.

The Fall of the Web

At the turn of the century, Web innovation slowed due to the emergence of Microsoft's Internet Explorer as a near monopoly in the web browser market. Once Microsoft achieved a controlling share of the web browser market they lost interest in driving or cooperating on new Web standards and technologies because this might threaten their profitable Windows platform.

The Rise of the Web (Again)

In the last three years there has been a new explosion of Web innovation. The Web has been released from its cage by three main trends: the rise of Mozilla Firefox (a spiritual descendant of the Mosaic browser), the rise of Google Chrome, and the rise of mobile devices.

Web Browser Market Share 2008-2011 (Wikimedia)

Web Browser Market Share (Wikimedia)

 

A core part of human nature is to give names to everything including to abstract concepts. The new energy and innovation surrounding the Web platform is no exception; it needs a name. Which brings us to "HTML5" ...

 

"HTML5"

Technically, HTML5 is a specification from the World Wide Web Consortium (W3C) (Footnote 2). Many pedants will claim this is the only correct usage. For the rest of us, HTML5 is a useful term to describe the rapid changes that are currently happening to the Web platform. This is what I will mean when I use the term "HTML5". I will refer to the formal specification as "W3C HTML5".

W3C HTML5

Although the W3C HTML5 specification is not going to be officially complete until 2014, the specification finished last call review earlier this week (Aug 3rd) so there are unlikely to be any radical changes in the next three years before finalization.

W3C HTML5 is Boring

The W3C HTML5 is very important at one level, but it is also pretty boring. It is basically a formal description of the state-of-the art in the Web platform from 3 years ago. The actual content of the specification is pretty mundane (even apart from the dry and technical nature of specification documents). The most interesting API in the W3C HTML5 is the Canvas 2D Context and that is defined in a separate document.

W3C HTML5 is Important

However, the W3C HTML5 is important because it makes official all the good ideas that have been learned over the years and it attempts to remove most of the things that are now considered mistakes. It also brings a great deal of consistency and completeness to the various DOM APIs and HTML elements. And probably most importantly, it has brought the various browser makers into agreement. This means web developers that develop against what is defined in the W3C HTML5 specification should have one application that works well on all recent browsers versions without the need for browser specific kludges.

W3C HTML5 vs HTML5

If I were to sum up the differences between the W3C HTML5 specification and the larger concept of HTML5 it would be this:

  • The W3C HTML5 promotes many of the existing second class elements of the Web such as video, audio, animations, smart forms, etc into first class elements.
  • HTML5 (the common usage) takes those new elements and adds power and functionality to them that was not previously possible. HTML5 also creates a whole new set of first class elements out of technologies that were not part of the Web in the first place such as hardware device access, binary data, file system access, multiprocessing, etc.

Or another way of summing up the relationship between the two:

The W3C HTML5 specification serves as the foundation and framework upon which all the interesting HTML5 developments are happening.

 

The Most Important Parts of HTML5

Now we come to my purpose for this article: to list and describe the HTML5 APIs, standards, and technologies that are most important (and most interesting). I have tried to imagine the Web as it will exist five years from now and from that vantage point determine what parts of HTML5 were most crucial in bringing us to that imagined future.

The following list is ordered from most to least important. Obviously this is just my opinion. I do web application development (noVNC, websockify) and participate in HTML5 working groups and discussions, but the future is always more interesting (and less) than any experts can predict.

So without further ado...

 

1. Faster Javascript Engines

The Web as a application platform is built on this more than anything else. The new Javascript engines are the warp drive for the Web. Without the warp drive, Star Trek is a story that takes place on earth (or at best in one solar system). Without the massive increase in Javascript performance we would still be talking about web pages and not web applications.

2. WebSockets

This moves the browser solidly into the space of highly interactive networked applications. After fast Javascript engines, low-latency networking has the largest potential for allowing the Web to conquer new application domains.

3. Binary Data Types (Typed Arrays and Blobs)

Javascript started its life as a way to do validation of textual form data. However, many of the first class elements introduced in HTML5 contain, receive and/or output binary data and so native binary data types in Javascript has become a necessity. Developers have used various hacks to encode binary data in old Javascript data types for many years. But using these hacks is a significant barrier and the full power of HTML5 will not be unleashed without native binary data support.

4. Web Audio API

This is not the <audio> tag but rather the APIs/proposals for allowing low-latency, direct audio manipulation from Javascript. The <audio> tag (which is part of W3C HTML5) allows an audio file to be embedded directly in a web page and it provides a playback and synchronization API.

The Web Audio API proposals allows for direct creation and manipulation of audio waveforms and also address issues of high-latency playback that exist in current <audio> tag implementations. These proposals are still very young and the final solutions may merge with the <audio> tag, but the issues addressed by the Web Audio API proposals will be in future browsers in one form or another.

5. Canvas 2D Context

Direct pixel manipulation. Everybody agrees it is important so there is not much I will add. I put it below the Web Audio API because much of what can be done with Canvas 2D Content can be done with other methods (SVG, WebGL, CSS3).

6. CSS3 and WOFF

Cascading Style Sheets 3 and the Web Open Font Format together bring the full power of design, typography, layout, and visual transformation to the web. Also, with CSS3 (in particular the Flexible Box Model), web applications will finally have a simple and powerful way of doing user interface layout without the element positioning mess that is necessary with CSS2.

7. Local Storage, Offline Applications and the File APIs

There are many application domains that are just not feasible without fast, local and persistent storage (at least not until everyone has cheap Gigabit Internet connections). Some form of local storage is also necessary for web applications to be usable when there is no Internet connection available. There are a number of APIs/standards being developed in this area but they are all addressing different aspects of the same fundamental limitation of pre-HTML5 browsers.

8. Web Workers

Moore's Law is dead, long live Moore's Law! The year-after-year exponential increase in CPU frequency due to Moore's Law ended several years ago. But Moore's Law was actually a statement about transistor cost/density and this has not changed, it simply has a new face: processor cores per square inch of chip. In a few years, even your mobile phone will have more processor cores than you have fingers.

New software models are required to take full advantage of the new multi-core reality of Moore's Law. Fortunately, even though Javascript has always been a single threaded language it was also designed from the beginning to an event driven language. This means that while multiple lines of Javascript in the same web application cannot be running simultaneously, the browser can be doing multiple things at once on behalf of that Javascript code that is running.

Being event driven only goes so far. The Web Workers specification was created to allow a single web application to have multiple threads of Javascript running simultaneously. To avoid the massive complexity that usually comes with multi-threaded programming (locks, special data structures, etc), Web Workers are independent Javascript contexts and they can only interact with each other and with the main Javascript thread using event driven message passing.

9. SVG 1.1/2.0

The SVG (Scalable Vector Graphics) format has been around for a long time and some browsers have been able to embed SVG images into web pages. SVG is finally starting to be adopted by all browser makers in a form that adds powerful APIs and that allows full access and manipulation to the contained elements (i.e. a truly first class element).

SVG is actually a difficult one to place in the list. Many (perhaps even most) of the uses of the Canvas 2D Context are actually more appropriate to SVG and in many ways SVG is far more powerful. However, SVG has had an uneven history and I fear that it has accumulated some unjustified mental baggage that will prevent it from being as fundamental and important as it otherwise would be. I will be happy to be proved wrong if it turns out to be more important than I have rated it.

10. WebGL

This is the Canvas 3D Context and it is basically a hardware accelerated OpenGL API for the Web. Like SVG, this is potentially a very important piece of HTML5. But I say potentially because Microsoft has been somewhat dismissive of WebGL (possibly since it is defined in terms of OpenGL rather than their own DirectX API) and so there is uncertainty about whether this will ever make it into Internet Explorer. If there were less uncertainty I would place this higher in the list because it brings the Web to the doorstep of so many new application domains (including 3D games).

11. All the Rest

The ultimate vision of many who are pushing forward the Web platform is to make the Web platform as powerful, capable and comprehensive as native applications. The first 10 items each open up the Web platform to large new application domains that have historically only been possible with native applications. But they leave many gaps that must be filled before we reach a future where the question asked by developers is no longer "Can I build this as a web application?" but rather "Do I want to build this as a web application?"

There are numerous proposals that are being worked on to address the gaps in web application functionality. Several of them would probably be higher on the list if they were further along or had less uncertainty about whether they will be universally adopted by all browser makers. Here are just a few of the proposals/APIs that are attempting to fill the gaps:

  • WebRTC/Stream API: Peer-to-peer video conferencing.
  • Geolocation: Where in the World am I?
  • Orientation: Which way is up?
  • Crypto: Encrypt/decrypt efficiently in Javascript
  • WebCL: The Web version of OpenCL. Who wouldn't want to use the GPU directly for computation from Javascript?
  • WebNotifications: Tell me what's happening, but gently.
  • Web Intents: Associate data types with default actions and pass data back and forth between web applications. Your favorite web application for editing images will inevitably be different then the default one for that web based slideshow application. Shout with me: "OLE!"
  • Page Visibility: Imagine how much energy the world would be save if those animations and movies stopped rendering when you aren't looking at them.
  • requestAnimationFrame: with setTimeout you get 100 FPS or you get 2 FPS (often within the same second). Now you can get a consistent 30 FPS. And there was much rejoicing.
  • Microdata: unambiguous parsing of embedded machine-readable data.
  • Etc
  • Etc (Footnote 3)

 

Mac vs PC vs Browser (xkcd.com)

Mac vs PC vs Browser (from xkcd.com)

What about <video> and <audio>?

The <video> and <audio> tags are probably the HTML5 features that have caused the most excitement (and angst) on the web regarding HTML5. In fact, some of you may have scanned ahead looking for them in the list and were surprised they were not #1 and #2. These two elements are part of W3C HTML5 and while they do make first class elements out of what has traditionally been second class (e.g. done with Flash), they are still really just a better way of doing what has already been done. The <video> and <audio> tags don't significantly expand the scope of what web applications are capable of so that is one reason why they are not on the list. Another problem with these tags is that the list of supported media formats is inconsistent across browsers. Until this is resolved, their adoption will be hampered. (Footnote 4)


 

Footnotes

These were originally inline in the article and they seemed to interrupt the flow so I moved them here. And even without these notes the article is still too long.

Footnote 1

Here are some notable standards that developed around the Web platform in the first few years after the first web browser was released by Tim Berners-Lee:

  • 1995 - HTML 2.0 is published as the first official HTML standard.
  • 1995 - HTML make the first baby step toward becoming dynamic and interactive when JavaScript is created by Brendan Eich and added to Netscape Navigator 2.
  • 1996 - The CSS1 (Cascading Style Sheets) specification is completed and parts of it appear in Internet Explorer 3. Part of the reason for CSS is to separate web page appearance from content and functionality. This follows as well established principle in computer science of separating concerns.
  • 1998 - The DOM Level 1 (Document Object Model) specification, the API of HTML elements on a page, is published by the W3C.

Footnote 2

I should note another important group (with a long and forgettable name) that is involved in HTML standards is the Web Hypertext Application Technology Working Group (WHATWG). Many (perhaps most) of the people at WHATWG also participate in work at the W3C. The WHATWG can be thought as a more dynamic and less restricted version of the W3C.

The WHATWG maintains two overarching standards documents. The first is HTML. Many of the ideas that are in the W3C HTML5 were first documented here. This document can be thought of HTML5+. The second standard document is called Web Apps 1.0. This contains the content of the HTML document and adds many of the APIs that are considered important for the Web to become a full fledged application platform.

The WHATWG Web Apps 1.0 document is much closer to what people generally mean when they use the term "HTML5", however, even that does not encapsulate the whole meaning. There are many other APIs and proposals that are part of the new energy of the Web that are or will make their way into most browsers (either via the standards process or not).

Footnote 3

Etc. I am not aware of a web proposal/API yet with those three letters. The question is will this article will make it more or less likely that there will be one soon?

Footnote 4

By combining fast Javascript, WebSockets, Canvas, Web Audio, and binary data types you can have video and audio without the <video>/<audio> tags and without any plugins. The frame rate and resolution might be lower than desired but it is possible.


 

Resources

 

Links