Browser <3 Plugin

I should stop reading blogs and Twitter Winking smile

I’ve seen a lot of commentary saying something along the lines of;

“if only ‘HTML5 ‘will come along and then we won’t need all these plugins”

and I find it quite an odd idea.

I suspect what it really means is something like;

“if only ‘HTML5’ will come along and then I won’t need this plugin to play this simple video or a bit of audio”

but who knows?

By the way, I use “HTML5” in quotes like that because I mean “HTML5 and associated standards” but it’s long-winded to write that out every time.

I find it fascinating that folks that seem to like one piece of software (the browser) running on another piece of software (the operating system) then don’t like another piece of software (the plugin) for doing a similar thing. I understand why though.

It does feel to me like this has analogies to the OS (albeit imperfect ones);

Operating Systems

We run operating systems. Myself – I’m running Windows 7 (on a number of machines), Windows Phone 7, Mac OS X Snow Leopard and iOS.

Operating systems provide platforms for applications. New versions come along relatively infrequently. In most cases, the platform is controlled by a vendor like Microsoft or Apple.

No operating system can foresee every client-side need and so we add plugins. We might call them drivers or extensions or even applications.

It’s worth remembering that in the not-so-distant past these plugins included things like a network stack (which if I remember correctly came in as a standard part of the OS in Windows 3.11) and in more recent history have included things like Bluetooth support and multi-touch support.

Over time, functionality that at one time resided in an operating system plug-in becomes so generally accepted as useful that the operating system broadens out to include that functionality and we’ve seen that happen with all of the 3 examples I gave in italics.

Browsers

We run web browsers. Myself – I’m running IE9, FireFox 4.0b6, Chrome 5.0.x and a Safari version or two that I can’t remember as I’m writing this post on my Windows 7 laptop.

Browsers provide a platform for markup display and code execution and surface a subset of the underlying operating system’s functionality.

New versions come along relatively infrequently.

In most cases, the browser is controlled by a vendor like Microsoft, Apple, Mozilla, Google and, in all cases, the browser is implementing a whole tonne of web standards.

The markup/code that the browser runs falls into one of 2 categories;

Standards compliant.

Proprietary which I split out into 2 different categories;

  1. Using proprietary APIs or markup elements that are not standardised but are, instead, limited to one vendor’s particular browser.
  2. Using APIs or markup elements that are not implemented correctly by the browser and so behave differently in one or more vendor’s browsers.

No browser standard addresses every client-side need and so we add plugins. We tend to call them plugins. Smile

Interestingly, the content authored for the browser spans across to other browsers, other platforms and even other devices.

Plugins

We run plugins. Myself – I’m running Silverlight and Flash in the browsers that I run and that’s on both OS X and Windows but not, of course, on iOS.

The OS hosts the browser and the browser hosts the plugin. Whilst the plugin has access to the browser it, crucially, also has access to the underlying operating system.

That is, the plugin is not limited by the browser that is hosting it.

New versions tend to come along relatively frequently compared to browsers or operating systems and people adopt them quickly (e.g. 90%+ of all Silverlight users are already on Silverlight 4).

Thinking specifically about plugins like Silverlight/Flash – these plugins provide a proprietary platform to developers that is owned by a vendor (Microsoft and Adobe in this case).

Interestingly again, the content authored for these plugins spans browsers, platforms and onto other devices.

Applying this (Loosely) to Video

An existing feature like video provides a telling story. Here’s how it went or, at least, how I think it went – there’s a bit of artistic license being applied here;

  1. In the beginning, there was the operating system and the operating system could play videos.
  2. Along came the browser.
  3. People put video content onto the web.
  4. The browser could not play video so when a user clicked on a video, the browser handed the file over to the local media player on the operating system.
  5. Users saw this and demanded “video in the browser”.
  6. Developers writing markup and JavaScript cannot meet this need. They cannot reach beyond the browser to the underlying OS and its obvious ability to play videos. They are sandboxed and this is a good thing.
  7. Plugins sprung up to reach through the browser to the underlying OS and play video. The plugin is not sandboxed, it can reach out to the OS.
  8. Video became successful.
  9. Video playing becomes standardised in future browser versions by adding it to the capabilities inside the browser’s sandbox.

Imagine There’s No Plugins

What if we all had an “HTML5” browser and a new client-side requirement comes along. What can be done go get it to show up in the browser?

  1. It’s not possible. There are no requirements that “HTML5” does not address. Go rethink. You’re an idiot.
  2. Ignore it. Maybe it will go away.
  3. Implement the new requirement using JavaScript and existing HTML markup elements.
  4. Have the browser vendor implement it and provide a proprietary API to their implementation.
  5. Have the standards bodies convene, standardise the requirement and have browser vendors implement it.

I suspect only (3), (4) and (5) are really options.

(3) would be the ideal as no-one has to develop or install a new browser version.

However, because the browser doesn’t (quite rightly!) surface all of the OS functionality to someone writing HTML/JavaScript there’s a lot of things you can’t do without someone else doing some work in the browser layer itself to make sure your code is supported.

Multi-touch might be a good example. In order to get multi-touch events into your HTML/JS I’d say that you need a browser that catches those events from the OS and passes them through to HTML/JS. You need to go beyond the sandbox.

Applying this (Loosely) to Multi-Touch

If I want to pick up multi-touch events and do something reasonable with them inside the browser window today then what are my realistic choices?

4) Have the browser vendor implement it and provide a proprietary API to their implementation.

There are some touch event implementations out there – the window.ontouchstarted, ontouchmove etc events that some browsers surface today.

I struggled quite a lot to figure out which browsers do/don’t support these and on which operating systems. The best I could find was the Modernizer test page. From what I read;

  1. Safari supports these on iOS but I’m not sure whether it does on OS X.
  2. As far as I know (from here) FireFox 4 on Windows 7 supports similar events.
  3. This suggests that Chrome should but that they don’t currently fire.
  4. IE doesn’t to the best of my knowledge.

So, you might be able to get away with this if you could mandate a (non IE) browser choice for your users. That might work out ok on iOS devices of course.

But it’s clear that all these browser vendors implementing a feature in a slightly different way is not really great.

    5) Have the standards bodies convene, standardise the requirement and have the browser vendors implement it.

    I think the W3C is working on this under the “Web Events” banner and is looking to make a recommendation around August 2012 in this area.

    This would be the right way to go in the long term but it’s 2010 and developers are already doing a lot of multi-touch work on all kinds of different devices and operating systems.

    So, what could I do today?

    1. Constrain my users to particular browsers on particular devices (not necessarily unrealistic depending on the devices they use) and go with the proprietary APIs waiting for standards to catch up.
    2. Use a plugin.
      1. Silverlight has support for multi-touch today and it has that support in IE6, 7, 8, 9 and FireFox, Safari and Chrome. What it doesn’t have is that support on iOS and Android. As far as I know, Flash also supports multi-touch today.

    Is This Just About Video and Multi-Touch?

    No, that was just an example.

    I think this is more general than video or multi-touch.  Another example might be 3D.

    It’d be a surprise if 3D didn’t show up more on the web in the future and there’s efforts out there to get X3D into HTML at some point but I’m not too clear where that’s up to right now although there are some implementations of WebGL out there.

    I’d imagine that this is going to show up broadly in plugins first. Adobe announced their “Molehill” APIs for Flash just the other week and are working away at that.

    But, again, that’s just another specific example.

    The Browser has a Plugin Shaped Hole

    My general point is that I think the browser needs an extensibility model to enable people to innovate around browser-based content and, right now, the only model that’s there which lets you take advantage of capabilities not already present in the browser is the plugin model.

    Or did I miss the point?