Zooming & Panning with CSS in IE 10

Something that I’ve been experimenting with just a little lately in the IE10 Preview (available on the Windows Developer Preview) is the idea of how simple HTML content (e.g. a DIV or an IMG) can respond to touch events in order to support the sort of gesture based interactions that a user might expect to use on a touch-first device – specifically here for zooming and panning.

Where touch isn’t the primary input mechanism, the same interactions can be driven by mouse/keyboard.

In the past, I'd have expected to write code to provide the sort of functionality that IE10 supports via some (currently vendor specific) CSS attributes. These make it easy to enable scenarios that are common and simple to use but aren’t necessarily simple to implement.

Zooming

Let’s say that I’ve got a really simple piece of “UI” represented by this block of HTML;

<html style="height:100%">

<head>
	<style type="text/css">
		body
		{
			height: 100%;
			-ms-content-zooming:none;
		}
		#grid
		{
			display: -ms-grid;
			-ms-grid-columns: 1fr 8fr 1fr;
			-ms-grid-rows: 1fr 8fr 1fr;
			height:100%;
		}
		#container
		{
			-ms-grid-column: 2;
			-ms-grid-row: 2;
			overflow: none;
		}
		#image
		{
			width: 100%;
			height: 100%;
		}
		#text
		{
			-ms-grid-row:3;
			-ms-grid-column:2;
			-ms-grid-row-alignment:stretch;
			-ms-grid-column-alignment:stretch;
			background-color: #FF0000;
		}
	</style>

</head>

<body>


   <div id="grid">
        <div id="container">
            <img id="image" src="img0.jpg" style="width: 100%; height: 100%" />
        </div>
	<div id="text">this is just some text to check for zoooming</div>
  	 </div>

	</div>
</body>
</html>

which displays;

image

With just this HTML in IE10, if I use my touch screen and try and do something like a pinch gesture on the content then the browser does nothing. For example, if I do a "shrink" pinch then nothing happens. Note that this is because I've set the -ms-content-zooming on my body element to none – otherwise the browser would have scaled the whole content and I wanted to start from a clean slate here with no zooming anywhere.

That HTML has a DIV that contains an IMG and if I want to do something "active" in response to a zoom gesture on that DIV then I could say so via my CSS;

	#container
		{
			-ms-grid-column: 2;
			-ms-grid-row: 2;
			overflow: scroll;
			-ms-content-zooming:zoom;
		}

changing the overflow there and adding the -ms-content-zooming attribute set to zoom allows my IMG containing DIV to be zoomed via a pinch/zoom gesture ( or equivalent mouse/keyboard shortcut ).

However, I can only zoom in, enlarging the image and adding scrollbars;

image

that's purely because I haven't set up the limits that I want for zooming. If I was to specify the maximum and minimum sizes for zooming;

#container
		{
			-ms-grid-column: 2;
			-ms-grid-row: 2;
			overflow: scroll;
			-ms-content-zooming:zoom;
			-ms-content-zoom-boundary-max: 500%;
			-ms-content-zoom-boundary-min: 75%;
		}

then I can now zoom the DIV down (or out) with a pinch gesture down to 3/4 of its original size and in (or up) to 5 times as large;

image

and all of that comes for free in the browser. Where might I use it? You can think of scenarios where you might have content that you want to be able to display at a default size and yet offer the capability to zoom in and see a little more detail ( maybe a tube map or something along those lines? ) – it's nice functionality to have to hand for zero effort Smile

At the moment, my zooming is unguided in that I can zoom to 75, 76, 77% or wherever I happen to release the pinch gesture. Often, touch interactions are a little tricky to be accurate about and so you can provide "guide points" or “snap points” for the zooming to guide the gesture.

There's 2 ways of specifying these "snap points" – as snapIntervals or as a snapList. Quick examples below;

 

  • snapInterval(25%, 25%) means that we want to guide the user to be able to scale the DIV to 25%, 50%, 75%, 100% and so on up to the 500% max that we've specified as the boundary.
  • snapList(25%, 100%, 200%, 300%, 400%, 500%) is a more explicit list of the points that we are interested in guiding the user’s pinch/zoom gestures to.

and either way I set those up using something like;

	#container
		{
			-ms-grid-column: 2;
			-ms-grid-row: 2;
			overflow: scroll;
			-ms-content-zooming:zoom;
			-ms-content-zoom-boundary-max: 500%;
			-ms-content-zoom-boundary-min: 25%;
			-ms-content-zoom-snap-type: mandatory;
			-ms-content-zoom-snap-points: snapList(75%, 150%, 200%, 300%, 400%, 500%);
		}

and then the browser is going to take my expand gesture and guide it towards one of those values that I've specified as a snap point.

I also have a choice as to how I want those snap-points to be handled.

I can choose to go with mandatory snap points which the previous CSS uses then that means that when the zoom/pinch gesture completes it will always be adjusted from its end point to the nearest snap point.

If I choose to go with proximity for –ms-content-zoom-snap-type instead then the zoom/pinch gesture will only be adjusted to the nearest snap point if it happens to land near to one of those snap points in the first place.

It’s not quite as easy to see the effect of these snap points on zooming as it is on panning unless you choose them somewhat artificially just to make it obvious. With panning though it’s far more obvious…

Panning ( or scrolling )

Rather than zooming, let's say that I want to control the ability to scroll/pan. If I change my HTML to provide a "2 x 2" viewport onto content which is actually a 6 x 4 grid as below;

<html style="height:100%">

<head>
	<style type="text/css">
		body
		{
			height: 100%;
			-ms-content-zooming:none;
		}
		#grid
		{
			display: -ms-grid;
			-ms-grid-columns: 1fr 8fr 1fr;
			-ms-grid-rows: 1fr 8fr 1fr;
			height:100%;
		}
		#container
		{
			-ms-grid-column: 2;
			-ms-grid-row: 2;
			overflow: auto;
		}
		#content
		{
			width: 300%;
			height: 200%;
			display: -ms-grid;
			-ms-grid-columns: 1fr 1fr 1fr 1fr 1fr 1fr;
			-ms-grid-rows: 1fr 1fr 1fr 1fr;
		}

	</style>

</head>

<body>
	<div id="grid">
		<div id="container">
            		<div id="content">
				<div style="-ms-grid-column:1;-ms-grid-row:1;background-color:red"></div>
				<div style="-ms-grid-column:2;-ms-grid-row:1;background-color:green"></div>
				<div style="-ms-grid-column:1;-ms-grid-row:2;background-color:yellow"></div>
				<div style="-ms-grid-column:2;-ms-grid-row:2;background-color:blue"></div>
				<div style="-ms-grid-column:3;-ms-grid-row:1;background-color:orange"></div>
				<div style="-ms-grid-column:4;-ms-grid-row:1;background-color:pink"></div>
				<div style="-ms-grid-column:3;-ms-grid-row:2;background-color:black"></div>
				<div style="-ms-grid-column:4;-ms-grid-row:2;background-color:silver"></div>
				<div style="-ms-grid-column:5;-ms-grid-row:1;background-color:red"></div>
				<div style="-ms-grid-column:6;-ms-grid-row:1;background-color:green"></div>
				<div style="-ms-grid-column:5;-ms-grid-row:2;background-color:yellow"></div>
				<div style="-ms-grid-column:6;-ms-grid-row:2;background-color:blue"></div>
				<div style="-ms-grid-column:1;-ms-grid-row:3;background-color:orange"></div>
				<div style="-ms-grid-column:2;-ms-grid-row:3;background-color:pink"></div>
				<div style="-ms-grid-column:1;-ms-grid-row:4;background-color:black"></div>
				<div style="-ms-grid-column:2;-ms-grid-row:4;background-color:silver"></div>
				<div style="-ms-grid-column:3;-ms-grid-row:3;background-color:red"></div>
				<div style="-ms-grid-column:4;-ms-grid-row:3;background-color:green"></div>
				<div style="-ms-grid-column:3;-ms-grid-row:4;background-color:yellow"></div>
				<div style="-ms-grid-column:4;-ms-grid-row:4;background-color:blue"></div>
				<div style="-ms-grid-column:5;-ms-grid-row:3;background-color:orange"></div>
				<div style="-ms-grid-column:6;-ms-grid-row:3;background-color:pink"></div>
				<div style="-ms-grid-column:5;-ms-grid-row:4;background-color:black"></div>
				<div style="-ms-grid-column:6;-ms-grid-row:4;background-color:silver"></div>
			</div>
        	</div>
	</div>
</body>
</html>

then I immediately get an area that is 3 times wider and twice as high as the DIV which is providing a “viewport” onto the content and so scrollbars appear (because of my auto setting);

image

Rather than drag around on the scrollbars (which aren’t very touch friendly), I can just pan around on the content itself to move it along on the X and Y axes;

image

If I wanted to limit the scrolling then I can use -ms-scroll-boundary ( left, right, top, bottom ) so with this additional CSS;

	#container
		{
			-ms-scroll-boundary-left: 0px;
			-ms-scroll-boundary-right: 200%;

I can now only scroll my grid to the right by "2 squares" or 100% of the width of the grid container itself so as far as to see;

image

what that picture was trying to show is that there is content off to the right here but I can't pan to it because of the -ms-scroll-boundary-right setting.

That's all great but, once again, from the point of a touch user if this content was actually meaningful rather than just a bunch of coloured squares I might want to provide some guides for the user's scrolling. I can do that by changing my CSS in a similar way to providing guides for zooming;

		#container
		{
			-ms-grid-column: 2;
			-ms-grid-row: 2;
			overflow: auto;
			-ms-scroll-snap-x: mandatory snapInterval(0%, 50%);
		}

and then the scrolling will snap me (in a mandatory manner here) to each of my grid squares as I pan to the right/left;

image

image

I should say that the text there was drawn with a mouse rather than a pen or a finger Smile and I can also do this vertically;

		#container
		{
			-ms-grid-column: 2;
			-ms-grid-row: 2;
			overflow: auto;
			-ms-scroll-snap-x: mandatory snapInterval(0%, 50%);
			-ms-scroll-snap-y: mandatory snapInterval(0%, 50%);
		}

and now I'm snapping in 2 directions of panning;

image

Going back to my content when not constrained by snap points – i.e. with my #container DIV styled by;

#container
{
	-ms-grid-column: 2;
	-ms-grid-row: 2;
	overflow: auto;
}

then the content will largely just pan around in any direction and in any amount by following my finger as I pan around. If I want to again provide a bit more guidance around the panning and constrain it to largely be either horizontal or vertical then I can add “rails”;

#container
{
	-ms-grid-column: 2;
	-ms-grid-row: 2;
	overflow: auto;
	-ms-scroll-rails: railed;
}

then the user gets a more “guided” experience for their pan gesture in that if a pan along a particular axis is detected then slight movements along the other axis will be ignored so that the pan is “guided” horizontally or vertically.

I only came across these capabilities the other day (when someone pointed them out to me) and it was a minor revelation that the browser could handle these kinds of gestures for me with minimal effort – the doc page is available if you want to look in some more detail or see the other attributes that I left out here.

Silverlight and NESL Redux–Touch

Stephen pointed out last week that there’s a preview update to the “Native Extensions to Silverlight” up on CodePlex and this preview adds some capabilities around touch and having written a little around that topic recently;

I thought I’d take a look at what was going on in the NESL preview – this follows on from my previous post where I took more of an “overview” approach to NESL.

I downloaded the preview, installed it. Then I wanted to put together a simple example of a FlickR search with a touch gesture or two to flick through pictures.

I built a little basic support in order that my application had a few different pieces of UI depending on whether it was running in the browser and/or whether NESL was installed;

image

and that gets me to to a point where my app can know that NESL is installed and so start to rely on it – my previous post did this in more detail.

I hit a problem with the NESL installation on this preview in that the call Installer.CheckNESLInstalled failed on me and so I came up with my own hacky way of denoting whether NESL had or hadn’t been installed although this would only work for my own app rather than across the system. It’s probably a “preview” thing or a user error on my part as it worked fine the last time I used it.

With the NESL 2 preview installed, it’s then time to figure out what can be done about touch and there’s a new namespace Microsoft.Silverlight.Windows.Touch ( I ended up referencing Microsoft.Silverlight.Windows.dll and Microsoft.Silverlight.Windows.Touch.dll ) and within there you’ll find;

image

and these classes look pretty similar to what you see in WPF 4 in terms of a ManipulationProcessor and an InertiaProcessor although I think that the ManipulationProcessor is a little different here in that it also has the notion of raising Gestures whereas in WPF 4 you specify whether you want gestures or not up-front if I remember correctly. See my previous post for more on that.

This is convenient then as you have both options in one place. I figured that I’d try and put together a simple photo viewer which searches FlickR and then lets you move forwards through result-sets by using a simple flick gesture.

Signing up for gestures is pretty easy. If I’m running out of browser then I can set up a ManipulationProcessor and tell it what I want to do;

      // This is my own flag to determine app & NESL installation status
      if (this._appState == AppState.OutOfBrowserNeslInstalled)
      {
        this._manipProcessor = new ManipulationProcessor();

        this._manipProcessor.RaiseGestures =
          Gesture.Flick | Gesture.TwoFingerTap;

        Touch.FrameReported += (s, e) =>
          {
            this._manipProcessor.ProcessTouchPoints(
              e.GetTouchPoints(Application.Current.RootVisual),
              e.Timestamp);
          };

        this._manipProcessor.Flick += OnFlick;
        this._manipProcessor.TwoFingerTap += OnTap;
      }

and that should serve me fine.

I wanted to re-use the code that I’d worked up on this previous post which used the Reactive Extensions in order to bring back some images from FlickR.

They key of this code was a routine that made an observable sequence of images to be brought back from FlickR. This ended up looking like;

  class FlickrSearch
  {
    public FlickrSearch(string searchText,
      TimeSpan interval)
    {
      this.state = new FlickSearchState(searchText);
      this.interval = interval;

      this.photoResults = new Lazy<IObservable<Stream>>(
        MakePhotoResults);
    }
    public IObservable<Stream> PhotoResults
    {
      get
      {
        return (this.photoResults.Value);
      }
    }
    IObservable<Stream> MakePhotoResults()
    {
      var webRequests =
        this.state.GetObservableSearchUris().SelectMany(
          uri =>
          {
            HttpWebRequest wr = WebRequest.Create(uri) as HttpWebRequest;

            return (Observable.Defer(Observable.FromAsyncPattern<WebResponse>(
              wr.BeginGetResponse, wr.EndGetResponse)));
          });

      var xmlResponses =
        webRequests.Select(
          wr =>
          {
            XElement xml = null;
            FlickrPhotoPageResult result = null;

            using (Stream stream = wr.GetResponseStream())
            {
              xml = XElement.Load(stream);
              result = new FlickrPhotoPageResult(xml);
            };
            return (result);
          });

      var photoResponses =
        xmlResponses.SelectMany(
          pageResponse =>
          {
            return (pageResponse.GetPhotos());
          });

      var slowedDownResponses =
        photoResponses.Zip(
          Observable.Interval(this.interval),
          (photo, count) =>
          {
            return (photo);
          });

      var slowedDownLoadsNextPage =
        slowedDownResponses.Do(p =>
        {
          // TODO: not sure about this having a side-effect which
          // can advance/end the observable at the "tail of the
          // chain" here.
          if (p.IsLastImage)
          {
            this.state.Advance(end: true);
          }
          else if (p.IsLastImageOnCurrentPage)
          {
            this.state.Advance(end: false);
          }
        });

      var streams =
        slowedDownLoadsNextPage.SelectMany(
          photo =>
          {
            HttpWebRequest request = WebRequest.Create(photo.ImageUri) as HttpWebRequest;

            return(Observable.Defer(
              Observable.FromAsyncPattern<WebResponse>(request.BeginGetResponse,
                request.EndGetResponse)));
          });

      var byteArrays =
        streams.Select(
          resp =>
          {
            HttpWebResponse webResponse = (HttpWebResponse)resp;

            MemoryStream ms = new MemoryStream();

            using (Stream s = resp.GetResponseStream())
            {
              s.CopyTo(ms);
            }
            ms.Seek(0, SeekOrigin.Begin);
            return (ms);
          });

      return (byteArrays);
    }
    Lazy<IObservable<Stream>> photoResults;
    FlickSearchState state;
    TimeSpan interval;
  }

and what it is trying to do is bring back pages of results from FlickR and throttle them down so that they don’t arrive too frequently but, instead, only arrive at most every this.interval period where that period is configurable by whoever instantiates the search instance in the first place.

However, in my Silverlight app I didn’t want to produce a new image every so many seconds. I wanted to produce one every time the user did a flick gesture on the screen.

My FlickrSearch class needed to be able to adapt so that rather than taking an interval and then plugging it in to Observable.Interval it could perhaps take an IObservable<T> and use that as part of the Zip operation which it makes use of for throttling.

I also realised that my previous code throttled really at the wrong point in that it did;

  1. Make new URI for FlickR search based on a page size and a page number
  2. Make HTTP request for that URI
  3. Parse Response into XML
  4. Split Response into individual photos
  5. Throttle
  6. Make HTTP request for each photo’s image
  7. Surface that image

and I figured I might move the throttling from step 5 here to after step 7 in order that a page of images are ready to go at pretty much the same time.

Here’s what I ended up with ( not too pretty – I’m sure there’s about 100 ways to do this better and/or correctly );

  class FlickrSearch<T>
  {
    public FlickrSearch(string searchText,
      IObservable<T> throttlingObservable)
    {
      this.state = new FlickSearchState(searchText);
      this.throttlingObservable = throttlingObservable;

      this.photoResults = new Lazy<IObservable<MemoryStream>>(
        MakePhotoResults);
    }
    public IObservable<MemoryStream> PhotoResults
    {
      get
      {
        return (this.photoResults.Value);
      }
    }
    IObservable<MemoryStream> MakePhotoResults()
    {
      // As new URIs are produced, we turn them into HttpWebResponses
      var webRequests =
        this.state.GetObservableSearchUris().SelectMany(
          uri =>
          {
            HttpWebRequest wr = WebRequest.Create(uri) as HttpWebRequest;

            return (Observable.Defer(Observable.FromAsyncPattern<WebResponse>(
              wr.BeginGetResponse, wr.EndGetResponse)));
          });

      // We take each of those responses and we load the XML back from it
      // as a page of data.
      var xmlResponses =
        webRequests.Select(
          wr =>
          {
            XElement xml = null;
            FlickrPhotoPageResult result = null;

            using (Stream stream = wr.GetResponseStream())
            {
              xml = XElement.Load(stream);
              result = new FlickrPhotoPageResult(xml);
            };
            return (result);
          });

      // We take that page of data and turn it into photograph instances.
      var photoResponses =
        xmlResponses.SelectMany(
          pageResponse =>
          {
            return (pageResponse.GetPhotos());
          });

      // We're happy to make the web requests in order to get the streams
      // containing the photo bits.
      var photoResponseStreams =
        photoResponses.SelectMany(
          photo =>
          {
            HttpWebRequest request = WebRequest.Create(photo.ImageUri) as HttpWebRequest;

            var response = Observable.FromAsyncPattern<WebResponse>(
              request.BeginGetResponse, request.EndGetResponse);

            return (Observable.Defer(response).Select(
              resp => Tuple.Create(photo, resp)));
          });

      // And produce a stream from it.
      var photoAndStreams =
        photoResponseStreams.Select(
          photoResp =>
          {
            HttpWebResponse webResponse = (HttpWebResponse)photoResp.Item2;

            MemoryStream ms = new MemoryStream();

            using (Stream s = webResponse.GetResponseStream())
            {
              s.CopyTo(ms);
            }
            ms.Seek(0, SeekOrigin.Begin);

            return (Tuple.Create(photoResp.Item1, ms));
          });

      // We always want the first one because the user has searched and
      // expects a result.
      var first =
        photoAndStreams.Take(1);

      // We also want the rest but we want to defer them until a later 
      // time.
      var rest =
        photoAndStreams.Skip(1);

      // We produce the first result followed by the remaining results
      // only when the "throttlingObservable" produces a value so the
      // remaining results won't come through until some timer tick
      // or event fires or whatever.
      var slowedDownResponses =
        first.Concat(
          rest.Zip(
            this.throttlingObservable,
            (photoResp, throttle) =>
            {
              return (photoResp);
            }));

      // We need to make sure that we request the next page if necessary
      // by updating the URI or that we signal the result-set is complete.
      var slowedDownLoadsNextPage =
        slowedDownResponses.Do(photoAndStream =>
        {
          // TODO: not sure about this having a side-effect which
          // can advance/end the observable at the "tail of the
          // chain" here.
          if (photoAndStream.Item1.IsLastImage)
          {
            this.state.Advance(end: true);
          }
          else if (photoAndStream.Item1.IsLastImageOnCurrentPage)
          {
            this.state.Advance(end: false);
          }
        });

      return (slowedDownLoadsNextPage.Select(
        photoAndStream => photoAndStream.Item2));
    }
    Lazy<IObservable<MemoryStream>> photoResults;
    FlickSearchState state;
    IObservable<T> throttlingObservable;
  }

and so what this is trying to do is to throttle back the web requests and the production of images such that we only move on to the next image when the IObservable<T> passed to the constructor produces some value.

I’m feeling that I need a lot more practise with Rx but I’ll get there one day Winking smile 

What it’s also trying to do ( around lines 92, 98 ) is trying to make sure that we always produce the first image regardless because if the user has done a search then they expect the first image to show up without having to do a flick gesture to see it.

With this in place, I did a clumsy looking combination such that I tied together the Flick event from the ManipulationProcessor with an instance of FlickrSearch and brought them together;

      this._flickEvents =
        Observable
          .FromEvent<FlickEventArgs>(this._manipProcessor, "Flick");

and then tying that into the search such that the Flick event becomes the throttle for the production of photos;

FlickrSearch<IEvent<FlickEventArgs>> search =
          new FlickrSearch<IEvent<FlickEventArgs>>(
            this.SearchText, this._flickEvents);

        this._searchSubscription =
          search
            .PhotoResults
            .ObserveOnDispatcher()
            .Subscribe(
            ms =>
            {
              BitmapImage image = new BitmapImage();
              image.SetSource(ms);
              this.CurrentImage = image;
            });

this all seems to hang together “reasonably well” and I also sync’d up the TwoFingerTap gesture to raise a search dialog box;

      this._manipProcessor.TwoFingerTap += OnTap;
    }
    void OnTap(object sender, TapEventArgs e)
    {
      this.ShowSearch = Visibility.Visible;
    }

which is simple enough and here’s a quick video of me using the thing and doing the two-finger-tap to raise the search box and then using a flick gesture to move to the next photos (I only support going forwards Smile);

In conclusion – the touch support here seems pretty nice and combines the 2 different options of going with pre-determined gestures like flick or doing manipulation and inertia processing on your own object. I haven’t explored those parts of the APIs because I’m assuming it’s pretty similar to doing the work in WPF 4.

When would I use this though? I’d use it in situations where I’d already decided that I was going to make use of the Native Extensions. These come with a fairly big additional installation requirement whereas there are steps that I can take in Silverlight 4 to make use of touch support ( as discussed at length in those previous posts at the top of this article ) without having to install anything or even leave the confines of a browser application so I’d probably follow the path of “least dependency” and go with those options if I could and go down the NESL path if my app was already using NESL for other reasons.

Here’s the source-code if you want to have an experiment yourself. You’ll need the NESL 2 Preview bits and Rx to build it.