Mike Taulty's Blog
Bits and Bytes from Microsoft UK
Rx and Getting Paged Data from the MovieDb API

Blogs

Mike Taulty's Blog

Elsewhere

Archives

I haven’t written any code with the Reactive Extensions for a little while, so much so that I’d forgotten how much fun it can be.

I got a ping from Javanie on Twitter about some code that he was a bit blocked on writing which needed to call the MovieDb API and the particular problem was in the way that the API provides paged results of data back to the caller.

What was nice about this was that the code he sent me was using RestSharp and he’d already done the work to create the right classes to serialize the results which come back from the service so I had only to do a bit of work in order to try and get those pieces to plug in nicely into Rx.

What the code was intending to do was to query the movie db API for its list of movie genres. This can be found by hitting the URL;

http://api.themoviedb.org/3/genre/list?api_key=MY_API_KEY

 

but you’ll need an API key I think to get proper results which come back something like this;

{"genres":[{"id":28,"name":"Action"},{"id":12,"name":"Adventure"},{"id":16,"name":"Animation"},{"id":35,"name":"Comedy"},{"id":80,"name":"Crime"},{"id":105,"name":"Disaster"},{"id":99,"name":"Documentary"},{"id":18,"name":"Drama"},{"id":82,"name":"Eastern"},{"id":2916,"name":"Erotic"},{"id":10751,"name":"Family"},{"id":10750,"name":"Fan Film"},{"id":14,"name":"Fantasy"},{"id":10753,"name":"Film Noir"},{"id":10769,"name":"Foreign"},{"id":36,"name":"History"},{"id":10595,"name":"Holiday"},{"id":27,"name":"Horror"},{"id":10756,"name":"Indie"},{"id":10402,"name":"Music"},{"id":22,"name":"Musical"},{"id":9648,"name":"Mystery"},{"id":10754,"name":"Neo-noir"},{"id":1115,"name":"Road Movie"},{"id":10749,"name":"Romance"},{"id":878,"name":"Science Fiction"},{"id":10755,"name":"Short"},{"id":9805,"name":"Sport"},{"id":10758,"name":"Sporting Event"},{"id":10757,"name":"Sports Film"},{"id":10748,"name":"Suspense"},{"id":10770,"name":"TV movie"},{"id":53,"name":"Thriller"},{"id":10752,"name":"War"},{"id":37,"name":"Western"}]}

from there, the code wants to get hold of the “Action” genre, grab its Id and use it to form a query to get all the movies of that genre with a URL like;

http://api.themoviedb.org/3/genre/28/movies?api_key=MY_API_KEY

and that returns the first page of results for that particular genre;

{"id":28,"page":1,"results":[{"adult":false,"backdrop_path":"/hXOWCT2HjZSK7qM4Se7TBjbktZF.jpg","id":26842,"original_title":"The Message: The Story of Islam","release_date":"1977-03-09","poster_path":"/6zOWn1mtcSPfWekydQKbxnCyaTw.jpg","popularity":1.510080781875,"title":"The Message: The Story of Islam","vote_average":8.5,"vote_count":13},{"adult":false,"backdrop_path":"/6I3GKrpNGGFEXx7VhUZGMgpCqaV.jpg","id":3482,"original_title":"The Train","release_date":"1964-09-22","poster_path":"/idpWIwrJK5I8eiMkNYBddXvBlbW.jpg","popularity":1.05039008694455,"title":"The Train","vote_average":8.4,"vote_count":10},{"adult":false,"backdrop_path":"/dfSaFmcQBTu68JcTaNhaaSB0If1.jpg","id":11712,"original_title":"Tsubaki Sanjûrô","release_date":"1962-01-01","poster_path":"/fOP13eRzTGAvWo7fEiuWfrYlihC.jpg","popularity":0.625777660667093,"title":"Sanjuro","vote_average":8.4,"vote_count":15},{"adult":false,"backdrop_path":"/14qGZyC10dSWadsNuctFEM1QqHZ.jpg","id":832,"original_title":"M","release_date":"1931-05-11","poster_path":"/5jAaU3LStpQNlaBnZYsExMlvsBQ.jpg","popularity":3.83229663224416,"title":"M","vote_average":8.4,"vote_count":33},{"adult":false,"backdrop_path":"/jxIscmrDkgTEcYpKD9FePOoSLWk.jpg","id":118406,"original_title":"Gekijōban Naruto: Rōdo tu Ninja","release_date":"2012-07-28","poster_path":"/hELlIPO1u294ICcderb4PLG3NCk.jpg","popularity":1.731211250625,"title":"Naruto Shippuden the Movie: Road to Ninja","vote_average":8.3,"vote_count":12},{"adult":false,"backdrop_path":"/6XxsHC0Ty35lyL7nUJicE7vHMTP.jpg","id":11016,"original_title":"Key Largo","release_date":"1948-07-16","poster_path":"/iU9j8ACFKjTB1b7nTfVfLjMVGAe.jpg","popularity":2.07485774914062,"title":"Key Largo","vote_average":8.3,"vote_count":10},{"adult":false,"backdrop_path":"/z9qC0tssrzo88UKy0V8RCuZ12fj.jpg","id":15003,"original_title":"ช็อคโกแลต","release_date":"2008-02-06","poster_path":"/daFKjI9gZKcglJhmn5i2lz4PLA.jpg","popularity":1.578905725,"title":"Chocolate","vote_average":8.3,"vote_count":14},{"adult":false,"backdrop_path":"/qtxiemVTpPjDoueStK6fRliU46Z.jpg","id":13855,"original_title":"Chugyeogja","release_date":"2008-02-14","poster_path":"/6Zqbrb7y8FdwOS5zLTEKbEZyXxz.jpg","popularity":0.873446149678136,"title":"The Chaser","vote_average":8.2,"vote_count":19}],"total_pages":74,"total_results":1477}

I snipped out most of that JSON but the important bit is that it contains an array of results and it also contains a total_pages and total_results so it’s possible to figure out how many pages are required and by appending a page parameter to the query string it’s possible to request each of the pages.

What Javanie wanted from his code here was to bring back some details of each movie in this genre (dealing with the paged nature of the data). He handed me a bunch of really useful classes;

  public class Genre
  {
    public int Id { get; set; }
    public String Name { get; set; }
  }

  public class GenreCollection
  {
    public List<Genre> Genres { get; set; }
  }

  public class MovieCollection
  {
    public int Id { get; set; }
    public int Page { get; set; }
    public List<MovieResult> Results { get; set; }
    public int TotalPages { get; set; }
    public int TotalResults { get; set; }
  }
  public class MovieResult
  {
    public int Id { get; set; }
    public String BackdropPath { get; set; }
    public String OrignalTitle { get; set; }
    public String PosterPath { get; set; }
    public String ReleaseDate { get; set; }
    public String Title { get; set; }
    public Double VoteAverage { get; set; }
    public int VoteCount { get; set; }
  }

and so the task became one of how we might use a bit of Rx in order to make a web request to get back the GenreCollection from the service and then find the “Action” genre and then make some more web requests in order to get back all the movies.

Something a little like;

Web Request for The Genre List

Extract the ID of the “Action” Genre

Web Request for the first page of the movies in the “Action” genre

Extract the total pages count

For each page Web Request for that page

For each movie on that page extract some detail of that movie.

It’s very procedural when we hit the “loop” parts of that logic and the trick (if there is a trick) is to perhaps think of that looping in terms of sequences rather than loops.

As is often the case on this blog, I wrote a bit of code to experiment with this. I didn’t spend a lot of time on it. It might not be quite right. “Buyer beware” as they say Smile

Step 1 – Making a Web Request in an Observable Way

The first thing I wanted to do was to try and use the RestClient from Rest Sharp in an observable way. Because it has methods that return Task it becomes fairly natural to link it up with Rx and produce something IObservable. I wanted a method that would hit a URL, deserialize the results into some <T> and then return a sequence of those values and so I came up with;

    static readonly string baseUrl = @"http://api.themoviedb.org/3/";
    static readonly string apiKey = "get your own key :-)";

    static string MakeUrl(string insert)
    {
      return (string.Format("{0}?api_key={1}", insert, apiKey));
    }

    static IObservable<T> MakeObservableMovieDbRequest<T>(string url,
      params KeyValuePair<string,string>[] additionalParameters)
    {
      var observable = 
        Observable.FromAsync<IRestResponse<T>>(
          () =>
          {
            RestClient client = new RestClient(baseUrl);
          
            StringBuilder parameterisedUrl = new StringBuilder(MakeUrl(url));

            foreach (var keyValuePair in additionalParameters)
            {
              parameterisedUrl.AppendFormat(@"&{0}={1}",
                keyValuePair.Key, keyValuePair.Value);
            }

            RestRequest request = new RestRequest(parameterisedUrl.ToString());

            return (client.ExecuteGetTaskAsync<T>(request));
          }
        );

      var observableData = observable.Select(o => o.Data);

      return(observableData);
    }

And then I could make use of this with something along the lines of;

      var genresCollection = MakeObservableMovieDbRequest<GenreCollection>(
        @"genre/list");

      // this observable feels like a sequence but it really only ever returns one
      // entry which then has a number of genres within it. we can then flatten 
      // that out here.
      var genres = genresCollection.SelectMany(collection => collection.Genres);

      genres.Subscribe(
        genre =>
        {
          Console.WriteLine(genre.Name);
        });

      Console.ReadLine();

One of the “interesting” parts of this piece of code is when the HTTP request is sent to the service. That is – is it sent after line 1 above executes or is it sent as the subscription starts on line 9 onwards? In the Rx terminology that’s the difference between a “hot” observable and a “cold” observable.

In this case, without that call to Subscribe on line 9, no HTTP request is sent to the service. It’s produced “on demand” which is probably what I want.

Step 2 – Getting to the “Action” Genre and Figuring out Page Counts

Having got the ability to make a web request, I can now call the API to get hold of the id of the genre relating to “Action” movies and I can then grab the first page of data back from that web request to figure out how many pages of data there are in total.

That is, I can write code such as;

      var genresCollection = MakeObservableMovieDbRequest<GenreCollection>(
        @"genre/list");

      // this observable feels like a sequence but it really only ever returns one
      // entry which then has a number of genres within it. we can then flatten 
      // that out here.
      var genres = genresCollection.SelectMany(collection => collection.Genres);

      // likely to only be one of these in reality.
      var actionGenres = genres.Where(genre => genre.Name == "Action");

      // this should represent the first page of data for any genre that
      // identifies itself as "Action"
      var actionGenreMovies = actionGenres.SelectMany(
        actionGenre =>
        {
          string url = string.Format(@"genre/{0}/movies", actionGenre.Id);

          return (MakeObservableMovieDbRequest<MovieCollection>(url));
        }
      );

      actionGenreMovies.Subscribe(
        movies =>
        {
          Console.WriteLine(movies.TotalPages);
        });

and so now, we’ve managed to produce a sequence of integers (probably a sequence of 1) which relates to the total number of pages of data available in each of the genres that identifies itself as “Action”.

Step 3 – Producing a Sequence of Page Numbers

It takes a bit of head-scratching to try and conjure up a way to go from this sequence containing the total number of pages to a sequence of HTTP calls which return pages of data but it’s not too bad. We have functions like Enumerable.Range() (or Observable.Range()) which can produce a sequence of integers for us so that could be use to produce page numbers 1…N where N is the total number of pages available.

If we work that way then we can produce all the pages of the result set by going back to the service again as below;

      var genresCollection = MakeObservableMovieDbRequest<GenreCollection>(
        @"genre/list");

      // this observable feels like a sequence but it really only ever returns one
      // entry which then has a number of genres within it. we can then flatten 
      // that out here.
      var genres = genresCollection.SelectMany(collection => collection.Genres);

      // likely to only be one of these in reality.
      var actionGenres = genres.Where(genre => genre.Name == "Action");

      // this should represent the first page of data for any genre that
      // identifies itself as "Action"
      var actionGenreMovies = actionGenres.SelectMany(
        actionGenre =>
        {
          string url = string.Format(@"genre/{0}/movies", actionGenre.Id);

          return (MakeObservableMovieDbRequest<MovieCollection>(url));
        }
      );

      // produce a sequence of page numbers 1..N but don't lose the genre
      // data, pass it along as part of the sequence.
      var genrePagesToRequest = actionGenreMovies.SelectMany(
        genreData =>
        {
          var numberSequence = Enumerable.Range(1, genreData.TotalPages).ToObservable();

          return (numberSequence.Select(
            n => new 
              { 
                Page = n, 
                GenreInfo = genreData 
              }
            ));
        }
      );

and that produces a nice output to the console of (1, N), (2, N), (3,N) … (N,N) with the next step being to produce a set of web requests for each of those pages in order to get the actual movie data.

Step 4 – Producing a Sequence of Page HTTP Requests

Now to try and turn that previous sequence of integers into a sequence of results from HTTP requests for each of the pages of data in question. I don’t think that’s too difficult by adding another step to the pipeline;

      var genresCollection = MakeObservableMovieDbRequest<GenreCollection>(
        @"genre/list");

      // this observable feels like a sequence but it really only ever returns one
      // entry which then has a number of genres within it. we can then flatten 
      // that out here.
      var genres = genresCollection.SelectMany(collection => collection.Genres);

      // likely to only be one of these in reality.
      var actionGenres = genres.Where(genre => genre.Name == "Action");

      // this should represent the first page of data for any genre that
      // identifies itself as "Action"
      var actionGenreMovies = actionGenres.SelectMany(
        actionGenre =>
        {
          string url = string.Format(@"genre/{0}/movies", actionGenre.Id);

          return (MakeObservableMovieDbRequest<MovieCollection>(url));
        }
      );

      // produce a sequence of page numbers 1..N but don't lose the genre
      // data, pass it along as part of the sequence.
      var genrePagesToRequest = actionGenreMovies.SelectMany(
        genreData =>
        {
          var numberSequence = Enumerable.Range(1, genreData.TotalPages).ToObservable();

          return (numberSequence.Select(
            n => new 
              { 
                Page = n, 
                GenreInfo = genreData 
              }
            ));
        }
      );

      // take each one of those entries and produce an HTTP request which goes and
      // gets the data for that particular page's movies.
      var movieDetails = genrePagesToRequest.SelectMany(
          pageInfo =>
          {
            string url = string.Format(@"genre/{0}/movies", pageInfo.GenreInfo.Id);

            var parameter = new KeyValuePair<string, string>("page",
              pageInfo.Page.ToString());

            var request = MakeObservableMovieDbRequest<MovieCollection>(url,
              parameter);

            return (request.SelectMany(collection => collection.Results));
          }
        );

      movieDetails.Subscribe(
          movie =>
          {
            Console.WriteLine("Movie {0}", movie.Title);
          }
        );

and so now we end up with a sequence of MovieResult which is being pulled from each web request that we make to the service with 1 request bringing back multiple movie results.

The only downside of this is that it’s a bit haphazard. There’s a bunch of concurrent work going on in there. For instance, if I set a breakpoint on line 60 up above then take a look at what’s going on in Fiddler then I see;

image

that’s quite a lot of HTTP requests! The order in which requests come back is also somewhat random. For instance, each time I run this code I can find a different movie returned as the first movie which might not be what I’m looking for.

Step 5 – Keeping the Pages in Sequence

Back on line 42 of that previous code snippet I’m using the SelectMany operator to go from a sequence of anonymous types (containing a page number and a MovieCollection) to a sequence of MovieResults. The way in which that is done though is to produce an IObservable<MovieResult> for each of the input sequence and each of those IObservable<MovieResult> is produced by doing an asynchronous web request to get that particular page of details from the web service.

The ordering in which those web requests might complete is not deterministic and so it’s possible that the second web request completes before the first one and so movies from the second page of results are produced before movies from the first page of results and so on.

I think that’s the nature of the SelectMany operator – it takes an IObservable<T> and for each T it runs some function to produce an IObservable<U> which means that you ultimately end up with a set of IObservable<U> and values of type U are going to be produced by those observables as and when they are available.

If I was to replace that call to SelectMany with a call to Select then things are a little different. As below;

      var movieDetails = genrePagesToRequest.Select(
          pageInfo =>
          {
            string url = string.Format(@"genre/{0}/movies", pageInfo.GenreInfo.Id);

            var parameter = new KeyValuePair<string, string>("page",
              pageInfo.Page.ToString());

            var request = MakeObservableMovieDbRequest<MovieCollection>(url,
              parameter);

            return (request.SelectMany(collection => collection.Results));
          }
        );

the return type of this function is now IObservable<IObservable<MovieResult>> – it’s a sequence of sequences because it is no longer being flattened for me by SelectMany.

What that means is that I could use some other operator to combine these sequences. For instance, if I use the Concat operator then that’s an operator that preserves ordering so if I write something like the code below (including the entire code again);

      var genresCollection = MakeObservableMovieDbRequest<GenreCollection>(
        @"genre/list");

      // this observable feels like a sequence but it really only ever returns one
      // entry which then has a number of genres within it. we can then flatten 
      // that out here.
      var genres = genresCollection.SelectMany(collection => collection.Genres);

      // likely to only be one of these in reality.
      var actionGenres = genres.Where(genre => genre.Name == "Action");

      // this should represent the first page of data for any genre that
      // identifies itself as "Action"
      var actionGenreMovies = actionGenres.SelectMany(
        actionGenre =>
        {
          string url = string.Format(@"genre/{0}/movies", actionGenre.Id);

          return (MakeObservableMovieDbRequest<MovieCollection>(url));
        }
      );

      // produce a sequence of page numbers 1..N but don't lose the genre
      // data, pass it along as part of the sequence that comes out of
      // this operator.
      var genrePagesToRequest = actionGenreMovies.SelectMany(
        genreData =>
        {
          var numberSequence = Observable.Range(1, genreData.TotalPages);

          return (numberSequence.Select(
            n => new 
              { 
                Page = n, 
                GenreInfo = genreData 
              }
            ));
        }
      );

      // take each one of those entries and produce an HTTP request which goes and
      // gets the data for that particular page's movies.
      var movieDetails = genrePagesToRequest.Select(
          pageInfo =>
          {
            string url = string.Format(@"genre/{0}/movies", pageInfo.GenreInfo.Id);

            var parameter = new KeyValuePair<string, string>("page",
              pageInfo.Page.ToString());

            var request = MakeObservableMovieDbRequest<MovieCollection>(url,
              parameter);

            return (request.SelectMany(collection => collection.Results));
          }
        );

      var concatenatedMovieDetails = Observable.Concat(movieDetails);

      concatenatedMovieDetails.Subscribe(
          movie =>
          {
            Console.WriteLine("Movie {0}", movie.Title);
          }
        );


      Console.ReadLine();

Then the movies are consumed in the order in which they appear in the web service results starting at Page 1, Movie 1 and only moving on to Page 2 when we’ve exhausted Page 1.

What it also alters is the huge number of HTTP requests that were going out to the web concurrently. What I now see is requests going out to the web as they are needed – i.e. if I put a breakpoint on line 63 above then I see in Fiddler;

image

and the request for page 2 goes out once page 1 has been consumed.

Summary

I’m sure that I did some things wrong but I had a lot of fun putting together that little bit of code. Rx is such a great framework to work with – I wish it was more built into the .NET framework pieces as I’d love to see it show up in app development for Windows/Phone – e.g. rather than firing an event from the UI why not just have a sequence of data flowing from the UI. Naturally, you can always plug Rx in but it’d be great to see it natively part of more framework pieces.


Posted Wed, Feb 5 2014 3:23 PM by mtaulty
Filed under: , ,

Comments

James World wrote re: Rx and Getting Paged Data from the MovieDb API
on Mon, Feb 10 2014 2:09 PM

Hey Mike, interesting post... I had some thoughts on it: www.zerobugbuild.com