Tuesday, August 9, 2011

Use WebClient to decompress a GZip JSON response from cross-domain API

Quite frequently I am using jQuery to access external APIs. The way you must access data changes between APIs and sometimes external APIs do not like cross-domain requests for JSON. A common way around this is to request JSONP, if the API supports it.

Take Twitter's API, for example: It returns JSON from a jQuery .getJSON() request very nicely.

Stackoverflow's API does not respond nicely to a jQuery .getJSON() request, because it is picky with cross-domain requests. However, they give developers the option to request GZip'ed JSONP instead of GZip'ed JSON, which does work with a jQuery .getJSON() request. Also, the fact that we know to expect GZip'ed data from the API will become important later.

Dave Ward's method of requesting cross-domain API data through an ASP.Net "Generic" HttpHandler creates a more reliable workaround without having to request JSONP. Rather than relying on JSONP to get around cross-domain issues, HttpHandler provides a reliable method of retrieving JSON from any API using System.Net.WebClient. His example of requesting the latest tweets from a Twitter user works very well, allows for caching the results and has other added benefits.

If you apply Dave's method to retrieve JSON data from Stackoverflow's API, for example, you will be able to request JSON through HttpHandler without the need to resort to JSONP. Retrieving the data works, but now we are dealing with GZip'ed JSON data. System.Net.WebClient's DownloadString() method does not know how to decompress GZip'ed data, and your .getJSON() method would throw an error.

I have built on Dave Ward's method of requesting JSON data by also decompressing GZip'ed JSON data (Please note, this is only intended for API responses that you know will be GZip'ed):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Net;
using System.IO;
using System.IO.Compression;

namespace HttpHandler_Proxy.StackoverflowAPI
{
    public class UserData : IHttpHandler
    {
        public void ProcessRequest(HttpContext context)
        {
            WebClient stack = new WebClient();

            stack.Headers["Accept-Encoding"] = "gzip";  

            context.Response.ContentType = "application/json";

            string Id = context.Request.QueryString["Id"];

            if (string.IsNullOrWhiteSpace(Id))
                Id = "285714";

            object userDataCache = context.Cache["stackUserData-" + Id];
            if (userDataCache != null)
            {
                string cachedUserData = userDataCache.ToString();
                context.Response.Write(cachedUserData);
                return;
            }

            string baseUrl = "http://api.stackoverflow.com/1.1/";
            
            string request = "users/" + Id;

            byte[] gzippedResponse = stack.DownloadData(baseUrl + request);
            byte[] decompressedResponse = Decompress(gzippedResponse);
            string userData = System.Text.ASCIIEncoding.ASCII.GetString(decompressedResponse);

            context.Cache.Add("stackUserData-" + Id, userData,
              null, DateTime.Now.AddMinutes(5),
              System.Web.Caching.Cache.NoSlidingExpiration,
              System.Web.Caching.CacheItemPriority.Normal,
              null);

            context.Response.Write(userData);
        }

        // Method to decompress byte array containing gzipped data
        //   borrowed from: http://www.dotnetperls.com/decompress-web-page
        static byte[] Decompress(byte[] gzip)
        {
            using (GZipStream stream = new GZipStream(new MemoryStream(gzip),
                                  CompressionMode.Decompress))
            {
                const int size = 4096;
                byte[] buffer = new byte[size];
                using (MemoryStream memory = new MemoryStream())
                {
                    int count = 0;
                    do
                    {
                        count = stream.Read(buffer, 0, size);
                        if (count > 0)
                        {
                            memory.Write(buffer, 0, count);
                        }
                    }
                    while (count > 0);
                    return memory.ToArray();
                }
            }
        }

        public bool IsReusable
        {
            get
            {
                return false;
            }
        }
    }
}

The following snippets are what I added to Dave Ward's code in order to decompress GZip'ed results:

Decompress() method relies on System.IO and System.IO.Compression
using System.IO;
using System.IO.Compression;

Set accept-encoding header to accept stackoverflow API's GZip'ed results
stack.Headers["Accept-Encoding"] = "gzip";  

Our request to StackOverflow API to capture results becomes a 3-step process because we need to store the GZip'ed result in a byte array in order to decompress it and then turn it in to a string
byte[] gzippedResponse = stack.DownloadData(baseUrl + request);
byte[] decompressedResponse = Decompress(gzippedResponse);
string userData = System.Text.ASCIIEncoding.ASCII.GetString(decompressedResponse);

Method to decompress a byte array of GZip'ed data was borrowed from: http://www.dotnetperls.com/decompress-web-page
static byte[] Decompress(byte[] gzip)
{
   using (GZipStream stream = new GZipStream(new MemoryStream(gzip),
                          CompressionMode.Decompress))
   {
      const int size = 4096;
      byte[] buffer = new byte[size];
      using (MemoryStream memory = new MemoryStream())
      {  
         int count = 0;
         do
         {
            count = stream.Read(buffer, 0, size);
            if (count > 0)
            {
               memory.Write(buffer, 0, count);
            }
         }
         while (count > 0);
         return memory.ToArray();
      }
   }
}