Parsing a GeoRSS Atom feed using XML to LINQ in Silverlight

…or how to put a map of your blogposts on your blog.

I recently started a little pet project with a “photo a day” blog. I thought it could be fun to geocode each blogpost with to the place where each photograph was taken, and then place each photo on a map, similar to the flickr map I created earlier.

The most common way of geocoding an atom feed, is by adding a <georss:point>[latitude] [longitude]</georss:point> field for each entry. However, the blogengine I’m using currently doesn’t support that, so I will also try and find the location using a magic string in the blogpost, in this case “Location: [latitude] [longitude]”. You will notice that the posts I made so far, all have this at the end of the post. This might not be the most elegant solution to geocoding your blogposts, but it should work for any type of blog.

So the first step is to create a WebRequest that will download the feed (this could be simplified by using the WebClient, but I like the full control of the WebRequest):

System.Net.WebRequest request = System.Net.WebRequest.Create(
     new Uri("http://socaldaily.sharpgis.net/syndication.axd?format=atom", UriKind.Absolute));
request.BeginGetResponse(new AsyncCallback(createRssRequest), new object[] { request, this });

Begin get response handler:

private static void createRssRequest(IAsyncResult asyncRes)
{
      object[] state = (object[])asyncRes.AsyncState;
      System.Net.HttpWebRequest httpRequest = (System.Net.HttpWebRequest)state[0];
      Page page = (Page)state[1];
 
      if (!httpRequest.HaveResponse) { return; }
 
      System.Net.HttpWebResponse httpResponse = (System.Net.HttpWebResponse)httpRequest.EndGetResponse(asyncRes);
      if (httpResponse.StatusCode != System.Net.HttpStatusCode.OK) { return; }
      Stream stream = httpResponse.GetResponseStream();
      page.Dispatcher.BeginInvoke(() => { page.ParseAtomUsingLinq(stream); });
}

When the request comes back, it will call our ParseAtomUsingLinq method with the response stream. The LINQ expression will select basic parameters like title, date, link, contents and if available, the georss location point.

private void ParseAtomUsingLinq(System.IO.Stream stream)
{
    System.Xml.Linq.XDocument feedXML = System.Xml.Linq.XDocument.Load(stream);
    System.Xml.Linq.XNamespace xmlns = "http://www.w3.org/2005/Atom"; //Atom namespace
    System.Xml.Linq.XNamespace georssns = "http://www.georss.org/georss"; //GeoRSS Namespace
 
    //Use LINQ to select all entries
    var posts = from item in feedXML.Descendants(xmlns + "entry")
                select new 
                {
                    Title = item.Element(xmlns + "title").Value,
                    Published = DateTime.Parse(item.Element(xmlns + "updated").Value),
                    Url = item.Element(xmlns + "link").Attribute("href").Value,
                    Description = item.Element(xmlns + "summary").Value,
                    Location = fromGeoRssPoint(item.Element(georssns + "point"))  //Simple GeoRSS <georss:point>X Y</georss.point>
                };
     foreach (var post in postsOrdered)
     {
          ESRI.ArcGIS.Geometry.MapPoint point = null;
          if (post.Location != null)
          {
               point = post.Location; //Use GeoRSS location
          }
          else
          {
               point = ExtractLocation(post.Description); //Search for location in blog content
          }
          if (point == null) continue; //We didn't find a point
          string imageSrc = ExtractImageSource(post.Description, post.Url); //try and find an image to use for symbol
          //TODO: Add points to map...
     }
}

You will notice in the above code we use three utility methods for extracting data. First we have a method that converts the simple GeoRSS format “<georss:point>X Y</georss.point>” to a point. In this case I’m using the MapPoint class from the ESRI ArcGIS Silverlight API, since I wan’t to use that to draw my entries on the map.

private ESRI.ArcGIS.Geometry.MapPoint fromGeoRssPoint(System.Xml.Linq.XElement elm)
{
    if (elm == null) return null;
    string val = elm.Value;
    string[] vals = val.Split(new char[] { ' ' });
    if (vals.Length != 2) return null;
    double x = double.NaN;
    double y = double.NaN;
    if (double.TryParse(vals[1], out x) && double.TryParse(vals[0], out y))
    {
        return new ESRI.ArcGIS.Geometry.MapPoint(x, y);
    }
    return null;
}

If this doesn’t return any results (usually if the feed is not georss enabled), we will during the loop look for the location tag in the entry content, using the following helper method:

private ESRI.ArcGIS.Geometry.MapPoint ExtractLocation(string description)
{    
    int idx = description.LastIndexOf("Location: ");
    int idx2 = description.LastIndexOf("</p>");
    double x = double.NaN;
    double y = double.NaN;
    if (idx < idx2 && idx > -1)
    {
        string sub = description.Substring(idx, idx2 - idx);
        string[] vals = sub.Split(new char[] { ' ' });
        foreach (string val in vals)
        {
            if (val[0] == 'N')
            {
                double.TryParse(val.Substring(1), out y);
            }
            else if (val[0] == 'S')
            {
                if (double.TryParse(val.Substring(1), out y)) y *= -1;
            }
            else if (val[0] == 'E')
            {
                double.TryParse(val.Substring(1), out x);
            }
            else if (val[0] == 'W')
            {
                if (double.TryParse(val.Substring(1), out x)) x *= -1;
            }
        }
        if (!double.IsNaN(x) && !double.IsNaN(y))
        {
            return new ESRI.ArcGIS.Geometry.MapPoint(x, y);
        }
    }
    return null;
}

Lastly, we will search for the first <img src=”…”/> entry in the body and use that for displaying the entries on the map:

private string ExtractImageSource(string description, string feedlink)
{
    int idxSrc = description.IndexOf("<img ");
    if (idxSrc >= 0)
    {
        int idxSrc2 = description.Substring(idxSrc).IndexOf("src=\"");
        int idxSrc3 = description.Substring(idxSrc + idxSrc2 + 5).IndexOf("\"");
        if (idxSrc2 >= 0 && idxSrc3 >= 0)
        {
            string src = description.Substring(idxSrc + idxSrc2 + 5, idxSrc3);
            Uri uri = new Uri(src, UriKind.RelativeOrAbsolute);
            if (uri.IsAbsoluteUri) return uri.AbsoluteUri;
            Uri page = new Uri(feedlink, UriKind.Absolute);
            return new UriBuilder(page.Scheme, page.Host, page.Port, src).Uri.AbsoluteUri;
        }
    }
    return null;
}

In our feed entries loop, we can now simply construct a graphic, set a few attributes that we will use for binding the look for the symbol (ie title and image) and add it to our graphics layer. The symbol I use here is the same as used for the flickr application mentioned above.

ESRI.ArcGIS.Graphic g = new ESRI.ArcGIS.Graphic()
{
    Geometry = point,
    Symbol = myFeedSymbol
};
g.Attributes.Add("Title", title);
g.Attributes.Add("ImageURI", src);
g.Attributes.Add("WebURI", link);
myGraphicsLayer.Graphics.Add(g);

You can view the blog map in action here: http://socaldaily.sharpgis.net/page/Photo-map.aspx

Note that there are several ways to geocode blogposts, and this approach only deals with the simplest version.

The Microsoft Live Maps and Google Maps projection

I have lately seen several blogposts confused about which datum and projection Microsoft’s Live Maps (Virtual Earth) and Google Maps use. As most people already know by now, they render the round earth onto a flat screen using a Mercator projection.

I think the confusion comes from that they actually use two spatial reference systems at the same time:

  1. Geographic  Longitude/Latitude coordinatesystem based on the standard WGS84 datum.
  2. Mercator projection using a datum based on WGS84, BUT modified to be spheric.

So when is what used?

The Javascript API’s use (1) as input when you want to add points, lines and polygons. That is, they expect you to input any geometry in geographical coordinates, and click events etc. will also return geometry in this spatial reference. This is the coordinate system most javascript developers will use. The API will automatically project it to the spheric mercator projection.

If you want to create image overlays, or roll your own tile server on top of the map, you will need to project your images into a spheric mercator projection. The JavaScript APIs are not able to do this for you.

Here’s a bit of facts about the two projections:

The valid range of (1) is: [-180,-85.05112877980659] to [180, 85.05112877980659].

The valid range of (2) is: [-20037508.3427892, -20037508.3427892] to [20037508.3427892, 20037508.3427892]

Well-known Text for (1):
GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433]]

Well-known Text for (2):
PROJCS["Mercator Spheric", GEOGCS["WGS84basedSpheric_GCS", DATUM["WGS84basedSpheric_Datum", SPHEROID["WGS84based_Sphere", 6378137, 0], TOWGS84[0, 0, 0, 0, 0, 0, 0]], PRIMEM["Greenwich", 0, AUTHORITY["EPSG", "8901"]], UNIT["degree", 0.0174532925199433, AUTHORITY["EPSG", "9102"]], AXIS["E", EAST], AXIS["N", NORTH]], PROJECTION["Mercator"], PARAMETER["False_Easting", 0], PARAMETER["False_Northing", 0], PARAMETER["Central_Meridian", 0], PARAMETER["Latitude_of_origin", 0], UNIT["metre", 1, AUTHORITY["EPSG", "9001"]], AXIS["East", EAST], AXIS["North", NORTH]]

Proj.4 definition for (1):
+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs

Proj.4 definition for (2) (see here for an explanation of the weird ’nadgrids’ parameter):
+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs

So why this weird extent for the latitude? First of all, the poles in the Mercator projection extends towards infinity, so at some point they have to cut them off (and who cares about ice anyway - apart from that its melting). If you look at (2) these latitude/longitude values projected into the spheric mercator results in a perfect square, fitting perfectly with squared image tiles, that are simple to sub-divide over and over again, as you zoom in. I expect the reason for the spheric datum is for simplicity and perfomance when reprojecting points from longitude/latitude to screen coordinates. Charlie Savage also has a more mathematical approach to deriving these values.

For a quick introduction to projections, coordinate systems and datums see here.

If you want to know more about how these mapping api's work, keep an eye on Jayant's blog.

Update: We now have an offical EPSG code for the projection. See details here.