Parsing a GeoRSS Atom feed using XML to LINQ in Silverlight

…or how to put a map of your blogposts on your blog.

I recently started a little pet project with a “photo a day” blog. I thought it could be fun to geocode each blogpost with to the place where each photograph was taken, and then place each photo on a map, similar to the flickr map I created earlier.

The most common way of geocoding an atom feed, is by adding a <georss:point>[latitude] [longitude]</georss:point> field for each entry. However, the blogengine I’m using currently doesn’t support that, so I will also try and find the location using a magic string in the blogpost, in this case “Location: [latitude] [longitude]”. You will notice that the posts I made so far, all have this at the end of the post. This might not be the most elegant solution to geocoding your blogposts, but it should work for any type of blog.

So the first step is to create a WebRequest that will download the feed (this could be simplified by using the WebClient, but I like the full control of the WebRequest):

System.Net.WebRequest request = System.Net.WebRequest.Create(
     new Uri("http://socaldaily.sharpgis.net/syndication.axd?format=atom", UriKind.Absolute));
request.BeginGetResponse(new AsyncCallback(createRssRequest), new object[] { request, this });

Begin get response handler:

private static void createRssRequest(IAsyncResult asyncRes)
{
      object[] state = (object[])asyncRes.AsyncState;
      System.Net.HttpWebRequest httpRequest = (System.Net.HttpWebRequest)state[0];
      Page page = (Page)state[1];
 
      if (!httpRequest.HaveResponse) { return; }
 
      System.Net.HttpWebResponse httpResponse = (System.Net.HttpWebResponse)httpRequest.EndGetResponse(asyncRes);
      if (httpResponse.StatusCode != System.Net.HttpStatusCode.OK) { return; }
      Stream stream = httpResponse.GetResponseStream();
      page.Dispatcher.BeginInvoke(() => { page.ParseAtomUsingLinq(stream); });
}

When the request comes back, it will call our ParseAtomUsingLinq method with the response stream. The LINQ expression will select basic parameters like title, date, link, contents and if available, the georss location point.

private void ParseAtomUsingLinq(System.IO.Stream stream)
{
    System.Xml.Linq.XDocument feedXML = System.Xml.Linq.XDocument.Load(stream);
    System.Xml.Linq.XNamespace xmlns = "http://www.w3.org/2005/Atom"; //Atom namespace
    System.Xml.Linq.XNamespace georssns = "http://www.georss.org/georss"; //GeoRSS Namespace
 
    //Use LINQ to select all entries
    var posts = from item in feedXML.Descendants(xmlns + "entry")
                select new 
                {
                    Title = item.Element(xmlns + "title").Value,
                    Published = DateTime.Parse(item.Element(xmlns + "updated").Value),
                    Url = item.Element(xmlns + "link").Attribute("href").Value,
                    Description = item.Element(xmlns + "summary").Value,
                    Location = fromGeoRssPoint(item.Element(georssns + "point"))  //Simple GeoRSS <georss:point>X Y</georss.point>
                };
     foreach (var post in postsOrdered)
     {
          ESRI.ArcGIS.Geometry.MapPoint point = null;
          if (post.Location != null)
          {
               point = post.Location; //Use GeoRSS location
          }
          else
          {
               point = ExtractLocation(post.Description); //Search for location in blog content
          }
          if (point == null) continue; //We didn't find a point
          string imageSrc = ExtractImageSource(post.Description, post.Url); //try and find an image to use for symbol
          //TODO: Add points to map...
     }
}

You will notice in the above code we use three utility methods for extracting data. First we have a method that converts the simple GeoRSS format “<georss:point>X Y</georss.point>” to a point. In this case I’m using the MapPoint class from the ESRI ArcGIS Silverlight API, since I wan’t to use that to draw my entries on the map.

private ESRI.ArcGIS.Geometry.MapPoint fromGeoRssPoint(System.Xml.Linq.XElement elm)
{
    if (elm == null) return null;
    string val = elm.Value;
    string[] vals = val.Split(new char[] { ' ' });
    if (vals.Length != 2) return null;
    double x = double.NaN;
    double y = double.NaN;
    if (double.TryParse(vals[1], out x) && double.TryParse(vals[0], out y))
    {
        return new ESRI.ArcGIS.Geometry.MapPoint(x, y);
    }
    return null;
}

If this doesn’t return any results (usually if the feed is not georss enabled), we will during the loop look for the location tag in the entry content, using the following helper method:

private ESRI.ArcGIS.Geometry.MapPoint ExtractLocation(string description)
{    
    int idx = description.LastIndexOf("Location: ");
    int idx2 = description.LastIndexOf("</p>");
    double x = double.NaN;
    double y = double.NaN;
    if (idx < idx2 && idx > -1)
    {
        string sub = description.Substring(idx, idx2 - idx);
        string[] vals = sub.Split(new char[] { ' ' });
        foreach (string val in vals)
        {
            if (val[0] == 'N')
            {
                double.TryParse(val.Substring(1), out y);
            }
            else if (val[0] == 'S')
            {
                if (double.TryParse(val.Substring(1), out y)) y *= -1;
            }
            else if (val[0] == 'E')
            {
                double.TryParse(val.Substring(1), out x);
            }
            else if (val[0] == 'W')
            {
                if (double.TryParse(val.Substring(1), out x)) x *= -1;
            }
        }
        if (!double.IsNaN(x) && !double.IsNaN(y))
        {
            return new ESRI.ArcGIS.Geometry.MapPoint(x, y);
        }
    }
    return null;
}

Lastly, we will search for the first <img src=”…”/> entry in the body and use that for displaying the entries on the map:

private string ExtractImageSource(string description, string feedlink)
{
    int idxSrc = description.IndexOf("<img ");
    if (idxSrc >= 0)
    {
        int idxSrc2 = description.Substring(idxSrc).IndexOf("src=\"");
        int idxSrc3 = description.Substring(idxSrc + idxSrc2 + 5).IndexOf("\"");
        if (idxSrc2 >= 0 && idxSrc3 >= 0)
        {
            string src = description.Substring(idxSrc + idxSrc2 + 5, idxSrc3);
            Uri uri = new Uri(src, UriKind.RelativeOrAbsolute);
            if (uri.IsAbsoluteUri) return uri.AbsoluteUri;
            Uri page = new Uri(feedlink, UriKind.Absolute);
            return new UriBuilder(page.Scheme, page.Host, page.Port, src).Uri.AbsoluteUri;
        }
    }
    return null;
}

In our feed entries loop, we can now simply construct a graphic, set a few attributes that we will use for binding the look for the symbol (ie title and image) and add it to our graphics layer. The symbol I use here is the same as used for the flickr application mentioned above.

ESRI.ArcGIS.Graphic g = new ESRI.ArcGIS.Graphic()
{
    Geometry = point,
    Symbol = myFeedSymbol
};
g.Attributes.Add("Title", title);
g.Attributes.Add("ImageURI", src);
g.Attributes.Add("WebURI", link);
myGraphicsLayer.Graphics.Add(g);

You can view the blog map in action here: http://socaldaily.sharpgis.net/page/Photo-map.aspx

Note that there are several ways to geocode blogposts, and this approach only deals with the simplest version.