Archive

Archive for the ‘OGC’ Category

Representing an XML qualified name as a string

May 31st, 2011 1 comment

I am working on a project where we need to store qualified XML names (QNames i.e. namespace and local name) as strings outside of an XML document. This includes QNames from any third party namespace that a user of our package wants to include. So I set out to find the standard way of doing this in a way that would give other apps the best chance of being able to properly parse the string back into a QName, especially for QNames which already had a somewhat widely used string representation. We are storing meta-data about “things” (documents, sensor recordings, you name it), so I paid particular attention to popular schemas in the semantic web space. Should we use ns:name, ns/name, ns#name, or something else? After spending way too much time on this, here is what I found:

  • There is no official standard. A qualified name is officially defined as two strings – the namespace and the local name. Oh, great.
  • One of the first papers on this by James Clark says {namespace}local is proper. This is what javax.xml.namespace.QName.toString produces, and the QName.valueOf method will parse that format. This form is also what the groovy QName class uses, but, interestingly, the equals for that class will accept a string that uses a colon delimiter.
  • http://docstore.mik.ua/orelly/xml/xmlnut/ch04_02.htm talks of both {namespace}local and namespace#local
  • http://www.rpbourret.com/xml/NamespacesFAQ.htm#names_15 has great detail on namespaces overall. It talks of {namespace}local and another form, namespace^local, which is what SAX filter uses, according to the page. I found no other examples or mention of this “caret” format.
  • javax.xml.soap.Name uses namespace:local. Apache axis does the same thing, which is not surprising considering I believe one came from the other.
  • ECMAScript for XML (and, thus, Adobe ActionScript) uses 2 colons – namespace::local. This is partly because it uses the two colons as an operator of sorts, and needed to separate it from other uses of a colon in the ECMAScript syntax.
  • Dublin Core (DC) explicitly defines the URIs of the terms in its schema. It uses “the path divider ‘/’ as the delimiter between namespace and local name. Of note, if you try to put one of those URIs into a web browser as a URL, it will redirect to a page which uses ‘#’ to note the fragment in an RDF schema. For example, http://purl.org/dc/terms/ will resolve to http://dublincore.org/2010/10/11/dcterms.rdf#name. I didn’t find any other schema/taxonomy that explicitly defines the URI for each element.
  • Regardless of the above behaviour, the Dublin Core XSD defines the namespace to include the ending ‘/’.
  • The namespaces of the RDF and OWL specifications include an ending ‘#’.
  • All namespaces included in the output from pingthesemanticweb, which lists the most popular semantic schemas, end in ‘/’ or ‘#’. Even the few that use urn format end in ‘#’ (e.g. urn:x-inspire:specification:gmlas:HydroPhysicalWaters:3.0#).
  • The Department of Defense Discovery Metadata Specification (DDMS) namespace, based heavily on Dublin Core, includes the ending ‘/’ just as DC does.
  • I could not find any namespaces that end in ‘}’, ‘^’, or ‘:’ (the first two of which are illegal, I think)

  • So, you might be thinking that we could just concatenate the namespace and local name together to form the string. To parse it, we could then split the string at the last occurrence of the delimiter character, keeping the delimiter as part of the namespace if it is a ‘/’ or a ‘#’. But wait! There’s more…

  • Many non-semantic-web schemas, like the XML Schema itself, xlink, and the OGC standards like gml, do not include the ending delimiter in their namespaces.
  • National Information Exchange Model (NIEM) namespaces, arguably somewhat-semantic, also do not include a trailing delimiter.
  • Neither does the Intelligence Community Metadata Standard for
    Information Security Marking (IC-ISM)
    namespace (which is in urn format).
  • Nor does the DOD core metadata OWL schema, at least as far as I can tell. Sorry, I couldn’t find an exact reference to that one.

Resolution Rules

So if you want to represent a particular qualified name as a string and do it in a way that others are most likely to recognize as the “accepted” way to represent that particular QName and you want it to be reversible, at least within your own app, the best rules I could come up with are:

Creating the String

Call the path divider ‘/’ and fragment ‘#’ symbols sticky delimiters because they may be a part of (i.e. stick to) a namespace. Call the other possibilities (‘:’, ‘::’, ‘}’, ‘^’) formal delimiters because you know they only serve the purpose of being a delimiter.

  1. If the namespace ends in a delimiter of any form, simple append the local name directly to it.
  2. Else, use ‘:’, ‘^’ or, to be totally safe, surround the namespace string with ‘{}’ and then append the local name. I chose ‘:’ because I at least saw some uses of that form on various pages while I never saw any uses of the caret ‘^’ or the surrounding ‘{}’. If you have total control of your input and output, use the surrounding braces format since it is totally unambiguous.

Parsing the String

  1. If there is a ‘{}’ pair, can assume form is {namespace}local
  2. Else, find the last possible delimiter in the string. If it is a “formal” delimiter, then drop the delimiter and make the namespace the chars before it and local name the chars after it.
  3. Else, if the last delimiter is “sticky”, you have to guess whether to keep it in the namespace. I put some basic logic in my code to recognize well known namespaces (like those above) that do not end in a delimiter, but then otherwise assume that a sticky delimiter should be included in the namespace.

It’s not a perfect solution, but that’s what you get when there is no standard.

Categories: groovy, OGC, semantic web, XML Tags: ,

Newbie Notes for GIS Web Development

July 7th, 2009 1 comment

I have spent some time over the past couple of weeks getting to know the Open Source GIS arena, from spatial databases (PostGIS) to server software (GeoServer) to web display techs (OpenLayers). When I started, I decided to map my trail as I explored this space since I will have co-workers following along behind me as new projects in our group ramp up. Hopefully, these breadcrumbs will help other newbies in this area as well.

begin tangent
Have you ever felt that sometimes the worst people to write documentation for something are actually those that know the most about it? Once you are an expert in something, it’s really hard to consciously remember all the questions that came up and problems you had to solve and research you had to do to reach nirvana. That’s why with this project I tried to jot down questions that I had as I went, and kept the question list even after I figured out the answers. It helped me not forget what I didn’t know. I’ve used this technique before, and strongly encourage new hires who come onto my projects to do the same so we can fill in the holes in our developer documentation. Hmmm… sounds like another blog entry in there…
end tangent

This first entry is about the “what’s what” GIS and “who’s who” in the Open Source area. It is not intended to provide all the information you need to work on a GIS application. It is intended to tell you where to get that information. It also explains a few questions and misconceptions I had as I started down this path.

What is GIS?

First, to get your head around the GIS terms and concepts, read this short overview on GIS concepts from developerworks. One thing the article does is define what a layer is and relate that to the term feature. What it leaves out is the term FeatureType. A FeatureType defines a type of data, listing the attributes that go along with it, such as name, shape, and other meta-data (e.g. population for cities, road type for a set roads (highway, secondary, dirt, etc)). The layer concept is a way of visualizing a bunch of features of the same FeatureType. For example, in GoogleEarth, you don’t think about turning on and off the road FeatureType or the city FeatureType. You think about showing/hiding those layers. Some software (like GeoServer) uses the term FeatureType in places where we more naturally think in layers. Just get used to moving back and forth between the two terms.

What’s a Map?

Something that is missing from all the literature is a strict definition of what is a map. Tutorials describe layers, FeatureTypes,and features. Standards define ways of storing geographic information, retrieving that information, applying styles to it, and rendering it. But I couldn’t find anything that strictly defines what a map is. What this means is that each application or library has it’s own concept for what comprises a map, or it may not have a unified concept of a map at all. OpenLayers (the most popular way of rendering maps in web pages) is based around the concept of a map. It’s map has layers of data that can be shown/hidden independently. The map also has tools for zooming, paging, measuring, highlighting, and possibly even editing the data. GeoServer (an Open Source map server), on the other hand, is more focused on individual data sets (i.e. independent layers or FeatureTypes). How those sets of data are combined into a single, visual display that we would call a map is up to the consumer of the data (such as a web app using OpenLayers or a desktop app like uDig).

If you are looking for the least common denominator for the concept of a map, think of a display of layered geospatial information with one or more “base layers” comprised of static (or nearly so) data (e.g. geographic features, political boundaries, rivers, roads, cities, etc) and zero or more “live layers” comprised of data that can change with relatively high frequency (e.g. weather images, traffic patterns, earthquake epicenters, recent Elvis sightings, etc).

Who’s Who

To learn about the Open Source standards and tools in the GIS space, flip through Scott Davis’ GIS for Web Developers presentation while you listen to his GIS podcast. He has a lot of other great content on his mapmap site. If you like his presentation style, pick up his GIS for web developers book. It comes in a PDF format for instant gratification.

Now that you’ve been exposed to some of the concepts and heard mention of the major players and most popular apps, you can read what Wikipedia has to say about them. As usual, the Wikipedia pages have links to the organizations’ sites as well as important related concepts.

  • Open Geospatial Consortium (OGC) – the first thing to know about the OGC is that they publish the WMS and WFS standards. They have many other standards as well, but those two are the primary protocols by which you will get data from a map server (like GeoServer) to a UI (like OpenLayers).
  • ESRI ArcGIS – ESRI is the 800 lbs. gorilla in the GIS space. It is sort of like the Oracle of GIS. It has it’s own commercial, proprietary software suite called ArcGIS. If you have heard of Shape files, this is the company that invented that format. It’s not open source, but it’s good to know who they are. In my situation, I have existing systems that feed into ArcGIS layers, so I have to work with it as well as with other data via the OGC standards.
  • GeoServer – Highly extensible, open source WMS/WFS server. A good application to keep in mind if you want to run an application that is a single source for both “base layer” data and your app-specific data. Something that was critical for my project is that it can pull data from an ESRI ArcGIS server as well as other sources like PostGIS or raw images. It’s online user manual contains some good sections on basic concepts for serving and formating geo data over HTTP, including Styled Layer Descriptor (SLD), WMS, and WFS.
  • PostGIS – Geo-spatial extensions to PostgreSQL. This is the most popular (and powerful) OpenSource geo-enabled DB. MySql also has geo-extensions, as does MS SqlServer and Oracle (called Oracle Spatial). BostonGis has a great tutorial on installing PostGIS and the basics for using it.
  • OpenLayers – JavaScript library for displaying maps and mapping tools. If you see a map on a web site and it isn’t an embedded GoogleMap or MapQuest map, it’s probably being rendered by OpenLayers.
  • Open Source Geospatial Foundation (OSGeo) – not to be confused with the OGC, above. The OGC is a standards organization. OSGeo is a non-profit that supports open-source geo software projects and related initiatives. They support web mapping, desktop apps, geospatial libraries, and other types of projects, including GeoTools and OpenLayers.

If you thought that keeping OGC and OSGeo straight was confusing, just wait! There’s one more. OpenGeo is a company (sorry, a “social enterprise”) that integrates the most popular Open Source GIS technologies (like those listed above) into a single, supported stack or application framework. I have no experience with them, but if you need to get a GIS app up and running quickly, they sound like good people to call. I am sure there are other such organizations out there that can help write your software or train your dev team. I just now found that Scott Davis’ ThirstyHead company is offering 3-day GIS training course.

GIS Blogs

There are probably a 100 GIS-oriented blogs. Start with planetgs. That is an aggregator for many others. If you find that articles coming from a particular source are good, you can follow it directly. I happen to like Fuzzy Tolerance for it’s good content on GIS and Open Source web development in general and concise monthly roundups.

Where to Get Data

If you want to get “base layer” data (geographic, political, structural, etc) to display underneath your app-specific data, browse through these sites:


Geocoding

If you have address data or other geographic text and want to find out how to plot it on a map, there are a few free geocoding services. geocoder.us is a good starting place for testing your app if you just have data in the US. There are sister services for other countries. Google also has a geocoding service, but the license requirement says you have to use the data to display on a google map. (At least, it does in one place. In another place, it just says display on a map, without specifically stating google map.)

Getting Data Via WMS versus WFS

Before going into more detail about some of the above apps and libraries, I want to clear up something that confused me at first. How data is stored (vectors or rasters) is a separate concept from how it is distributed and displayed. When you request data from a WMS service, you will get back an image. It doesn’t matter if the data is stored as a jpeg or tiff or as a Shape file or a set of XML files or it is in a database table. Whatever the source, the map server converts that data into an image using some standard styling rules and sends that image over the wire. On the other hand, if you request the same data via WFS, you will get some form of data list, usually in an XML format known as GML. How that data is then transformed into some visual display is up to the client.

The difference between WMS and WFS has an impact on how you can combine data from different sources in the same web-based map when using OpenLayers. WFS layers can be subject to the cross domain scripting limitation. But that is the basis for another post.

Reference Apps

Finally, here are some apps that let you drool over the possibilities of what GIS tools can do for you.

Have your own cool web-based GIS app? Post a comment and I’ll add it in.

Next up… A few notes on setting up datasources in GeoServer.

Categories: GeoServer, GIS, OGC, OpenLayers, PostGIS, WFS, WMS Tags: