Towards metadata for KML

<meta name=”description” content=”A long-form piece that tries to zoom out a bit for a better big picture. You will either know everything in it already, you won’t care, or it will be fundamentally flawed in some way. Please choose one.”>

It’s always fun to show off Google Earth to a first-time user while surreptitiously observing their reaction; it really is an impressive application. Don’t forget that all this immersive goodness is just a lure, however; it’s meant to draw us in until we find ourselves using its most innovative feature: The network link.

Network links allow anyone to dynamically deliver content to Google Earth. The markup language used for this is KML, a form of XML that can describe objects as simple as a single labelled coordinate pair, or as complex as collections of points, paths, 3D shapes, views, map overlays and icons, all with coordinates, descriptions and links to further content elsewhere on the web. Google Earth renders the KML delivered by these network links onto of the surface of the globe, just as it does with most of its own pre-configured layers.

You can thus think of Google Earth as a geospatially aware browser. The network links are “websites” or “feeds”, except that individual content items are wedded to specific relevant places on the globe. Just as with websites, this content can be static — uploaded once, and updated infrequently — or dynamic — uploaded at regular intervals in anticipation of new data, or even with content that depends on where on Earth your view lingers. If you see something worthwhile amid all this delivered content, you can save it by dragging the item to your own list of static content (or you can make your own content).

A traditional web browser renders and displays only one HTML document at a time. Trying to layer several unrelated pages on top of one another would result in a cacophony of content. Google Earth, however, lets you turn on as many KML items as your computer can handle, and often the result is greater than the sum of its parts. This makes Google Earth far more powerful than a traditional web browser.

There already are dozens, if not hundreds, of network links available for download. Weather overlays, sports competitions, blogs, vlogs, Flickr photos, live Space Shuttle positions, real estate listings, 911 calls, live traffic, newsfeeds… The ability to simultaneously view subsets of these layers makes the possibilities for serendipity huge, though at the price of complexity. As the number of available network links grows, simply listing them will be found wanting as a means of organizing such a sprawling data set. Folders aren’t a feasible solution in the long run either — they force sorting by one category only, and that category usually becomes subject matter.

Enter metatags.

A taxonomy for HTML documents is already well established, and is encoded as <meta> tags in page headers. I think KML objects in general, and network links in particular, will soon benefit from having a similar taxonomy, so that it becomes easier to select collections of items based on their common properties, rather than pecking away ad hoc at checkmark boxes every time we want to view a new configuration.

Some of the meta-information used for traditional HTML websites would already be of immediate value: Author, language and keywords, for example. But network links and their component objects have other properties that would be well worth encoding in metatags. For example:

Scope: Local, region, country, continent, global…

At what scale does the object affect the view in Google Earth?

Media: Text, video, photos, sound…

For placemarks that reference other sites on the web, what kind of media is being referenced?

Persistence: Years, months, days, seconds…

How long, once updated, can the content be expected to be accurate (though perhaps not complete)? A layer of hospital locations has high persistence; a layer for live Shuttle positions has low persistence.

Recommended Mode: On view, once, periodically, manually…

What does the author recommend is the best way to refresh the data in a network link? This is different from the <viewRefreshmode> and <viewRefreshTime> KML elements, because the settings the user ends up using are not necessarily the settings the author recommends.

Recommended Interval: yearly, monthly, weekly, daily, hourly, x seconds…

With what frequency does the author envisage updating the data in her network link? This is different from the <refreshInterval> KML element because the user might want to check for new content more frequently than the likely content update frequency.

AuthorType: man, nature, mixed…

Mixed because some layers contain objects that refer to both kinds of data.

Dimension: events, objects, flow… This one is a bit more abstract. I actually had a go at classifying KML objects more fully along these lines a few weeks ago. This metatag is needed because KML can describe very distinct kinds of data. A map overlay showing wind speeds is a fundamentally different category of data from a path describing a Tour de France stage, or a model of the Golden Gate Bridge. These data sets all exploit the available degrees of freedom in different ways, are classifiable accordingly. (Again, check out my earlier post.)

What does adding such metadata to content allow? well, for example, you could nix the viewing of low-persistence layers in case you’re off-line. You could choose to focus on just man-made events, like concert tour locations for subscribed bands, or on just natural events, like earthquakes and volcano eruptions. Turning on just text-based links with low persistence would give you all your newsfeeds. Choosing text-based links with high persistence would steer you to Wikipedia articles.

One final thing. Where and how would this metadata be injected into KML? This is where I betray my shallow knowledge of markup languages. Metadata could be added as comments, but that feels so wrong. There is, however, the schema element which seems to allow you to set up key/value pairs like the ones above. I’m just not sure that’s what this element was intended for, and it also doesn’t constrain the allowed values. So perhaps this is something that needs official working on for KML 3.0.

PS. Was that long enough? No? I just saw James Fee is all over metadata, and he points to interesting examples of existing metadata for other geospatial file formats.