Introduction
When setting up a Web-based journey planner that takes into account individual user needs, developers are confronted with the daunting prospect of integrating heterogeneous datasets. Industry players provide an alternative solution to these complexities, by providing services on top of their datasets that are centralized in one location. They are able to provide a solution for the ~80% of the world-wide needs and build a business around that. When trying to cater to the remaining 20%, industrial route planners are quickly confronted with diminishing returns of integrating more datasets. Take for example use cases of (i) people owning a foldable bike and the ability to take their bike on public transport, (ii) companies trying to find an optimal delivery route for their delivery service where some vehicles cannot pass through low emission zones, (iii) routes based on the real-time state of traffic control systems and probability to hit a green light, or (iv) people with special constraints or disabilities. For each of these four use cases, extra datasets are needed to calculate these end-user specific routing graphs. Such datasets are published by different organizations that often publish them openly, thanks to strategic goals or legal mandates. Every route planner will somehow need to compile their own sources over which they can execute their own route planning algorithm.
In this paper, we aim to automate data adoption in route planners. As a first step, we introduce Routable Tiles. This is a hypermedia specification for geospatial road networks. In this specification, we republished all the roads in OpenStreetMap. In the next section we will see that the ideas behind Routable Tiles itself is not novel. The contribution of this paper lies in applying the geospatial indexing idea from the database world to Web APIs, introducing an ontology for describing geospatial hypermedia controls, and launching this world-wide dataset as a resource to the Semantic Web community.
Related work
Geospatial data on the Web
Spatial Data on the Web [1] is a W3C and OGC collaboration to create a list of best practices for spatial data on the Web. It takes a strong position in favor of HTTP URIs to identify resources. The rationale is that Linked Data principles such as the use of HTTP URIs as global identifiers, raises the interoperability of geo-spatial datasets by providing a common set of semantics that can be reused by data publishers.
Slippy maps are maps often included in web-pages on which you can pan around. In order to reduce server load, the client is preconfigured with a URL template of the web-server containing image tiles. When the map is loaded, the client can calculate all URLs necessary. Vector tiles reuse this idea to, instead of raster images, publish the raw data behind the tiles. The client can then render the maps on the client-side. This gives clear benefits over raster images: (i) the styling of the maps can be done by UI developers that can use CSS and scripts to style the road elements, (ii) vector tiles can be smaller in size as vectors are typically much smaller than a rendered bitmap, and (iii) it allows for all elements on the map to become interactive. Existing implementations include the Mapbox vector tiles and Open Map Tiles. Each have their own specificities and schemas (see https://docs.mapbox.com/vector-tiles/mapbox-streets-v8/ and https://openmaptiles.org/schema/) with a strong focus on rendering maps.
Valhalla by Mapzen and now hosted by the Linux Foundation is the first project that implements the idea of vector tiles for route planners in an open-source project. The technology proposes a tiling specification for storing routing information on disk. Tiling the data makes sure the server can be selective about the data that needs to be loaded into memory in order to execute an individual request. This tiling specification in Valhalla is however not used as an exchange format – although offline routing is an upcoming feature – and interoperability with other datasets is not a focus.
Linked Geo Data [2] is an initiative that maps OpenStreetMap data to Linked Data. It releases data dumps, subject pages and a SPARQL endpoint. Furthermore, it has their own mappings from the OpenStreetMap data model to a Linked Geo Data ontology.
Open Data publishing
Open Data can be published in various interfaces. In order to be able to query these interfaces, Comunica [3] was built. It is a Linked Data user agent that can run federated queries over several heterogeneous Web APIs, such as data dumps, SPARQL endpoints, Linked Data documents and Triple Pattern Fragments [4]. This engine has been developed to make it easy to plug in specific types of functionality as separate modules. Such modules can be added or removed depending on the configuration. As such, by looking for affordances in Web APIs, more intelligent user agents can be created. Preconditions for an engine like Comunica to work with a dataset is: supporting a Linked Data representation and allowing cross origin resource sharing headers in the HTTP responses. The better the data is split in fragments, the better the caching will be able to provide a faster user-perceived performance.
For public transport systems, instead of publishing a dump of time schedules or a full fletched route planner, Linked Connections [5] proposes a publishing mechanism that gives clients access to the data in time fragments. It uses departure-arrival pairs from a station to another (a connection), and orders these connections by departure time. It then fragments this dataset in documents that can be published over HTTP. Links in the responses ensure a client can always find more information to take into account.
Implementation
Routable Tiles is a JSON-LD specification for which the working draft can be found at https://openplanner.team/specs/2018-11-routable-tiles.html. It has three main aspects: (i) it introduces a hypermedia specification reusing Hydra Collections for describing a tile server, (ii) a way to describe OpenStreetMap’s nodes, ways and relations; and (iii) it introduces a mapping of the OpenStreetMap basic terms to an RDFS vocabulary.
The Linked Geo Data vocabulary has been unavailable since 2018 and not updated since 2015. Therefore we introduced our own vocabulary, that nonetheless takes a different approach. Instead of mapping everything, we decided to map only the bare minimum needed specifically for the use case or route planing. Therefore, we keep the ontology as close as possible to the actual OSM data model. We added links to the appropriate Linked Geo Data classes (ontology has yet to be published, awaiting a third party implementation).
We define 3 main classes: osm:Way, osm:Relation and osm:Node. The osm:members property describes the members of the relation and the osm:role their function in the relation. osm:restriction is used to model turn restrictions. The property osm:nodes is used to link to an rdf:List of osm:Node items. In one page, multiple lists can be described. If a Way crosses a tile, the other tile also mentioned the border Node in one of its rdf:Lists. The property osm:members is used to link to an rdf:List of osm:Member items.
The mapping scripts and server interface can be found at https://github.com/openplannerteam/routeable-tiles. Every tile in OpenStreetMap on zoom level 14 is on the fly made available as JSON-LD. Special attention was given to the HTTP server to include a server-side cache and to compress the HTTP response with gzip. Furthermore, it sets both an etag header and a cache-control max-age header for client-side cache control. Finally, it also allows webpages on other domains to request its data by setting the appropriate Cross Origin Resource Sharing headers.
For the URIs of Ways and Nodes, we decided to reuse the subject pages provided by the OpenStreetMap project itself. We hope that at some point, OpenStreetMap decides to support a Linked Data representation on these URIs. For example: openstreetmap.org/node/366934331 and openstreetmap.org/way/242536619.
Demonstrator
A server instance is set up publishing data for the entire world. Entry points into the hypermedia API can be found through tiles.openplanner.team/planet/14/{x}/{y}/ (e.g., with x 8411 and y 5485). A live demonstrator using this data can be found at https://openplannerteam.github.io/leaflet-routable-tiles/.
Conclusion and future work
This demo introduced a tiling mechanism for publishing road networks in Linked Data. Compared to a SPARQL endpoint approach or server answering any individual route planning request, the server costs in our approach are indisputably lower. Thanks to the hypermedia controls, the full dataset can be automatically discovered, and HTTP caching can be leveraged. Furthermore, developers of route planning application are given more flexibility as they are now in full control of the algorithm. Just like vector tiles styling, they can tailor the routing algorithm to their end-user needs in the browser. This also opens the door to use cases where multiple sources are queried at once.
Nevertheless, it still remains an open question if applying this kind of Linked Data model/HTTP URIs to the roads related data is the optimal approach. For example, when another source wants to do a statement about a part of a road, chances are low it is already available as a separate osm:Way instance, and thus the source data would have to be altered to split the original entity into two.
There might be a need in the future to support a binary format in order to reduce size. Therefore, we will benchmark the difference in size and performance between Valhalla tiles (protobuf) and Routable Tiles (JSON-LD + gzip). By doing this, we should also look at other optimizations, such as applying summaries over different zoom levels. In order to achieve better adoption of the hypermedia controls, an actor will be added to the Linked Data query agent called Comunica [3]. This way, we will help to solve geo-spatial queries over any data source by downloading only the right tiles even beyond road networks.