Why is geospatial data sharing so important?

November 19, 2015

Gordon Plunkett

There are many reasons for sharing your geospatial data. These reasons could be so others can use your data or to help reduce duplicate data collection. However, have you ever thought of sharing your data simply to help with data interoperability? By publishing your data, others will be able to see how you’ve defined your data and perhaps use the same data model and data definitions.

I recently attended Geomatics Innovation Day at Natural Resources Canada (NRCan) in Sherbrooke, Quebec. The focus of the various presentations was related to surface water mapping. Topics included how to collect water data, how to identify and map water areas, and how to use and analyze the collected water data for decision-making. I personally found the presentations and the event very useful.

One of the initial presentations was given by the head of an international working group tasked with harmonizing hydrographic data across the Canada-US border. This task requires federal cooperation in both Canada and the US, plus cooperation from the provinces, territories and states. The usual interoperability issues, such as data collection standards and data models, were identified as dissimilar in the two countries. Both countries have their data, but it is not interoperable (yet) so that cross-border hydrographic visualization and analysis can be done. This international working group is busy harmonizing the data on a drainage basin-by-drainage basin basis, which means that new standards, attributes and data models for these regions are being created and used.

Canada-US boundary water basins (Source: Watching Over our Transboundary Waters from Coast to Coast, International Joint Commission)

There was also an interesting presentation about the USGS National StreamStats program on how they collect, process and publish stream data in the US. There were other presentations on analysis of water quality, use of water data for calculating flood risk areas in cities and the use of RADARSAT data for automated real-time surface water mapping. As well, there were presentations on the Canadian National Hydrological Services basin delineation project and on the status of the NRCan National Hydrographic Network.

So, what does this work and these presentations have to do with data sharing? Clearly, surface water is part of a basemap as it is the stream, river and lake data in a basemap. It’s not surprising that many of these folks have run headlong into the same problems that the basemapping community ran into many years ago. These problems can be watered down to two main concerns and associated principles.

The first concern is related to the data collection methodology and the data definition standard. Principle 1: If two (or more) organizations are collecting the same or similar types of geospatial data independently, then the resulting data will almost never be easily interoperable.
The second concern is related to data sharing. Principle 2: Despite the fact that SDI technology has been available for years, many internally focused organizations don’t use SDI technology for sharing data with their external community.

Let’s take a closer look at these two concerns and their related principles.

For the data definition issue, when you collect any type of spatial data – whether it’s for water, road, soil or forest classes – it is necessary that you do it purposely for your organization’s needs and uses. This would be termed your primary reason for collecting and maintaining the data. It’s also important to consider that there may be additional downstream uses of this data. These are termed the secondary or supplemental uses of this data. However, even if you collect and store the data with the broadest application in mind, it may still not be interoperable. This is because an organization in a neighbouring jurisdiction may be collecting the same or similar data to quite different specifications because both jurisdictions are working independently.

For example, I’ve seen water quality data collection where similar organizations were doing excellent jobs of collecting data within their jurisdictions. However, the problem was that one of the organizations collected their data as a specific measurement, another collected their data as a percentage and yet another organization collected their data using different water quality chemical analysis. As you can imagine, when these organizations tried to integrate this data across the combined geography, it was an impossible task. Another example of limited interoperability is the different road classifications used in the provinces that makes it difficult to create national road coverage.

So, what’s the solution to getting the spatial data definitions across Canada closer to some kind of interoperable data standard? Well, this is a very difficult task. First, there’s the issue of getting subject matter experts to agree on a common set of classes and attributes. Next is the issue of getting organizations to buy into and implement the new data definitions. Finally, there’s the organizational inertia that causes a delay in implementation. I’m certainly not saying that this task will be easy, but if we don’t start now, it will take even more time and effort in the future.

These maps show the Alberta-Saskatchewan provincial border. The map on the left shows many roads that change their road type at the provincial border. This is because the road type definitions are different, but in reality, there is no change in the road at the border. The map on the right shows the same area with the common road type definitions.

The second interoperability principle is the use of SDI technology for data sharing. This means publishing your organization’s data so that others can search, find, download and use it. Using SDI makes your data more visible. While you can just email the data file to your neighbouring jurisdiction, this doesn’t make it more public. Also, it doesn’t allow others to use your data for secondary uses and it doesn’t allow them to perhaps use your data definitions as a best practice. This would permit others to start collecting data in their jurisdiction using the same data definitions and maybe even the same data model.

Your organization does not need to make a big investment to use the SDI approach either. You or your organization could simply publish the data on ArcGIS Online as a feature service. As long as the layer is public and the publisher has allowed others to export it, the layer can be searched, examined, used in maps and downloaded. This SDI approach would allow others to use your data and perhaps style their data in a similar fashion.

Why is data sharing so important? If you share your data, others may use it in their maps and analysis, but perhaps equally important, others can see what data you have collected, as well as the structure, model and other aspects of your data. This may spur other jurisdictions to follow your lead and collect data in their jurisdiction in a similar fashion so that your data will become more interoperable over a wider geographic coverage. So, share your geospatial data if, when and where you can! There are lots of benefits.