Spatial databases – Specialised databases series – Part 3

Spatial Database System

Put simply spatial databases are concerned with storing data in relation to space. Güting describes a spatial database as one that supports spatial data types in its data model. It provides a query language and at the minimum spatial indexing and spatial join methods. He goes on to state,

“Spatial database systems offer the underlying database technology for geographic information systems and other applications” (Güting, 1994)

Spatial databases were designed to simplify the querying of spatial data. Shekhar and Chawla (Shekhar & Chawla, 2003) explain

“a query like “List the top ten customers, in terms of sales, in the year 1998” will be very efficiently answered by a DBMS even if the database has to scan through a very large customer database”

and further clarify

“a relatively simple query such as “List all the customers who reside within fifty miles of the company headquarters” will confound the database”

This is caused by a lack of indexing for narrowing the search due to traditional indices being incapable of ordering multidimensional coordinate data. Thus the need for spatial databases to combat this issue was conceived.

There are a number of uses for spatial databases Elmasri and Navathe (Elmasri & Navathe, 2003) give cartographic databases as one example. A cartographical database is one that contains coordinates defining a geographical area which can then be used with other variables (such as age, gender or class) to map the distribution in a geographical area. Another example of spatial databases in use is one taken from a presentation by Bettina Berendt, Figure 1 shows a customer subscribing to be notified of a shopping coupon, should they be in the location of a shop that provides one they will be alerted. This obviously provides a clear financial benefit for consumer and producer; the consumer gets a discount and the provider makes a sale to a customer that may otherwise have not known about them.

Customer being sent shopping coupon

Figure 1 – Customer being sent shopping coupon (Berendt, 2007)

Further Analysis

The initial problem standard DBMSs had with handling spatial data was the implementation of spatial algebra and the integration of it into the querying process (Güting, 1994). The DBMS’s were not designed to provide spatial data types and implementation of atomic operations, spatial indexing to support spatial selection and support of spatial join. A DBMS that wanted to support spatial data would have to accommodate the following in its architecture:

  • representations for the data types of a spatial algebra,
  • procedures for the atomic operations,
  • spatial index structures,
  • access operations for spatial indices,
  • filter and refine techniques,
  • spatial join algorithms,
  • cost functions for all these operations,
  • statistics for estimating selectivity of spatial selection and spatial join,
  • extensions of the optimizer to map queries into the specialized query processing methods,
  • spatial data types and operations within data definition and query language,
  • user interface extensions to handle graphical representation and input of SDT value

When relational systems were developed attempts were made to use them as a basis. There emerged two main architectures Layered Architecture, also called pure relational and Dual Architecture, also called loosely coupled.

Layered architecture – Spatial functionality is implemented on top of underlying functionality

Figure 2 – Layered architecture – Spatial functionality is implemented on top of underlying functionality

To represent spatial data types (SDT) there were two strategies. The first strategy was to break the SDT down and place its values in tuples, the issue with this was that for an instance of the SDT to be used in would have to be reconstructed first which would be process heavy and thus expensive. The second option was to represent the SDT values in “long fields” this was better but still came with problems as the DBMS would handle the geometrics only in the form of interpreted byte strings so any operation or evaluation on the actual geometry instance could only be done in the top layer.

Dual architecture brought in an integration layer. The top layer integrates the standard DBMS and the spatial subsystem.

Dual Architecture

Figure 3 – Dual Architecture

Here the SDT is broken down and non-spatial attributes is stored in the standard DBMS while the spatial attributes are stored in the spatial subsystem and the two pieces are connected by logical pointers.

The previous two architectures assumed that the DBMS was closed to extensibility. There is a third option which is the most popular and used by commercial companies such as PostgreSQL and Oracle (Browne, 2009). This is where the DBMS is extended to cater for spatial data. The query language is extended and new spatial types are handled as basic types by the DBMS. The disadvantage to this type of architecture is that you must use additional software to make an attractive map from the raw vectors stored within the DBMS.

Designers have had to approach spatial databases in a more formal manner. The design of a spatial database has a huge impact on the special querying, analysis, data exchange and system operability (Bédard, 1999). Spatial data is highly complex and such complexity requires high performance which encourages database structures that are hard to understand and so the analysis and design of special databases must rely on formal effective methods and patterns. Bédard explains that when designing for spatial databases there is a clear split in the form of analysis and the design and that this split is essential for multiplatform environments. Analysis design relates to the users perspective of the application, say in the form of roads and houses, whereas the design is more focused on the technology selected.

Object relational DBMSs are often used for handling spatial data due to their ability to handle complex objects (Browne, 2009). An object relational DBMS is strongly linked with object orientated programming and like OOP. OOP can be engineered using unified modelling language, Bédard suggests a similar approach to designing for spacial data and hints that progress in reverse engineering tools would be an advantage.

Over the next 5-10 years we should also expect to see a more uniform cost efficient solution Jayant Sharma Oracle Spatial’s technical director comments:

“While in the past, systems and applications focused on solving a single problem using custom data models, and specialized databases, enterprises are moving away from these isolated systems to reduce duplication of effort, reduce costs through better utilization of resources, and improve quality of service.”

Research appears to indicate towards the development of a more comprehensive and effective tool set for spactial data analysis (Sharma, 2005) which will make it easier for users to extract valid data. The GIS community also mention that ease of use is an aspect we can potentially look forward to in the future, with one member stating:

“We are hiding more of the underlying technology and making it easier for users to ask GIS questions, and make maps” (Snape, 2011)

In the same paragraph Snape does express that he views the future of spatial databases as divided and that while we are starting to make things easier we are also starting to look at other options for analysis. One such option was suggested to be Graph (NoSQL) databases for the storage and retrieval of spatial data.

One application that utilises a special database is the public land survey system (Wikipedia, n.d.) It is used in the United States to survey and spatially identify land parcels before designation of eventual ownership.

This BLM map depicts the principal meridians and baselines used for surveying states (coloured) in the PLSS

Figure 4 – This BLM map depicts the principal meridians and baselines used for surveying states (coloured) in the PLSS.

The information recorded is perfectly suited to spatial database technology. Storing the data in any other database would not be suitable because it would provide the querying and data types needed to represent meridians and baselines. The system presents to the user highly complex spatial data and the needs to be manipulated in a variety of ways that is only suited to a spatial database.

That’s it for part 3, until next week….

Thanks,

Sara :)

If you can’t wait for the series here is the full text.

References

Filed under: Research, Technical, , , , ,

1 Response

  1. Erik Says:

    It seems to me that the spatial database could be extended beyond representing geographical data. I’ve never used these alternative kinds of databases (which is why I’m enjoying your post series), but it seems to me as if this model is ideally suited for persisting R tree lookup structures in general.

    So, I think these databases could be extending to statistical modeling or optical character recognition schemes, for example. Does that fit with your understanding of them?

    Posted on February 21st, 2012 at 22:35

Leave a Reply