11 Conclusions and future work

There are a great number of possible representations for spatial information, each of which with its own benefits and drawbacks and only some of which have been discussed in this thesis. From an engineering point of view, choosing an appropriate representation for a given system or task is a fundamental issue, as this choice cascades down to almost every engineering decision and indirectly affects a GIS program’s every functionality. Among other aspects, it affects the type of objects that can be efficiently stored and the operations that can be easily performed on them, and it also has important computational consequences in terms of both memory and processing time.

The 2D representations that dominate the GIS world work well for 2D datasets and problems that are essentially two-dimensional, but many problems arise when they are adapted to model 3D, spatiotemporal and multi-scale geographic information. Most research in GIS is devoted to improving these adaptations, as well as to developing new methods that build on them to solve problems both old and new.

This thesis pushes GIS research in a different direction, starting from the assumption that many issues in GIS can probably be better solved by using a new, fundamentally different modelling approach—modelling both spatial and non-spatial characteristics as dimensions in the geometric sense, thus using higher-dimensional representations to create, manipulate and visualise geographic information. Accordingly, this thesis’ main research objective was to realise the fundamental aspects of a higher-dimensional Geographic Information System, and therefore focusing on the development of higher-dimensional representations and methods for GIS.

This concluding chapter starts with a short outlook on higher-dimensional GIS in §11.1, describing concisely when and where it makes sense to use higher-dimensional models. Afterwards, §11.2 describes in detail the lessons learned by pursuing this thesis’ research objective. Finally, §11.3 lists the main contributions of this thesis, and §11.4 discusses the topics that I think would be most useful for future research on this topic.

11.1 An outlook on higher-dimensional GIS

As this thesis has shown, there are many potential advantages to the use of higher-dimensional modelling in GIS. This approach provides a simple and consistent way to store geometry, attributes and topological relationships between objects of any dimension. This generic technique can be easily extended to handle other non-spatial characteristics, enabling better data management and more powerful operations. At the same time, higher-dimensional representations are undoubtedly memory-intensive and often hard to work with, both due to their level of abstraction and the unintuitiveness of working with dimensions higher than three. Admittedly, current GIS use cases do not easily justify the higher-dimensional approach.

However, a far stronger case for higher-dimensional representations emerges when considering what sort of new tools could be developed using this type of representations, both in GIS and in related fields. For instance, 3D modelling software could consider time-varying topology, much as Dalstein et al. [2015] do for 2D vector drawings. 4D topology (as 3D+time) could then be used to automatically generate smooth transitions for animations.

Similarly, CAD tools could use 4D topology to model buildings at different scales and timeframes automatically, keeping track of all relationships between objects and providing immediate user input during interactive editing. For example, a program could display how different changes affect construction time, the size of the model at predefined LODs and when certain safety constraints were violated.

At a lower-level, 4D geometric modelling operators could be developed to operate directly on 4D primitives, such as splitting and merging 4-cells. These could also be used intuitively in an interactive environment, allowing for instance to change a building’s configuration by adding and removing walls while always ensuring that the representation remains a valid 4D space partition.

Considering that current GIS are very often used to manage large heterogeneous datasets and keep them up to date, a future system using a higher-dimensional underlying representation could be used to enforce certain validity constraints at the data structure level, such as avoiding 4D intersections or preserving a certain degree of continuity at LOD transitions.

Is the higher-dimensional approach worthwhile?

In short, yes, but only given certain conditions.

Higher-dimensional modelling is advantageous, but only when the added functionality that will be built with them—as compared to the simpler and more compact 2D/3D models—justifies it, such as when queries across space and time can be implemented as higher-dimensional geometric/topological operations. As discussed in §4.2.1, higher-dimensional modelling makes sense only when the characteristics depicted as dimensions are parametrisable and independent from each other, as is the case for space/time/scale, and where objects occur along a dimension as intervals, not as discrete points (which can be easily stored as attributes). Based on current hardware, software and the typical GIS datasets, there is another important practical requirement: the manageable number of dimensions is limited to 6–8. Finally, it is also worth noting that higher-dimensional models are not incompatible with standard 2D/3D data structures and methods—the best tool for the job can be chosen depending on the need at hand and both approaches can be combined.

11.2 Lessons learned

Current representations are not suitable in higher dimensions The mathematical foundations of spatial data modelling (Chapter 2) are defined in a dimension-independent manner, including all the basic tenets of geometry and topology. This dimensional independence is also true for all spatial data models or representation schemes (Chapter 3)—at least when they are analysed at a high level—but is generally not preserved when they are implemented into more concrete data structures. Most 2D and 3D data structures in GIS thus only encode a few chosen geometric and topological properties (e.g. the coordinates of each point and the adjacencies between polygons), which are often defined with a formulation that is different per dimension.

This modelling approach can make for custom structures that are compact and efficient when used exactly as intended (i.e. for a particular class of objects of a given dimension), but it can limit functionality or introduce inefficiencies when the data structures are adapted to be used under a different set of circumstances. Case in point, the typical data structures of 2D GIS are frequently used with minimal changes for 3D, spatiotemporal and multi-scale GIS. This results various problems, such as inefficient representations (e.g. due to duplicate elements), an inability to represent common 3D objects (e.g. those with a non-2-manifold boundary), difficulties in expressing 3D topological relationships (e.g. adjacencies between solids), and the widespread availability of invalid datasets (e.g. 3D models that do not formally enclose any space), among others.
Higher-dimensional modelling as a solution An alternative to the use of ad hoc adaptations to 2D data structures is to model both spatial and non-spatial characteristics as dimensions in the geometric sense (Chapter 4). While this approach can be memory-intensive, it provides a generic solution that can be applied to the representation of \(n\)D space, time, scale and any other parametrisable characteristics.

A tuple of \(n\) parametrisable spatial and non-spatial characteristics can thus define a coordinate system in \(\mathbb{R}^n\), and lower-dimensional 0D–3D objects existing across these characteristics can thus be modelled as higher-dimensional 0D–\(n\)D objects embedded in higher-dimensional space. As GIS objects are usually non-overlapping¹⁵⁶, they should form an \(n\)D space partition and can thus be represented using \(n\)D topological data structures, reducing the total number of elements and ensuring that it is easy to navigate between the objects. However, even when the objects do not form a space partition, the objects themselves can be partitioned by using an intermediate representation where the original objects are transformed into a set of non-overlapping regions, such that each of these regions represents a set of the original objects [Rossignac and O'Connor, 1989].

\(n\)D space partitions are also ideal in a practical sense, as they simplify many of the operations that can be defined with them, including simple point-in-polytope queries and all of the constructions methods presented in Chapters 6–8. \(n\)D point clouds, while outside the scope of this thesis, are a good complement to \(n\)D space partitions, as they are close to data as it is acquired and are relatively easy to store and manipulate.
The most promising higher-dimensional representations Spatial data structures often consist of two aspects: 1. a combinatorial part, which consists of a set of primitives and some topological relationships between them, and 2. an embedding that links these primitives to their geometry and attributes [Lienhardt, 1994]. As argued in this thesis, the knowledge of higher-dimensional topological relationships in a data structure is the main aspect that differentiates higher-dimensional data structures from 2D/3D ones.

Some data structures that are frequently used in 2D/3D GIS have straightforward extensions to higher dimensions, such as \(n\)D rasters and hierarchies of trees. These use similar structures and algorithms as their 2D/3D counterparts and are therefore easy to understand and implement.

Other data structures implement the models of an \(n\)D simplicial complex or an \(n\)D cell complex. As the simplices in a simplicial complex have a known number of adjacent simplices and bounding facets, they are most efficiently stored using simplex-based data structures. Meanwhile, cell complexes can be easily stored using incidence graphs and related structures. However, Nef polyhedra [Bieri and Nef, 1988] are probably the most promising representation for an \(n\)D cell complex, as they provide a good base to develop Boolean set operations, enabling a wide range of geometric operations.

Ordered topological models such as the cell-tuple [Brisson, 1993] and generalised/combinatorial maps [Lienhardt, 1994] also deserve a special mention. By combining the strong algebra and easy navigation of a simplicial complex with the easy representation of a cell complex, they provide the most important benefits of both. They are rather memory-intensive, but it is important to note that can still be more compact than a non-topological approach (Chapter 5).

Nevertheless, non-topological higher-dimensional representations do have a clear role to play as exchange formats, much as is the case for those based around Simple Features in 2D [OGC, 2011], and CityGML [Gröger et al., 2012] and IFC¹⁵⁷ in 3D.
Three construction methods for higher-dimensional objects Creating computer representations of higher-dimensional objects can be complex. Common construction methods used in 2D and 3D, such as directly manipulating combinatorial primitives, or using primitive-level construction operations (e.g. Euler operators [Mäntylä, 1988]), rely on our intuition of 2D/3D geometry, and thus do not work well in higher dimensions. It is therefore all too easy to create invalid objects, which then cannot be easily interpreted or fixed—a problem that is already exceedingly apparent in most 3D datasets. As an alternative to the use of simple operations on combinatorial primitives, this thesis thus proposed three higher-level methods, all of which are relatively easy to use and attempt to create valid output.
Method I. constructing objects using \(n\)D extrusion Extrusion as used in GIS has a natural extension into a dimension-independent formulation (Chapter 6). Starting from an \((n-1)\)-dimensional space partition as an \((n-1)\)-dimensional cell complex and a set of intervals per cell, it is possible to extrude them to create an \(n\)-dimensional cell complex. It is the easiest method to load existing 2D or 3D data into a higher-dimensional structure, representing a set of cells that exist along a given dimension, such as a length of time or a range of scales. It is also easy to guarantee that the output cell complex is valid and can be used as a base for further operations, such as dimension-independent generalisation algorithms.

The extrusion algorithm developed in this thesis works on the basis of a generalised map representation of the cell complex and is relatively fast, with a worst case complexity of \(O(ndr)\) in the main algorithm, where \(n\) is the extrusion dimension, \(d\) is the total number of darts in the input map and \(r\) is the total number of intervals in the input, but offers better complexity in practice. It is also memory-efficient, as only three layers of darts (of the size of the input cell complex) need to be kept in memory at the same time.
Method II. constructing \(n\)D objects incrementally Based on the Jordan-Brouwer separation theorem [Lebesgue, 1911; Brouwer, 1911], it is known that an \(i\)-cell can be described based on a set of its bounding \((i-1)\)-cells (Chapter 7). Since individual \((i-1)\)-cells are easier to describe than the \(i\)-cell, this can be used to subdivide a complex representation problem into a set of simpler, more intuitive ones. This method can be incrementally applied to construct cell complexes of any dimension, starting from a set of vertices in \(\mathbb{R}^n\) defined by a \(n\)-tuple of their coordinates, and continuing with cells of increasing dimension—creating edges from vertices, faces from vertices or edges, volumes from faces and so on.

The incremental construction algorithm developed in this thesis solves this problem in a practical setting by computing the topological relationships connecting the bounding \((i-1)\)-cells. It uses indices on the lexicographically smallest vertex of every cell per dimension, as well as an added index using the lexicographically smallest vertex of the ridges around the bounding facets of the cell that is being built. It generates an \(i\)-cell in \(O(d^{2})\) in the worst case, with \(d\) the total number of darts in the cell. However, it fares markedly better in real-world datasets, as cells do not generally share the same lexicographically smallest vertex. By checking all matching ridges within a cell’s facets, the algorithm can optionally verify that the cell being constructed forms a combinatorially valid quasi-manifold, avoiding the construction of invalid configurations.
Method III. linking 3D models at different LODs into a 4D model As an example high-level higher-dimensional object construction method, a 4D model can be constructed from a series of different 3D models at different LODs (Chapter 8). The method presented in this thesis consists of three steps: identifying corresponding elements in different LODs, deciding how these should be connected according to a linking scheme, and finally linking relevant 3-cells into 4-cells. Different linking schemes yield 4D models having different properties, such as objects that suddenly appear and disappear, gradually change in size or morph into different objects along the fourth dimension.

By modelling the LOD as a dimension, the correspondences between equivalent objects across LODs become geometric primitives, making it possible to perform geometric operations with them (e.g. extracting an intermediate LOD for visualisation purposes) or to attach attributes to them (e.g. general semantics or the meaning of these correspondences), just as is done to other geometric primitives. These topological relationships and correspondences can then be used for multiple applications, such as updating and maintaining series of 3D models at different LODs, or testing the consistency of multi-LOD models (e.g. by using the validity checks in Gröger and Plümer [2011a]).
Extracting 2D/3D subsets from an \(n\)D model The process to obtain a lower-dimensional subset of a higher-dimensional dataset can be regarded as a function that maps a subset of \(\mathbb{R}^n\) to a subset of \(\mathbb{R}^m\), \(m < n\), which is obtained by cutting through the dataset in a geometrically meaningful way (Chapter 9). Broadly, this process consists of two steps: (i) selecting a subset of the objects in the model and (ii) projecting this subset to a lower dimension. Both of these steps can vary substantially. Selecting a subset of the objects can be as simple as obtaining those within a axis-aligned bounding box, or can be as complex as a Boolean set intersection operation, such as for the computation of cross-sections. Meanwhile, there are a wide variety of transformations that apply different projections with different properties, such as the \(n\)-dimensional to (\(n-1\))-dimensional orthographic and perspective projections derived in this thesis and the \(\mathbb{R}^n\) to \(S^{n-1}\) spherical projection used in its cover.
Methods to create valid objects and space partitions in 2D and 3D Most algorithms described in computational geometry and GIS assume that their input datasets are flawless and they are processable using real numbers. However, invalid datasets are widespread in GIS (Chapter 10), and they are represented and processed using limited-precision arithmetic (Appendix A).

In order to cope with 2D invalid datasets, this thesis further developed methods to create valid polygons and planar partitions using a constrained triangulation of the input. These were based on the work done in Arroyo Ohori [2010], improving the reconstruction algorithm, fixing edge cases, implementing an odd-even constraint counting mechanism and improving the quality of the implementation. Similarly, a method to repair 3D objects and space subdivisions was developed by snapping together lower-dimensional primitives and removing overlaps using Boolean set operations on Nef polyhedra [Bieri and Nef, 1988; Hachenberger, 2006]. These methods were used in this thesis in order to use real-world datasets in practice, such as when applying the construction algorithms.

11.3 Contributions

The main contribution of this thesis is the realisation of the fundamental aspects of a higher-dimensional Geographic Information System. By approaching this problem in a practical manner, many of the technical issues of its development were investigated, including an analysis of its possible internal (in-memory) and external (exchange format) representations, the development of basic algorithms for object construction and visualisation, and the development of GIS data repair tools for 2D and 3D datasets. By taking a model that was previously only described at a conceptual level [van Oosterom and Stoter, 2010] and realising it, it is now possible to more fully evaluate the consequences of this higher-dimensional approach.

In more concrete terms, there were several smaller contributions that were necessary to be able to achieve this realisation. The most significant ones are:

Survey and analysis of higher-dimensional models and structures I conducted a survey of all of the main data models and data structures used in GIS, geometric modelling and related fields, considering 2D, 3D and \(n\)D data structures. These were analysed in terms of their feasibility for the higher-dimensional modelling of geographic information, including how they could handle different geometry classes, topology and attributes, either in their current form or through modifications, as well as their ease of implementation in practice.
Three construction methods I developed three easy-to-use construction methods for objects of any dimension. The two methods lower-level methods—extrusion and incremental construction—were implemented using CGAL Combinatorial Maps and tested using real-world datasets. The third method has been tested with a few synthetic datasets, but more work is necessary to fully realise it and automate it. All of the implementations were made available publicly under an open source licence.
Higher-dimensional real-world models As part of this thesis, I created higher-dimensional models from real-world 2D and 3D datasets. To the best of my knowledge, these are the only datasets consisting of realistic higher-dimensional objects in a GIS setting.
Combinatorial map reversal As part of the development of the incremental construction operation, I developed additional functions that were added to CGAL Combinatorial Maps after being approved by the CGAL Editorial Board. These functions involved reversing the orientation of a combinatorial map of any dimension.
Simple formulation of \(n\)D to (\(n-1\))D projections I developed intuitive formulations of \(n\)-dimensional to (\(n-1\))-dimensional orthographic and perspective projections. While other formulations exist, my formulations based on normal vectors are in my opinion the easiest to understand and manipulate, at least for a GIS audience.
Repair methods tools I developed methods to automatically repair 2D polygons and planar partitions, as well as 3D polyhedra and space partitions. The 2D methods were also released publicly under an open source licence as prepair and pprepair. The 3D methods will also be released publicly after further improvements are made, together with more thorough testing and the addition of basic documentation.

11.4 Future work

As this thesis challenges many of the assumptions underpinning current GIS, there are many potential lines of research that can be formulated for higher-dimensional GIS and higher-dimensional modelling in general. Most algorithms for 2D/3D GIS have a dimension-dependent formulation and would result in open problems in an \(n\)D context.

However, while these are worthy of attention, I would like to focus on what are still significant gaps in knowledge for the implementation of a higher-dimensional GIS and that could not be solved in this thesis’ timeframe. While some of these research topics are not within the main subject matter of GIS research, they are what I consider to be the key steps for a more complete implementation of a working system. They are:

Low-level linking algorithms The linking schemes from Chapter 8 have only been described in terms of high-level algorithms. These should be further developed by finding adequate low-level algorithms to identify matching elements using customisable constraints. For instance, it is reasonable to attempt to minimise a certain distance function between two models (e.g. Earth mover’s distance), but it is also important to do so in a manner that preserves the topological relationships between the objects. The special treatment of holes of different dimensions should also be investigated, together with adequate methods to ensure that holes are linked correctly between themselves and to other primitives.
High-level construction algorithms The three object construction methods described in this thesis operate mostly on lower-dimensional primitives and consequently cover only a few use cases. There is a need to develop intuitive methods that operate directly on higher-dimensional primitives, which should preferably be usable in an interactive environment. Note however that this does not mean that the end user would be viewing the higher-dimensional model directly, as it could be shown in simplified form or as a 2D or 3D representative subset.
\(n\)D constrained triangulator There are high-quality robust implementations of constrained Delaunay triangulations in 2D [Shewchuk, 1996a] and 3D [Si and Gärtner, 2005], as well as good descriptions of a constrained Delaunay triangulation in \(n\)D [Shewchuk, 2008]. However, in order to realise a higher-dimensional GIS based on a simplicial complex model, it is necessary to have a robust \(n\)D constrained triangulator which should be preferably Delaunay.
Hyperspherical projective geometry kernel I deemed Nef polyhedra one of the most promising models for a higher-dimensional GIS, but so far they have only been implemented in 2D and 3D. In order to implement \(n\)-dimensional Nef polyhedra, it is necessary to develop a hyperspherical projective geometry kernel. This kernel would then be used to compute the local \((n-1)\)-dimensional pyramids around every vertex. While the projective mathematics for this are relatively simple, it is a complex engineering problem to implement it robustly.
\(n\)D Boolean set operations Using the above mentioned hyperspherical projective kernel or another method, it is necessary to develop robust algorithms to compute \(n\)D Boolean set operations. For instance, an \(n\)D Nef polyhedra implementation using recursive boundary definitions could compute these operations at the local pyramid level. \(n\)D Boolean set operations would be an excellent base for most geometric operations in a higher-dimensional GIS.
\(n\)D to/from 2D and 3D projective kernel Many operations that are required for object manipulation in \(n\)D are actually well-defined operations applied to 2D and 3D objects that are merely embedded in \(n\)D. A robust kernel that could handle on-the-fly conversions of \(n\)D geometries to 2D and 3D without loss of precision would enable many of these operations to be applied. Relatedly, CGAL currently has simple kernels that apply 3D to 2D orthographic projections using the planes defined by the \(xy\), \(yz\) or \(zx\) planes. These are useful as they can be wrapped around the basic 2D kernels (i.e. \(\mathbb{R}^2\) with floating-point, interval or exact representations) using the traits programming paradigm available throughout CGAL, and so they can be used to apply 2D operations to 2D objects that are embedded in 3D. However, these kernels do not handle the conversions from 2D back to 3D automatically nor are easily extensible to higher dimensions.
Visualisation of higher-dimensional geographic information Chapter 9 described how data selection and projection methods can work in arbitrary dimensions, while §5.2 described the \(n\)D mathematics behind the basic manipulation operations for higher-dimensional objects. However, there are significant issues to tackle in order to create a useful visualiser for higher-dimensional datasets. Namely, there are significant hurdles in user interaction, dealing with large datasets, computing higher-dimensional cross-sections and the definition of useful visual cues in higher dimensions. The simple implementation used for the cover of this thesis make several hard-coded assumptions for the dataset that was used and is far from optimal, but it will be published together with the rest of the open source tools developed for this thesis once it is improved to handle more general input in the form of \(n\)D linear cell complexes.
2D/3D/\(n\)D repair methods with quality guarantees The data repair methods described in Chapter 10 are able to recover from most simple invalid 2D and 3D configurations. However, they are not easily extensible to higher dimensions due to the lack of a robust \(n\)D constrained triangulator (see above), they do not provide quality guarantees in their output, and can be numerically unstable when there are many nearly coplanar planes. In order to develop robust systems, it is highly desirable to be able to specify a robustness criterion (e.g. a minimum distance between vertices or ensuring that the geometries are not collapsed/flipped in a floating-point representation), which is guaranteed by a data repair algorithm. Additional geometric constraints could also be implemented, such as guaranteeing that coplanar planes stay coplanar after repair.
Higher-dimensional modification operations By necessity, this thesis focused on operations for object creation in order to generate initial higher-dimensional datasets. Now that it is possible to generate them, it is important to think of intuitive higher-dimensional object modification operations. For instance, how can a 4-cell split operation be intuitively defined, or how can collapsed geometries (e.g. from the extruded and collapsed models in §6.5) be processed to remove combinatorial elements. Ideally, these operations should also be intuitively usable in an interactive environment, such as a geometric modeller.
Real-world 4D spatiotemporal datasets Using timestamped 3D volumetric datasets, it should be possible to create true 4D datasets using spatiotemporal information. However, obtaining reasonably clean volumetric datasets is nearly impossible at this point. Every dataset that was found during this project had only surfaces embedded in 3D, had severe validity problems up to the point that it would require substantial manual work to fix, or was missing the temporal information. Exporting the temporal information that is present in some closed commercial formats is also an issue, as doing do in a naïve way generally means losing the links to the timestamps’ corresponding geometries.

156. At least in theory.↩

157. http://www.buildingsmart-tech.org/specifications/ifc-releases ↩

10 Processing real-world datasets into clean geometric models

Contents

A Implementing higher-dimensional representations and operations