# Are your polygons the same as my polygons?

29 Dec 2014When someone outside the field of GIS thinks of a ‘polygon’, she usually pictures something like one of these:

And I can’t blame her, that’s what we learned in school. Squares, triangles, pentagons, hexagons. However, when I speak about a polygon with my colleagues, we usually think of something more complex:

## Simple Features polygons

We use the definition as found in the *Simple Feature Access* document of the OGC:

A Polygon is defined as a planar Surface defined by 1 exterior boundary and 0 or more interior boundaries. Each interior boundary defines a hole in the Polygon.

There are different rules for the interactions between the different rings (boundaries) of a polygon, and it is easy to imagine that things can quickly get complex and messy. The following figure shows 12 examples of rather simple polygons that are invalid, and these highlight the most important validity rules.

- each ring defining the exterior and interior boundaries should be
*simple*, ie non-self-intersecting (*p*_{1}and*p*_{10}). Notice that this prevents the existence of rings with zero-area (*p*_{6}), and of rings having two consecutive points at the same location. It should be observed that the polygon*p*_{1}is not allowed either (in a valid representation of the polygon, the triangle should be represented as an interior boundary touching the exterior boundary), but some implementations do allow it (eg ESRI’s Shapefile). - each ring should be closed (
*p*_{11}): its first and its last points should be the same. - the rings of a polygon should not cross (
*p*_{3},*p*_{7},*p*_{8}and*p*_{12}) but may intersect at one tangent point (the interior ring of*p*_{2}is a valid case, although*p*_{2}as a whole is not since the other interior ring is located*outside*the interior one). - a polygon may not have cut lines, spikes or punctures (
*p*_{5}or*p*_{6}); removing these is known as the*regularisation*of a polygon (a standard point-set topology operation). - the interior of every polygon is a connected point set, ie one can ‘walk’ everywhere within its interior without having to go outside (
*p*_{4}). - each interior ring creates a new area that is disconnected from the exterior. Thus, an interior ring cannot be located outside the exterior ring (
*p*_{2}) or inside other interior rings (*p*_{9}).

## BIG polygons

Many real-world polygons are very large, both in terms of number of vertices and of rings. An example is this polygon, taken from the Canada Land Cover. It has 148,612 vertices in total, and 5,132 interior rings. “How was such a large polygon created in the first place?” I hear you ask. Automatically from reclassified raster imagery. It is—like several polygons that are that big—invalid since it contains self-intersections.

## My all-time favourite polygons

Of all the polygons I’ve seen in my life, these 2 are my favourite.

### 1. the Swedish polygon

The polygon with ID ‘EU-199948’ in the Corine land cover dataset has 1,189,903 vertices, and 7,672 inner rings. You can get it and zoomin on it there.

### 2. the Cleveland polygon

I obtained it from someone working at the Cleveland Metroparks, and you can get it there.
Although it covers a much smaller area than the Swedish one, it is **monstrous**: 1,689,703 vertices, 66,908 inner rings, and its biggest ring has 500,373 vertices. Wow.

I can’t decide which one I like most, perhaps because the Swedish polygon should probably be bigger than the Cleveland one: observe that it was truncated at the bottom and on the left, probably because of a different CRS zone, or perhaps because the GIS couldn’t handle the full one?

## A repository of BIG polygons

I’ve setup a repository in which I keep these big polygons and other interesting ones:

github.com/hugoledoux/BIGpolygons

I usually use them to test the capabilities of code I write to process GIS datasets, eg prepair.

I’d be grateful if you submitted other interesting ones and/or your favourite ones.