SQLite Forum

Recursive CTE

Recursive CTE

(1) By anonymous on 2020-08-27 06:55:18 [link] [source]

Given a polygon (Well Known Text or Well Known Binary), I want to enumerate the points. Normally, I would assign the polygon to a session variable and build an explicit loop from 1 to n.

SQLite supports neither session variables nor explicit loops which leaves me with one option - recursive Common Table Expression. 

My code, shown below, does the job but I am not comfortable with the repeated embedded SELECTs to refer to the polygon (I am sure it impacts on performance adversely). 

Is there a better way to do this?

The code:

CREATE TEMP TABLE tmp (Place text, shape blob);

INSERT INTO tmp (place,shape)
SELECT 'New York',GeomFromText('POLYGON((-79.78 42.27,-79.78 42.525,-79.77 42.535,-79.05 42.805,-78.95 42.845,-78.945 42.855,-78.925 42.905,-78.925 42.925,-79.025 42.965,-79.045 42.995,-79.045 43.03,-79.025 43.05,-79.085 43.06,-79.095 43.07,-79.095 43.09,-79.085 43.1,-79.09 43.13,-79.065 43.145,-79.075 43.17,-79.075 43.245,-79.085 43.245,-79.22 43.445,-79.22 43.46,-79.21 43.47,-78.695 43.65,-76.805 43.65,-76.455 44.105,-76.365 44.15,-76.32 44.22,-76.25 44.225,-76.2 44.24,-76.185 44.25,-76.18 44.29,-76.17 44.3,-76.14 44.315,-76.115 44.315,-76.005 44.37,-75.96 44.365,-75.94 44.375,-75.935 44.385,-75.84 44.445,-75.825 44.48,-75.78 44.53,-75.515 44.725,-75.44 44.77,-75.425 44.79,-75.385 44.8,-75.36 44.825,-75.325 44.835,-75.325 44.85,-75.22 44.9,-75.155 44.915,-75.15 44.93,-75.14 44.935,-75.035 44.965,-75.005 44.995,-74.98 45.005,-74.915 45.005,-74.895 45.02,-74.87 45.02,-74.845 45.035,-74.795 45.035,-74.785 45.025,-74.745 45.02,-74.745 45.01,-74.705 45.025,-74.335 45.01,-74.125 45.01,-73.815 45.025,-73.335 45.03,-73.325 45.02,-73.33 44.985,-73.32 44.975,-73.32 44.91,-73.33 44.895,-73.345 44.89,-73.345 44.87,-73.36 44.845,-73.35 44.845,-73.32 44.82,-73.315 44.775,-73.33 44.765,-73.345 44.735,-73.34 44.69,-73.365 44.615,-73.355 44.605,-73.355 44.585,-73.32 44.56,-73.32 44.55,-73.285 44.505,-73.275 44.42,-73.295 44.38,-73.315 44.36,-73.305 44.345,-73.305 44.305,-73.29 44.28,-73.3 44.24,-73.37 44.18,-73.38 44.145,-73.395 44.125,-73.395 44.095,-73.415 44.055,-73.385 44.025,-73.39 43.935,-73.355 43.885,-73.36 43.86,-73.35 43.845,-73.365 43.82,-73.34 43.8,-73.33 43.78,-73.33 43.76,-73.35 43.74,-73.355 43.71,-73.375 43.695,-73.4 43.645,-73.4 43.61,-73.38 43.645,-73.3 43.65,-73.285 43.64,-73.27 43.595,-73.23 43.565,-73.22 43.535,-73.23 43.495,-73.255 42.865,-73.27 42.8,-73.26 42.765,-73.245 42.755,-73.245 42.74,-73.485 42.085,-73.465 42.045,-73.53 41.3,-73.465 41.225,-73.465 41.205,-73.47 41.195,-73.5 41.18,-73.58 41.15,-73.695 41.09,-73.635 41.02,-73.64 40.995,-73.61 40.97,-72.09 41.28,-72.025 41.31,-71.925 41.33,-71.9 41.325,-71.78 41.195,-71.775 41.055,-71.79 41.025,-71.83 40.99,-72.32 40.82,-73.04 40.595,-73.245 40.55,-73.355 40.555,-73.53 40.515,-73.545 40.505,-73.6 40.505,-73.62 40.515,-73.635 40.51,-73.8 40.515,-73.845 40.495,-73.88 40.49,-73.89 40.475,-73.91 40.475,-73.98 40.5,-74.23 40.455,-74.27 40.475,-74.28 40.49,-74.28 40.51,-74.27 40.525,-74.27 40.555,-74.225 40.585,-74.22 40.645,-74.195 40.665,-74.145 40.66,-74.115 40.67,-74.065 40.67,-74.065 40.7,-74.045 40.72,-74.03 40.77,-73.955 40.88,-73.915 40.98,-74.165 41.095,-74.305 41.15,-74.325 41.165,-74.34 41.165,-74.45 41.225,-74.71 41.34,-74.715 41.36,-74.745 41.38,-74.76 41.395,-74.76 41.405,-74.795 41.4,-74.81 41.415,-74.84 41.41,-74.86 41.425,-74.905 41.42,-74.93 41.46,-74.955 41.455,-74.99 41.46,-75.005 41.475,-75.005 41.49,-75.02 41.495,-75.025 41.515,-75.045 41.525,-75.045 41.55,-75.095 41.595,-75.095 41.615,-75.07 41.635,-75.07 41.655,-75.08 41.665,-75.075 41.685,-75.085 41.69,-75.09 41.705,-75.085 41.725,-75.075 41.73,-75.075 41.75,-75.12 41.755,-75.125 41.785,-75.115 41.805,-75.125 41.805,-75.145 41.83,-75.17 41.83,-75.195 41.845,-75.225 41.835,-75.245 41.845,-75.27 41.845,-75.295 41.89,-75.295 41.925,-75.32 41.93,-75.335 41.95,-75.355 41.955,-75.36 41.98,-78.855 41.975,-79.775 41.98,-79.78 42.27))', 4326);
.mode column
.headers on

AS (
			SELECT X(PointN(exteriorring(shape), 1))
			FROM tmp
			SELECT Y(PointN(exteriorring(shape), 1))
			FROM tmp
	SELECT n + 1,
			SELECT X(PointN(exteriorring(shape), n + 1))
			FROM tmp
			SELECT Y(PointN(exteriorring(shape), n + 1))
			FROM tmp
	FROM points
	WHERE n < (
			SELECT NumPoints(exteriorring(shape))
			FROM tmp
FROM points;

(2) By David Raymond (dvdraymond) on 2020-08-27 12:59:06 in reply to 1 [link] [source]

What you have probably isn't as bad as you think it is.

Why not something along the lines of

with recursive pointNums (n) as (
    select 1
    union all
    select n+1 from pointNums
    where n < (select NumPoints(exteriorring(shape)) from tmp)
X(PointN(exteriorring(shape), n)) as Longitude,
Y(PointN(exteriorring(shape), n)) as Latitude
tmp inner join pointNums;

If you have the generate_series extension I think it'd be the equivalent of

X(PointN(exteriorring(shape), value)) as Longitude,
Y(PointN(exteriorring(shape), value)) as Latitude
tmp, generate_series(1, (SELECT NumPoints(exteriorring(shape)) from tmp), 1);

...but I have been known to be wrong. This is all assuming a single record in tmp, but could probably be modified for a tmp table with multiple entries.

(3) By anonymous on 2020-08-27 14:36:48 in reply to 2 [link] [source]

I tried your first solution. 

It works (and it is shorter); thanks.

However, I am rather baffled by 'inner join' NOT being followed by an 'on' clause.

Trying to use generate_series gives me 'Error: no such function: generate_series'; therefore I do not have the extension. Which DLL does it come from?

(4) By anonymous on 2020-08-27 14:52:07 in reply to 3 [link] [source]

See https://sqlite.org/forum/forumpost/afbaaddd7f

It's probably in that ZIP file Keith linked to.

(6) By Larry Brasfield (LarryBrasfield) on 2020-08-27 15:46:20 in reply to 4 [link] [source]

The generate_series() function is here.

(5) By JayKreibich (jkreibich) on 2020-08-27 15:40:55 in reply to 3 [link] [source]

An inner join without a condition is just a cross join... it will match every row in "tmp" to every row in "pointNums". The table "tmp" contains one row. The table "pointNums" just contains N rows with the integers 1..N+1. Joining the tables together, without a condition, matches the single "tmp" row to each index number in "pointNums". The core query than extracts each value from the single row of "tmp" using the index values in "pointNums."

The key point is that "tmp" only has one row. Adding an "on" clause wouldn't filter anything... there is nothing to match against.

(8) By anonymous on 2020-08-27 19:39:41 in reply to 5 [link] [source]

SQL where every row in the LEFT table inherits every row in the RIGHT table i.e. Cartesion or CROSS JOIN makes for very poor runtime performance.

My example had just one row in "tmp", for making my point (workable in the CLI). In practice, it will have multiple rows. Since SQLite does not have explicit loops, the iteration will need to be in the application.

(9) By David Raymond (dvdraymond) on 2020-08-27 20:29:30 in reply to 8 [link] [source]

How about this? (assuming a rowid table)

with recursive pointNums (orig_rowid, n, max_n) as (
    select rowid, 1, NumPoints(exteriorring(shape)) from tmp
    union all
    select orig_rowid, n + 1, max_n
    from tmp
    where n < max_n
X(PointN(exteriorring(shape), n)) as Longitude,
Y(PointN(exteriorring(shape), n)) as Latitude
tmp inner join pointNums
on pointNums.orig_rowid = tmp.rowid
order by rowid, n;

(11) By anonymous on 2020-08-27 22:31:21 in reply to 9 [link] [source]

I had thought of inner join but could not figure out what condition would follow ON (there seemed to be nothing!). I had forgotten that ROWID tables have a column that I could use. 

I have been looking at the window function ROW_NUMBER() (not sure if this is more reliable than rowid).

Your first solution shortened (less to do means quicker to finish, usually) my code. Your present solution has the ON clause.

My 'real' table consists of the administrative boundaries of 265 countries, with the polygon for Canada having (the most points) 255,955 points or vertices.

I need to do some more tests (& timings); this might take a while not least because I am looking for means of simplifying-read reducing the number of vertices- the polygons.

SpatiaLite does not have a Reduce method for simplifying polygons i.e. reducing the number of vertices - any one know the equivalent method with SpatiaLite?

Thanks for sharing your insights, everyone.

(12) By anonymous on 2020-08-28 15:47:15 in reply to 11 [link] [source]

Reducing the number of vertices, using Spatialite:

ST_Simplify: return a geometric object representing a simplified version of c applying the Douglas-Peukert algorithm with given tolerance

See for instance http://www.gaia-gis.it/gaia-sins/spatialite-sql-4.2.0.html

(13) By anonymous on 2020-08-28 16:09:18 in reply to 12 [link] [source]


By the time I read your hint, I'd already had it figured.

My current struggle is to find a way to determine whether a polygon has been defined counter-clockwise or clockwise. Any clues?

(14.1) By Larry Brasfield (LarryBrasfield) on 2020-08-28 17:13:10 edited from 14.0 in reply to 13 [link] [source]

The differences between angles of successive vectors as you traverse the polygon (including the last-to-first segment angle) will sum to either +360 or -360 degrees depending upon the traversal direction, unless it crosses itself.

If you lookup the algorithm for determining whether a point is inside a polygon, you will find that it also reveals the same information.

(16) By anonymous on 2020-08-28 18:36:41 in reply to 14.1 [link] [source]

SpatiaLite there no built-in function?

(17) By Larry Brasfield (LarryBrasfield) on 2020-08-28 18:58:58 in reply to 16 [source]

Look at geopoly_ccw().

(19) By anonymous on 2020-08-28 19:09:51 in reply to 17 [link] [source]

Thank you; search over, finally!

Now I need to find a function that will tell me the orientation.

(20) By Larry Brasfield (LarryBrasfield) on 2020-08-28 21:09:11 in reply to 19 [link] [source]

The function mentioned reverses the orientation if it had CW orientation. You can either use it as is and do a before-to-after comparison, or examine its source to see how it determined the orientation and extend (or refactor) the extension to make that detection usable on its own.

(21) By anonymous on 2020-08-28 21:34:23 in reply to 20 [link] [source]

As polygon orientation is so critical in spatial computation, I feel that SpatiaLite should include

- a method to determine the orientation
- a method to re-orient where necessary

Is there a formal route to putting this suggestion to the SpatialLite developers?

(25) By anonymous on 2020-08-29 10:32:14 in reply to 21 [link] [source]

The Spatialite forum is here where you could ask the questions but you might find the functions are available in Spatialite and can be found in the documentation here which would be a good place to start. Spatialite is an extension to SQLite not part of it, and not supported by this forum, although many are having a good go at it.

(26) By anonymous on 2020-08-29 11:00:56 in reply to 25 [link] [source]

Thanks for the link to the forum; I'll make a request there. 
I have spent quire some time looking for the functions/methods to no avail.

(27) By anonymous on 2020-08-29 11:14:33 in reply to 21 [link] [source]

You could check if these fit your needs as per the doc link






(28) By anonymous on 2020-08-29 12:02:22 in reply to 21 [link] [source]

Found them in V 5.0.0


(15) By David Raymond (dvdraymond) on 2020-08-28 17:37:05 in reply to 13 [link] [source]

I know for example than in shapefiles the direction defines the shape. The right hand side of the lines defines inside and the left hand side is outside. Otherwise you would never know which side was the actual shape. If you could define it as either way you would never know if it's "this square" or "everything except this square".

https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry According to wikipedia for Well known text format anyway: "The OGC standard definition requires a polygon to be topologically closed. It also states that if the exterior linear ring of a polygon is defined in a counterclockwise direction it will be seen from the "top". Any interior linear rings should be defined in opposite fashion compared to the exterior ring, in this case, clockwise."

So the order for WKT and shape are the opposite.

(18) By anonymous on 2020-08-28 19:04:38 in reply to 15 [link] [source]

A polygon:
- defined counter-clockwise focuses INSIDE the polygon.
- defined clockwise focuses OUTSIDE the polygon.

England defined counter-clockwise looks inside England, defined clockwise looks at the world except England.

If the intention is to focus inside a polygon, and it happens to be defined clockwise it needs to be re-oriented counter-clockwise.

Take South Africa (a polygon with a hole): it will have 
- a counter-clockwise polygon representing its administrative boundary AND 
- a clockwise polygon representing Lesotho. 

The counter-clockwise polygon focuses INSIDE and the clockwise polygon focuses OUTSIDE.

Therefore, a point inside Lesotho will NOT be inside South Africa and vice-versa.

Lesotho itself will have a counter-clockwise polygon.

SpatiaLite calls the counter-clockwise polygon the 'exteriorring' and the clockwise polygons 'interiorring'.

A polygon has ONE 'exteriorring' and zero or more 'interiorring'; Italy has 1 'exteriorring' (obviously) and 2 'interiorring' (Vatican City & San Marino).

I am looking to 
- establish the orientation of a given polygon
- a means of re-orienting a polygon (counter- to -clockwise or vice-versa)

Does SpatiaLite have these two functions/methods?

(22) By TripeHound on 2020-08-28 21:52:00 in reply to 18 [link] [source]

Italy has 1 'exteriorring' (obviously)

What about Sardinia, Sicily etc.? Should Italy not have additional 'exteriorring' polygons for them?

(23) By anonymous on 2020-08-28 22:08:43 in reply to 22 [link] [source]

Possible sloppy 'obviously' from me.

I meant mainland Italy has one exterior ring and two interior rings making up a POLYGON with hole(s).

Italy as a jurisdiction, i.e. including Sardinia, Sicily etc. is a MULTIPOLYGON which would include the POLYGONs for Sardinia, Sicily etc - these POLYGONs will each have just ONE 'exteriorring'.

(24.1) By Keith Medcalf (kmedcalf) on 2020-08-28 22:24:20 edited from 24.0 in reply to 18 [link] [source]

If geopoly_area is +ve then the polygon is "enclosing"
If geopoly_area is -ve then the polygon is a "hole"
If geopoly_area is NULL then the polygon is not a polygon
If geopoly_area is 0 then the polygon is a point

No idea about spatialite however it probably has documentation that would answer the question.

(7) By Keith Medcalf (kmedcalf) on 2020-08-27 17:09:22 in reply to 3 [link] [source]

The "ON expr" clause is merely syntactic sugar for "AND/WHERE (expr)" except in the specific case where the ON clause follows an OUTER JOIN in which case it is bound to the RHS table descent loop rather than applying to the projection conditions.

(10.1) By Keith Medcalf (kmedcalf) on 2020-08-27 21:19:19 edited from 10.0 in reply to 7 [link] [source]

To be more specific the "expr" in the "ON expr" following an outer join is bound into the particular descent (the LH table for a RIGHT OUTER JOIN and the RH table for a LEFT OUTER JOIN and to both tables for a both-ways OUTER JOIN) as compared to the expr in "ON expr" following an inner join which is merely syntactic sugar for an parethetically enlclosed AND term tacked on to the projection WHERE clause.

table1 JOIN table2

does not in any way imply a cross join unless there are absolutely no constraints specified in the WHERE conditions between table1 and table2 (directly or indirectly), although as a "thought experiment" for solving "set algebra" some school children might consider it thus for learning purposes before they have learned more efficient methods.