SQLite3 Array?

(1) By anonymous on 2021-02-01 13:56:06 [link] [source]

At this location I read the following:

Thus the input string: 53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79 Corresponds to a puzzle like this: (this is depicted as a 2D array).

The whole query works; however, I am interested in simply reading a numeric vector into what seems to be a 2D array.

This does not work:

WITH RECURSIVE
   input(sud) AS (  VALUES('53,,7,,,,6,,195,,,,98,,,,6,8,,,6,,,34,,8,3,,17,,,2,,,6,6,,,,28,,,,419,,5,,,,8,,79')
  ) SELECT * FROM input;

How do I build an array from a numeric vector? And use it to solve a shortest path problem as described here?

(2) By ddevienne on 2021-02-01 15:20:48 in reply to 1 [link] [source]

is that what you want?

C:\Users\ddevienne>sqlite3
SQLite version 3.33.0 2020-08-14 13:23:32
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> with input(sud) as (values (53), (null), (7), (null), (null), (79))
   ...> select * from input;
53

7


79
sqlite>

I shortened the list, too tedious to type it all.
I also assumed you wanted nulls when you had nothing between commas.

(3) By Gunter Hick (gunter_hick) on 2021-02-01 16:08:10 in reply to 1 [link] [source]

The original query is reading a SINGLE STRING, where each character position corresponds to a cell in the sudoku puzzle and contains either a digit (for a known cell) or the full stop character (for an unknown cell).

Replacing full stops with commas does not make a list of values out of the string.

SQLite does not have an ARRAY type, as would be required for the solution to the shortest path problem in the page you referenced.

(4) By ddevienne on 2021-02-01 16:35:08 in reply to 3 [link] [source]

I know that. But as the OP wasn't very clear, I extrapolated a little.

SQLite does not have an ARRAY type

Well, there's JSON1, json_each() and json_group_array allow
to take-apart and reconstitute JSON arrays, so that close :).

The Sudoku solver also uses the string as an array of sort too.

So you're of course right that SQLite doesn't have arrays per-se,
but packing several values in strings is a common work-around.
That likely complicates calculating a shortest-path in SQL, but
it is still probably possible.

(5) By Keith Medcalf (kmedcalf) on 2021-02-01 19:02:17 in reply to 1 [link] [source]

I think we have done this before:

create table edges
(
    fromNode    integer not null,
    toNode      integer not null,
    distance    numeric not null default 1,
    isOneWay    integer not null check(isOneWay in (True, False)) default True,
    primary key (fromNode, toNode)
) without rowid;

create unique index backedges on edges (toNode, fromNode, distance) where not isOneWay;

insert into edges(fromNode, toNode, isOneWay, distance) values
    (1, 2, True, 100),
    (1, 3, True,  70),
    (2, 3, False, 20),
    (3, 4, True,  40),
    (2, 5, True,  30),
    (4, 5, True,  20),
    (5, 6, True,  20),
    (4, 6, True,  80);

.parameter init
.parameter set :start 1
.parameter set :end 6
.parameter set :maxhops 10

with paths (startAt, endAt, visited, hops, pathDistance)
  as (
         select :start                      as startAt,
                :start                      as endAt,
                '/' || :start || '/'        as visited,
                0                           as hops,
                0                           as pathDistance
      union all
         select startAt                     as startAt,
                toNode                      as endAt,
                visited || toNode || '/'    as visited,
                hops + 1                    as hops,
                pathDistance + distance     as pathDistance
           from paths, edges
          where fromNode == endAt
            and instr(visited, '/' || toNode || '/') == 0
            and instr(visited, '/' || :end || '/') == 0
            and hops < :maxhops
      union all
         select startAt                     as startAt,
                fromNode                    as endAt,
                visited || fromNode || '/'  as visited,
                hops + 1                    as hops,
                pathDistance + distance     as pathDistance
           from paths, edges
          where toNode == endAt
            and not isOneWay
            and instr(visited, '/' || fromNode || '/') == 0
            and instr(visited, '/' || :end || '/') == 0
            and hops < :maxhops
       order by 5
    )
  select *
    from paths
   where startAt == :start
     and endAt == :end
order by pathDistance, hops
;
┌─────────┬───────┬───────────────┬──────┬──────────────┐
│ startAt │ endAt │    visited    │ hops │ pathDistance │
├─────────┼───────┼───────────────┼──────┼──────────────┤
│ 1       │ 6     │ /1/3/2/5/6/   │ 4    │ 140          │
│ 1       │ 6     │ /1/2/5/6/     │ 3    │ 150          │
│ 1       │ 6     │ /1/3/4/5/6/   │ 4    │ 150          │
│ 1       │ 6     │ /1/3/4/6/     │ 3    │ 190          │
│ 1       │ 6     │ /1/2/3/4/5/6/ │ 5    │ 200          │
│ 1       │ 6     │ /1/2/3/4/6/   │ 4    │ 240          │
└─────────┴───────┴───────────────┴──────┴──────────────┘

You could also add "cost" to each edge (eg, minutes to drive) to get an enumeration of the "cost" of each path since the minimum distance might give a different answer that the minimum cost.

(6) By anonymous on 2021-02-01 20:10:54 in reply to 5 [link] [source]

Wow! Amazing!

Your solution deserves a more prominent place than being buried in the chronological heap of the forum archive.

Just one observation - I'd substitute order by 5 by order by pathDistance (never liked ordinal reference to column names!).

Thank you.

(7) By Keith Medcalf (kmedcalf) on 2021-02-01 22:00:14 in reply to 6 [link] [source]

Since the path enumeration is a complete enumeration, the ORDER BY does not really matter. If you leave it out you will get a "breadth first" enumeration (ie, the same as ORDER BY hops).

If you have an ORDER BY then if there is a choice of which path to extend first, it will be the one with the smallest pathDistance (in the case of ORDER BY pathDistance).

However, notwithstanding the search order the result will be a complete enumeration of all paths (and partial paths) meeting the requirements.

(8) By anonymous on 2021-02-01 22:28:56 in reply to 7 [source]

Keith, hats off!

(9) By Mark Lawrence (mark) on 2021-02-02 10:46:51 in reply to 5 [link] [source]

I have SQLite version 3.34.0 2020-09-01 00:26:21 3ca0b7d54d73d... which when running your example says Error: near line 27: circular reference: paths. Am I missing a feature or later commit?

(10.2) By ddevienne on 2021-02-02 10:55:24 edited from 10.1 in reply to 9 [link] [source]

That date looks suspicious, according to https://www.sqlite.org/changes.html

Maybe you're using an arbitrary trunk snapshot, instead of an official release?

(11) By Mark Lawrence (mark) on 2021-02-02 10:55:48 in reply to 10.1 [link] [source]

I compiled this version myself (from trunk?) while tracking down an issue somewhere.

(12) By John Dennis (jdennis) on 2021-02-02 11:36:31 in reply to 9 [link] [source]

I see the same error Error: circular reference: paths with 3.33.0 for Windows, using the executables obtained from the SQLite3 website. I haven't yet downloaded 3.34 but will do so tomorrow.

(13) By anonymous on 2021-02-02 12:56:52 in reply to 10.2 [link] [source]

I can run the script successfully on these 32-bit versions:

Version	Size onDisk
SQLite version 3.33.0 2020-08-14 13:23:32	999,424 bytes
SQLite version 3.34.0 2020-12-01 16:14:00	995,328 bytes
SQLLite version 3.34.1 2021-01-20 14:10:07	995,328 bytes

All above versions downloaded as pre-compiled binaries for Windows.

However, this version:

Version	Size on Disk
SQLite version 3.33.0 2020-08-14 13:23:32	1,495,040 bytes

from the SpatiaLite extension which has the same timestamp as the very first version reports Error: circular reference: paths¹ Note that the size is much bigger.

¹No line reference

(14) By Keith Medcalf (kmedcalf) on 2021-02-02 18:19:54 in reply to 12 [link] [source]

The recursive query has more than one recursive part. Support for that was added in 3.34.0 released 2020-12-01. https://sqlite.org/releaselog/3_34_0.html item 2.

If you want to use an earlier version of SQLite3 that can only use a single recursive part, then you have to remove the isOneWay flag from the table, remove the index, remove the last part of the recursive query, and add an extra row to the edges table for the edge from 3 to 2.

(15) By Keith Medcalf (kmedcalf) on 2021-02-02 18:50:06 in reply to 14 [link] [source]

So here is one that uses a "speed" column so it can optimize "cost" (distance/speed):

create table edges
(
    fromNode    integer not null,
    toNode      integer not null,
    distance    real    not null default 1,
    speed       real    not null default 1,
    isOneWay    integer not null check(isOneWay in (True, False)) default True,
    primary key (fromNode, toNode)
) without rowid;

create unique index backedges on edges (toNode, fromNode, distance, speed) where not isOneWay;

insert into edges(fromNode, toNode, isOneWay, distance, speed) values
    (1, 2, True, 100,  40),
    (1, 3, True,  70,  50),
    (2, 3, False, 20,  40),
    (3, 4, True,  40,  50),
    (2, 5, True,  30,  50),
    (4, 5, True,  20,  40),
    (5, 6, True,  20,  20),
    (4, 6, True,  80, 100);

.parameter init
.parameter set :start 1
.parameter set :end 6
.parameter set :maxHops 10
.parameter set :maxCost 10
.parameter set :maxDistance 250

with paths (startAt, endAt, visited, hops, pathDistance, pathCost)
  as (
         select :start                      as startAt,
                :start                      as endAt,
                '/' || :start || '/'        as visited,
                0                           as hops,
                0                           as pathDistance,
                0                           as pathCost
      union all
         select startAt                     as startAt,
                toNode                      as endAt,
                visited || toNode || '/'    as visited,
                hops + 1                    as hops,
                pathDistance + distance     as pathDistance,
                pathCost + distance / speed as pathCost
           from paths, edges
          where fromNode == endAt
            and instr(visited, '/' || toNode || '/') == 0
            and instr(visited, '/' || :end || '/') == 0
            and hops < :maxHops
            and pathCost < :maxCost
            and pathDistance < :maxDistance
      union all
         select startAt                     as startAt,
                fromNode                    as endAt,
                visited || fromNode || '/'  as visited,
                hops + 1                    as hops,
                pathDistance + distance     as pathDistance,
                pathCost + distance / speed as pathCost
           from paths, edges
          where toNode == endAt
            and not isOneWay
            and instr(visited, '/' || fromNode || '/') == 0
            and instr(visited, '/' || :end || '/') == 0
            and hops < :maxHops
            and pathCost < :maxCost
            and pathDistance < :maxDistance
       order by pathCost
    )
select startAt, endAt, visited, hops, pathDistance, pathCost
  from paths
 where startAt == :start
   and endAt == :end
;

And here is the same thing with the backedges in the edges table and using only a single recursive part:

create table edges
(
    fromNode    integer not null,
    toNode      integer not null,
    distance    real    not null default 1,
    speed       real    not null default 1,
    primary key (fromNode, toNode)
) without rowid;

insert into edges(fromNode, toNode, distance, speed) values
    (1, 2, 100,  40),
    (1, 3,  70,  50),
    (2, 3,  20,  40),
    (3, 2,  20,  40),
    (3, 4,  40,  50),
    (2, 5,  30,  50),
    (4, 5,  20,  40),
    (5, 6,  20,  20),
    (4, 6,  80, 100);

.parameter init
.parameter set :start 1
.parameter set :end 6
.parameter set :maxHops 10
.parameter set :maxCost 10
.parameter set :maxDistance 250

with paths (startAt, endAt, visited, hops, pathDistance, pathCost)
  as (
         select :start                      as startAt,
                :start                      as endAt,
                '/' || :start || '/'        as visited,
                0                           as hops,
                0                           as pathDistance,
                0                           as pathCost
      union all
         select startAt                     as startAt,
                toNode                      as endAt,
                visited || toNode || '/'    as visited,
                hops + 1                    as hops,
                pathDistance + distance     as pathDistance,
                pathCost + distance / speed as pathCost
           from paths, edges
          where fromNode == endAt
            and instr(visited, '/' || toNode || '/') == 0
            and instr(visited, '/' || :end || '/') == 0
            and hops < :maxHops
            and pathCost < :maxCost
            and pathDistance < :maxDistance
       order by pathCost
    )
select startAt, endAt, visited, hops, pathDistance, pathCost
  from paths
 where startAt == :start
   and endAt == :end
;

(16) By anonymous on 2021-02-02 19:22:51 in reply to 15 [link] [source]

I've been studying and trying to understand the original version.

Thanks for this version. It works for me.

On the network, I should be able to :start and :end at any available node. However, when I start at a :start node higher than the :end node, I am not getting any results with either version.

(17.1) By Keith Medcalf (kmedcalf) on 2021-02-02 21:09:14 edited from 17.0 in reply to 16 [link] [source]

The data reflects the nodemap found at https://learnsql.com/blog/get-to-know-the-power-of-sql-recursive-queries/. As you can see there is no path found because none exists. (The the speed data is arbitrary for demonstration purposes.)

(18) By Adrian Ho (lexfiend) on 2021-02-03 08:10:01 in reply to 11 [link] [source]

I think ddevienne's point is that the official release of 3.34.0 was three months after your build, so it's technically not "real" 3.34.0.