Enh. Request : VIEW Definitions to span multiple databases?

(1) By midijohnny on 2022-02-18 12:50:48 [link] [source]

I am trying to work out a way of using multiple DBs in analogous way that you 'import' or 'require' libraries in imperative programming languages. In particular packaging up (maybe complex/long) SQL queries as VIEWS and re-using them across projects.

So: for instance - creating a set of views/tables in a 'util' schema. One limitation seems to be that you cannot span multiple schemas. That is: you cannot build a VIEW in 'util' (say) which references any other schemas.

You can create a view which does this (implicitly - using unqualified schema) - but the view will fail at runtime.

Is there a good reason to disallow this? (Maybe to stop 'schema-injection' attacks? Somebody crafting a bad schema to replace existing views etc?)

I note that it is possible to create a view in the same schema which fails at runtime: so in that sense the 'runtime issue' already exists.

.shell rm -f 'util.db'
.shell rm -f 'main.db'

.open 'main.db'
CREATE TABLE pixel(x,y,colour_id NOT NULL, PRIMARY KEY (x,y) );
INSERT INTO pixel VALUES (0,0,1), (0,1,1), (1,0,2);

ATTACH DATABASE 'util.db' AS util;
CREATE TABLE util.lookup(colour_id,colour_name);
INSERT INTO util.lookup VALUES (0,'red'), (1,'green'), (2,'blue');

SELECT * FROM pixel NATURAL JOIN util.lookup;

-- All works up to here.

-- Attempt 1: define view in UTIL schema; referencing (unqualified) main table + qualified 'util' (i.e. same as view) schema:
-- View is created without error:
CREATE VIEW util.v_lookup AS
SELECT * FROM pixel NATURAL JOIN util.lookup;
-- But fails at runtime:
SELECT * FROM util.v_lookup; -- 'Error: no such table: util.pixel'

-- Attempt 2: define view in MAIN schema (fully qualified, explict 'main')
-- View does not create

-- CREATE VIEW main.v_lookup AS -- view v_lookup cannot reference objects in database util
-- SELECT * FROM pixel NATURAL JOIN util.lookup;

(2.1) By MBL (UserMBL) on 2022-02-18 14:46:33 edited from 2.0 in reply to 1 [link] [source]

When the view in util is called it does not know about the table pixel in schema main; there was no hint about in which schema to find the table.

I added the schema name 'main' to pixel but then I got the more speaking error meassage:

sqlite> create view util.v_lookup as
   ...> select * from main.pixel NATURAL JOIN util.lookup;
Error: in prepare, view v_lookup cannot reference objects in database main (1)
sqlite>

Edit: copy-paste error correction done, schema util, not until.

(3) By midijohnny on 2022-02-18 17:14:31 in reply to 2.1 [link] [source]

Thanks for this - but notice the behaviour is different with regard to defining a view on a non-existent table within the same schema - the system defers the complaint until the view is used.

SQLite version 3.34.1 2021-01-20 14:10:07
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create view my_view as select * from my_table; -- no error
sqlite> select * from my_view;
Error: no such table: main.my_table

My ask here - is for the explanation of the difference in principle and whether or not it could be useful to allow this behaviour to be consistent with different schemas. (As that would allow a convenient method for 'storing' SQL queries in a sort-of modular fashion: maybe it would require a special syntax or pragma to allow this - not sure).

(4) By Simon Slavin (slavin) on 2022-02-18 18:19:10 in reply to 1 [link] [source]

I can't solve your problem, but I can answer a small question. You asked if there's a reason that VIEWs can't cross schema. Some answers we have previously discussed are …

We don't know if the schema is active. Perhaps nothing is attached there. Since the VIEW might be stored in a database, when should an error message be produced ?
Even if something is attached there, it might not have a table/row the VIEW refers to. Since the VIEW might be stored in a database, when should an error message be produced ?
Suppose the VIEW did currently work right, and you use it in a compiled statement. Then you DETACH the schema it refers to. This invalidates the prepared statement, which would make sqlite3_step() crash. So sqlite3_step() must check for DETACH. But this would slow down that function call, which is very frequently used by SQLite. It would make execution slower for everyone, whether they used cross-schema VIEWs or not.

(5) By midijohnny on 2022-02-18 19:31:47 in reply to 4 [link] [source]

That makes sense - and is distinct (I would say) from having a view defined in the same schema - which might result in a runtime error - but we could just declare the database 'broken' at that point.

Whereas (as you point out) - if ATTACH and DETACH are expected to happen, then having views which break upon DETACH would cause that extra work/check/slow-down effect.

I think I will get what I need done by keeping view definitions generic in SQL files - and using external tools to configure/share at build-time , rather than 'baking' in the dependency.

One thing - does SQLite offer a 'USE schema' type command (I couldn't find one)? Why ask that?

If I want a 'library' of views (etc) - that I wish to instantiate in a schema - I want to avoid having to replace schema names in source code - rather use a 'relative' path:

e.g:

ATTACH database 'util.db' AS util;
USE util; -- All subsequent DDL(DML?) will assume 'util' (unless explicitly over-ridden)
CREATE VIEW v_local_name_only[...] (but will use 'util')
USE main; -- back to normal

This might also have overheads and possibly cause confusion - not sure.

(6.1) By midijohnny on 2022-02-19 12:53:17 edited from 6.0 in reply to 5 [link] [source]

Actually : I think this can be done without any 'use' type command. (I haven't tried this yet - but I guess would probably work).

Create all 'library' tables, views (etc) as an attached 'main' DB. Probably use a 'pseudo' schema prefix like 'util' would be sufficient.
Export the SQL- All DDL would be in local-name (not schema.table)

For re-using; just load the SQL from various 'libraries' to build the objects in the new 'main' DB.

I guess what I'm after here really, is a logical schema facility - not necessarily the physical schema. Some way of grouping / naming stuff that have some functional relationship.

(7) By Simon Slavin (slavin) on 2022-02-19 16:57:00 in reply to 6.1 [link] [source]

The schema name doesn' thave to be related to the filename

ATTACH DATABASE filename AS schemaname

So specify the schema name when you ATTACH the database. Pick a set of schema names for the various functions and use them, no matter which files you have ATTACHed.

(8) By midijohnny on 2022-02-19 17:34:48 in reply to 7 [link] [source]

The trouble-is: I need to not only reference the objects in the schema (I'm mostly talking about VIEWs here) - but those VIEWs themselves need a reference to other objects (most probably TABLEs).

So: the idea (I didn't explain very well: probably still not doing so!) above was to do everything at the source-level - and just have a build-process which assembles the database so that everything can be re-used that way.

I guess probably another feasible request for enhancement would be a cross-schema synonym) mechanism: which would need to be tolerant of missing objects at definition-time, only erroring at runtime.

Although this would still have the issue identified in this thread of having to check the validity of those synonyms at runtime - which may have performance implications.

I also did some further experiments: with DETACHING the 'main' schema - this is also disallowed.

The idea I had there was to (temporarily) mount a secondary schema as 'main', then extract (export) the SQL (assuming it is all schema free/local names only) - DETACH that DB , and re-attach (as 'main') the target schema - load and run the SQL.

But it looks like I need to exit sqlite3 tool in between these tasks; so would need some sort (nothing complex probably) of build-process to do this.

(The idea of having a naming convention "util_xxx" say - is separate but related idea).

(18) By Simon Slavin (slavin) on 2022-02-20 15:19:41 in reply to 8 [link] [source]

If those VIEWs reference objects in the same schema as the VIEW, then just do it. That's how SQLite works: a reference without a schema means "in the same schema I'm in".

I don't understand why you're trying to mess with 'main.'. Leave 'main.' alone. It's fundamental part of how SQLite works.

If you need a place which is sometimes the database you have loaded as main and sometimes a different database, just ATTACH the same database as you have opened as 'main.' under a different schema name. There's nothing in the documentation to stop you opening the same database under different schema names at the same time.

(19) By midijohnny on 2022-02-21 11:32:36 in reply to 18 [link] [source]

"There's nothing in the documentation to stop you opening the same database under different schema names at the same time"

Thanks for this as well - I hadn't appreciated that either - could be useful!

sqlite> create table n(n);
sqlite> attach database '/tmp/mydb' as secondary;
sqlite> .databases
main: /tmp/mydb
secondary: /tmp/mydb
sqlite> select * from n;
sqlite> .tables
n            secondary.n
sqlite> insert into n values(1);
sqlite> select * from secondary.n;
1

(9) By midijohnny on 2022-02-19 17:53:57 in reply to 7 [link] [source]

Additionally - I viewed another question on here (about temp/main DB) and that provided an interesting link: https://www.sqlite.org/lang_naming.html

So: it looks like DDL has the idea of a 'search-path' etc - where is will take a local name and look for it in the schemas.

But I guess the rule doesn't apply to DML. (and fair-enough maybe - I'm not complaining - just observing).

I tried the following experiment and confirmed that views can only reference object in their own schema - note in this example I never specific this explicitly - but the system sees the view in schema 'v' and (at runtime) searches for an object 'x' in that schema alone.

$ sqlite3
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .open v.db -- new DB: 
sqlite> create view v as select * from x; -- There is no table 'x'
sqlite> .quit


$ sqlite3 x.db
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
sqlite> attach 'v.db' as v;
sqlite> create table x as select 1;
sqlite> select * from v;
Error: no such table: v.x

(10) By Keith Medcalf (kmedcalf) on 2022-02-19 18:23:57 in reply to 9 [link] [source]

A database schema (which corresponds to one database file) can only reference itself. This is because SQLite3 only opens each database file individually.

Unlike other RDBMS systems, which open all the schema's at once, there is no way for SQLite3 to "build" the attachment of multiple database files (as seperate schemata) to a connection by itself -- that is, it would be impossible to maintain the integrity of a database file if it were to be permitted to contain cross-schema schema entries.

Similarly, virtual table, aggregate, and scalar functions can be marked as only being accessible "directly" (by statements issued on the connection) and not indirectly (by being embeded in a database file schema).

The above restrictions are removed for the temp database because anything stored in the temp database schema could only have gotten there because it was put there by the programmer after the connection was opened, and one generally figures that the programmer knows what it is doing. Therefore the above requirements (cross-schema links and the use of directonly functions) are relaxed for the temp schema and only the temp schema.

This should not be necessary, but the case is that if some nutbar somewhere can do something stupid, then they will do something stupid, and the resulting insecurity will be laid at the door of the SQLite3 developers for not protecting the stupid from themselves.

(11.1) By midijohnny on 2022-02-19 19:13:43 edited from 11.0 in reply to 10 [link] [source]

Thanks for this. And actually - I hadn't appreciated that you can reach the main schema via 'temp'. And taking that it into account - I can do exactly what I need to do. I can define my views in a DB; and build them dynamically each time and get to keep them in a separate (i.e. 'temp') schema.

Great - thanks all.

SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .shell rm -rf new.db
sqlite> .open new.db
sqlite> create view temp.my_view as select * from my_table;
sqlite> select * from my_view;
Error: no such table: my_table
sqlite> create table my_table as select 'hello';
sqlite> .tables
my_table      temp.my_view
sqlite> select * from temp.my_view; -- edit;or just 'my_view', schema not needed.
hello
sqlite> .quit

So that's all great - thanks all!

A follow-on question...is there a mechanism in the commandline tool to directly clone or copy on infile db into the temp database? (i.e. not exporting SQL to text file then reading it back in).

(12) By anonymous on 2022-02-19 19:35:01 in reply to 11.1 [link] [source]

A follow-on question...is there a mechanism in the commandline tool to directly clone or copy on infile db into the temp database? (i.e. not exporting SQL to text file then reading it back in).

.restore temp /path/to/database

(14) By Keith Medcalf (kmedcalf) on 2022-02-19 19:48:22 in reply to 12 [link] [source]

That works but I do not know how one would create the database file in the first instance.

It does appear though that the CLI command .save temp file does save the existing temp schema to the file and that .restore temp file puts it back, though the .help documentation for the .save command does not indicate that it can do this ...

(15) By Larry Brasfield (larrybr) on 2022-02-19 20:04:51 in reply to 14 [source]

The .save dot command is no more than an alias resolving to the same action as the better-helped .backup dot command, where this behavior is explained.

I'll see about correcting this in the .help output.

(16) By Keith Medcalf (kmedcalf) on 2022-02-19 20:09:26 in reply to 14 [link] [source]

Note that once upon a time I wrote an application that retrieved historical process data from a facility backup that kept all the historical data archives by year.

The main database was built with the complete schema but no actual data (the data tables were defined, but empty). The application basically attached the databases containing the actual data and then created temp views that 'obscured' the main connection tables with tables from the attached databases.

This procedure allowed each database file to be internally consistent in its own right, but allowed for building a queryable system by re-directing the queries and views from the main database to other attached databases using the built-in processing but without having to know (or care) cross-schema integrity constraints (it also meant that some data, such as the tag list, were duplicated in each database).

(13) By Keith Medcalf (kmedcalf) on 2022-02-19 19:39:03 in reply to 11.1 [link] [source]

A follow-on question...is there a mechanism in the commandline tool to directly clone or copy on infile db into the temp database? (i.e. not exporting SQL to text file then reading it back in).

I do not think so. There may be ways to achieve that end, but I believe that you would have to create the schema in temp by issuing SQL commands directly -- that is the intention of the security protections -- to ensure that such features (if you cann them that) can only be used "directly" by the programmer/user deliberately and not from some other source.

For the CLI there is no reason that one could not create a file of SQL commands and then .read them to build the connection structure you want. The overall point is that the configuration needs to be created explicitly and cannot be implicit.

(17) By midijohnny on 2022-02-20 13:21:12 in reply to 13 [link] [source]

Thanks all for this. I have something that will probably work for me now. I can create stuff in memory (and without worrying about schema prefix), save it to a file, and then restore as a 'main'.

Here's a test script for reference in case it is useful for others:

.version
.head on
.mode column
.echo on

.shell rm -rf mem.db
-- List tables/db
.databases
.tables
-- Create temp objects
CREATE TEMPORARY TABLE mem(n);
INSERT INTO mem VALUES (1),(2);
CREATE TEMPORARY VIEW v_mem AS SELECT * FROM mem UNION SELECT n*n FROM mem;
-- Check temp objects (note: no schema prefix needed - good)
SELECT * FROM mem;
SELECT * FROM v_mem;
-- List tables/db
.databases
.tables
-- Save temp object to file
.backup temp mem.db
-- Clear temp objects. (Is there is better of doing this globally?)
DROP TABLE temp.mem;
DROP VIEW temp.v_mem;
-- Open file into main
.open mem.db
-- List tables/db
.databases
.tables
-- Check main objects 
SELECT * FROM mem;
SELECT * FROM v_mem;

Output:

$ sqlite3 < create_db.sql
SQLite 3.31.1 2020-01-27 19:55:54 3bfa9cc97da10598521b342961df8f5f68c7388fa117345eeb516eaa837balt1
zlib version 1.2.11
gcc-9.3.0

.shell rm -rf mem.db
-- List tables/db
.databases
main: 
.tables
-- Create temp objects
CREATE TEMPORARY TABLE mem(n);
INSERT INTO mem VALUES (1),(2);
CREATE TEMPORARY VIEW v_mem AS SELECT * FROM mem UNION SELECT n*n FROM mem;
-- Check temp objects (note: no schema prefix needed - good)
SELECT * FROM mem;
n         
----------
1         
2         
SELECT * FROM v_mem;
n         
----------
1         
2         
4         
-- List tables/db
.databases
main: 
temp: 
.tables
temp.mem    temp.v_mem
-- Save temp object to file
.backup temp mem.db
-- Clear temp objects. (Is there is better of doing this globally?)
DROP TABLE temp.mem;
DROP VIEW temp.v_mem;
-- Open file into main
.open mem.db
-- List tables/db
.databases
main: /home/sqlite/mem.db
.tables
mem    v_mem
-- Check main objects 
SELECT * FROM mem;
n         
----------
1         
2         
SELECT * FROM v_mem;
n         
----------
1         
2         
4