SQLite Forum

Lack of abstraction ability in SQLite
Login

Lack of abstraction ability in SQLite

(1) By Matt (mattkuine) on 2020-10-14 08:58:29 [link]

SQLite's query language is so powerful it's even Turing-complete with recursive common table expressions. However it's completely lacking in abstraction power. 

 - Recursive common table expressions (CTEs) but no recursive top-level tables.
 - There is a way to add custom functions via the C API but no way to write custom functions in SQL (to reuse commonly used code, etc.) or a built-in procedural language.
 - There are virtual tables but no way to create writable views (creating triggers over views is impossible)
 - There is a bytecode engine with powerful features such as coroutines but it lacks a stack (to have reentrancy, continuations, and abstractions) and is not exposed as an API to applications.
 - There is no way to create a "virtual database" that handles reads and writes its own way.

The C API is lacking as well: 

 - There is no way to create and register new storage engines. I can only write new VFS layers that allow SQLite to write databases in the SQLite3 B-tree format to new file systems even though the SQLite bytecode refers not to B-tree leaves, branches and nodes but to SQL database tables and rows. This makes it impossible for SQLite to read from and write to CSVs, Excel files, etc. without the special "virtual table" feature. (Ideally, the "virtual table" feature shouldn't be special.)
 - SQLite doesn't support the `CREATE PROCEDURE` syntax even just to pass the code to an external interpreter to run.

I think adding these features to SQLite, which is already modular by design, isn't difficult. SQLite already implements many programming features such as streaming logic, relational logic, authorization, table-valued functions, etc. and I think it can be made into a general-purpose, multi-paradigm (stack based, functional, logic, procedural, object-oriented, etc.) programming environment and operating system for developing portable applications that run anywhere if these features are added.

SQLite is a modular, well-designed, extensible data storage and processing library and if it is a little bit more modular and extensible it would take over the world.

(2) By Kees Nuyt (knu) on 2020-10-14 13:37:08 in reply to 1 [link]

>  - There is a way to add custom functions via the C API but no way to write custom functions in SQL (to reuse commonly used code, etc.) or a built-in procedural language.

Database procedures are especially useful in client/server database models, where they avoid the dataflow bottleneck between client and server by running procedures in the server. Dataflow between a host program and the SQLite library is orders of magnitude faster.

Think of SQLite not as a replacement for Oracle but as a [replacement for fopen()](https://www.sqlite.org/about.html) .

Reuse of commonly used code can easily be obtained by using functions or macros in the host language, or by using some other macro processor (m4, anyone?).

> - There are virtual tables but no way to create writable views (creating triggers over views is impossible)

You can create INSTEAD OF triggers on views and make them writable that way.

> - There is a bytecode engine with powerful features such as coroutines but it lacks a stack (to have reentrancy, continuations, and abstractions) and is not exposed as an API to applications.

It is good enough to serve its sole purpose: a database engine.

It is, and should be, a black box, only exposed through higher abstractions like the C-API and SQL.

> - There is no way to create a "virtual database" that handles reads and writes its own way.

Couldn't that be achieved with virtual tables, and a :memory: database for just the schema? 

>The C API is lacking as well: 
>
> - There is no way to create and register new storage engines. ... This makes it impossible for SQLite to read from and write to CSVs, Excel files, etc. without the special "virtual table" feature. (Ideally, the "virtual table" feature shouldn't be special.)

IMHO virtual tables are not more special than alternative storage engines.
And good luck implementing an Excel storage engine.
SQLite can be used the other way around, as a storage engine for a spreadsheet program, e.g. with an ODBC driver.

> - SQLite doesn't support the `CREATE PROCEDURE` syntax even just to pass the code to an external interpreter to run.

SQLite intents to replace fopen() in order to manage a structured data storage, accessible with SQL. As a database engine library, the expectation is that all procedural code is handled by the host program.

> I think adding these features to SQLite, which is already modular by design, isn't difficult. ...

I think the strength of SQLite is that it is faithful to its scope and only implements features that can be maintained by the small team.

> SQLite is a modular, well-designed, extensible data storage and processing library

I fully agree.

> and if it is a little bit more modular and extensible it would take over the world.

It already has [taken over the world](https://www.sqlite.org/index.html). Anyone is free to fork the code and add features, and people have done that to some extent, but the results were never as successful as core SQLite itself.

However, I understand your enthusiasm about the concept, quality of implementation, and excellent support for SQLite, and your desire to apply that quality to a bigger scope.

Just my EUR 0.02

~~~
-- 
Regards,
Kees Nuyt
~~~

(3) By Gunter Hick (gunter_hick) on 2020-10-14 13:51:33 in reply to 2

I have too agree with Kees Nuyt here. I am quite comforatble and successful in adding "storage providers" as virtual table implementations for stuff as diverse as Faircom CTree files, a proprietary memory section based data dictionary, elements of a linux queue and whatever else we need here.

Currently I am experimenting with a VFS that allows an SQLite database file to reside in a shared memory section.

There have been several attempts to save SQLite bytecode in lieu of stored procedures, but I am not aware that any of these have resulted in anything useful. I don't expect them to either, as SQLite bytecode is specifically generated for the current combination of schema and statement and needs to be recompiled as soon as the schema of the affected tables is changed.

(4) By Ryan Smith (cuz) on 2020-10-14 13:58:32 in reply to 1 [link]

> ... and if it is a little bit more modular and extensible it would take over the world.

Been there, done that, got the T-shirt.

I don't disagree with any point, and might add that all of the points have been made in one way or another, but possibly several posters, over time. They are definitely not wrong.

Possibly two of the biggest advantages of SQLite are: 
 
- 1. The actual "lite"-ness, and
- 2. The local-ness of data.

Note that Both these points have severe drawbacks, especially when measured against other databases, but that's ok, it's drawbacks we embrace because it allows the advantages that come with the same.

Put differently: SQLite can go places and do things none of the other engines can, and yes, by that same token there are many things it will never do as well as even a mediocre client-server system. It's a different focus, by design.

Having said that, you are still not wrong, a lot of the described items could certainly make it into SQLite with good effect. Some of which may start to encroach on the "Lite"-ness, but I think the biggest reason why lots of it are not implemented is simply a question of time vs. need vs. utility.

Nobody questions the usefulness of a Helicopter, but there is a reason not every household has one.

(5) By anonymous on 2020-10-15 07:32:02 in reply to 1 [link]

> - Recursive common table expressions (CTEs) but no recursive top-level tables.

I doubt that recursive top-level tables are useful, but you can use views if needed.

> - There is a way to add custom functions via the C API but no way to write custom functions in SQL (to reuse commonly used code, etc.) or a built-in procedural language.

For stored procedures, you could instead use views with INSTEAD OF triggers; this is what I have done. A "CREATE PROCEDURE" command is unnecessary. (Note that despite what the documentation says, WITH can be used in a trigger as long as it does not occur as the first word in a command, and I find this feature useful).

For functions, you could implement this in an application code too (by parsing the first few words of a SQL code and checking if they are "CREATE FUNCTION", and then handling it by yourself). This is not possible to implement in an extension, though; to allow it in extensions would require a "syntax hook" (which would just pass any SQL code with syntax errors to the hook function that you defined, if there is one; the extension code needs to do anything else to handle it).

For a somewhat different reason, the ability to define "macro functions" would also be useful; a C API to do this may be be given arguments whose values are known at parsing time (basically, only literal values), and then returns a string with a SQL expression with host parameters; the arguments to the function are then substituted where the host parameters appear in the returned code, and then that expression is used instead of the original function call when the statement is compiled. (For example, if you write `F(X+Y,Z)` and the macro function `F` is defined as `(?1*?2)` then it is just as though you had written `((X+Y)*Z)` instead. This example isn't very useful, but if your macro uses the date/time functions, LIKELIHOOD function, expanding to multiple terms in a WHERE clause, etc, then it becomes much more useful.)

> - There are virtual tables but no way to create writable views (creating triggers over views is impossible)

Creating triggers over views is not impossible. You can use INSTEAD OF triggers to make writable views.

(Still, I don't like that you cannot use the ALTER TABLE command to rename views. I once tried to fix the code to allow this, and it seemed to work, but I have not thoroughly tested this.)

> - There is a bytecode engine with powerful features such as coroutines but it lacks a stack (to have reentrancy, continuations, and abstractions) and is not exposed as an API to applications.

There are good reasons to not expose it as an API to applications. (I have an idea to "kind of" expose it though: Make it a test control (which, like all test controls, are subject to change), and provide macros with the major and minor VDBE version numbers. If anyone uses this feature, they have no support, and if an upgrade of SQLite breaks their program (the macros I mentioned would make this detectable at compile time), that is their own fault for using this feature, and the writer of the program that used this feature now has to fix it themself, since it is not a problem with SQLite itself; the SQLite developers will not fix it.)

The features that the bytecode engine doesn't have are things that it doesn't need, so there is no point to add it.

> - There is no way to create a "virtual database" that handles reads and writes its own way.
> - There is no way to create and register new storage engines. I can only write new VFS layers that allow SQLite to write databases in the SQLite3 B-tree format to new file systems even though the SQLite bytecode refers not to B-tree leaves, branches and nodes but to SQL database tables and rows. This makes it impossible for SQLite to read from and write to CSVs, Excel files, etc. without the special "virtual table" feature. (Ideally, the "virtual table" feature shouldn't be special.)

These two are similar. I think SQLite4 allows custom storage engines, but SQLite3 does not, and perhaps this should be corrected (eventually).

Note that a VFS could construct a database file itself (or parse any writes to it by itself, although only after a transaction has been committed), since the file format is public, although this is really messy, and it is often easier to use a virtual table or a temporary database.

(Actually, there are many deficiencies in the virtual table mechanism too, although they have made many improvements to it over time.)