select * from t; -- but don't want virtual columns

(1.1) By curmudgeon on 2020-09-12 12:31:10 edited from 1.0 [link] [source]

I'm sure I saw a thread on this just after generated columns were added to sqlite but I've been unable to find it. Has anyone found a way round this that doesn't involve supplying the names of all non-virtual columns? There's a lot of convenience lost when a virtual column is added to a table as * can no longer be used in select => insert, create table as etc.

Is there not a case for a new ^ command (e.g. 'select # from t') that only returns non-virtual columns? Or maybe a virtual table that does the same?

(2) By Simon Slavin (slavin) on 2020-09-12 13:11:17 in reply to 1.1 [link] [source]

The standard answer on this is that you shouldn't be using SELECT * at all in production code.

That shortcut can be useful in quick scripts and debugging, but has nasty results when used in a program you're not going to look at for years. The most obvious is that if you come back to an old project and add an extra column to a table, you then have to change every SELECT *.

(3) By L Carl (lcarlp) on 2020-09-12 15:27:36 in reply to 2 [link] [source]

Except for a few unusual cases, I agree that select * should be forbidden in production code, but it is extremely useful for interactively exploring data, debugging, reverse-engineering etc. And, I’ve often wished for a way to select all columns except for one or two. In the past, I’ve created scripts that generate a list of all the columns in a table or view (from the data dictionary) that I can quickly edit to make the query I want. But, that slows me down.

(4) By curmudgeon on 2020-09-13 09:33:03 in reply to 2 [link] [source]

Is that not a bad example Simon? Why would you have to change the 'select *'?

Suppose you had a backup for a table with 50 columns 
insert into bkup select * from tbl where ....;
would surely be easier to maintain than the alternative. It would save having to change the backup sql every time you decided to restructure the tables.

(5) By Warren Young (wyoung) on 2020-09-13 11:28:15 in reply to 4 [link] [source]

This argument is nearly a FAQ in SQL, so there’s not much point rehashing it here.

Since finding the many places select * has been argued against is difficult due to the way search engines handle phrases and wildcards, I dug up a good discussion for you here. There’s a lot more on that topic if you want to do some more digging.

(7) By curmudgeon on 2020-09-13 16:46:39 in reply to 5 [link] [source]

Thanks Warren. I read that thread but I still don't see an argument against having the # option even if it should only be used for development purposes. Nor has anyone address my backup example or the difficulty in creating a duplicate table with 'create table as insert * from ..'.

Its simple enough to get a list of the cols ('select group_concat(name,', ') from pragma_table_info(?1);') but you've then got to remove the virtual cols (assuming you can recall them without looking up the table DDL) and then paste the resulting string into another query which will then look long & ugly. I'll be able to find a way of automating it but I wouldn't fancy describing the steps involved to a newbie evaluating sqlite. Having said that I admit I couldn't find any sql software that allowed 'excluding cols' from *.

(8) By Kees Nuyt (knu) on 2020-09-13 20:50:28 in reply to 7 [link] [source]

Try table_xinfo()

SELECT * FROM pragma_table_xinfo('tablename')

The virtual column gets attribute 'hidden'.

Example

sqlite3 test.db \
"CREATE TABLE t1 ( \
id INTEGER PRIMARY KEY NOT NULL, \
a INTEGER, \
b INTEGER, \
c INTEGER GENERATED ALWAYS AS (a * b) VIRTUAL
)" \
".mode column" \
". echo on" \
"SELECT * FROM pragma_table_xinfo('t1')"
cid  name  type                      notnull  dflt_value  pk  hidden
---  ----  ------------------------  -------  ----------  --  ------
0    id    INTEGER                   1                    1   0
1    a     INTEGER                   0                    0   0
2    b     INTEGER                   0                    0   0
3    c     INTEGER GENERATED ALWAYS  0                    0   2

-- 
Regards,
Kees Nuyt

(9) By Keith Medcalf (kmedcalf) on 2020-09-13 21:06:10 in reply to 7 [link] [source]

Actually, it would be:

  select group_concat(name, ', ') 
    from pragma_table_xinfo 
   where schema == :schema
     and arg == :tablename
     and hidden == 0
order by cid
;

Which will return you the comma separated list of non-hidden columns for the table :tablename in schema :schema. Your application can then construct an SQL statement that works with columns that are not hidden. Using the correct introspection pragma/virtual table is key.

Once you have that list of column names constructing an appropriate SQL statement in the application is trivial:

'insert into bcc (%s) select %s from cc' % (columns, columns)

will evaluate to the necessary SQL to insert the non-hidden columns from table cc into the same named columns in table bcc.

I do not think there would be any problem explaining that to anyone at all once you point out the correct method of getting the applicable column names.

(23) By luuk on 2020-09-15 06:20:59 in reply to 9 [link] [source]

But "The order of the concatenated elements is arbitrary." in the paragraph under "group_concat(X),group_concat(X,Y)" on https://www.sqlite.org/lang_aggfunc.html

That will hurt harder than the 'problem' which is discussed here.....

(24) By curmudgeon on 2020-09-15 08:47:37 in reply to 23 [link] [source]

You're right. It should probably be

  select group_concat(name, ', ') from
  (select name from pragma_table_xinfo 
  where schema == :schema and arg == :tablename and hidden == 0 order by cid);

although I'd like Keith to confirm that.

(25) By Keith Medcalf (kmedcalf) on 2020-09-15 09:52:48 in reply to 24 [link] [source]

group_concat will will work in "visitation order". The order by clause imposes "presentation order" of the results.

sqlite> create table x(a,b,c,d,e,f);
sqlite> select group_concat(name, ', ') from pragma_table_info where arg='x' order by cid asc;
┌──────────────────────────┐
│ group_concat(name, ', ') │
├──────────────────────────┤
│ a, b, c, d, e, f         │
└──────────────────────────┘
sqlite> select group_concat(name, ', ') from pragma_table_info where arg='x' order by cid desc;
┌──────────────────────────┐
│ group_concat(name, ', ') │
├──────────────────────────┤
│ a, b, c, d, e, f         │
└──────────────────────────┘
sqlite> select group_concat(name, ', ') from (select name from pragma_table_info where arg='x' order by cid asc);
┌──────────────────────────┐
│ group_concat(name, ', ') │
├──────────────────────────┤
│ a, b, c, d, e, f         │
└──────────────────────────┘
sqlite> select group_concat(name, ', ') from (select name from pragma_table_info where arg='x' order by cid desc);
┌──────────────────────────┐
│ group_concat(name, ', ') │
├──────────────────────────┤
│ f, e, d, c, b, a         │
└──────────────────────────┘

So applying "presentation order" does not affect "visitation order" to the members of the group unless the "visitation" is the result of applying "presentation order" to the rows being visited.

Without the subselect being ordered the result is one row and "cid" has one value per group (it is a bare column of an aggregate even though it is not output in the select list, it must still be computed as if it were). Hence the one row generated by the select will be the same notwithstanding any order by.

However, in the subselect the "presentation order" of the rows of names is affected by the order by so the "visitation order" of the rows in the aggregate will be that "presentation order".

Of course, if you want the results in "order by cid" then you need do nothing since that is the "presentation order" of the virtual table anyway -- but that might be viewed as an "implementation detail" although I should think that it is unlikely to change.

(10) By Keith Medcalf (kmedcalf) on 2020-09-13 22:09:38 in reply to 7 [link] [source]

Note that you can do something like the following to create the appropriate view and do it entirely in SQL (requires the eval extension):

SQLite version 3.34.0 2020-09-11 22:07:03
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table x (x, y as (x+1), z, z1 as (z+1) stored);
sqlite> select 'create view main.vw_x as select ' || (select group_concat(name, ', ') from pragma_table_xinfo where arg='x' and schema='main' and hidden=0 order by cid) || ' from main.x';
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 'create view main.vw_x as select ' || (select group_concat(name, ', ') from pragma_table_xinfo where arg='x' and schema='main' and hidden=0 order by cid) || ' from main.x' │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ create view main.vw_x as select x, z from main.x                                                                                                                            │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
sqlite> select eval((select 'create view main.vw_x as select ' || (select group_concat(name, ', ') from pragma_table_xinfo where arg='x' and schema='main' and hidden=0 order by cid) || ' from main.x'));
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ eval((select 'create view main.vw_x as select ' || (select group_concat(name, ', ') from pragma_table_xinfo where arg='x' and schema='main' and hidden=0 order by cid) || ' from main.x')) │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                                                                                                                            │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
sqlite> .schema main.*
CREATE TABLE x (x, y as (x+1), z, z1 as (z+1) stored);
CREATE VIEW vw_x as select x, z from main.x
/* vw_x(x,z) */;

(11) By curmudgeon on 2020-09-14 06:49:38 in reply to 10 [link] [source]

Thanks Keith (and Kees). I don't know how I managed to miss the hidden column. That eval certainly simplifies what I had in mind.

(6) By Warren Young (wyoung) on 2020-09-13 11:32:02 in reply to 1.1 [link] [source]

Create a VIEW explicitly naming the columns you want to consume. Then do SELECT * on the view.

(This doesn’t contradict my post #5 in this thread, since you’re explicitly naming your columns in the first step.)

(12) By curmudgeon on 2020-09-14 06:52:21 in reply to 1.1 [link] [source]

While we're on the subject can anyone explain to me what the purpose of a stored virtual column is? I don't really see how it would be different from an ordinary column.

(13) By Keith Medcalf (kmedcalf) on 2020-09-14 07:27:15 in reply to 12 [link] [source]

Consider:

create table t
(
  id integer primary key,
  ...
  last_updated as (datetime()) stored
);

Then the value of t.last_update will always contain the datetime on which each record was inserted or last updated. If the generated datetime was not stored then it would be computed each time the record is accessed which would make it not be the last_updated datetime but rather the last accessed datetime (that is, it would be the current now).

(14) By curmudgeon on 2020-09-14 09:06:36 in reply to 13 [source]

I get that difference Keith but in what way would it be different to last_updated as a non-virtual column? I'm assuming it would take up the same space in the db so what is the purpose of STORED?

PS I obviously missed the hidden column because I was using  table_info rather than table_xinfo. I just noticed that table_info doesn't even list the virtual columns. I don't know how I missed that.

(15) By Keith Medcalf (kmedcalf) on 2020-09-14 09:32:06 in reply to 14 [link] [source]

SQLite version 3.34.0 2020-09-11 22:07:03
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table ts(x, last_update as (datetime()) stored);
sqlite> insert into ts values (1, datetime());
Error: table ts has 1 columns but 2 values were supplied
sqlite> insert into ts values (1);
sqlite> insert into ts values (2);
sqlite> select * from ts;
┌───┬───────────────────────────┐
│ x │        last_update        │
├───┼───────────────────────────┤
│ 1 │ 2020-09-14 09:09:35.532 Z │
│ 2 │ 2020-09-14 09:09:50.556 Z │
└───┴───────────────────────────┘
sqlite> update ts set last_update=datetime();
Error: cannot UPDATE generated column "last_update"
sqlite> update ts set x=3 where x=1;
sqlite> select * from ts;
┌───┬───────────────────────────┐
│ x │        last_update        │
├───┼───────────────────────────┤
│ 3 │ 2020-09-14 09:10:10.539 Z │
│ 2 │ 2020-09-14 09:09:50.556 Z │
└───┴───────────────────────────┘
sqlite> create table tv(x, last_update as (datetime()) virtual);
sqlite> insert into tv values (1, datetime());
Error: table tv has 1 columns but 2 values were supplied
sqlite> insert into tv values (1);
sqlite> insert into tv values (2);
sqlite> select * from tv;
┌───┬───────────────────────────┐
│ x │        last_update        │
├───┼───────────────────────────┤
│ 1 │ 2020-09-14 09:11:32.773 Z │
│ 2 │ 2020-09-14 09:11:32.773 Z │
└───┴───────────────────────────┘
sqlite> update tv set x=3 where x=1;
sqlite> select * from tv;
┌───┬───────────────────────────┐
│ x │        last_update        │
├───┼───────────────────────────┤
│ 3 │ 2020-09-14 09:11:55.549 Z │
│ 2 │ 2020-09-14 09:11:55.549 Z │
└───┴───────────────────────────┘
sqlite> update tv set last_update=datetime();
Error: cannot UPDATE generated column "last_update"
sqlite>

See the difference?

When a GENERATED COLUMN is STORED the value is stored in the field of the record when when the record is "written" (inserted or updated) -- it is basically a "default" for which you cannot specify a value -- and when the record is retrieved the stored value is returned. It takes up storage space in the record payload.

When a GENERATED COLUMN is VIRTUAL the value is not stored in the table but is rather generated VIRTUALLY (computed) when the record is retrieved. It does not take up space in the record payload.

So a STORED virtual column is a column that is computed and then stored as part of the record, just like a default. Except, unlike a default, you cannot override it with user provided data.

(16) By Keith Medcalf (kmedcalf) on 2020-09-14 09:42:52 in reply to 14 [link] [source]

Also, you cannot use expressions in the unique constraint of a table definition, but you can use generated columns.

sqlite> create table t(x, unique(x/10));
Error: expressions prohibited in PRIMARY KEY and UNIQUE constraints
sqlite> create table t(x, y as (x/10), unique(y));
sqlite> insert into t values (1);
sqlite> insert into t values (2);
Error: UNIQUE constraint failed: t.y
sqlite> insert into t values (10);

Theoretically you could use STORED generated columns in a primary key however SQLite3 presently does not permit that (it prohibits all generated columns from primary keys).

(17) By curmudgeon on 2020-09-14 10:04:18 in reply to 16 [link] [source]

Got it now Keith, thanks.

(18) By Keith Medcalf (kmedcalf) on 2020-09-14 10:18:12 in reply to 17 [link] [source]

The basic difference is when the value of the generated column is computed.

For GENERATED AS ... VIRTUAL the computation is done when the row is retrieved so the value does not need to be stored in the row.

For GENERATED AS ... STORED the computation is done each time the row is stored or updated and therefore the result of that computation must be stored in the row for subsequent retrieval.

This is perhaps the simplest way to describe the distinction.

(19) By Warren Young (wyoung) on 2020-09-14 13:52:22 in reply to 12 [link] [source]

It's a form of memoization: given a deterministic expression for an oft-needed result, it's better to calculate it once and store it, then retrieved the cached copy as long as the inputs don't change, as opposed to repeatedly recalculating it.

When those inputs are also in SQLite, then stored virtual columns give you memoization at the DB layer, so you don't have to do it higher up in the application layer. Coupled with application-defined SQLite functions, you can even do this for expressions that SQLite cannot calculate using only its built-in operators and functions — e.g. is this the record of a blue-chip customer — as long as the functions involved are all deterministic.

Additionally, it solves one of the two hard problems in computer science.

Stored virtual columns are very good things.

(21) By curmudgeon on 2020-09-14 15:14:42 in reply to 19 [link] [source]

Thanks Warren.

(22.1) By Keith Medcalf (kmedcalf) on 2020-09-14 21:02:51 edited from 22.0 in reply to 19 [link] [source]

Example:

SQLite version 3.34.0 2020-09-14 08:14:15
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table x(x, last_update as (datetime()) stored, update_sid as (usersid()) stored);
sqlite> insert into x values (1);
sqlite> select * from x;
┌───┬───────────────────────────┬────────────────────────────────────────────────┐
│ x │        last_update        │                   update_sid                   │
├───┼───────────────────────────┼────────────────────────────────────────────────┤
│ 1 │ 2020-09-14 21:01:32.964 Z │ S-1-5-21-2707818823-1216488348-3746780667-1001 │
└───┴───────────────────────────┴────────────────────────────────────────────────┘
sqlite> select * from x;
┌───┬───────────────────────────┬────────────────────────────────────────────────┐
│ x │        last_update        │                   update_sid                   │
├───┼───────────────────────────┼────────────────────────────────────────────────┤
│ 1 │ 2020-09-14 21:01:32.964 Z │ S-1-5-21-2707818823-1216488348-3746780667-1001 │
└───┴───────────────────────────┴────────────────────────────────────────────────┘

Whenever a row is added or updated in table x the date of last update and the security identifier performing that update are recorded and cannot be fiddled (usersid() is a UDF that retrieves the Windows Security Identifier of the current/active process authentication token).

(20.1) By Simon Slavin (slavin) on 2020-09-14 13:59:58 edited from 20.0 in reply to 12 [link] [source]

Deleted