SQLite Forum


50 most recent forum posts by user dvdraymond

15:46 Reply: Is it possible to fix a more than 1G corrupted SQLite file with only the first 16K content corrupted? (artifact: 7de4b52ee4 user: dvdraymond)
I'm curious how you know only the first 16k was corrupted.

The schema is always on the first page (or at least starts there), and the next 3 pages could in theory be anything.

If those pages are garbage, you can still recover all the individual records that are completely stored on pages other than the first 4 there. Any data in those first 4 pages though is of course gone.

Because the schema is corrupted, you won't necessarily know what table each record belongs to. But you can put them in big groups and say look,
we know everything in group A was in the same table,
we know everything in group B was in the same table,
we know everything in group C was in the same table.
But we don't know which tables A, B and C were, and they might even be the same one.

If you know what your schema looked like, and all of your tables had different numbers of fields, then it's nice and easy to say ohh, these records were 12 fields long, they must have been from table X.

But if you had multiple tables with the same number of fields it can be harder to determine what was what. Maybe you yourself can figure it out based on the contents and what type of data was in each field, maybe not.
Maybe you can look at the rowid's in groups A and B and see that they overlap, and thus be sure they're from different tables, or maybe they won't overlap and you won't know.

13:24 Reply: Case when (artifact: 081d83bc24 user: dvdraymond)
The || thing is a binary operator. It's just like multiplication *, division /  etc. in that it needs something on both sides of it to mean anything.

There's nothing on the left hand side of the operator here

You have mismatched parenthesis. You have 2 more open parenthesis than you have close parenthesis.

You're using CASE... but don't have THEN or END in there anywhere. Again, look at how to use CASE: https://www.sqlite.org/lang_expr.html#the_case_expression

= can't be used to test against multiple things. It is, again, a binary operator. It takes what's immediately on the left, immediately on the right, compares them and returns the result. To compare 1 thing on the left against multiple things on the right use IN:  https://www.sqlite.org/lang_expr.html#the_in_and_not_in_operators
(which I guess is technically still a binary operator comparing 1 thing on the left with 1 list on the right...)

substr(...) IN ('1', '2', '3')

substr(...) = '1','2','3','Q1'
is like saying
(substr(...) = '1'), '2', '3', 'Q1'
true, '2', '3', 'Q1'
It doesn't mean anything
13:03 Reply: Case when (artifact: ae28e9b548 user: dvdraymond)

My bet is that their dates are like 1/2/2022, and not like 01/02/2022, but as has been pointed out, they aren't showing us their data.

19:00 Reply: Case when (artifact: e8046d3240 user: dvdraymond)

Not much there is valid SQL.

The format for a CASE statement is shown in the Expression page

Single quotes make string literals. You need double quotes to quote an identifier like a column name.

String concatenation is 2 pipes, not 1

Core Functions

There is no right() function, but you can use a negative second value to substr()

For the month lookup you'll use substr() and instr(), something like substr("accounting period", 1, instr("accounting period", '/') - 1)

15:40 Reply: feature request: vacuum into without indexes (artifact: 3af80049ee user: dvdraymond)
> So I guess `INSERT INTO ... SELECT * FROM ...;` is optimized in a certain way?

There is indeed an optimization for that. Here's part of the comment on it from the source code since I'm not sure if it's documented on the web page anywhere.

** Attempt the transfer optimization on INSERTs of the form
**     INSERT INTO tab1 SELECT * FROM tab2;
** The xfer optimization transfers raw records from tab2 over to tab1.
** Columns are not decoded and reassembled, which greatly improves
** performance.  Raw index records are transferred in the same way.
** The xfer optimization is only attempted if tab1 and tab2 are compatible.
** There are lots of rules for determining compatibility - see comments
** embedded in the code for details.


** This optimization is particularly useful at making VACUUM run faster.
16:49 Reply: Unable to read while database is locked (artifact: 3fe1aa5b4d user: dvdraymond)

SQLite can only ever have 1 writer at a time. In order to have concurrent readers while that 1 writer is active you have to use WAL mode

18:38 Reply: SELECT with LIMIT OFFSET clause return unexpected result (with sample db and query) (artifact: 195588b84d user: dvdraymond)

Did I misunderstand the query ?

Yes. It being a sub-query is part of whatever the bug is.

Can add that I'm seeing what they're seeing with the 3.37.2 CLI

14:27 Reply: autovacuum in sqlite VS autovacuum in PG (artifact: f24c56d390 user: dvdraymond)

I believe your understanding is correct.

My limited understanding of Postgres is that it keeps multiple copies of each record. Each copy has metadata that says I am valid from transaction A through transaction B. It does that because older transactions still need to see the table as it was when they started even if a newer transaction comes along and makes changes. So Postgres autovacuum comes along and says "ok, you were valid from A through D, and the oldest running transaction is F, so no one needs you any more, you can be removed from the file." Or "You were valid from D through G, but the oldest running transaction is F, so F might still need you and we have to leave you in the file."

The way SQLite handles that is that in rollback journal mode you can't make changes until all other transactions complete. In WAL mode you can't checkpoint a page in the WAL back to the main file until all transactions older than that change are complete. So in WAL mode you'll have your multiple copies of a page, but they'll be in the WAL and not the main file.

SQLite's autovacuum is pretty much what you said. It's there to reclaim unused pages and keep the file size smaller by shuffling pages from the end to free pages earlier in the file.

16:33 Reply: How to merge two or more tables that have the same structure? (artifact: bb0ca9f424 user: dvdraymond)
You should be able to order the sub query then, right?

update t set value1 = (select value1 from t as t2 where t2.time < t.time and t2.value1 is not null order by t2.time desc limit 1) where value1 is null;
13:16 Reply: ROWID primary key works slower than index (artifact: 53a09f7a9f user: dvdraymond)

I can't answer all of of your questions, but in this case the blob size vs the covering index is important. In the main B tree each record has the rowid and the blob for every value in the tree. With a 1K blob then you're only gonna get 3 records per 4k page (or use overflow pages) for the main storage.

Conversely the index on id only stores id, it doesn't store the blob in the index. Since it's only storing one integer for each record then you're gonna store thousands per page.

Since your query doesn't use any of the other fields, it doesn't have to go to the main table to do the query, it can get everything it needs from the nice compact index, which in this specific case means it's fetching a thousand times fewer pages from the database to complete it. If your query involved any of the other fields in the table, then the extra index on ID is going to be slower and not of any use.

18:39 Reply: I need a little help with UPDATE? (artifact: 1bb7b4bdd9 user: dvdraymond)
Well, you haven't told us what you wanted it to do, or why you think there's a problem. So we have to guess on everything.

For starters as to what's wrong: "s" isn't defined anywhere in there.
14:59 Reply: binary .dump format (artifact: d287c1fe01 user: dvdraymond)
I'm gonna risk showing my low knowledge and ask something.

For a binary dump of the data, why not just make a straight copy of all the leaf table b-tree pages? (I'm sticking my fingers in my ears and ignoring WITHOUT ROWID tables for now)

Go through each of the tables, and dump out the leaf pages in btree order (not original physical page order). If a leaf has overflow pages, maybe put them in order right after the leaf they overflow from.

We'll be storing sqlite_master, so a loader tool will have the whole schema. All the interior btree pages and all index pages (again, ignoring WITHOUT ROWID for now) are just there for quick search and wouldn't have had any data that needed storing. And when restoring a dump we're just gonna visit each record once and don't need quick searches.

Re-creating the tree part shouldn't be too bad if you have all the leaves, right? You could either decode them one at a time and do a bunch of inserts to the new file, or in a fancier move copy them all at once to the new file, then re-create the interior pages with something akin to create index. The only copied pages you'd have to make updates to would be for new root page numbers in sqlite_master, and overflow page pointers. Right? You'd probably just need a new header to list the start and end pages of each table in the dump.

Just storing the original leaf pages as-is would also let you re-use a bunch of already written code I'd bet.

Admittedly it wouldn't really work for comparing dumps of equivalent data, as how things are distributed between pages and on each page could be different. But as far as a "binary .dump format", why not?

Anyway, let me know how insane I'm talking. And apologies if it really is insane.
13:36 Reply: Views, Casting vs and converters (artifact: 3a773c65a1 user: dvdraymond)
I believe when you set detect_types to both PARSE_DELCTYPES and PARSE_COLNAMES it prioritizes the column names first, then the declared types.

Since your declared type of my_col is INTEGER, and you only declared a converter for "BOOLEAN"

my_col as "my_col float [FLOAT]"
Because of the column name it looks for a converter for "FLOAT" and doesn't find it.
Since "my_col" is a bare column it checks for a converter for "INTEGER" and doesn't find it.
So it returns what it got from the query, which is 1

CAST(my_col as FLOAT) as [my_col cast float]
The column name "my_col cast float" doesn't conform to the "name [type]" format, so no check is done for a converter on the column name.
CAST(my_col as FLOAT) is not a bare column any more, so there is no declared type for it to look up a converter for.
So it returns what it got from the query, which is my_col cast as a float, or 1.0

my_col as "my_col bol [BOOLEAN]"
The column name of the correct format, so it checks for a converter for "BOOLEAN" and finds it.
It passes what it got from the query (1) to your registered function for "BOOLEAN" and gets back True

CAST(my_col as BOOLEAN) as [my_col cast bol]
The column name "my_col cast bol" doesn't conform to the "name [type]" format, so no check is done for a converter on the column name.
CAST(my_col as BOOLEAN) isn't a bare column any more, so there is no declared type for it to look up a converter for.
So it returns what it got from the query, which is my_col cast to the SQLite type of "BOOLEAN", which following the rules results in numeric affinity. 1 as numeric is still 1, so that's what you get back.

So the important things for you is that for PARSE_DECLTYPES when you do...
cast(my_col as SOMETHING)
...that "SOMETHING" is not carried through as the declared type. PARSE_DECLTYPES only works if it's selecting just a bare column from a table.
And for PARSE_COLNAMES, your column names need to follow the format of "name [type]"
13:19 Reply: STRICT tables and UNIQUE indices (artifact: 146daf1138 user: dvdraymond)
How about adding in another column used as a boolean that says whether you have an actual value for that?

Where column "b" is your column needing the "non value" value (for this example -1), then how about something like:

create table foo (
    a not null,
    b not null,
    b_is_unknown not null
        check (b_is_unknown in (0, 1)),
    check (case when b_is_unknown then b = -1 end)
create unique index foo_idx on foo (a, b, b_is_unknown);

sqlite> insert into foo values (1, 2, 0); -- normal value
sqlite> insert into foo values (1, 2, 0); -- can't dupcliate normal values
Error: UNIQUE constraint failed: foo.a, foo.b, foo.b_is_unknown
sqlite> insert into foo values (1, 2, 1); -- using bad value for unknown
Error: CHECK constraint failed: case when b_is_unknown then b = -1 end
sqlite> insert into foo values (1, -1, 1); -- proper unknown value
sqlite> insert into foo values (1, -1, 0); -- -1 is an actual value here
sqlite> insert into foo values (1, -1, 1); -- can't submit 2 unknown b's with the same a
Error: UNIQUE constraint failed: foo.a, foo.b, foo.b_is_unknown
sqlite> insert into foo values (1, 2, -1); -- bad value for b_is_unknown flag
Error: CHECK constraint failed: b_is_unknown in (0, 1)
14:54 Reply: Possible bug: Unexpected result using correlated aggregate subquery (artifact: 1d6f05c554 user: dvdraymond)
I mean you have to admit that it does seem a little weird that an aggregate function in the sub-query makes the outer query an aggregate. That is a little unexpected.

SQLite version 3.36.0 2021-06-18 18:36:39
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table x (a);
sqlite> create table y (b);
sqlite> insert into x values (1), (2);
sqlite> insert into y values (7), (8), (9);
sqlite> select (select group_concat(y.b) from y) from x; --returns 2 rows
sqlite> select (select group_concat(5) from y) from x; --returns 2 rows
sqlite> select (select group_concat(x.a) from y) from x; --now only returns 1 row

sqlite> select (select count(*) from (select group_concat(y.b) from y)) from x;
sqlite> select (select count(*) from (select group_concat(5) from y)) from x;
sqlite> select (select count(*) from (select group_concat(x.a) from y)) from x;
14:21 Reply: Possible bug: Unexpected result using correlated aggregate subquery (artifact: 85a3337362 user: dvdraymond)
> sqlite> CREATE TABLE x AS SELECT 1 a;

Note that the table "y" has no column "a", only column "b"

sqlite> SELECT (SELECT group_concat(a) FROM y) unexpected, -- expected 1,1
   ...> FROM x;

The subquery (SELECT group_concat(a) FROM y) here returns 2 rows of 1 column, where the value of that one column is '1' for both rows, because "a" is reffering to the column in x, and not to anything in y. Since this subquery is being used to provide a single column value, only the first of those rows is being used, and the rest discarded.

To get your expected '1,1' you would either have to make it something like
sqlite> select (select group_concat(a) from (select a from y)), ... from x;

if you did indeed mean column "a", or it you mentioned the wrong column and really wanted "b", then

sqlite> select (select group_concat(b) from y), ... from x;
15:20 Reply: Most performant way to test for having 0, 1 or more rows in a table (artifact: 9d87c828fc user: dvdraymond)
You can see the limiting in the EXPLAIN output, as it's more verbose than the shorter, more readable EXPLAIN QUERY PLAN output.

SQLite version 3.36.0 2021-06-18 18:36:39
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table t (k integer primary key not null, data blob);
sqlite> .eqp full
sqlite> select (select count(*) from (select 1 from t limit 2)) = 1;
   |  `--SCAN t
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     28    0                    0   Start at 28
1     Integer        22    3     0                    0   r[3]=22; return address
2     Once           0     22    0                    0
3     Null           0     4     4                    0   r[4..4]=NULL; Init subquery result
4     InitCoroutine  5     13    5                    0   SUBQUERY 1
5     Integer        2     6     0                    0   r[6]=2; LIMIT counter
6     OpenRead       1     2     0     0              0   root=2 iDb=0; t
7     Rewind         1     12    0                    0
8       Integer        1     7     0                    0   r[7]=1
9       Yield          5     0     0                    0
10      DecrJumpZero   6     12    0                    0   if (--r[6])==0 goto 12
11    Next           1     8     0                    1
12    EndCoroutine   5     0     0                    0
13    Integer        1     8     0                    0   r[8]=1; LIMIT counter
14    Null           0     9     9                    0   r[9..9]=NULL
15    InitCoroutine  5     0     5                    0
16      Yield          5     19    0                    0   next row of subquery_1
17      AggStep        0     0     9     count(0)       0   accum=r[9] step(r[0])
18    Goto           0     16    0                    0
19    AggFinal       9     0     0     count(0)       0   accum=r[9] N=0
20    Copy           9     4     0                    0   r[4]=r[9]
21    DecrJumpZero   8     22    0                    0   if (--r[8])==0 goto 22
22    Return         3     0     0                    0
23    Integer        1     1     0                    0   r[1]=1
24    Eq             10    26    4                    64  if r[4]==r[10] goto 26
25    ZeroOrNull     4     1     10                   0   r[1] = 0 OR NULL
26    ResultRow      1     1     0                    0   output=r[1]
27    Halt           0     0     0                    0
28    Transaction    0     0     1     0              1   usesStmtJournal=0
29    Integer        1     10    0                    0   r[10]=1
30    Goto           0     1     0                    0
12:49 Reply: Avoiding recursive CTE materialisation (artifact: 6523441ecc user: dvdraymond)
Perhaps something like this? (Haven't tested so might not work)

with recursive parents (id, parentId, name, found_what_i_want) as (
    select *, false from export where id = 60363923
    union all
    select export.*, export.id > 60363923 as found_what_i_want
    from export inner join parents on export.id = parents.parentId
    where not found_what_i_want
select id, parentId, name from parents where found_what_i_want;
17:05 Reply: LEFT AND RIGHT OUTER JOINS problem (artifact: 9387707b37 user: dvdraymond)

SELECT ..., SUM(o.quantity)*o.price

You've got an aggregate function in there, but no GROUP BY. So it's using all the rows as your group, and treating the rest of the columns are bare columns, which it's taking from one of the rows "at random" (not actually random, but that's another discussion)

You need to either group by something for which you want info on the groups... or get rid of the aggregate functions.

13:40 Reply: SQLite irrespective query (artifact: 99678ffcb5 user: dvdraymond)
HAVING is only for GROUP BY queries, where you're applying the filter to the resulting groups, and not to the rows before they're grouped.

In this case you can just write it like you have the restrictions in English.
the shoppers who joined the company after 01-01-02020 or are a woman

FROM shoppers               -- the shoppers
WHERE                       -- who
date_joined >= '2020-01-01' -- joined the company after 01-01-02020
OR                          -- or
gender = 'F'                -- are a woman

Also, aliases should not be in single quotes, that's for string literals. Either use no quotes or double quotes to show it's an identifier. SQLite will probably let you do it, but it's being kind.

... AS Age
... AS "Age"
but not
... AS 'Age'
13:20 Reply: problem report: incorrect join behavior (artifact: d730d754d7 user: dvdraymond)

See this optimization

A LEFT JOIN can sometimes be converted into an ordinary JOIN if there are terms in the WHERE clause that guarantee that the two joins will give identical results. In particular, if any column in the right-hand table of the LEFT JOIN must be non-NULL in order for the WHERE clause to be true, then the LEFT JOIN is demoted to an ordinary JOIN.

In your case the ON condition only references the left table, so there's no point in it being an outer join, and thus it treats it as an inner join.

13:01 Reply: How to submit changes (artifact: b7d7244259 user: dvdraymond)

If I understand you correctly, you're saying you tried to commit your own code to the master SQLite repository?

SQLite is open source, but not open-contribute. You can suggest changes here in the forums for example, but only the select few actual members of the team can commit code.

If it's "just" spelling issues (in the code and not the docs) then don't fret if they say no. I remember someone submitted a thing to remove all the trailing white space for example, and the decision was that was going to explode and cause noise in deltas with no actual benefit to the binaries. So someone looking for what changed from version A to B would see 1,000 lines of code changed... but none of those code changes would actually effect how anything worked.

12:44 Reply: operators IS TRUE and IS FALSE missing from diagrams and table plus possible evaluation errors (artifact: a148f53ae6 user: dvdraymond)
> Well it turns out that "is false" and "is not false" do not need to be special

SQLite version 3.36.0 2021-06-18 18:36:39
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> select '' is false;
sqlite> select '' is (false);
sqlite> select '' is (+false);
sqlite> select false;
sqlite> select '' is 0;
sqlite> select '' is (0);
sqlite> select '' is (+0);
13:48 Reply: cycle detection in CHECK constraint with recursive CTE (artifact: a184978699 user: dvdraymond)

Most DBMS's implement immediate Foreign-Keys by default, unlike SQLite.

With deferred FKs, the only mode in SQLite...

No, immediate is the default and you have to specify "deferrable initially deferred" to get deferred FKs.

Deferred Foreign Key Constraints

(Or I'm just misunderstanding what you're saying)

19:43 Reply: Unexpected return from the NULL column (artifact: 3b70b2ac1d user: dvdraymond)

So we're saying in the original version, that since v3 is calculated as NULL, that the "v3 NOT IN..." should therefore evaluate to NULL, and thus the record should not pass the WHERE clause. But somehow it is.

18:23 Reply: Should SQLite be able to optimize this query? (artifact: 3344977a85 user: dvdraymond)

My hunch is that you'll find that exact case with group_concat() a lot in the wild because we can't do ordered aggregates.

Adding in the ability to do

group_concat(word, ',' order by word)

would allow people to fix it, but then they'd be fixing it because it broke in the meantime.

12:56 Reply: Feature request: add index info to WITH clause table (artifact: 47e359f7ec user: dvdraymond)

I know SQLite has logic where it can create Automatic Indexes to help with queries. Perhaps if that was combined with AS MATERIALIZED hints to let it know it's ok to consider making an index on the materialization of the with statement?

Or is that already the case?

19:19 Reply: can i custom index blob data? (artifact: a524364a3a user: dvdraymond)

Remember that you can make Indexes on expressions, so if the way you're pulling out stuff from the blob is something that can be expressed in SQL with deterministic functions, then you can index on it.

13:23 Reply: How to handle collation version changes (artifact: 9edc83b98c user: dvdraymond)

Compared to PRAGMA integrity_check, does PRAGMA quick_check skip checking the validity of the sort order per the collation? (Is skipping this step one of the things that makes it "quick"?)

Yes. From the docs https://www.sqlite.org/pragma.html#pragma_quick_check

The pragma is like integrity_check except that it does not verify UNIQUE constraints and does not verify that index content matches table content. By skipping UNIQUE and index consistency checks, quick_check is able to run faster. PRAGMA quick_check runs in O(N) time whereas PRAGMA integrity_check requires O(NlogN) time where N is the total number of rows in the database. Otherwise the two pragmas are the same.

I tested it by inserting into a text field all the capitalization variants of 'ABC', then using pragma writable_schema to change it to text collate nocase. After closing and re-opening the database quick_check returned ok, but integrity_check found errors.

14:36 Reply: Shortcut to change a non changing update or upsert in a noop (artifact: 16d7383a0c user: dvdraymond)

I think the main reason changes() reports is for triggers. That way "on update" triggers will still fire even if all the old and new fields are the same for the record.

14:07 Reply: Shortcut to change a non changing update or upsert in a noop (artifact: a3edf4ec1f user: dvdraymond)

Granted it's not the most thorough of tests, but in Windows, when I run an update query that doesn't end up changing any values, then the Modified and Accessed times of the file are not updated, despite changes() saying something was changed. When I do an update that does change values, then the Modified and Accessed times on the file do get updated. So I'd say yes, there is some optimization there. Whether it depends on the VFS... I haven't a clue.

18:09 Reply: Duplicated rows (and missing rows) when SELECTing with ORDER BY (artifact: 195d361259 user: dvdraymond)

What does pragma integrity_check; return?

15:28 Reply: Slow insert with prepare/step (artifact: c6759f6f31 user: dvdraymond)

So I just need to do separate execs for BEGIN and END before and after, is it the right way to do it?

That's my understanding anyway. Give it a whirl and let us know if that does indeed solve your issue.

14:15 Reply: Slow insert with prepare/step (artifact: 1db30fbdd7 user: dvdraymond)

There are two functions below doing the same thing. On my machine Test1 takes microseconds to complete, Test2 ~ 17-20 seconds. Why is Test2 so painfully slow? I expected it to be faster than Test1.

1 transaction in the first vs 1,000 in the second?

12:14 Reply: Feature request: Stored Procedures (artifact: cd05f203c9 user: dvdraymond)

I'll give you just one scenario to help you see that views and triggers are not always enough.

How are views and triggers not enough for this?

SQLite version 3.34.0 2020-12-01 16:14:00
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.

sqlite> create table people (pid, name, address);

sqlite> create table phones (pid, phone);

sqlite> create view add_person as select pid, name, address, phone from people inner join phones using (pid);

sqlite> create trigger add_person_trg instead of insert on add_person begin insert into people values (new.pid, new.name, new.address); insert into phones values (new.pid, new.phone); end;

sqlite> insert into add_person values (1, 'First Last', 'Home', '555-1212');

sqlite> select * from people;
1|First Last|Home

sqlite> select * from phones;

15:15 Reply: How to relate a row with more than one topic? See example (artifact: 314bfa4f70 user: dvdraymond)

And which column format does SQLite not support?

Don't know about MYSQL, but in PostgreSQL for example you could have an array field to hold all the topics

create table quotes ( quote_id integer not null primary key, quote text not null, topic_ids integer[] ); insert into quotes values (1, 'A successful man is a fall in love man.', array[1, 2, 3]);

But the only advantage of that would be 1 less table and a lower overall record count, which doesn't really help anything. You couldn't make them official foreign keys, updates would be harder, indexing going from topics to quotes would be harder, etc.

So, as people have recommended, it would be a lot better for the normal

create table quotes_to_topics ( quote_id int not null references quotes, topic_id int not null references topics, primary key (quote_id, topic_id) ) without rowid; create index topics_to_quotes_idx on quotes_to_topics (topic_id, quote_id);

16:21 Reply: Is there a way to optimize this UPDATE command to run faster? (artifact: 90f7c9ed54 user: dvdraymond)

param_id doesn't seem to be showing up anywhere in your update query. Is that right, or should there also be a bit saying the param_id's are equal? If you've got more than one param_id per object_id then you're potentially messing things up I believe.

15:01 Reply: How can I merge results from two SELECTs? (artifact: 80c56da25f user: dvdraymond)

If the real schema is as simple as the example you could use SQLite's bare columns in aggregates to get the highest snapshot for each parameter.

Something like... select max(snapshot_id) as snapshot_id, param_id, param_value from SnapshotParameters group by param_id order by param_id;

15:11 Edit reply: ORDER BY out of order. Is this a bug or expected? (artifact: fe70cc3e9a user: dvdraymond)

I believe order by is supposed to look at the column aliases first, yes. I can't point you to anything specific that says it though.

Don't forget you can order by something that's not in the result columns. So you can explicitly say the column you want to order by.

sqlite> select printf('%8.2f', n) as n from t order by t.n desc;
│    n     │
│   500.00 │
│  -574.41 │

15:10 Reply: ORDER BY out of order. Is this a bug or expected? (artifact: 713d1aa50d user: dvdraymond)

I believe order by is supposed to look at the column aliases first, yes. I can't point you to anything specific that says it though.

Don't forget you can order by something that's not in the result columns. So you can explicitly say the column you want to order by.

│    n     │
│   500.00 │
│  -574.41 │

13:33 Reply: Fast way to insert rows in SQLite (artifact: db7d038eb9 user: dvdraymond)

For the Python library, adding in "isolation_level = None" in the connection call gets rid of all the automatic transaction handling and leaves it in autocommit mode/full manual transaction control.

12:58 Reply: Fast way to insert rows in SQLite (artifact: f832398c19 user: dvdraymond)

Not sure that it would help at all for SQLite, but one example from remote DBMS inserts is using multiple records in your values statement.

So instead of...


...binding 3 values for 1 record per INSERT statement

You might try doing 10 at a time:

INSERT INTO user VALUES (NULL,?,?,?), (NULL,?,?,?), (NULL,?,?,?), (NULL,?,?,?)... x10

...binding 30 values for 10 records per INSERT statement

14:02 Reply: Table Btree and Index Btree (artifact: 8cc2c69629 user: dvdraymond)

Rowid tables use table b-trees. "without rowid" tables and all indexes use index b-trees.

From the Database File Format page

A b-tree page is either a table b-tree page or an index b-tree page. All pages within each complete b-tree are of the same type: either table or index. There is one table b-trees in the database file for each rowid table in the database schema, including system tables such as sqlite_schema. There is one index b-tree in the database file for each index in the schema, including implied indexes created by uniqueness constraints. There are no b-trees associated with virtual tables. Specific virtual table implementations might make use of shadow tables for storage, but those shadow tables will have separate entries in the database schema. WITHOUT ROWID tables use index b-trees rather than a table b-trees, so there is one index b-tree in the database file for each WITHOUT ROWID table. The b-tree corresponding to the sqlite_schema table is always a table b-tree and always has a root page of 1. The sqlite_schema table contains the root page number for every other table and index in the database file.

13:47 Reply: Why do I have a `SQLITE_AUTOINDEX` index for my primary key? Is the table itself not an index on the PK? (artifact: f16f8ed866 user: dvdraymond)

In the second schema the hash value of base32 is unique instead.

CREATE INDEX hash_base32 ON hash(base32)

There "hash" is the table name and "base32" is the field name.

It's not "hash" as a function name and "base32" as an argument.


17:29 Reply: Where is the temporary database file stored? (artifact: bfd1f0c044 user: dvdraymond)

Temporary Files Used By SQLite Section 5 has the info I think you're looking for.

Temporary File Storage Locations

The directory or folder in which temporary files are created is determined by the OS-specific VFS.

On unix-like systems, directories are searched in the following order:

    1. The directory set by PRAGMA temp_store_directory or by the sqlite3_temp_directory global variable
    2. The SQLITE_TMPDIR environment variable
    3. The TMPDIR environment variable
    4. /var/tmp
    5. /usr/tmp
    6. /tmp
    7. The current working directory (".") 

The first of the above that is found to exist and have the write and execute bits set is used. The final "." fallback is important for some applications that use SQLite inside of chroot jails that do not have the standard temporary file locations available.

On Windows systems, folders are searched in the following order:

    1. The folder set by PRAGMA temp_store_directory or by the sqlite3_temp_directory global variable
    2. The folder returned by the GetTempPath() system interface. 

SQLite itself does not pay any attention to environment variables in this case, though presumably the GetTempPath() system call does. The search algorithm is different for CYGWIN builds. Check the source code for details.
14:14 Reply: SQLite doesn't use indexes for bitwise clauses? (artifact: ff2610b7d0 user: dvdraymond)

How would an index be helpful in processing a "x & ? = 0" WHERE clause term? What algorithm do you propose?

Wouldn't you then only need to execute the test once per each unique value of x?

Go through the index in order, when x changes test the new x. Then you can either say "include every row until x changes" or "ignore every row until x changes"

On a simple test like this then it might not save much, but the more expensive the test, and the fewer distinct values of x then the more useful that would be.

(That or I'm missing something obvious)

17:58 Reply: how DISTINCT remove the duplicate rows? (artifact: bd2555df20 user: dvdraymond)

In 2.4 of the SELECT documentation there's this:

4 . Removal of duplicate rows (DISTINCT processing).

One of the ALL or DISTINCT keywords may follow the SELECT keyword in a simple SELECT statement. If the simple SELECT is a SELECT ALL, then the entire set of result rows are returned by the SELECT. If neither ALL or DISTINCT are present, then the behavior is as if ALL were specified. If the simple SELECT is a SELECT DISTINCT, then duplicate rows are removed from the set of result rows before it is returned. For the purposes of detecting duplicate rows, two NULL values are considered to be equal. The usual rules apply for selecting a collation sequence to compare text values.

So it looks like it uses =/IS to compare them. And since 1 = 1.0 they're considered the same. Which one you get back is probably which one came first as it was traversing the values.

13:01 Reply: What does, if many threads and/or processes need to write the database at the same instant, mean? (artifact: cc6a7f156e user: dvdraymond)

PRAGMA journal_mode = WAL;

The script gets called in a loop from just two different machines on the network.

Just to make sure I'm understanding this right, each of those two machines is accessing its own separate database. Right? Because WAL mode doesn't work across different machines.

Or is that perhaps part of your problem :)

17:58 Reply: What indexes help Window Functions (artifact: 543442cd01 user: dvdraymond)

Maybe "one index per table instance in the query" is more accurate?

What we're saying is that if you have a table t with fields a and b, with an index on a and an index on b, then in...

select * from t where a = 5 or b = 7;

Then SQLite with use only one of those two indexes, or do a full table scan. It will not use the A index for "where a = 5", use the B index for "or b = 7" and then de-duplicate the results.

If you have

select * from t as t1 join t as t2 on t1.id1 = t2.id2 where t1.a = 12 or t2.b = 42;

then again for t1 it will use one index or the full table, and for t2 it will use one index or the full table. It can't use an index on t2.id2 to speed up the ON clause and use an index on t2.b to speed up the WHERE and do some cross comparing. It can only pick one.

...or something like that.

14:09 Reply: Help rewriting a query to match the speed of a 2nd query (artifact: bec274b1de user: dvdraymond)

but is there a way to rewrite the second without the the duplication of IN groups?

If you mean just without having to type/bind the values twice, then how about

with ProjIDs as (values ('PR0000014888'),
select ...
where b.ProjID in ProjIDs
where a.ProjID in ProjIDs
More ↓