SQLite Forum

Timeline
Login

20 most recent forum posts

2020-08-08
14:57 Reply: SQLite 3.33.0 beta-1 (artifact: a49eff2c1d user: kmedcalf)

This allows a faster check if you are interested just in whether the database file does, or does not, have faults.

This is incorrect. If the database has no errors in it, then specifying a value of N (any value) does not make any difference in the length of time that will be consumed to determine that no errors exist.

The value of N limits the number of errors reported, so in a database that contains multiple errors the value of N does not have any effect on how long it takes to check that database for errors, merely once having found that there are errors, how many of those errors are reported.

While it is quite possibly true that for a database containing 47 errors which takes 5 minutes to scan and detect all of them, that you might be done reporting 5 errors after only 3 minutes, the number N does not "allow a faster check" -- it merely sets the number of error messages after which one "throws up hands in disgust".

12:56 Reply: SQLite 3.33.0 beta-1 (artifact: ca25ee2328 user: slavin)

Item 3, in the documentation refers to https://www.sqlite.org/draft/pragma.html#pragma_integrity_check. Because the features of that PRAGMA were introduced 'backwards', the documentation looks a little strange. I came up with this, though there may be problems with that too.


integrity_check(TABLENAME) checks for errors only in the named table and indexes of that table.

integrity_check with no arguments does the same as iterating through all tables using integrity_check(TABLENAME) but also detects all the other errors listed above.

integrity_check(N) does the same as the option with no arguments, but stops checking once N errors have been found. This allows a faster check if you are interested just in whether the database file does, or does not, have faults.


Can the function described in 7b please delete the dodgy sqlite_stat1 table, unless the database has been opened with r/o permission ?

In addition can ANALYZE delete stat1, stat2, stat3 and stat4, before it starts compiling new tables ? My logic is that if the programmer has specifically chosen to use ANALYZE they are explicitly stating that they don't care about any old data.

00:40 Reply: How SQLite decides if a subquery is correlated or not? (artifact: bb40fcf396 user: nalzok)

Thanks, so the takeaway is that SQL is agnostic to (non-)determinism, and its decisions are based solely on if there is any dependency. This is a little surprising to me, but I guess that's how things are specified, so I need to obey the rule :)

2020-08-07
20:20 Edit: SQLite 3.33.0 beta-1 (artifact: 3b87070608 user: drh)

The 3.33.0 release of SQLite is schedule to occur in about two weeks. Now would be a great time for you to download the latest prerelease snapshot and see if you can break it. We'd prefer that you report bugs before the release rather than afterwards.

Other links:

20:19 Post: SQLite 3.33.0 beta-1 (artifact: 7552dc1150 user: drh)

The 3.33.0 release of SQLite is schedule to occur in about two weeks. Now would be a great time for you to download the latest prerelease snapshot and see if you can break it. We'd prefer that you report bugs before the release rather than afterwards.

Other links:

16:01 Reply: Error al recuperar o leer en base de datos a traves de Nombre (artifact: fcd4a8488a user: kmedcalf)

Use bound parameters. replace this:

miCursor.execute("SELECT * FROM DATOSEQUIPOS WHERE EQUIPO=" + miEquipo.get())

with this

miCursor.execute("SELECT * FROM DATOSEQUIPOS WHERE EQUIPO=?", (miEquipo.get(),))

This is because text fields must be properly quoted when you "construct" your SQL statement. Dynamic construction of SQL is fraught with peril.

https://xkcd.com/327/

15:51 Edit reply: How SQLite decides if a subquery is correlated or not? (artifact: 61f44033b3 user: kmedcalf)
x in lucky_numbers

is not a subquery. This actually means, where lucky_numbers is a table-like object:

x in (select luckynumber from lucky_numbers)

and the subquery is independent of the value of x. It is not a correlated subquery.

Similarly the other query is not correlated, even though it must be evaluated for each row of the outer table, it does not depend on a value of that outer table and is therefore independent, not correlated.

It is up to the planner to decide whether an "independent" query will be run more than once. If the results of such a query are non-deterministic, you have created yourself a problem because the planner assumes that the result of a query is deterministic and stable. If it does not hold that property then you run into the problem you have observed (not to mention that the entire concept of a "computer" is dependent on the assumption of determinism -- that doing exactly the same thing again with the same inputs will achieve the same output).

It you violate this fundamental design principal of the universe then all hell breaks loose.

15:48 Reply: How SQLite decides if a subquery is correlated or not? (artifact: 76c0c37c95 user: kmedcalf)
x in lucky_numbers

is not a subquery. This actually means, where luck_numbers is a table-like object:

x in (select luckynumber from luck_numbers)

and the subquery is independent of the value of x. It is not a correlated subquery.

Similarly the other query is not correlated, even though it must be evaluated for each row of the outer table, it does not depend on a value of that outer table and is therefore independent, not correlated.

It is up to the planner to decide whether an "independent" query will be run more than once. If the results of such a query are non-deterministic, you have created yourself a problem because the planner assumes that the result of a query is deterministic and stable. If it does not hold that property then you run into the problem you have observed (not to mention that the entire concept of a "computer" is dependent on the assumption of determinism -- that doing exactly the same thing again with the same inputs will achieve the same output).

It you violate this fundamental design principal or the universe then all hell breaks loose.

15:29 Reply: How SQLite decides if a subquery is correlated or not? (artifact: f4ab8cb1e5 user: kmedcalf)

A subquery is correlated where it contains a reference to the parent row -- that is, it requires evaluation for each row in the outer query.

select a,
       (select b from c),
  from d;

the subquery is not correlated -- its result is independent of the parent.

select a,
       (select b from c where g == a)
  from d;

the subquery is correlated because it must be evaluated separately for each row in d because the subquery is "correlated" to the value of "a" in the outer (containing) query.

So a subquery is either independent or it is correlated and telling the difference is trivial.

12:55 Edit reply: Error al recuperar o leer en base de datos a traves de Nombre (artifact: 1ab845a9f7 user: drh)

Moderator's note:

Answers should be in English. An English translation of the original follows:


Hello, I have created an application with Tkinter in Python using a Sqlite database. When searching for Equipo, I get this error:

   sqlite3.OperationalError: no such column: Vara

If the value of Equipo is a number, then reading of the database works, but if it has letters, it does not. The code I have used is the following:

  def leer():
    miConexion=sqlite3.connect("Equipos")
    miCursor=miConexion.cursor()
    miCursor.execute("SELECT * FROM DATOSEQUIPOS WHERE EQUIPO=" + miEquipo.get())
    elusuario=miCursor.fetchall()
    for usuario in elusuario:
      miID.set(usuario[0])
      miEquipo.set(usuario[1])
      miN_Serie.set(usuario[2])
      miMAC.set(usuario[3])
      miIP.set(usuario[4])
      miWIFI.set(usuario[5])
      miVSM.set(usuario[6])
      miRoseta.set(usuario[7])
      textoComentario.insert(1.0, usuario[8])
    miConexion.commit()

Can you help me? Thanks in advance!

12:51 Reply: Error al recuperar o leer en base de datos a traves de Nombre (artifact: a8dbb1a451 user: drh)

Moderator's note:

Answers should be in English. An English translation of the original follows:


Hello, I have created an application with Tkinter in Python using a Sqlite database. When searching for Equipment, I get this error:

   sqlite3.OperationalError: no such column: Vara

If the name of the Team is a number, then reading of the database works, but if it has letters, it does not. The code I have used is the following:

  def leer():
    miConexion=sqlite3.connect("Equipos")
    miCursor=miConexion.cursor()
    miCursor.execute("SELECT * FROM DATOSEQUIPOS WHERE EQUIPO=" + miEquipo.get())
    elusuario=miCursor.fetchall()
    for usuario in elusuario:
      miID.set(usuario[0])
      miEquipo.set(usuario[1])
      miN_Serie.set(usuario[2])
      miMAC.set(usuario[3])
      miIP.set(usuario[4])
      miWIFI.set(usuario[5])
      miVSM.set(usuario[6])
      miRoseta.set(usuario[7])
      textoComentario.insert(1.0, usuario[8])
    miConexion.commit()

Can you help me? Thanks in advance!

12:31 Post: Error al recuperar o leer en base de datos a traves de Nombre (artifact: af7b679058 user: anonymous)

Hola, he creado una aplicacion con Tkinter en Python usando una base de datos Sqlite. al realizar la busqueda por Equipo, me sale este error:

sqlite3.OperationalError: no such column: Vara

Si el nombre del Equipo es un numero, me hace bien la operación de lectura de la base de datos, pero si tiene letras no. El código que he usado es el siguiente:

def leer():

miConexion=sqlite3.connect("Equipos")

miCursor=miConexion.cursor()

miCursor.execute("SELECT * FROM DATOSEQUIPOS WHERE EQUIPO=" + miEquipo.get())

elusuario=miCursor.fetchall()

for usuario in elusuario:
    miID.set(usuario[0])
    miEquipo.set(usuario[1])
    miN_Serie.set(usuario[2])
    miMAC.set(usuario[3])
    miIP.set(usuario[4])
    miWIFI.set(usuario[5])
    miVSM.set(usuario[6])
    miRoseta.set(usuario[7])
    textoComentario.insert(1.0, usuario[8])

miConexion.commit()

¿Me podeis ayudar?

Gracias de antemano.

07:10 Reply: Need help building a custom amalgamation (artifact: 322f1ca6f7 user: lexfiend)

Yes, but there's no guarantee that a random 32-bit libsqlite.so would actually work on your remote machine. For starters, Linux runs on more 32-bit architectures than just Intel x86, especially in the embedded arena.

Even if it's the same architecture, a library built on your typical Linux distro probably won't work on a musl-based remote OS.

So unless you're planning to drop Yet Another SQLite Library on your remote machine, you really should be using what's already available. In fact, if your remote machine is significantly different architecturally from your local one, you should be building your program on the remote box for maximum success.

03:30 Post: How SQLite decides if a subquery is correlated or not? (artifact: 75a466bf37 user: nalzok)

As a background, the question stems from my observation that a query that once took ~30s finishes in ~0.05s after I replaced an EXISTS with IN in the WHERE clause. I have constructed the following example to reproduce the issue to you. While this thread is self-contained, you are more than welcome to check out this related question on DBA.SE and this previous thread.

Assume that we have a list of natural numbers running from 0 to 999.

CREATE TABLE IF NOT EXISTS natural_numbers AS
WITH RECURSIVE natural_numbers(x) AS (
    SELECT 0
    UNION
    SELECT x + 1
    FROM natural_numbers
    LIMIT 1000
)
SELECT x
FROM natural_numbers;

Now, I want to draw 5 out of the 1000 numbers to be "lucky numbers", and count how many numbers are lucky. What I did is

WITH lucky_numbers(lucky_number) AS (
    SELECT x
    FROM natural_numbers
    ORDER BY RANDOM()
    LIMIT 5
)
SELECT SUM(x IN lucky_numbers)
FROM natural_numbers;

and the query plan is

QUERY PLAN
|--SCAN TABLE natural_numbers
`--LIST SUBQUERY 2
   |--SCAN TABLE natural_numbers
   `--USE TEMP B-TREE FOR ORDER BY

While subquery 1 lucky_numbers is non-correlated, I'm surprised to see that subquery 2 (x IN lucky_numbers) is not considered a correlated subquery. As pointed out by @drh earlier,

[T]he programmer cannot make any assumptions about whether or not the results of a (non-correlated) subquery are cached and reused or if the subquery is run multiple times.

If (x IN lucky_numbers) is non-correlated, then the optimizer might decide to return the same value for all x, which would be blatantly wrong. I'm wondering how SQLite decides if a subquery is correlated or not?


For comparison, here is an alternative query

WITH lucky_numbers(lucky_number) AS (
    SELECT x
    FROM natural_numbers
    ORDER BY RANDOM()
    LIMIT 5
)
SELECT SUM(EXISTS(
        SELECT 1
        FROM lucky_numbers
        WHERE x = lucky_number
))
FROM natural_numbers;

Its query plan correctly identifies subquery 2 (SELECT 1 FROM lucky_numbers WHERE x = lucky_number) as a correlated scalar subquery.

QUERY PLAN
|--SCAN TABLE natural_numbers
`--CORRELATED SCALAR SUBQUERY 2
   |--CO-ROUTINE 1
   |  |--SCAN TABLE natural_numbers
   |  `--USE TEMP B-TREE FOR ORDER BY
   `--SCAN SUBQUERY 1

Thus, while lucky_numbers may still be cached, it is guaranteed that (SELECT 1 FROM lucky_numbers WHERE x = lucky_number) will be executed for each x.

In addition, the bundled query explainer in PyCharm Professional indicates that with IN the subquery is only once (see red circle), whereas with EXISTS it's executed repeatedly. I'm not sure if that's related to whether the subquery is considered correlated or not.

02:56 Reply: Query Optimizer Changes Meaning of Query (artifact: d545422fd9 user: nalzok)

In other words, the programmer cannot make any assumptions about whether or not the results of a (non-correlated) subquery are cached and reused or if the subquery is run multiple times

Thanks for the great explanation! I actually have a follow-up question about how SQLite decides if a subquery is correlated or not, but let's discuss that in a new thread.

2020-08-06
21:48 Reply: Query Optimizer Changes Meaning of Query (artifact: 7e7e007d28 user: drh)

The query

   SELECT x FROM natural_numbers ORDER BY random() LIMIT 50;

Returns a table of 50 distinct random numbers. The question then is, if that query is really a subquery in a larger SQL statement, and the result set is used multiple times within the outer query, is the SQL engine compelled to rerun the subquery, or is it allowed to cache the results of the first run and reuse them.

I contend that the query planner is free to do it either way. It can either cache the results of the first run and reuse them. Or it can rerun the query. Or (if it wanted to) it could cache the results and use them two or three times and then rerun the query and use those results two or three times, then run the subquery again, and so forth.

In other words, the programmer cannot make any assumptions about whether or not the results of a (non-correlated) subquery are cached and reused or if the subquery is run multiple times. The SQL engine is free to do whatever it wants. Any query that depends on whether or not subquery results are cached returns unpredictable output. It is akin to running a query without an ORDER BY clause - the engine is free to return the results in any order it wants. Just because it returns the results in the order you want today does not mean it will continue to do so tomorrow.

Constructing a query like this that depends on whether or not subquery results are cached is like writing a multiple-threaded program that depends on the order in which the threads are executed. The output can vary from one run to the next. Don't do that.

I checked with the PostgreSQL developers and am told that PostgreSQL works the same way. The equivalent query in PostgreSQL might cache and reuse the subquery results, or it might rerun the subquery. You never know.

As currently implemented, SQLite only caches the subquery results if it wants to build an automatic index. But that decision might change tomorrow, so you cannot depend on it.

19:06 Reply: Capitalize first letters only (artifact: 23e885a770 user: anonymous)

Thanks for the SQLitespeed pointer. Yes, that was a one-off job (for now).

Unfortunately, the app has too many issues to replace my other browser (sqlitebrowser), like throwing errors when there should't be any (even after updating its sqlite3.dll with mine to be one same page), or not updating the view after running an UPDATE SQL.

But, I somehow managed to complete the job eventually. So, thanks.

17:16 Reply: Need help building a custom amalgamation (artifact: 64e365b038 user: anonymous)

If the only thing you want to do is add a trigger to the database schema, then I suppose it might work to lock the database, create a record for the trigger, add it to the btree of the schema table, increment the schema cookie number and file change counter, and then unlock it. But another question is, do you need WAL mode?

17:08 Reply: Query Optimizer Changes Meaning of Query (artifact: 1a8029eb18 user: anonymous)

It should be possible to make the query optimizer to automatically figure out whether or not a view is deterministic. However, this doesn't work for virtual tables, unless it is enhanced by adding a SQLITE_INDEX_SCAN_DETERMINISTIC flag (or a SQLITE_VTAB_DETERMINISTIC option).

(I think there are other messages on this forum describing proposed improvement of virtual table.)

15:22 Reply: fts5 does not treat left half ring character as token (artifact: 8e9223df80 user: anonymous)
I think the unicode61 tokenizer just strips marks from letters. The left half-ring character is a letter, not a mark, so it is left as part of the token. I have no idea if Unicode has it right or wrong, but it is what it is. You may want to process your special cases before feeding to FTS5. Do you know how does this work in Solr for example?

Martin
More ↓