SQLite Forum

Timeline
Login

50 forum posts occurring on or before 2021-10-22 18:48:42.

More ↑
2021-10-22
18:48 Post: Documentation bug regarding max SQL length (artifact: ba0e448646 user: anonymous)

Hi, We recently saw SQLite error code regarding queries being too large. On checking the error code page, the docs mention limit as 1,000,000 (https://www.sqlite.org/rescode.html#toobig)

However the actual limit is 1,000,000,000 as documented here: https://www.sqlite.org/limits.html#max_sql_length, and in the code

https://sqlite.org/src/file?name=src/sqliteLimit.h&ci=trunk

/*
** The maximum length of a single SQL statement in bytes.
**
** It used to be the case that setting this value to zero would
** turn the limit off.  That is no longer true.  It is not possible
** to turn this limit off.
*/
#ifndef SQLITE_MAX_SQL_LENGTH
# define SQLITE_MAX_SQL_LENGTH 1000000000
#endif

Wanted to confirm that this indeed is just a documentation issue, and I'm not misreading the default value.

Regards, Kanishka

14:07 Reply: Truncated Words bug with fts5vocab (artifact: b8a691b233 user: slavin)

The stemmer deliberately changes the word you supply into something else. Stemmers you might see in action right now might produce something that looks like the original word, but the stem is an internal representation, for use by the computer only. When doing your programming you should imagine that the stemmer takes the word 'something' and turns it into '174JGS72'.

You need to keep both strings of text: the original, as supplied by your program/user, and whatever your stemmer produces. They are not the same thing, and if you want them both you have to store them both.

11:36 Reply: SELECT STATEMENT FOR OGR2OGR IN GIS (artifact: db3ea1f39c user: anonymous)

Do not concatenate with + but use double pipe

10:27 Reply: Truncated Words bug with fts5vocab (artifact: 14a95a1b5b user: occultus73)
As it happens, for my project I'm not actually using Porter's stemmer, but rather the custom Snowball stemmer in a modified build of FTS5. It seems that this too has the same feature in this instance.

The problem for me here is I was using the fts5vocab table to top up the Spellfix1 vocabulary table, as per the documentation - although technically there, fts4aux is specified, but I understand fts5vocab is a replacement for that.

As it is, Spellfix1 is "correcting" misspellings to these roots as you say, which isn't really what one expects of a spellfixer. That said, seems as I spellfix search text in order to improve querying against the fts tables, and the fts tables will necessarily understand these roots, then perhaps spell correcting to a root is relatively harmless.
07:18 Edit reply: Inconsistency in BETWEEN Query (artifact: e2b50d334d user: ddevienne)

And if your integers are even bigger, there's always the decimal extension.

You can't use indexes anymore, although you could still use a real column
that approximates your large integers for indexing, and apply the decimal_cmp
on the subset returned via the real index. You'd have to be careful of course,
since the real approximation could be above or below the integer, and that
complexity is probably necessary only for very large tables, as SQLite will be
fast enough even with full-scans in most cases most likely. FWIW. --DD

07:18 Edit reply: Inconsistency in BETWEEN Query (artifact: f7c6ea4065 user: ddevienne)

And if your integers are even bigger, there's always the decimal etension.

You can't use indexes anymore, although you could still use a real column
that approximates your large integers for indexing, and apply the decimal_cmp
on the subset returned via the real index. You'd have to be careful of course,
since the real approximation could be above or below the integer, and that
complexity is probably necessary only for very large tables, as SQLite will be
fast enough even with full-scans in most cases most likely. FWIW. --DD

07:15 Reply: Inconsistency in BETWEEN Query (artifact: 26b7ee18e6 user: ddevienne)

And if you integers are even bigger, there's always the decimal etension.

You can't use indexes anymore, although you could still use a real column
that approximate your large integers for indexing, and apply the decimal_cmp
on the subset returned via the real index. You'd have to be careful of course,
since the real approximation could be above or below the integer, and that
complexity is probably necessary only for very large tables, as SQLite will be
fast enough even with full-scans in most cases most likely. FWIW. --DD

07:04 Reply: Inconsistency in BETWEEN Query (artifact: 83f15a9a51 user: stephan)

For SQLite the integer can be up to 8 bytes and will casting as "INTEGER" solve the problem without overflow?

You can ask the foremost expert yourself: sqlite. If it behaves that way today you can be 99.99% certain it will behave that way in 10 years (the project places tremendous value on backwards compatibility).

sqlite> select cast(4011110000001 as integer);
4011110000001
sqlite> select cast(401111000000100 as integer);
401111000000100
sqlite> select cast(4011110000001000 as integer);
4011110000001000
sqlite> select cast(40111100000010000 as integer);
40111100000010000
sqlite> select cast(401111000000100000 as integer);
401111000000100000
sqlite> select cast(4011110000001000000 as integer);
4011110000001000000
sqlite> select cast(40111100000010000000 as integer);
9223372036854775807
06:59 Reply: Inconsistency in BETWEEN Query (artifact: 38c1428c5c user: anonymous)

For SQLite the integer can be up to 8 bytes and will casting as "INTEGER" solve the problem without overflow?

My Query will be;

SELECT * FROM my_table WHERE 4011110000001 BETWEEN CAST(col_a AS INTEGER) AND CAST(col_b AS INTEGER)

06:50 Reply: Inconsistency in BETWEEN Query (artifact: 5530f0cb55 user: stephan)

I can't use "int" because of the length of the numbers. If I cast the query as "long", can you guarantee that all query ranges will run without errors?

sqlite does not distinguish between "long" and "int". See:

https://www.sqlite.org/datatype3.html

06:40 Reply: Inconsistency in BETWEEN Query (artifact: 42cbaee951 user: anonymous)

I can't use "int" because of the length of the numbers. If I cast the query as "long", can you guarantee that all query ranges will run without errors?

My Query will be; SELECT * FROM my_table WHERE 4011110000001 BETWEEN CAST(col_a AS LONG) AND CAST(col_b AS LONG)

00:14 Edit reply: SELECT STATEMENT FOR OGR2OGR IN GIS (artifact: 7f1cf4791a user: kmedcalf)

Neither LEFT( or RIGHT( are valid syntax.

LEFT(APN_D, 3) could be SUBSTRING(APN_D, 1, 3)
RIGHT(APN_D, 3) could be SUBSTRING(APN_D, 7, 3) (or perhaps SUBSTRING(APN_D, -3))

Alternatively, create the SQL functions LEFT and RIGHT.

00:08 Edit reply: SELECT STATEMENT FOR OGR2OGR IN GIS (artifact: c198f66f99 user: kmedcalf)
>sqlite
SQLite version 3.37.0 2021-10-20 17:10:36
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table apn(apn_d text not null);
sqlite> select CASE WHEN APN_D NOT LIKE '%-%' AND APN_D NOT LIKE '' THEN LEFT(APN_D, 3) + '-' + SUBSTRING(APN_D, 3, 3) + '-' + SUBSTRING(APN_D, 6, 3) + '-' + RIGHT(APN_D, 3) ELSE APN_D end from apn;
Error: near "(": syntax error
sqlite> select CASE WHEN APN_D NOT LIKE '%-%' AND APN_D NOT LIKE '' THEN SUBSTRING(APN_D, 1, 3) + '-' + SUBSTRING(APN_D, 3, 3) + '-' + SUBSTRING(APN_D, 6, 3) + '-' + RIGHT(APN_D, 3) ELSE APN_D end from apn;
Error: near "(": syntax error
sqlite> select CASE WHEN APN_D NOT LIKE '%-%' AND APN_D NOT LIKE '' THEN SUBSTRING(APN_D, 1, 3) + '-' + SUBSTRING(APN_D, 3, 3) + '-' + SUBSTRING(APN_D, 6, 3) + '-' + SUBSTRING(APN_D,7,3) ELSE APN_D end from apn;
sqlite> 
00:07 Reply: SELECT STATEMENT FOR OGR2OGR IN GIS (artifact: d857d0d3cc user: kmedcalf)
>sqlite
SQLite version 3.37.0 2021-10-20 17:10:36
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table apn(apn_d text mot null);
sqlite> select CASE WHEN APN_D NOT LIKE '%-%' AND APN_D NOT LIKE '' THEN LEFT(APN_D, 3) + '-' + SUBSTRING(APN_D, 3, 3) + '-' + SUBSTRING(APN_D, 6, 3) + '-' + RIGHT(APN_D, 3) ELSE APN_D end from apn;
Error: near "(": syntax error
sqlite> select CASE WHEN APN_D NOT LIKE '%-%' AND APN_D NOT LIKE '' THEN SUBSTRING(APN_D, 1, 3) + '-' + SUBSTRING(APN_D, 3, 3) + '-' + SUBSTRING(APN_D, 6, 3) + '-' + RIGHT(APN_D, 3) ELSE APN_D end from apn;
Error: near "(": syntax error
sqlite> select CASE WHEN APN_D NOT LIKE '%-%' AND APN_D NOT LIKE '' THEN SUBSTRING(APN_D, 1, 3) + '-' + SUBSTRING(APN_D, 3, 3) + '-' + SUBSTRING(APN_D, 6, 3) + '-' + SUBSTRING(APN_D,7,3) ELSE APN_D end from apn;
sqlite> 
2021-10-21
23:32 Reply: SELECT STATEMENT FOR OGR2OGR IN GIS (artifact: 8b1051f010 user: kmedcalf)

Neither LEFT( or RIGHT( are valid syntax.

LEFT(APN_D, 3) could be SUBSTRING(APN_D, 1, 3)
RIGHT(APN_D, 3) could be SUBSTRING(APN_D, 7, 3)

Alternatively, create the SQL functions LEFT and RIGHT.

22:06 Post: SELECT STATEMENT FOR OGR2OGR IN GIS (artifact: 5adbcde819 user: anonymous)

I seek the syntax to create APN numbers that are text strings with and without dashes eg XXX-XXX-XXX & XXXXXXXXX with a CASE statement?

CASE WHEN APN_D NOT LIKE '%-%' AND APN_D NOT LIKE '' THEN LEFT(APN_D, 3) + '-' + SUBSTRING(APN_D, 3, 3) + '-' + SUBSTRING(APN_D, 6, 3) + '-' + RIGHT(APN_D, 3) ELSE APN_D

THe error "near "(": syntax error" any assistance is MUCH appreciated.

17:28 Reply: Truncated Words bug with fts5vocab (artifact: fb1e6b3c25 user: drh)

The Porter Stemmer algorithm does that. It is a feature, not a bug. The idea of a "stemmer" is to map words that share a common root into the same base form, so that they appear to the search algorithm as the same word.

The Porter Stemmer algorithm (named for its inventor, Martin Porter), only works for English. But it has been widely used for over 4 decades and works fairly well within its domain.

If you don't want to use the Porter Stemmer, leave off the "porter" keyword when you declare the FTS5 table.

17:11 Reply: Column names and aliases in expressions (artifact: a09ec9b2d0 user: lcarlp)

I'm sure this is documented better somewhere, but bare column names are resolved in a manner similar to other scoped programming languages. If for example, you refer to a bare column name in a subquery, it will first try to resolve it using a table in that subquery and if none of those tables have such a column, it will next look at the nearest "enclosing" query... and so on up to the outermost query.

Note that if you add a new column to an existing table, you can, therefore, drastically change the meaning of an existing query. Hence, I would recommend using bare column names only in very simple queries. If the queries start to get more complex, it is better to qualify the column names. Furthermore, since the same table is likely to be referenced more than once in a complex query, it is often necessary to use table aliases when qualifying column names.

As other posters have said, you'll get an error if a bare column name is ambiguous, e.g., if the same column name occurs in more than one table in the same query. However, the real danger is when it is clear to the SQL Engine which column is being referenced, but not clear to you, the developer! I've been burned by this a few times.

16:59 Post: Truncated Words bug with fts5vocab (artifact: 5618045371 user: occultus73)
Hi there; new to the forum.

Can anyone recreate this bug I have found with FTS5?

Steps:

1. Load FTS5 Extension, if not included in the build.

2. Create a normal table with at least two columns called 'name' and 'keywords':

CREATE TABLE content (name TEXT, keywords TEXT);

3. Create a fts table based on that table:

CREATE VIRTUAL TABLE content_fts
USING fts5(name, keywords, content='content', tokenize='porter ascii');

4. Create a fts5vocab table based on that in turn:

CREATE VIRTUAL TABLE content_vocab
USING fts5vocab('content_fts','row');

5. Add a trigger to insert data into the fts table automatically:

CREATE TRIGGER content_ai AFTER INSERT ON content
BEGIN
	INSERT INTO content_fts (name, keywords) 
	VALUES (new.name, new.keywords); 
END;

6. Insert data containing the words 'Sample' and 'Wallpaper' into the table:

INSERT INTO content VALUES ('Sample Gif', 'content,sample,gif');
INSERT INTO content VALUES ('Sample Wallpaper', 'content,sample,wallpaper');

7. Query the content_vocab table,to see 'sample' and 'wallpaper' are truncated:

SELECT * FROM content_vocab

content	2	2
gif	1	2
sampl	2	4
wallpap	1	2
15:06 Reply: Inconsistency in BETWEEN Query (artifact: 0eaf895da7 user: gunter_hick)
You need to read up on data types is SQLite and the concept of "affinity".

See https://sqlite.org/datatype3.html#sorting_grouping_and_compound_selects

In section 4.2. Type Conversions Prior To Comparison, the second bullet is "If one operand has TEXT affinity and the other has no affinity, then TEXT affinity is applied to the other operand."

So you are performing a string (lexical) comparison of (duplicate digits removed for clarity) "40100" <= "4011" <= "40199", which is obviously true; because "4011" comes after any string beginning with "4010" and before any string beginning with "4019".
14:43 Reply: Inconsistency in BETWEEN Query (artifact: 9deb80dec8 user: larrybr)

If you expect your text values to be compared as integers, you need to write something like CAST( my_text_resembling_a_number AS INT ) for each such value. The inequality operators (which set includes BETWEEN) will not do it for you.

13:49 Reply: Inconsistency in BETWEEN Query (artifact: 09df8cd9e2 user: anonymous)

Yes col_a and col_b is TEXT.

13:40 Reply: Inconsistency in BETWEEN Query (artifact: 1f9ff2c06c user: curmudgeon)

What is the type of col_a and col_b. It looks as if the BETWEEN is treating the 3 values as TEXT.

12:38 Post: Inconsistency in BETWEEN Query (artifact: e7c9c8b21b user: anonymous)
Hello I am working on an Android SQLite project and can not find a way to get right value, The program takes an input value from the user and searches it in the database, I want to do a query like; if the input value is 3; look through the col_a and col_b if the value is between col_a and col_b prints the col_c value "AA" and col_d value "BB". Like col_a < 3 <col_b prints AA, BB. If col_a < 7 < col_b prints CC, DD.

I am doing a query with the "BETWEEN" but it returns an inconsistent value.

My Table;
ID  |        col_a        |        col_b        |  col_c  |  col_d 
--------------------------------------------------------------------
1   |    4011110000000    |    4011119999999    |   AA    |   BB   
--------------------------------------------------------------------
2   |    4022220000000    |    4022229999999    |   CC    |   DD    
--------------------------------------------------------------------

My Query;
SELECT * FROM my_table WHERE 401111000001 BETWEEN col_a AND col_b

You can see the values in column col_a and col_b is 13 digits long and the query value is 12 digits long. The query returns the item with the ID:1. If I put a 13 digits long value(4011110000001) it also returns the row with ID:1. The problem is 12 digits long value is not in between among the values to be checked.

Android Java Code;
String sql = "SELECT * FROM my_table WHERE ? BETWEEN col_a  AND col_b";
Cursor cursor = db.rawQuery(sql, new String[] {String.valueOf(x)});

if (cursor.moveToNext()) {
    String element1 = cursor.getString(cursor.getColumnIndexOrThrow("col_c"));
    String element2 = cursor.getString(cursor.getColumnIndexOrThrow("col_d"));
    String element3 = cursor.getString(cursor.getColumnIndexOrThrow("ID"));
    cursor.close();
    Log.d(""," " +element1);
    Log.d(""," " +element2);
    Log.d(""," " +element3);
}
02:38 Edit reply: I need a little help with UPDATE? (artifact: 35117e44c3 user: kmedcalf)

Note that this will replace ALL station names in t1 with whatever they are in "s" for the same lat-long

and any location not found will have the station name set to null.

Either of the following will fix that (assuming that you do not want not found station names set to null.

UPDATE t1
   SET start_station_name = coalesce((SELECT s.start_station_name
                                        FROM s
                                       WHERE start_lat = t1.start_lat
                                         AND start_lng = t1.start_lng
                                     ), start_station_name)
;
-- or --
UPDATE t1
   SET start_station_name = s.start_station_name
  FROM s
 WHERE s.start_lat = t1.start_lat
   AND s.start_lng = t1.start_lng
;
02:32 Reply: I need a little help with UPDATE? (artifact: de7f14b20f user: kmedcalf)

Note that this will replace ALL station names in t1 with whatever they are in "s" for the same lat-long

and any location not found will have the station name set to null.

00:17 Reply: How do i submit a bug report (artifact: cc490e5902 user: larrybr)

CSR:

The "lemon" program converts a structured description of a grammar which has actions associated with specified grammar constructs into a C function which is able to parse an instance of the grammar and perform the actions corresponding to various constructs in that instance.

The C function so generated from SQLite's SQL grammar is used in SQLite to process SQL passed into the prepare_statement() APIs.

Of course, the grammar description which Lemon is called upon to convert into a parser when SQLite is built does not contain anything like what you or other "Security Researchers" devise to expose so-called vulnerabilities in the lemon parser generator.

If you really want to learn about lemon, you can peruse The Lemon Parser Generator to your heart's content. This document can be found under "Lemon" in the website keyword index. I urge you to consult that index first when you have questions regarding SQLite. Much effort has gone into keeping the online docs current and accurate, so it should be your first source of answers. (And if something you cannot find there ought to be there, that is a fact worth bringing to the dev team's attention.)

2021-10-20
23:13 Reply: How do i submit a bug report (artifact: 41e35286df user: anonymous)

Thank you very much for the info. I do not know a whole lot regarding how sqlite uses lemon and i have also contacted the creators of lemon. I just thought i would point it out (regardless of the criticality of the vulnerability because i came across it and felt obligated. Could you please go into more detail on how sqlite uses lemon to generate C code?

Thanks, Cyber Security Researcher

19:13 Reply: I need a little help with UPDATE? (artifact: dcce7c2652 user: cuz)

David Raymond is right, we need a lot more info to offer anything resembling "wise" advice.

If I have to guess, your most immediate solution to just make it work, regardless of how well it functions (and assuming the "missing" s table in your query is actually called "s"), would be to do:

UPDATE t1 SET t1.start_station_name = (
    SELECT s.start_station_name
      FROM s
     WHERE s.start_lat = t1.start_lat AND s.start_lng = t1.start_lng
    )
;

Note that this will replace ALL station names in t1 with whatever they are in "s" for the same lat-long. Please back up your files before doing this, and give us much more information for more useful suggestions.

Note Also - if lat and lng are REAL (Floating point) fields, the equality check x.lat = y.lat is not guaranteed to work due to small possible differences in float values that seem equal to the naked eye.

18:39 Reply: I need a little help with UPDATE? (artifact: 1bb7b4bdd9 user: dvdraymond)
Well, you haven't told us what you wanted it to do, or why you think there's a problem. So we have to guess on everything.

For starters as to what's wrong: "s" isn't defined anywhere in there.
18:26 Edit reply: INDEX usage on inequality and IN (artifact: dd71bd32c5 user: casaderobison)

Others have already mentioned analyze so I won't repeat that here.

One other complication though for using an index when using inequality matching is "where is the inequality in the set". Using your example, let's say type has four possible values, 1, 2, 3, and 4.

SELECT * FROM table WHERE type != 1
SELECT * FROM table WHERE type != 2

The first of these example queries will return all rows where type is 2, 3, or 4, which would be a contiguous set of index entries.

The second of the examples will return all rows where type is 1, 3, or 4, which would be two sets of index entries (get all rows that match type 1, skip all rows that match type 2, then get all rows that match type 3 & 4). Certainly this can be done, but it complicates the query planner based on a number of variables, especially if using prepared statements.

One way to rewrite the query:

SELECT * FROM table WHERE type < 2 OR type > 2

That might better indicate to the query planner that it can divide it up into two subqueries. Or perhaps even better:

SELECT * FROM table WHERE type < 2
UNION ALL
SELECT * FROM table WHERE type > 2

There are other possibilities that have been recommended in other responses.

In the end, you can't assume that the query plan is legitimate unless you are running it against the actual data. It is not enough to simply have a schema and explain a query plan, as the SQLite query planner takes a lot more information into account.

18:25 Reply: INDEX usage on inequality and IN (artifact: 2811f9486e user: casaderobison)

Others have already mentioned analyze so I won't repeat that here.

One other complication though for using an index when using inequality matching is "where is the inequality in the set". Using your example, let's say type has four possible values, 1, 2, 3, and 4.

SELECT * FROM table WHERE type != 1 SELECT * FROM table WHERE type != 2

The first of these example queries will return all rows where type is 2, 3, or 4, which would be a contiguous set of index entries.

The second of the examples will return all rows where type is 1, 3, or 4, which would be two sets of index entries (get all rows that match type 1, skip all rows that match type 2, then get all rows that match type 3 & 4). Certainly this can be done, but it complicates the query planner based on a number of variables, especially if using prepared statements.

One way to rewrite the query:

SELECT * FROM table WHERE type < 2 OR type > 2

That might better indicate to the query planner that it can divide it up into two subqueries. Or perhaps even better:

SELECT * FROM table WHERE type < 2 UNION ALL SELECT * FROM table WHERE type > 2

There are other possibilities that have been recommended in other responses.

In the end, you can't assume that the query plan is legitimate unless you are running it against the actual data. It is not enough to simply have a schema and explain a query plan, as the SQLite query planner takes a lot more information into account.

18:12 Post: I need a little help with UPDATE? (artifact: e89a8dff6c user: davidjackson)

Ok, I think I am close. What is the correct code for my update query?

UPDATE t1 SET t1.start_station_name = s.start_station_name WHERE s.start_lat = t1.start_lat AND s.start_lng = t1.start_lng

TX, David

15:01 Reply: INDEX usage on inequality and IN (artifact: 76ff3e4a5c user: cuz)

That is called "Cardinality", which in math terms is the size of a set, and in RDBMS terms is the size of the set of possible values in a column.

i.e. A table with 1,000,000 rows with only 4 possible values in Column A is said to have a Cardinality of 4 for Column A. By contrast if Column B was declared Unique and indeed contained all values, its cardinality would be 1 million.

Most SQL engines use exactly this cardinality figure to guess what would be best to do, use the index or not. The problem is they cannot magically "know" what the cardinality of data is, they have to really at some point look through each row in a table and for each column (or at least Indexed column) to calculate the cardinality. in SQLite this is done with:
ANALYZE table_name;

Based on the info stored when ANALYZE is run (which, depending on the compile choices, may store much more than just cardinality), the query planner is well equipped to make such decisions as you suggest above, and is more likely to pick a good index.

If you did not ANALYZE the table, but have your own personal ideas/beliefs about how well the content in a table conforms to the query filtering, you could hint this to the query planner using "likelihood()" - which allows you to say whether the result of a comparison is more likely, or less likely, to be TRUE.

Simon is correct though, the ideas around this and the decision by the query planner to use or not to use a specific index, is non-trivial and after many years of hard work, it still can easily guess/assume wrong, so it really pays to provide hints or analyze tables for anything more than a simple lookup.

13:48 Reply: BUG carray.c memcpy() buffer overflow (artifact: 191ad0797f user: drh)

Now fixed on trunk. Thanks for the bug report.

13:30 Post: BUG carray.c memcpy() buffer overflow (artifact: 48e525b266 user: Kaktusbot)

Function in carray.c

413 SQLITE_API int sqlite3_carray_bind

There is memcpy() in case of data type != CARRAY_TEXT which implies that sz is size of data type and nData is number of array entries.

467 memcpy(pNew->aData, aData, sz*nData);

But in this section earlier sz was already multiplied by size of data type.

433 sqlite3_int64 sz = nData;
434 switch( mFlags & 0x03 ){
435   case CARRAY_INT32:   sz *= 4;              break;
436   case CARRAY_INT64:   sz *= 8;              break;
437   case CARRAY_DOUBLE:  sz *= 8;              break;
438   case CARRAY_TEXT:    sz *= sizeof(char*);  break;
439 }

And in fact it was properly handled on another line but not on the 467 memcpy one

446 pNew->aData = sqlite3_malloc64( sz );

This leads to memcpy read and write overflows which makes my app crash spontaneously.

After I've changed line 467 as showed below crashes stopped.

467 memcpy(pNew->aData, aData, sz);

12:18 Edit reply: INDEX usage on inequality and IN (artifact: 844bbb2ca8 user: slavin)

You have 1,000,000 rows, and only 4 possible values. You are searching for all rows other than those with 1 of those 4 values. The column is indexed, so an index is available.

Under those circumstances the index may be useful. SQLite would figure that out automatically, and you don't need to write any special SQL to make it happen. If SQLite doesn't take advantage of the index, it's because it has figured out that reading the whole table and skipping the rows where type=1 is faster than using the index. (Please note that this is more complicated than you think, since SQLite has to retrieve information from the table for the rows you selected, so it has to read the table anyway.)

If you want to be sure SQLite has the best information to choose the fastest methods, put realistic data in the table and use the ANALYZE command:

https://www.sqlite.org/lang_analyze.html

You can use the command just once. The results are stored in the database and, unless the character of your data changes radically, you don't need to ANALYZE again.

12:17 Reply: INDEX usage on inequality and IN (artifact: 567ef6bfa4 user: slavin)

You have 1,000,000 rows, and only 4 possible values. You are searching for all rows other than those with 1 of those 4 values. The column is indexed, so an index is available.

Under those circumstances the index may be useful. SQLite would figure that out automatically, and you don't need to write any special SQL to make it happen. If SQLite doesn't take advantage of the index, it's because it has figured out that reading the whole table and skipping the rows where type!=1 is faster than using the index. (Please note that this is more complicated than you think, since SQLite has to retrieve information from the table for the rows you selected, so it has to read the table anyway.)

If you want to be sure SQLite has the best information to choose the fastest methods, put realistic data in the table and use the ANALYZE command:

https://www.sqlite.org/lang_analyze.html

You can use the command just once. The results are stored in the database and, unless the character of your data changes radically, you don't need to ANALYZE again.

12:11 Reply: Serialize database to text (artifact: cd4f68bc07 user: sergeylapin)

Thanks!

11:41 Reply: Serialize database to text (artifact: a4b831eb85 user: itroot)

https://www.sqlite.org/cli.html#converting_an_entire_database_to_an_ascii_text_file

sqlite> .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE test(info TEXT);
INSERT INTO test VALUES('info1');
COMMIT;
sqlite> 
11:26 Reply: How do i submit a bug report (artifact: d04f5871e3 user: drh)

Lemon is a command-line tool. Nobody runs Lemon who does not already have full shell access on the target machine. So memory safety issues in Lemon don't really matter that much.

Lemon generates code that is used by SQLite. But Lemon is not itself part of SQLite. If you find faults in the code that Lemon generates, that is an issue. But faults in Lemon itself are scarcely a concern.

10:56 Reply: INDEX usage on inequality and IN (artifact: 882fa9fdf4 user: anonymous)

Another example is with LIKE operator if you do WHERE LIKE 'TEST%' explainer tells you will use index WHERE NOT LIKE 'TEST%' just scan table

but you could end up in an example like that in the NOT LIKE statement gives you less result than the first

09:37 Post: Serialize database to text (artifact: 088037a20a user: sergeylapin)

Hi, I recently read https://sqlite.org/whynotgit.html article and agree a lot with points made there. However, I am not ready to make a migration from git to fossil, yet. Currently, main obstacle for me in sqlite-git integration is diffing, for some reason binary is always new from git point of view. This issues is described really well in this article - https://ongardie.net/blog/sqlite-in-git.

Diego Ongaro, suggest to implement driver to serialize/deserialize on commit sqlite db, which is what I am doing right now. Unfortunately, I have failed to find how to serialize to sqlite to op log.

Can somebody help me with that?

import sqlite3 from 'sqlite3';
import path from 'path';
import fs from 'fs';
import { execSync } from 'child_process';
import { root } from '../core';

let log: string[] = []
describe('sqlite driver', () => {
  it('unpack a to text', () => {
    log = [
      "CREATE TABLE test(info TEXT)",
      "INSERT INTO test (info) VALUES ('info1')"
    ]
    // how to get these ^^^ ?
  });

  it('pack a previously unpacked to text to b', () => {
    if (fs.existsSync(path.join(__dirname, 'b.db'))) {
      fs.unlinkSync(path.join(__dirname, 'b.db'))
    }

    let bDb = new sqlite3.Database(path.join(__dirname, 'b.db'));

    bDb.serialize(() => {
      for (let line of log) {
        bDb.run(line);
      }
    });
    expect(execSync(`cd ${root} && git diff --name-only`).toString()).toBe('');
  });

  it('copy b to c and check that git does not have changes', () => {
    execSync(`cp ${__dirname}/b.db ${__dirname}/c.db`).toString();
    expect(execSync(`cd ${root} && git diff --name-only`).toString()).toBe('');
  });
});
06:52 Reply: How do i submit a bug report (artifact: 86b3a5c93d user: ddevienne)

Well, Lemon is not critical, in the sense that it is run only at build-time,
not runtime, and thus any vulnerability you may find is not essential to SQLite.

That's different about the C code Lemon generates, which IS used in SQLite.
So given how busy DRH is, you may find your reports not quickly acted on I suspect, if at all.

Also note that Lemon's purpose is to serve SQLite primarily.
Richard has made Lemon fixes reported/provided on the list several times,
even though those fixes didn't affect SQLite's own use-case with Lemon.
Still, support on Lemon cannot be considered in the same category as SQLite's.

My $0.02, from a non-team member.

06:16 Reply: INDEX usage on inequality and IN (artifact: 51a888675d user: anonymous)

Ok understand, but suppose you have 1.000.000 records and only 4 possible values for type. So basically you have 25% of type=1 25% type=2 and so on... So if I have an index on type and I do WHERE type!=1 the index can help me to skip 25% of records, is it correct? Otherwise how can I rewrite my query to benefit from index?

02:00 Reply: How do i submit a bug report (artifact: 09083e25e5 user: stephan)

How do i submit a bug report

Such reports should be sent directly to the project lead, Richard Hipp:

https://sqlite.org/support.html

01:31 Post: How do i submit a bug report (artifact: 1da3fa19b8 user: anonymous)

Hello. I am a cyber security student. I have found multiple memory corruption vulnerabilities within the lemon LALR parser resulting in controlling multiple registers. I believe that a return to libc exploit is likely to be possible and i would like info on how to properly report this and go through responsible disclosure.

I have done a lot of crash analysis in gdb and I have screenshots and further documentation that the vulnerability does indeed exist.

Thanks, Cyber Security Researcher.

2021-10-19
23:24 Reply: can I insert an 8GB file into sqlite? (artifact: 20ef934fda user: KevinYouren)

Andrew,

is there any more progress with your requirement?

The largest user type file I have is panlex_lite.zip which is 2.7G,

and it contains a 7.2 GB sqlite database.

15:37 Edit reply: INDEX usage on inequality and IN (artifact: 49c57f4368 user: casaderobison)

Imagine you have a table with 1000000 rows. The id column is guaranteed to be unique. So when you search for a specific id value, SQLite knows that it can perform a search in logarithmic time by using the index to find the one true record.

When you search for all rows that don't match a single id value, SQLite knows that you will match either all rows or all but one row. Thus the advantage to using the index does not exist when your query will return N or N-1 rows.

The same holds true for the type column without the uniqueness guarantee. The query plan might be different based on statistics, but at this point in time that is the best information that SQLite has available.

15:37 Reply: INDEX usage on inequality and IN (artifact: 7b8214a5e1 user: casaderobison)

Imagine you have a table with 1000000 rows. The id column is guaranteed to be unique. So when you search for a specific id value, SQLite knows that it can perform a search in logarithmic time by using the index to find the one true record.

When you search for all rows that don't match a single id value, knows that you will match either all rows or all but one row. Thus the advantage to using the index does not exist when your query will return N or N-1 rows.

The same holds true for the type column without the uniqueness guarantee. The query plan might be different based on statistics, but at this point in time that is the best information that SQLite has available.

More ↓