SQLite Forum

Timeline
Login

50 events by user casaderobison occurring on or before 2021-08-02 22:49:30.

More ↑
2021-08-02
22:49 Reply: null character sorts greater than 0x7f and less than 0x80 artifact: 0755290000 user: casaderobison

+1

16:42 Edit reply: null character sorts greater than 0x7f and less than 0x80 artifact: d8ae3db1ad user: casaderobison

This is because UTF-8 in SQLite is actually "a stream of bytes". If you put "garbage" into a UTF-8 field (such as the common "Modified UTF-8" which encodes NUL as C0 80) SQLite will dutifully store it and return it back to you as long as you ask for it in UTF-8. Or for that matter, you can store any encoding that does not embed a null in a UTF-8 text field, you are just responsible for knowing what to do with it in the end. Such is the nature of SQLite text fields.

One could write a user defined function, say "valid_utf8", that verifies a field is truly UTF-8 and use it in check constraints if they really want to guarantee that no ill formed UTF-8 is injected. At this point SQLite couldn't (so it seems to me) change how it stores UTF-8 text fields without breaking someone.

16:39 Edit reply: null character sorts greater than 0x7f and less than 0x80 artifact: 99412b9ade user: casaderobison

This is because UTF-8 in SQLite is actually "a stream of bytes". If you put "garbage" into a UTF-8 field (such as the common "Modified UTF-8" which encodes NUL as C0 80). Or for that matter, you can store any encoding that does not embed a null in a UTF-8 text field, you are just responsible for knowing what to do with it in the end. Such is the nature of SQLite text fields.

One could write a user defined function, say "valid_utf8", that verifies a field is truly UTF-8 and use it in check constraints if they really want to guarantee that no ill formed UTF-8 is injected. At this point SQLite couldn't (so it seems to me) change how it stores UTF-8 text fields without breaking someone.

16:38 Reply: null character sorts greater than 0x7f and less than 0x80 artifact: 34b2862d6f user: casaderobison

This is because UTF-8 in SQLite is actually "a stream of bytes". If you put "garbage" into a UTF-8 field (such as the common "Modified UTF-8" which encodes C0 80). Or for that matter, you can store any encoding that does not embed a null in a UTF-8 text field, you are just responsible for knowing what to do with it in the end. Such is the nature of SQLite text fields.

One could write a user defined function, say "valid_utf8", that verifies a field is truly UTF-8 and use it in check constraints if they really want to guarantee that no ill formed UTF-8 is injected. At this point SQLite couldn't (so it seems to me) change how it stores UTF-8 text fields without breaking someone.

2021-07-26
03:33 Reply: sqlite3_carray_bind and constness artifact: 53a426cf75 user: casaderobison
2021-07-14
21:08 Reply: How much would you trust a page-level checksum ? artifact: 585a771655 user: casaderobison

I don't know enough about exactly how SQLite is doing the field comparison. Presumably the existing data is in the cache, so it can do a byte for byte comparison. If it is a "typical string" (smallish) or an integer or double, then comparing the fields is probably much faster. If the field is a large text or blob, then perhaps comparing the previously computed hash of a block in the cache to a freshly computed hash of a to be written block would be faster.

This introduces the added complexity of what to do with a really large text or blob that spans two or more pages.

Ultimately, the size of the changes is going to be the driver of whether page based hashing or field based comparisons are more efficient, and the tipping point will depend on the complexity of the hashing algorithm used.

2021-07-06
18:28 Reply: A format string vulnerability in tool used to help build SQLite's TCL extension on Windows artifact: 40a5b32984 user: casaderobison

+1

2021-07-03
18:50 Edit reply: cannot create tables in sqlitestudio than other apps can see artifact: a2b549b9d7 user: casaderobison

A full path would have a number of folder names in it. Even if the backslashes were being eaten there would be many characters between the 'C:' and the filename.

Correct. My comment was based on an earlier comment where it was said that the root drive was being used, hence only needing to have a single unescaped backslash character.

Your comment is exactly correct, a fully qualified path needs to be used (I think) to resolve this problem. I was only pointing out that the particular test case was trying to use the root directory, which would not need a long path, and would likely result in a failure to open because of an inability (by default) to create and / or write to the root directory of the C drive in Windows 10.

18:50 Edit reply: cannot create tables in sqlitestudio than other apps can see artifact: c198fac550 user: casaderobison

A full path would have a number of folder names in it. Even if the backslashes were being eaten there would be many characters between the 'C:' and the filename.

Correct. My comment was based on an earlier comment where it was said that the root drive was being used, hence only needing to have a single unescaped backslash character.

Your comment is exactly correct, a fully qualified path needs to be used (I think) to resolve this problem. I was only pointing out that the particular test case was trying to use the root drive, which would not need a long path, and would likely result in a failure to open because of an inability (by default) to create and / or write to the root directory of the C drive in Windows 10.

18:48 Reply: cannot create tables in sqlitestudio than other apps can see artifact: 9ec639f643 user: casaderobison

Correct. My comment was based on an earlier comment where it was said that the root drive was being used, hence only needing to have a single unescaped backslash character.

Your comment is exactly correct, a fully qualified path needs to be used (I think) to resolve this problem. I was only pointing out that the particular test case was trying to use the root drive, which would not need a long path, and would likely result in a failure to open because of an inability (by default) to create and / or write to the root directory of the C drive in Windows 10.

18:38 Reply: The Untold Story of SQLite artifact: ae40dd844f user: casaderobison

Regarding the "git is the perfect vcs for linux" ... It is a very diplomatically worded statement, and perhaps drh believes it. My first thought upon hearing it was not "fossil would be better" or what have you, but "git is the adequate vcs for linux".

00:08 Reply: Integrated user management for a group of repositories artifact: c534ca9666 user: casaderobison

The ultimate reason is "no one has felt strongly enough about having this feature up to this point in time to contribute it."

By and large fossil is used by individual projects. Just because someone is interested in SQLite doesn't mean they are interested in fossil or pikchr, or any other combination.

What you are looking for is a first class citizen of github or chiselapp, and great for people who want to offload their project to a third party. For those who don't want to do that, there isn't such a feature yet.

Whether or not your questions motivate someone to create "social project functionality" above and beyond what fossil already has (which has apparently satisfied most people up to now) remains to be seen.

2021-07-02
19:58 Reply: cannot create tables in sqlitestudio than other apps can see artifact: 1a5a7513af user: casaderobison

I believe a full path is being specified but the backslash character is being eaten by markdown because it is not escaped.

2021-07-01
18:42 Reply: cannot create tables in sqlitestudio than other apps can see artifact: 551c7a2698 user: casaderobison

Windows 10 does not allow create and / or write access to the root of the C drive by default. Thus if you try to open a database in the root as a normal user, it will appear to succeed (because SQLite doesn't actually open the file right away, deferring that until a subsequent access of the file). Only once you try to do something that accesses the database will it open the file and the journal, and since you don't have create access in the root by default, open fails.

2021-06-29
17:51 Reply: Best practice: Save partial dates artifact: 426e23f666 user: casaderobison

You beat me by two minutes. And expressed it much better that I did.

I mean, "+1"

17:49 Delete reply: Best practice: Save partial dates artifact: 854e4335b6 user: casaderobison
Deleted
17:48 Reply: Best practice: Save partial dates artifact: 61851930c4 user: casaderobison

Not knowing more about your use case, I just thought I'd throw out a possible alternative that might work better for some use cases.

YYYY-MM-DD but instead of leaving unknown values empty, use a sentinel value. So XXXX-XX-XX if nothing is known, or ????-??-??, or something like that. Then fill in the portions that are known.

In this way, you can use simple substrings for the individual components, and you theoretically have a format that automatically aligns, making it easier to view a column of such values.

It will take more space in the database. If space is a concern where there might be many partial dates, then this might not be a suitable choice.

Just something to consider.

2021-06-18
20:19 Reply: Division by decimals wrongly returning a rounded int result (works fine when casting to real) artifact: 19569c70bb user: casaderobison

I think the issue is that Django abstracts away the database access completely so that it is not a concern of the programmer, so you don't really have the option of telling it what you want it to do with this or that database engine.

2021-06-14
13:23 Edit reply: SQLite error (7): failed to HeapReAlloc 465625658 bytes (8), heap=7A20000 artifact: ee08318645 user: casaderobison

I had a program some years ago that did a very similar pattern, and the problem is that that version of FTS needs to create some really large blobs when merging FTS nodes. I don't remember all the details, but my solution at the time was to hand shard the data to keep the size of FTS tables smaller, then run multiple queries to gather the same data.

http://sqlite.1065341.n5.nabble.com/Segfault-during-FTS-index-creation-from-huge-data-td81465.html seems to be a link to a six year old thread on the same or very similar / adjacent topic.

13:23 Reply: SQLite error (7): failed to HeapReAlloc 465625658 bytes (8), heap=7A20000 artifact: 2728d9471a user: casaderobison

I had a program some years ago that did a very similar pattern, and the problem is that that version of FTS needs to create some really large blobs when merging FTS nodes. I don't remember all the details, but my solution at the time was to hand shard the data to keep the size of FTS tables smaller, then run multiple queries to gather the same data.

http://sqlite.1065341.n5.nabble.com/Segfault-during-FTS-index-creation-from-huge-data-td81465.html seems to be a link to a six year old thread on the same or very similar / adjacent topic.

2021-05-31
02:15 Reply: SQLITE_IOERR on `pragma journal_mode=wal;` artifact: dbf77f0266 user: casaderobison

I was hedging because it seems to me that if multiple processes open a remote database in WAL mode from a single host, and no other connections are open, then they'll all be sharing the same memory. That might "work" for sufficiently fuzzy values of "work".

But yes, given that remote file systems are never recommended, your point is well made.

01:29 Reply: SQLITE_IOERR on `pragma journal_mode=wal;` artifact: 8185b5d049 user: casaderobison

Is there any chance the beta user is trying to access the file over a network file system? I know the message states it uses EXT4, but if it is a typical system, there are any number of other file systems that might be mounted. Without knowing the fstab and path to the file, EXT4 isn't a guarantee.

If there is a chance of remote file system usage, that could be a problem. I don't know that it would return that exact error message, but WAL especially is intended to be used exclusively by processes on a single host due to shared memory usage.

2021-05-18
01:14 Reply: memory vs mmap artifact: 5ec1dd8821 user: casaderobison

I don't know enough about the implementation detail, but I was trying to show you a big reason why an identical code path in SQLite accessing "simple raw pointers" could have wildly varying execution speeds due to the decisions made in implementing that underlying mapping object.

It is my experience that an mmap using application will typically create one big mapping object for an entire file, then any access to that will access the underlying memory object. However, I believe that due to the design decisions when mmap was first added to SQLite, it is only used for reading (not writing) data (which is your use case) as long as the file was under some certain size. Subsequently, I seem to recall that problems were discovered with one or more platforms mmap implementations and it was "deprecated" (that might not be the right word, but it seems right).

The page cache may perform one big allocation then access individual pages within that, but there is no requirement that the pages be continuous. A mmap object will necessarily be a contiguous chunk of the file.

So effectively you have one big mmap object that represents a sequential range of pages of the file, or you have a page cache which represents many potentially random pages in many page size chunks. This means that the method of accessing a memory DB will be of necessity very different than the code that accesses a mmap version, even though they both use pointers.

00:26 Reply: memory vs mmap artifact: d3012a5759 user: casaderobison

As has been written, there are many ways the mmap can be implemented. One model might be:

Allocate one page of physical memory to the mapped file object. Any time you access a different address in the mapped file object, it will write that page back to disk and load a new page from disk to physical memory.

Another might be:

Allocate one page of physical memory for each page of the mapped file object. Do not preload any data. As access is made to each page of the mapped file, a page fault is generated that will read the data from the file to physical memory.

These will both have radically different performance profiles. The first will be thrashing that one page repeatedly (assuming one is accessing random locations within the mapped file object). The other will "quickly" allocate memory, but it will slow down reading random pages from the file as one continues to access random locations.

Ultimately, a bespoke algorithm that knows a lot about how the data will be accessed will potentially be more efficient than one that must handle generic workflows.

We don't know exactly how mmap is implemented, and it has to work regardless of page visitation order.

In the case of creating a memory database, it will allocate enough space for all the pages of the database and never need to fault a page to load it from disk (except in the case of swapping, perhaps, if too much memory has been allocated).

In no way can you ever assume that accessing malloced pages is ever going to be exactly the same as mmap, as there is a lot more at play in creating memory map objects.

2021-05-13
09:53 Edit reply: How to convert am/pm time to 24hrs time format artifact: c07e6cb3a7 user: casaderobison

Though I will extend my "intuition" by this example:

12:00:01 pm
12:00:00.1 pm
12:00:00.01 pm
12:00:00.001 pm
12:00:00.0001 pm
12:00:00.00[...]001 pm

This leads me to the intuitive understanding that as we get closer and closer and closer to exactly 12 noon from the right side, we are on the pm side of the line, so I think the 2008 change to make 12 pm be noon was the right move.

Given that "0.999..." is exactly equal to "1.0" (https://en.wikipedia.org/wiki/0.999...) it is also intuitive to me that "11:59:59.999... am" is equal to "12:00:00 pm".

But we're talking about time standards that are not universal constants, which is why we have leap days, leap seconds, daylight time, summer time, standard time, all sorts of different legal standards for them by various governments, time zones, epochs. None of it makes "sense" in that regard.

"11:59:59 am {tick} 12:00:00 am {tick} 12:00:01 pm" hurts my OCD.

Ultimately I agree with the idea that "exactly" noon cannot be AM or PM by the very meaning of the words. Some years ago I worked on software for a company that supported radio station programming. In building a format clock, I had a list of hours of the day that were labeled "1P", "2P", ... "11P", and I used "12M" as the hour that came between "11P" and "1A" just to avoid the ambiguity. "12P" or "12N" doesn't have as much of a problem in my mind since it occurs in the middle of the day, but it is always confusing trying to talk to people about which day "12A" belongs to (but really it is the same problem when referring to "12M"). Until 2019 ISO 8601 allowed using 24:00:00 to refer to the end of the day but now that is forbidden.

09:41 Edit reply: How to convert am/pm time to 24hrs time format artifact: 75ce300a98 user: casaderobison

Though I will extend my "intuition" by this example:

12:00:01 pm
12:00:00.1 pm
12:00:00.01 pm
12:00:00.001 pm
12:00:00.0001 pm
12:00:00.00[...]001 pm

This leads me to the intuitive understanding that as we get closer and closer and closer to exactly 12 noon from the right side, we are on the pm side of the line, so I think the 2008 change to make 12 pm be noon was the right move.

Given that "0.999..." is exactly equal to "1.0" (https://en.wikipedia.org/wiki/0.999...) it is also intuitive to me that "11:59:59.999... am" is equal to "12:00:00 pm".

But we're talking about time standards that are not universal constants, which is why we have leap days, leap seconds, daylight time, summer time, standard time, all sorts of different legal standards for them by various governments, time zones, epochs. None of it makes "sense" in that regard.

"11:59:59 am {tick} 12:00:00 am {tick} 12:00:01 pm" hurts my OCD.

Ultimately I agree with the idea that "exactly" noon cannot be AM or PM by the very meaning of the words. Some years ago I worked on software for a company that supported radio station programming. In building a format clock, I had a list of hours of the day that were labeled "1P", "2P", ... "11P", and I used "12M" as the hour that came between the two just to avoid the ambiguity. "12P" or "12N" doesn't have as much of a problem in my mind since it occurs in the middle of the day, but it is always confusing trying to talk to people about which day "12A" belongs to (but really it is the same problem when referring to "12M"). Until 2019 ISO 8601 allowed using 24:00:00 to refer to the end of the day but now that is forbidden.

09:35 Reply: How to convert am/pm time to 24hrs time format artifact: 057b5fd250 user: casaderobison

Though I will extend my "intuition" by this example:

12:00:01 pm 12:00:00.1 pm 12:00:00.01 pm 12:00:00.001 pm 12:00:00.0001 pm 12:00:00.00[...]001 pm

This leads me to the intuitive understanding that as we get closer and closer and closer to exactly 12 noon from the right side, we are on the pm side of the line, so I think the 2008 change to make 12 pm be noon was the right move.

"11:59:59 am {tick} 12:00:00 am {tick} 12:00:01 pm" hurts my OCD.

Ultimately I agree with the idea that "exactly" noon cannot be AM or PM by the very meaning of the words. Some years ago I worked on software for a company that supported radio station programming. In building a format clock, I had a list of hours of the day that were labeled "1P", "2P", ... "11P", and I used "12M" as the hour that came between the two just to avoid the ambiguity. "12P" or "12N" doesn't have as much of a problem in my mind since it occurs in the middle of the day, but it is always confusing trying to talk to people about which day "12A" belongs to (but really it is the same problem when referring to "12M"). Until 2019 ISO 8601 allowed using 24:00:00 to refer to the end of the day but now that is forbidden.

09:21 Reply: How to convert am/pm time to 24hrs time format artifact: f0080df80a user: casaderobison

All true. This is why I labelled it "intuition" instead of "logic". I really didn't mean to come off as talking down about others views (though clearly they are wrong when they do not conform to mine). #haha

In my mind, it makes more sense that the "am pm" designation goes with the hour of the day. That being said, it is already non-intuitive that 12 maps to 0 and 12 and 24.

Intuition and logic are different things, even though people tend to conflate them. Hence my sincere thanks for pointing out that my intuition doesn't match others and thus that I need to be more precise when communicating the "round hours" of "exactly" noon and midnight.

05:03 Reply: Unexpected output from the SELECT statement artifact: c5382cf1b0 user: casaderobison

If you type these statements:

  explain select * from v0 where (v1=20) or (v1=10 and v2=10);

  explain select * from v0 where (v1=20) union select * from v0 where (v1=10 and v2=10);

You'll see that it generates very different VM programs for the first than the last. The second explain output clearly shows that v2 is checked, but not in the first.

Someone with more knowledge about the code generator will have to speak as to why the code generator works that way at present and what was necessary to eventually fix it.

03:02 Reply: How to convert am/pm time to 24hrs time format artifact: 46b5f1f98e user: casaderobison

That is an interesting article. I had no idea there was that level of "confusion".

To me, it seems intuitive that 12:00:01 am is one second after midnight, and 12:00:01 pm is one second after noon. By extension, it would be silly to consider 12:00:00 am as being offset by more than one second from 12:00:01 am. Never underestimate the ability of people to foul things up.

I do agree that when using a 12-hour clock in prose, it is probably best to use "midnight" or "noon" instead of "12 am" or "12 pm". Given the unlikely nature of any event happening exactly at the stroke of the instant of midnight or noon (instead of being offset by at least some fraction of a second), I feel people infer a precision by the term "12 am" or "12 pm" that doesn't exist, just as they infer a precision in binary floating-point arithmetic.

Thanks for sharing that. It's good to know government and legal sources don't understand (or haven't historically understood) it "intuitively" the way I do so that I can stop assuming it is as obvious as I think it is.

2021-05-06
23:25 Reply: "Office Space"... I'm losing all those pennies artifact: 352ae63a98 user: casaderobison

I suspect the discrepancy has to do with the digits of precision that float has. Float has 24 bits of precision (23 for subnormal) which amounts to 7.22 decimal digits of precision. MariaDB is undoubtedly going the extra mile and performing a rounding step internally before the final conversion to integer. This might be done via a IEEE 754 rounding mode, or it could be something they do in a final analysis.

Where MariaDB is a dedicated process that knows its only reason for existence is to work with a database, it is free to make all the decisions about how to utilize the floating point environment.

SQLite is but one part of a larger process, and defers those decisions to the process. Depending on what platform you are on, it might be possible to set an IEEE 754 rounding mode that would give you the same rounding mode. Alternatively, you could do something like:

SELECT CAST(CAST('16.15' AS FLOAT)*100+0.5 AS INT)

Or use ROUND as was suggested previously.

The long story short is: If all variables are the same, you will get the same answer on both platforms. If anything varies, then you will not.

What platform are you using? Operating system, compiler, runtime library, and versions.

2021-05-04
10:09 Edit reply: "Office Space"... I'm losing all those pennies artifact: c89aaaad86 user: casaderobison

That is part of the "lite" in SQLite. In the beginning was the C provided floating-point functionality that was the first-order approximation for decimal mode. It was lite to use existing functionality rather than implement its own decimal numeric processing.

SQLite has never claimed to be conformant to any particular SQL standard. Few database systems fully conform to any standard.

Fortunately, there are solutions to the missing functionality for those who require it, though it is still in a non-standard format (using functions instead of operators, for example).

10:06 Reply: "Office Space"... I'm losing all those pennies artifact: 551c8c66a2 user: casaderobison

That is part of the "lite" in SQLite. In the beginning was the C provided floating-point functionality that was the first-order approximation for decimal mode. It was lite to use existing functionality rather than implement its own decimal numeric processing.

SQLite has never claimed to be conformant to any particular SQL standard. Few database systems fully conform to any standard.

Fortunately, others have provided solutions to the missing functionality for those who require it.

01:47 Reply: License of SQLite pikchr RR diagrams artifact: 5194dba69c user: casaderobison

My concern was more to do with SQLite documentation. I know the compilable source code is in the public domain, but I also know there are copyright-protected portions of the project, such as SEE and certain portions of the testing infrastructure.

I just wanted to be extra sure I wasn't taken that which was not available, and I would never expect the license of pikchr to govern every image it creates.

2021-05-03
18:18 Reply: License of SQLite pikchr RR diagrams artifact: b09c446f1c user: casaderobison

Thank you.

18:09 Post: License of SQLite pikchr RR diagrams artifact: ce848e8bbb user: casaderobison

I'm finally working to finish a project I started long ago, and one of the things it "needs" is pretty railroad diagrams to illustrate the syntax / grammar of the "language". Pikchr is of course the perfect solution to this.

Is the SQLite documentation (including pikchr code) public domain or is there any problem using and adapting those "examples" for my own use?

2021-04-13
01:30 Reply: Please avoid all use of the "LL" suffix for long-long integer literals artifact: a0bea06001 user: casaderobison

Or use a macro to qualify them on a platform basis:

#if defined(_MSC_VER)
#define LL(x) (x##I64)
#else
#define LL(x) (x##LL)
#endif

if (somevar == LL(123456789012)) { somecode }
2021-04-07
16:08 Reply: open db in read only; explicitly store diff as WAL ? artifact: a39339732b user: casaderobison

I think it has been determined that SQLite doesn't do what you're asking for. One possibility that comes to mind would be to write your own VFS. In this VFS it would treat the source database as read only and would "magically" write all changes to a delta file of some sort. A separate utility could be used to merge the delta file back into the original database.

2021-03-11
02:59 Edit reply: import mishandles table name artifact: e3252f6a6b user: casaderobison

You opened an in-memory database. That database is by definition transient (aka temporary) but it is known as "main".

So why not just import into a table named summary. Then if you want to do some manipulation of the data to go into a persistent file, just attach that file with a specified name:

attach 'somefile.db' as nottemp;

Then you can do all the data manipulation you want from the temporary database named main into the persistent database named nottemp.

insert into nottemp.sometable
select some-list-of-columns
from main.summary
where some-list-of-conditions

I used that very technique for a data import project last year at work, importing awful looking CSV into a memory-based database named main, then transformed the data into a nicer format in tables in a database that I attached from a file.

02:58 Reply: import mishandles table name artifact: 3282a9b78a user: casaderobison

You opened an in-memory database. That database is by definition transient (aka temporary) but it is known as "main".

So why not just import into a table named summary. Then if you want to do some manipulation of the data to go into a persistent file, just attach that file with a specified name:

attach 'somefile.db' as nottemp;

Then you can do all the data manipulation you want from the temporary database named main into the persistent database named nottemp.

insert into nottemp.sometable
select some-list-of-columns
from main.summary
where some-list-of-conditions

I used that very technique for a data import project last year at work, importing CSV into a memory-based database named main into tables in a database that I attached from a file.

2021-03-10
02:57 Reply: Type of the column artifact: 9e08293836 user: casaderobison

It's in the header file, but it looks like this:

#ifdef SQLITE_INT64_TYPE
  typedef SQLITE_INT64_TYPE sqlite_int64;
# ifdef SQLITE_UINT64_TYPE
    typedef SQLITE_UINT64_TYPE sqlite_uint64;
# else
    typedef unsigned SQLITE_INT64_TYPE sqlite_uint64;
# endif
#elif defined(_MSC_VER) || defined(__BORLANDC__)
  typedef __int64 sqlite_int64;
  typedef unsigned __int64 sqlite_uint64;
#else
  typedef long long int sqlite_int64;
  typedef unsigned long long int sqlite_uint64;
#endif
typedef sqlite_int64 sqlite3_int64;
typedef sqlite_uint64 sqlite3_uint64;

So the exact incantation to get 64 bit integers depends on the environment, and can be customized by defining SQLITE_INT64_TYPE and (optionally) SQLITE_UINT64_TYPE macros.

2021-03-09
15:51 Reply: Type of the column artifact: 0058b3b6be user: casaderobison

Other answers are excellent but maybe this will be helpful to you (or someone).

SQLITE_INTEGER doesn't tell you anything about the size of the integer. Only that it is integral. It could be stored in one of several ways, from 8- up to 64-bit (though not all variants are used internally; there is no such thing as a 10-bit integer within the SQLite storage model, for example).

If you truly don't know in advance what size of data you expect to get back from the database, you must assume it is the full 64-bit and use the int64 API.

If you want to know "in advance" what size of an integer you will need, you'll need something like the following pseudo-code:

#define EXTENDED_INT8    -1
#define EXTENDED_INT16   -2
#define EXTENDED_INT32   -3
#define EXTENDED_INT64   -4

#define EXTENDED_BETWEEN(v, l, h) ((l <= v) && (v <= h))

int extended_column_type(sqlite3_stmt* pStmt, int iCol)
{
  int type = sqlite3_column_type(pStmt, iCol);
  if (type == SQLITE_INTEGER)
  {
    sqlite3_int64 value = sqlite3_column_int64(pStmt, iCol);
    if (EXTENDED_BETWEEN(value, -128, 127))
      type = EXTENDED_INT8;
    else if (EXTENDED_BETWEEN(value, -32768, 32767))
      type = EXTENDED_INT16;
    else if (EXTENDED_BETWEEN(value, -2147483648, 2147483647))
      type = EXTENDED_INT32;
    else
      type = EXTENDED_INT64;
  }
  return type;
}

I've not tried to compile that so it might not be perfect, and there are other ways to do it which might be "better" based on some criteria, but I hope it is illustrative of how you can build your own type checker that provides finer granularity than what SQLite offers by default.

2021-02-28
20:29 Reply: Ability to: SELECT * EXCEPT(some,columns) artifact: af461a9804 user: casaderobison

Then a single string parameter with a quoted identifier list could be used. I think it is doable in some way within the context of the system without adding it to core SQLite.

2021-02-26
21:08 Reply: Ability to: SELECT * EXCEPT(some,columns) artifact: 11a3c0c693 user: casaderobison

I think the problem is that the column list as it exists is simply a concatenation of columns that are desired. A possibly qualified asterisk says "add all the columns". The proposal says "now go back and remove some of the columns I told you to add previously". While SQL engines tend to do this in the order the columns are defined in the DDL, I don't know that this is a hard requirement.

(Perhaps there should be a reverse_wildcard_columns pragma like there is a reverse_unordered_selects pragma for those who want to ensure they aren't depending on an implementation detail. I'm not recommending it, just a thought that occurred to me.)

There certainly are many things a SQL engine could do to make things easier for the interactive user, but SQLite is not an inherently interactive user tool. That it can be used that way is incidental to its embedded nature.

Anyway, I think enhancing the select list to allow column removal is not a great idea. Not that my opinion matters in the end.

If anything, maybe what would have value would be a table-valued function that takes a query followed by a list of columns to exclude from the final list. That could be done with SQL as it exists today (in as much as SQLite supports table-valued functions). The implementation of such is left as an exercise for the reader.

2021-02-20
03:52 Reply: I made an ico file to the tool sqlite3.exe artifact: 6ab0ca589b user: casaderobison

Good point. Gmail previews it, though only after filtering the data, harvesting anything that can be mined for profit, etc.

00:03 Reply: I made an ico file to the tool sqlite3.exe artifact: c5b2312549 user: casaderobison

Though it does show up in the email announcement of the post, so that's cool.

2021-02-10
17:29 Reply: Can parts of a database be protected for distribution? artifact: 51a8b46116 user: casaderobison

Depending on just how you want to structure it this is possible.

The first thought that comes to my mind (which doesn't mean it is a great idea) is to use asymmetric encryption of the data. Your program which writes the sensitive portions of the database includes the private key and encrypts the fields that should not be editable. The program which reads the sensitive portions of the database decrypts with the public key. Any non-sensitive data need not use any encryption.

Of course, this makes the data opaque to SQL as well, so if you want to be able to make queries based on the contents of the encrypted data, you are out of luck. Unless you build the decryption function into SQL instead of just doing that in your program logic.

The next idea, which would be less complicated, would be to add a column with a hashed value of the data, then check to ensure the data matches the corresponding hash. This will be easier for the user to circumvent but adds a tiny level of intricacy above and beyond simple SQL queries.

My last idea, for now, would be to write a custom VFS for SQLite that stores data in a non-standard way that makes it unusable with off the shelf SQLite compatible tools.

2021-02-01
20:46 Reply: sqlite3_exec stepping through multiple queries artifact: 552acbf30d user: casaderobison

If you included a "sentinel query" between the two, it could help you identify the transition (as long you know the sentinel uses a different schema than the other queries). For example:

select * from mytable where gender = 'M'; select 'end-1' as EOQ; select * from mytable where gender = 'F'; select 'end-2' as EOQ;

Just a thought. One could just as easily issue two separate sqlite3_exec calls to know when one is done and another starts.

2021-01-28
03:13 Edit reply: No more moderation? artifact: 217a2f92d5 user: casaderobison

It is simply to decrease friction. Many people object to having to provide their email and create or reuse a password for yet another forum, especially if they don't plan to be active. The point of the forum is to allow reports of defects (among other things), and the consensus seems to be that many reports would go unreported if there was any more friction than already exists.

03:12 Reply: No more moderation? artifact: b9a739f0a2 user: casaderobison

It is simply to decrease friction. Many people object to having to provide their email and create or reuse a password for yet another forum, especially if they don't plan to be active. The point of the forum is to allow reports of defects (among other things), and the consensus seems to be that the many reports would go unreported if there was any more friction than already exists.

More ↓