SQLite User Forum

Few potential issues found in code
Login

Few potential issues found in code

(1) By Ales Nezbeda (anezbeda) on 2024-08-08 10:28:08 [source]

Hi, during static analysis, there were 4 issues that seem to not be false positives.

I have taken care to not burden you with issues that are false positives, so I honestly feel the issues I will mention are an actual issues. I see that most upstreams do not care about static analysis, so I tried to filter the issues it found as much as possible.

Issue #1: Possible overflow in ext/fts3/fts3_snippet.c L:401

We check for truncation on L:393 and allow only max int value - 1. Problem is that pIter->nSnippet is capped at values -64 to +64, that also means that, to my knowledge, on the L:401 iStart = iEnd - pIter->nSnippet + 1; we could have pIter->nSnippet in negative numbers and the iStart and consecutively pIter->iCurrent could overflow. As far as I know, pPhrase->iHead which is assigned to iEnd depends on an external data, so I don't know if it's safe to assume that the L:401 can't overflow.

Issue #2: va_end is not called before returning from a function in ext/recover/sqlite3recover.c L:366

va_end is only called in case there are arguments available, checked by the if statement, but based on the docs https://en.cppreference.com/w/c/variadic/va_end, not calling va_end before returning from function where va_start was called is an undefined behavior.

Issue #3: Accessing freed memory and double free in ext/fts5/fts5_expr.c freeing on L:2410 and accessing on L:2430

This is a more complicated issue and requires larger change to fix. Test suite doesn't trigger this issue, as a very specific state has to happen. It is completely possible that this can not happen in normal circumstances, but I haven't found anything that would prove this in code.

The problem is that the function fts5ExprAddChildren that does the freeing of the variables pLeft and pRight is not behaving the same with the contents of the argument at all times. There are two ways the function will handle copying data. In one case it will just assign the pointer from aforementioned variables to another variable, in other it will copy all contents of the variable to another one and then frees the pointer. Deleting the free here will result in mem leak.

It then checks whether we added more children than allowed with the SQLITE_FTS5_MAX_EXPR_DEPTH macro. If so, we end up in the part of the code where the suspected double free from SAST is - SAST thinks that the function sqlite3Fts5ParseNodeFree(pLeft) and sqlite3Fts5ParseNodeFree(pRight) will cause double free. It actually isn't doing the double free - it would eventually, but first it tries to free all the children, so it tries to access the variable and that causes the SQLite to segfault, since it accesses already freed memory.

This only happens when two if statements both return true at the same time. First, in a function fts5ExprAddChildren expression p->eType!=FTS5_NOT && pSub->eType==p->eType must be true. This will result in copying the memory of the pSub argument, rather than just assigning that pointer to somewhere else, and freeing the pSub argument. After that, in a function, sqlite3Fts5ParseNode expression pRet->iHeight>SQLITE_FTS5_MAX_EXPR_DEPTH must be true. This will result in an error state where we do the freeing part at the end of the function, which will segfault.

This combination of both expressions returning true doesn't happen in the test suite. But both expressions do return true in some of the test cases, but never together at the same time. So this segfault is replicatable for example by adding this code to the fts5ExprAddChildren function at the end of the code in the if statement (after p>nChild += pSub>nChild) and running the test suite as it will force both expressions to return true eventually.

p->iHeight = 88888;
return 1; 
NOTE: As I previously stated, I have no idea whether this could actually happen in a real life environment, so this is a very crude way of replicating this issue, that might be completely detached from reality.

This will force the 'if' expression checking whether we added too many children to always return true.

One of the ways of fixing this is changing the function fts5ExprAddChildren from void to int, and returning different values based on whether we have copied the data and freed them, or not. We also remove the free from this function.

We then add an else statement for the 'if' checking the children count, where we free the pLeft and pRight variables if needed. So instead of freeing in the fts5ExprAddChildren function, we free in the sqlite3Fts5ParseNode function, but only after we check whether we do not have too many children. If we do have we won't free it as the offending function, that would normally cause a segmentation fault, sqlite3Fts5ParseNodeFree would do that for us on the final complete cleanup.

Issue #4: Potentially invalid index in ext/expert/sqlite3expert.c L:1517

There is an assert asserting that iSlot<=p->nSlot. By SQLite docs you can depend on assert being true, but this assert is only checking for index being too large, but not that it is > 0. To my knowledge, the iSlot variable is sourced from external (user?) input so it could also be negative, which would be an invalid index. How do we even know that it is iSlot<=p->nSlot, what's the meaning of assert in this case. Or did I get the code wrong, and it is not just sourced directly from external source like user input, but actually checked somewhere else. Also isn't the iSlot<=p->nSlot supposed to be iSlot<p->nSlot as nSlot seems to be a number of slots, so I would think the last index in &p->aSlot would be p->nSlot - 1.

Thank you for clarifying/addressing these issues. For issue 3 I do have a patch prepared, but since you do not accept contributions by default, I won't post the code here, since it would just complicate your work (in case it is an actual problem) since you would have to come up with a different fix to not use code you are not completely sure is public domain. Since I am a new SQLite maintainer for Fedora, I was thinking whether it would be possible to submit the 'affidavit' you mention at https://sqlite.org/copyright.html so in future cases I would be able to post PRs. If you have any document prepared it would have to go through Red Hat Legal team first, but I couldn't find any document like this on SQLite website, so I thought it would be easier to ask first here whether you even allow this.

Thank you very much for reading this long post

(2.1) By Stephan Beal (stephan) on 2024-08-08 10:57:01 edited from 2.0 in reply to 1 [link] [source]

L:401 ... L:366 ...

For future reference: when referring to the source code, always provide the hash of the checkin you're referring to, as without that the line numbers may well not match what we're looking at.

va_end is not called before ...

That one is now resolved. The others will require closer evaluation. Thank you for providing such detailed analysis!

(3) By Stephan Beal (stephan) on 2024-08-08 11:02:11 in reply to 1 [link] [source]

PS:

I won't post the code here, since it would just complicate your work (in case it is an actual problem) since you would have to come up with a different fix to not use code you are not completely sure is public domain.

It's okay to post patches here. Though we cannot take them verbatim, they are often treated as a proof-of-concept and reimplemented.

Since I am a new SQLite maintainer for Fedora, I was thinking whether it would be possible to submit the 'affidavit' you mention at https://sqlite.org/copyright.html so in future cases I would be able to post PRs. If you have any document prepared it would have to go through Red Hat Legal team first, but I couldn't find any document like this on SQLite website, so I thought it would be easier to ask first here whether you even allow this.

Please contact the project lead directly about that: drh at this domain.

(4) By Ales Nezbeda (anezbeda) on 2024-08-08 11:23:08 in reply to 2.1 [link] [source]

Oh, right, I didn't realize that, my apologies. Just to clarify now, all line numbers are referencing code from published sqlite-src-3460000.zip. So published code version 3.46.0 (so as far as I know 96c92aba00c8375bc32fafcdf12429c58bd8aabfcadab6683e35bbb9cdebf19e should be the hash of checkin)

(5) By Ales Nezbeda (anezbeda) on 2024-08-08 11:32:12 in reply to 3 [link] [source]

Alright, thank you very much for the clarification. I will contact the project lead at the provided address.

Here is the fix that should resolve the issue 3. It does pass all tests when checked both via the reproducer I mentioned and without the reproducer. There probably is a better way, but from my testing it does work. At least I think it could work as a proof of concept as you said.

From: Ales Nezbeda <anezbeda@redhat.com>
Date: Wed, 7 Aug 2024 09:56:46 +0200
Subject: [PATCH] Fix possible segmentation fault and double free in
 sqlite3Fts5ParseNode

---
 ext/fts5/fts5_expr.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/ext/fts5/fts5_expr.c b/ext/fts5/fts5_expr.c
index 05c1b59..b6c7428 100644
--- a/ext/fts5/fts5_expr.c
+++ b/ext/fts5/fts5_expr.c
@@ -2258,19 +2258,19 @@ static void fts5ExprAssignXNext(Fts5ExprNode *pNode){
   }
 }
 
-static void fts5ExprAddChildren(Fts5ExprNode *p, Fts5ExprNode *pSub){
+static int fts5ExprAddChildren(Fts5ExprNode *p, Fts5ExprNode *pSub){
   int ii = p->nChild;
   if( p->eType!=FTS5_NOT && pSub->eType==p->eType ){
     int nByte = sizeof(Fts5ExprNode*) * pSub->nChild;
     memcpy(&p->apChild[p->nChild], pSub->apChild, nByte);
     p->nChild += pSub->nChild;
-    sqlite3_free(pSub);
   }else{
     p->apChild[p->nChild++] = pSub;
   }
   for( ; ii<p->nChild; ii++){
     p->iHeight = MAX(p->iHeight, p->apChild[ii]->iHeight + 1);
   }
+  return p->eType!=FTS5_NOT && pSub->eType==p->eType;
 }
 
 /*
@@ -2407,8 +2407,8 @@ Fts5ExprNode *sqlite3Fts5ParseNode(
             }
           }
         }else{
-          fts5ExprAddChildren(pRet, pLeft);
-          fts5ExprAddChildren(pRet, pRight);
+          int cLRet = fts5ExprAddChildren(pRet, pLeft);
+          int cRRet = fts5ExprAddChildren(pRet, pRight);
           if( pRet->iHeight>SQLITE_FTS5_MAX_EXPR_DEPTH ){
             sqlite3Fts5ParseError(pParse, 
                 "fts5 expression tree is too large (maximum depth %d)", 
@@ -2416,6 +2416,9 @@ Fts5ExprNode *sqlite3Fts5ParseNode(
             );
             sqlite3_free(pRet);
             pRet = 0;
+          } else {
+            if (cLRet) sqlite3_free(pLeft);
+            if (cRRet) sqlite3_free(pRight);
           }
         }
       }
-- 
2.45.2

(6) By anonymous on 2024-08-08 14:44:17 in reply to 2.1 [link] [source]

Wouldn't it make more sense to move the va_start inside the if (zFmt) rather than move the va_end outside it?

(7) By Stephan Beal (stephan) on 2024-08-08 15:15:00 in reply to 6 [link] [source]

Wouldn't it make more sense to move ...

Possibly so, but it's really a case of the proverbial "six one way and half-a-dozen the other."

The only places that function is used with a falsy zFmt are allocation errors and a serious internal misuse error (which "cannot happen") when calling sqlite3_bind_value(). As those will never be called during normal operation, that if() block will, in effect, always be entered, so there would seem to be little benefit now to moving the va_list handling into that block.

(8) By Dan Kennedy (dan) on 2024-08-08 15:29:44 in reply to 1 [link] [source]

Thanks for looking into all these.

Issue #1: Possible overflow in ext/fts3/fts3_snippet.c L:401

We check for truncation on L:393 and allow only max int value - 1. Problem is that pIter->nSnippet is capped at values -64 to +64,

I think it's guaranteed to be >=0. Because here:

https://sqlite.org/src/info/610328fe12?ln=1484

Will add an assert().

Issue #3: Accessing freed memory and double free in ext/fts5/fts5_expr.c freeing on L:2410 and accessing on L:2430

I don't think it can, because the case in fts5ExprAddChildren() that does call sqlite3_free does not increase the depth of the expression. Therefore the if(...) condition in the caller cannot be true.

But it's not really good code either. So we'll change it to make it clear that ownership of the children passes to the parent once fts5ExprAddChildren() is called and they are deleted as part of the parent (pRet) in cases where fts5ExprAddChildren() has been called.

Issue #4: Potentially invalid index in ext/expert/sqlite3expert.c L:1517

The assert() is incorrect as you say - should be assert( iSlot<p->nSlot );. Will fix that.

That SQL function is only available to code inside sqlite3expert.c though, so it's safe enough to assert() things about the arguments passed to it.

Thanks again,

Dan.

(9) By Ales Nezbeda (anezbeda) on 2024-08-09 08:53:21 in reply to 8 [link] [source]

I think it's guaranteed to be >=0. Because here:

https://sqlite.org/src/info/610328fe12?ln=1484

You are completely right, I was even checking the if statement to make sure the nFToken can't be outside the -64 - +64 range on the true side, but I completely missed that we make it a positive number if it is negative. My apologies.

I don't think it can, because the case in fts5ExprAddChildren() that does call sqlite3_free does not increase the depth of the expression.

Good, I mean I wasn't sure if it did or not, but if it doesn't, that's perfect, and it's another false positive.

Thank you for the code change, it really helps as overall the SQLite codebase is really readable so you are doing a great job.

The assert() is incorrect as you say - should be assert( iSlot

nSlot );. Will fix that.

Shouldn't we also assert assert( iSlot>=0 ); to prevent accidental negative invalid index? It probably shouldn't ever happen by mistake, but I think it might clarify the code.

Also, I am glad for the clarification for whether we can depend just on assert here. I was confused where the argv is coming from.