SQLite Forum

xBestIndex constraints: discern between JOIN and regular constraints
Login

xBestIndex constraints: discern between JOIN and regular constraints

(1) By Gunter Hick (gunter_hick) on 2021-06-14 11:12:40 [link] [source]

with SQLite 3.24.0 (sorry, no newer version in production here yet) I have two virtual tables corresponding to the following table definitions:

CREATE TABLE privilege (
module_def_no INTEGER,
user_person_id INTEGER,
module_action_no INTEGER,
read_access_allowed INTEGER,
write_access_allowed INTEGER,
UNIQUE (module_def_no, user_person_id, module action no);


CREATE TABLE assmnt(
module_def_no INTEGER,
state_def_no INTEGER,
module_action_no INTEGER,
state_transition_no INTEGER,
entity_id INTEGER,
UNIQUE (module_def_no, module_action_no,state_def_no,state_transition_no));


When preparing the following statement


SELECT ... FROM privilege p, assmnt a
<C1> WHERE a.state_def_no=13
<C2>  AND a.module_def_no=14076
<C3>  AND p.module_def_no=a.module_def_no
<C4>  AND p.user_person_id=1002
<C5>  AND p.module_action_no=a.module_action_no
<C6>  AND p.write_access_allowed = 1
<C7>  AND a.module_action_no NOT IN (12);


the BestIndex function are called thus (with answers shown):

Note: Constraint information is formatted "C" for constraint, field name, operation : argvIndex, "O" if Omit flags is set
Note: Return information is formatted "R" for result, index name, (index number), cost, rows, "Unique" if Unique flag is set

Call #1

with all four fields of privilege table available, i.e. RHS of the JOIN,
Bestindex responds with the fields of the index in index order and result is unique

2021-06-14 11:19:19.215: B  BestIndex for table privilege
2021-06-14 11:19:19.215: Constraints
2021-06-14 11:19:19.216: C  module_def_no                   EQ  : 1 O
2021-06-14 11:19:19.216: C  user_person_id                  EQ  : 2 O
2021-06-14 11:19:19.216: C  module_action_no                EQ  : 3 O
2021-06-14 11:19:19.216: C  write_access_allowed            EQ  : 0
2021-06-14 11:19:19.216: R  privilege_idx    (1) Cost   15.39 Rows 1 Unique

Call #2

with only the fields available to the LHS of the JOIN being usable,
BestIndex responds with a full table scan returning all rows

2021-06-14 11:19:19.216: B  BestIndex for table privilege
2021-06-14 11:19:19.216: Constraints
2021-06-14 11:19:19.216: C (module_def_no                   EQ) : 0
2021-06-14 11:19:19.216: C  user_person_id                  EQ  : 0
2021-06-14 11:19:19.216: C (module_action_no                EQ) : 0
2021-06-14 11:19:19.216: C  write_access_allowed            EQ  : 0
2021-06-14 11:19:19.216: R  privilege_idx    (1) Cost 43061.39 Rows 43046

Call #3

with all 5 fields available, i.e. the RHS of the JOIN,
BestIndex responds with a partial key scan using the first three index fields

     B  BestIndex for table assmnt
     Constraints
<C1> C  state_def_no                    EQ  : 3 O
<C2> C  module_def_no                   EQ  : 1 O
<C7> C  module_action_no                NE  : 0
<C5> C  module_action_no                EQ  : 2 O
<C3> C  module_def_no                   EQ  : 0
     R  assmnt_idx (1) Cost   14.34 Rows 5

Call #4

with only the fields available to the LHS of the JOIN being usable,
BestINdex responds with a partial key scan using the first field

2021-06-14 11:19:19.216: B  BestIndex for table assmnt
2021-06-14 11:19:19.216: Constraints
2021-06-14 11:19:19.216: C  state_def_no                    EQ  : 0
2021-06-14 11:19:19.216: C  module_def_no                   EQ  : 1 O
2021-06-14 11:19:19.216: C  module_action_no                NE  : 0
2021-06-14 11:19:19.216: C (module_action_no                EQ) : 0
2021-06-14 11:19:19.216: C (module_def_no                   EQ) : 0
2021-06-14 11:19:19.216: R  assmnt_idx (1) Cost  137.34 Rows 128

this resulta in the

QUERY PLAN
|--SCAN TABLE assmnt VIRTUAL TABLE INDEX 1:
`--SCAN TABLE privilege VIRTUAL TABLE INDEX 1:

How can the BestIndex function determine if a given constraint comes from a "result set limiting" constraint or from a "join constraint"?
In the case of module_action_no, looking at the operation (EQ vs. NE) is sufficient to ensure that the value passed to the Filter function comes from the JOIN.
In the case of module_def_no, the case is not so clear; indeed, if forced to use the "wrong" query plan, I can see that SQLite is checking p.module_def_no against the constant  (addr 37,10,11) and passing a copy of the constant to the Filter function (addr 12..17) instead of a copy of the field value. 

addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     35    0                    00  NULL
1     VOpen          0     0     0     vtab:DD205A0   00  NULL
2     VOpen          1     0     0     vtab:DD221F0   00  NULL
3     Integer        1     1     0                    00  NULL
4     Integer        0     2     0                    00  NULL
5     VFilter        0     34    1                    00  NULL
6       VColumn        0     1     3                    00  NULL
7       Ne             4     33    3     (BINARY)       53  NULL
8       VColumn        0     4     5                    00  NULL
9       Ne             6     33    5     (BINARY)       54  NULL
10      VColumn        0     0     7                    00  NULL
11      Ne             8     33    7     (BINARY)       53  NULL
12      Integer        14076  11    0                    00  NULL
13      VColumn        0     2     12                   00  NULL
14      Integer        13    13    0                    00  NULL
15      Integer        1     9     0                    00  NULL
16      Integer        3     10    0                    00  NULL
17      VFilter        1     33    9                   00  NULL
18        VColumn        1     0     14                   00  NULL
19        Ne             14    32    7     (BINARY)       53  NULL
20        VColumn        1     2     15                   00  NULL
21        Eq             16    32    15    (BINARY)       53  NULL
22        Concat         15    22    21                   00  NULL
23        Concat         22    21    17                   00  NULL
24        Copy           15    21    0                    00  NULL
25        Function0      0     21    18    hex(1)         01  NULL
26        VColumn        0     2     23                   00  NULL
27        Concat         23    22    21                   00  NULL
28        Concat         22    21    19                   00  NULL
29        Copy           23    21    0                    00  NULL
30        Function0      0     21    20    hex(1)         01  NULL
31        ResultRow      17    4     0                    00  NULL
32      VNext          1     18    0                    00  NULL
33    VNext          0     6     0                    00  NULL
34    Halt           0     0     0                    00  NULL
35    Integer        1002  4     0                    00  NULL
36    Integer        1     6     0                    00  NULL
37    Integer        14076  8     0                    00  NULL
38    Integer        12    16    0                    00  NULL
39    String8        0     22    0     *              00  NULL
40    Goto           0     1     0                    00  NULL

(2) By Richard Hipp (drh) on 2021-06-14 12:24:00 in reply to 1 [source]

How can the BestIndex function determine if a given constraint comes from a "result set limiting" constraint or from a "join constraint"?

I'm not sure what those two categories of constraint mean. SQLite certainly doesn't make any such distinctions anywhere internally, as far as I know.

(3) By Gunter Hick (gunter_hick) on 2021-06-14 14:17:27 in reply to 2 [link] [source]

A "result set limiting" constraint associates a field with a constant (or a variable), whereas a "join constraint" defines a relation between two fields of the tables being joined.

There are three (not including syntax variations) logically equivalent ways to express an equijoin between two tables limited to a certain value of the field used:

  1. T1.a == x AND T1.a == T2.a
  2. T2.a == x AND T1.a == T2.a
  3. T1.a == x AND T2.a == x

Form 1 favors T1 in the outer loop Form 2 favors T2 in the outer loop Form 3 is indifferent

Combining Form 1 with a cross join with T2 in the outer loop makes SQLite check T2.a == x inside the full table scan loop instead of asking for a partial key scan of the same condition.