/ Check-in [908daaa9]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Disable the OR optimization if it would conflict with column affinity coercions. Ticket #2249. Additional cleanup and testing of the OR optimization. (CVS 3658)
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 908daaa9ab86e0bd1da6d0807d6aaba240c3cee0
User & Date: drh 2007-02-23 23:13:34
Context
2007-02-24
11:52
Make sure that integer values are converted to real when pulled from a REAL table column by GROUP BY. Ticket #2251. Also make sure default values are correctly expanded. There may be other places in the code where this issue comes up - we need to look further. (CVS 3659) check-in: e11bbf17 user: drh tags: trunk
2007-02-23
23:13
Disable the OR optimization if it would conflict with column affinity coercions. Ticket #2249. Additional cleanup and testing of the OR optimization. (CVS 3658) check-in: 908daaa9 user: drh tags: trunk
14:20
Clarifications to the datatype3.html document. (CVS 3657) check-in: 4692a85e user: drh tags: trunk
Changes
Hide Diffs Side-by-Side Diffs Ignore Whitespace Patch

Changes to src/where.c.

    12     12   ** This module contains C code that generates VDBE code used to process
    13     13   ** the WHERE clause of SQL statements.  This module is reponsible for
    14     14   ** generating the code that loops through a table looking for applicable
    15     15   ** rows.  Indices are selected and used to speed the search when doing
    16     16   ** so is applicable.  Because this module is responsible for selecting
    17     17   ** indices, you might also think of this module as the "query optimizer".
    18     18   **
    19         -** $Id: where.c,v 1.237 2007/02/06 13:26:33 drh Exp $
           19  +** $Id: where.c,v 1.238 2007/02/23 23:13:34 drh Exp $
    20     20   */
    21     21   #include "sqliteInt.h"
    22     22   
    23     23   /*
    24     24   ** The number of bits in a Bitmask.  "BMS" means "BitMask Size".
    25     25   */
    26     26   #define BMS  (sizeof(Bitmask)*8)
................................................................................
   578    578   ** a join, then transfer the appropriate markings over to derived.
   579    579   */
   580    580   static void transferJoinMarkings(Expr *pDerived, Expr *pBase){
   581    581     pDerived->flags |= pBase->flags & EP_FromJoin;
   582    582     pDerived->iRightJoinTable = pBase->iRightJoinTable;
   583    583   }
   584    584   
          585  +#if !defined(SQLITE_OMIT_OR_OPTIMIZATION) && !defined(SQLITE_OMIT_SUBQUERY)
          586  +/*
          587  +** Return TRUE if the given term of an OR clause can be converted
          588  +** into an IN clause.  The iCursor and iColumn define the left-hand
          589  +** side of the IN clause.
          590  +**
          591  +** The context is that we have multiple OR-connected equality terms
          592  +** like this:
          593  +**
          594  +**           a=<expr1> OR  a=<expr2> OR b=<expr3>  OR ...
          595  +**
          596  +** The pOrTerm input to this routine corresponds to a single term of
          597  +** this OR clause.  In order for the term to be a condidate for
          598  +** conversion to an IN operator, the following must be true:
          599  +**
          600  +**     *  The left-hand side of the term must be the column which
          601  +**        is identified by iCursor and iColumn.
          602  +**
          603  +**     *  If the right-hand side is also a column, then the affinities
          604  +**        of both right and left sides must be such that no type
          605  +**        conversions are required on the right.  (Ticket #2249)
          606  +**
          607  +** If both of these conditions are true, then return true.  Otherwise
          608  +** return false.
          609  +*/
          610  +static int orTermIsOptCandidate(WhereTerm *pOrTerm, int iCursor, int iColumn){
          611  +  int affLeft, affRight;
          612  +  assert( pOrTerm->eOperator==WO_EQ );
          613  +  if( pOrTerm->leftCursor!=iCursor ){
          614  +    return 0;
          615  +  }
          616  +  if( pOrTerm->leftColumn!=iColumn ){
          617  +    return 0;
          618  +  }
          619  +  affRight = sqlite3ExprAffinity(pOrTerm->pExpr->pRight);
          620  +  if( affRight==0 ){
          621  +    return 1;
          622  +  }
          623  +  affLeft = sqlite3ExprAffinity(pOrTerm->pExpr->pLeft);
          624  +  if( affRight!=affLeft ){
          625  +    return 0;
          626  +  }
          627  +  return 1;
          628  +}
          629  +
          630  +/*
          631  +** Return true if the given term of an OR clause can be ignored during
          632  +** a check to make sure all OR terms are candidates for optimization.
          633  +** In other words, return true if a call to the orTermIsOptCandidate()
          634  +** above returned false but it is not necessary to disqualify the
          635  +** optimization.
          636  +**
          637  +** Suppose the original OR phrase was this:
          638  +**
          639  +**           a=4  OR  a=11  OR  a=b
          640  +**
          641  +** During analysis, the third term gets flipped around and duplicate
          642  +** so that we are left with this:
          643  +**
          644  +**           a=4  OR  a=11  OR  a=b  OR  b=a
          645  +**
          646  +** Since the last two terms are duplicates, only one of them
          647  +** has to qualify in order for the whole phrase to qualify.  When
          648  +** this routine is called, we know that pOrTerm did not qualify.
          649  +** This routine merely checks to see if pOrTerm has a duplicate that
          650  +** might qualify.  If there is a duplicate that has not yet been
          651  +** disqualified, then return true.  If there are no duplicates, or
          652  +** the duplicate has also been disqualifed, return false.
          653  +*/
          654  +static int orTermHasOkDuplicate(WhereClause *pOr, WhereTerm *pOrTerm){
          655  +  if( pOrTerm->flags & TERM_COPIED ){
          656  +    /* This is the original term.  The duplicate is to the left had
          657  +    ** has not yet been analyzed and thus has not yet been disqualified. */
          658  +    return 1;
          659  +  }
          660  +  if( (pOrTerm->flags & TERM_VIRTUAL)!=0
          661  +     && (pOr->a[pOrTerm->iParent].flags & TERM_OR_OK)!=0 ){
          662  +    /* This is a duplicate term.  The original qualified so this one
          663  +    ** does not have to. */
          664  +    return 1;
          665  +  }
          666  +  /* This is either a singleton term or else it is a duplicate for
          667  +  ** which the original did not qualify.  Either way we are done for. */
          668  +  return 0;
          669  +}
          670  +#endif /* !SQLITE_OMIT_OR_OPTIMIZATION && !SQLITE_OMIT_SUBQUERY */
   585    671   
   586    672   /*
   587    673   ** The input to this routine is an WhereTerm structure with only the
   588    674   ** "pExpr" field filled in.  The job of this routine is to analyze the
   589    675   ** subexpression and populate all the other fields of the WhereTerm
   590    676   ** structure.
   591    677   **
................................................................................
   712    798       WhereClause sOr;
   713    799       WhereTerm *pOrTerm;
   714    800   
   715    801       assert( (pTerm->flags & TERM_DYNAMIC)==0 );
   716    802       whereClauseInit(&sOr, pWC->pParse, pMaskSet);
   717    803       whereSplit(&sOr, pExpr, TK_OR);
   718    804       exprAnalyzeAll(pSrc, &sOr);
   719         -    assert( sOr.nTerm>0 );
          805  +    assert( sOr.nTerm>=2 );
   720    806       j = 0;
   721    807       do{
          808  +      assert( j<sOr.nTerm );
   722    809         iColumn = sOr.a[j].leftColumn;
   723    810         iCursor = sOr.a[j].leftCursor;
   724    811         ok = iCursor>=0;
   725    812         for(i=sOr.nTerm-1, pOrTerm=sOr.a; i>=0 && ok; i--, pOrTerm++){
   726    813           if( pOrTerm->eOperator!=WO_EQ ){
   727    814             goto or_not_possible;
   728    815           }
   729         -        if( pOrTerm->leftCursor==iCursor && pOrTerm->leftColumn==iColumn ){
          816  +        if( orTermIsOptCandidate(pOrTerm, iCursor, iColumn) ){
   730    817             pOrTerm->flags |= TERM_OR_OK;
   731         -        }else if( (pOrTerm->flags & TERM_COPIED)!=0 ||
   732         -                    ((pOrTerm->flags & TERM_VIRTUAL)!=0 &&
   733         -                     (sOr.a[pOrTerm->iParent].flags & TERM_OR_OK)!=0) ){
          818  +        }else if( orTermHasOkDuplicate(&sOr, pOrTerm) ){
   734    819             pOrTerm->flags &= ~TERM_OR_OK;
   735    820           }else{
   736    821             ok = 0;
   737    822           }
   738    823         }
   739         -    }while( !ok && (sOr.a[j++].flags & TERM_COPIED)!=0 && j<sOr.nTerm );
          824  +    }while( !ok && (sOr.a[j++].flags & TERM_COPIED)!=0 && j<2 );
   740    825       if( ok ){
   741    826         ExprList *pList = 0;
   742    827         Expr *pNew, *pDup;
   743    828         Expr *pLeft = 0;
   744    829         for(i=sOr.nTerm-1, pOrTerm=sOr.a; i>=0 && ok; i--, pOrTerm++){
   745    830           if( (pOrTerm->flags & TERM_OR_OK)==0 ) continue;
   746    831           pDup = sqlite3ExprDup(pOrTerm->pExpr->pRight);

Changes to test/where2.test.

     8      8   #    May you share freely, never taking more than you give.
     9      9   #
    10     10   #***********************************************************************
    11     11   # This file implements regression tests for SQLite library.  The
    12     12   # focus of this file is testing the use of indices in WHERE clauses
    13     13   # based on recent changes to the optimizer.
    14     14   #
    15         -# $Id: where2.test,v 1.10 2006/11/06 15:10:06 drh Exp $
           15  +# $Id: where2.test,v 1.11 2007/02/23 23:13:34 drh Exp $
    16     16   
    17     17   set testdir [file dirname $argv0]
    18     18   source $testdir/tester.tcl
    19     19   
    20     20   # Build some test data
    21     21   #
    22     22   do_test where2-1.0 {
................................................................................
   215    215     } {99 6 10000 10006 sort t1 i1w}
   216    216   }
   217    217   
   218    218   # Verify that OR clauses get translated into IN operators.
   219    219   #
   220    220   set ::idx {}
   221    221   ifcapable subquery {set ::idx i1w}
   222         -do_test where2-6.1 {
          222  +do_test where2-6.1.1 {
   223    223     queryplan {
   224    224       SELECT * FROM t1 WHERE w=99 OR w=100 ORDER BY +w
          225  +  }
          226  +} [list 99 6 10000 10006 100 6 10201 10207 sort t1 $::idx]
          227  +do_test where2-6.1.2 {
          228  +  queryplan {
          229  +    SELECT * FROM t1 WHERE 99=w OR 100=w ORDER BY +w
   225    230     }
   226    231   } [list 99 6 10000 10006 100 6 10201 10207 sort t1 $::idx]
   227    232   do_test where2-6.2 {
   228    233     queryplan {
   229    234       SELECT * FROM t1 WHERE w=99 OR w=100 OR 6=w ORDER BY +w
   230    235     }
   231    236   } [list 6 2 49 51 99 6 10000 10006 100 6 10201 10207 sort t1 $::idx]
................................................................................
   253    258   do_test where2-6.6 {
   254    259     queryplan {
   255    260       SELECT b.* FROM t1 a, t1 b
   256    261        WHERE a.w=1 AND (b.z=10 OR a.y=b.z OR b.z=10)
   257    262        ORDER BY +b.w
   258    263     }
   259    264   } [list 1 0 4 4 2 1 9 10 sort a i1w b $::idx]
          265  +
          266  +# Ticket #2249.  Make sure the OR optimization is not attempted if
          267  +# comparisons between columns of different affinities are needed.
          268  +#
          269  +do_test where2-6.7 {
          270  +  execsql {
          271  +    CREATE TABLE t2249a(a TEXT UNIQUE);
          272  +    CREATE TABLE t2249b(b INTEGER);
          273  +    INSERT INTO t2249a VALUES('0123');
          274  +    INSERT INTO t2249b VALUES(123);
          275  +  }
          276  +  queryplan {
          277  +    -- Because a is type TEXT and b is type INTEGER, both a and b
          278  +    -- will attempt to convert to NUMERIC before the comparison.
          279  +    -- They will thus compare equal.
          280  +    --
          281  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE a=b;
          282  +  }
          283  +} {123 0123 nosort t2249b {} t2249a {}}
          284  +do_test where2-6.9 {
          285  +  queryplan {
          286  +    -- The + operator removes affinity from the rhs.  No conversions
          287  +    -- occur and the comparison is false.  The result is an empty set.
          288  +    --
          289  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE a=+b;
          290  +  }
          291  +} {nosort t2249b {} {} sqlite_autoindex_t2249a_1}
          292  +do_test where2-6.9.2 {
          293  +  # The same thing but with the expression flipped around.
          294  +  queryplan {
          295  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE +b=a
          296  +  }
          297  +} {nosort t2249b {} {} sqlite_autoindex_t2249a_1}
          298  +do_test where2-6.10 {
          299  +  queryplan {
          300  +    -- Use + on both sides of the comparison to disable indices
          301  +    -- completely.  Make sure we get the same result.
          302  +    --
          303  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE +a=+b;
          304  +  }
          305  +} {nosort t2249b {} t2249a {}}
          306  +do_test where2-6.11 {
          307  +  # This will not attempt the OR optimization because of the a=b
          308  +  # comparison.
          309  +  queryplan {
          310  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE a=b OR a='hello';
          311  +  }
          312  +} {123 0123 nosort t2249b {} t2249a {}}
          313  +do_test where2-6.11.2 {
          314  +  # Permutations of the expression terms.
          315  +  queryplan {
          316  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE b=a OR a='hello';
          317  +  }
          318  +} {123 0123 nosort t2249b {} t2249a {}}
          319  +do_test where2-6.11.3 {
          320  +  # Permutations of the expression terms.
          321  +  queryplan {
          322  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE 'hello'=a OR b=a;
          323  +  }
          324  +} {123 0123 nosort t2249b {} t2249a {}}
          325  +do_test where2-6.11.4 {
          326  +  # Permutations of the expression terms.
          327  +  queryplan {
          328  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE a='hello' OR b=a;
          329  +  }
          330  +} {123 0123 nosort t2249b {} t2249a {}}
          331  +do_test where2-6.12 {
          332  +  # In this case, the +b disables the affinity conflict and allows
          333  +  # the OR optimization to be used again.  The result is now an empty
          334  +  # set, the same as in where2-6.9.
          335  +  queryplan {
          336  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE a=+b OR a='hello';
          337  +  }
          338  +} {nosort t2249b {} {} sqlite_autoindex_t2249a_1}
          339  +do_test where2-6.12.2 {
          340  +  # In this case, the +b disables the affinity conflict and allows
          341  +  # the OR optimization to be used again.  The result is now an empty
          342  +  # set, the same as in where2-6.9.
          343  +  queryplan {
          344  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE a='hello' OR +b=a;
          345  +  }
          346  +} {nosort t2249b {} {} sqlite_autoindex_t2249a_1}
          347  +do_test where2-6.12.3 {
          348  +  # In this case, the +b disables the affinity conflict and allows
          349  +  # the OR optimization to be used again.  The result is now an empty
          350  +  # set, the same as in where2-6.9.
          351  +  queryplan {
          352  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE +b=a OR a='hello';
          353  +  }
          354  +} {nosort t2249b {} {} sqlite_autoindex_t2249a_1}
          355  +do_test where2-6.13 {
          356  +  # The addition of +a on the second term disabled the OR optimization.
          357  +  # But we should still get the same empty-set result as in where2-6.9.
          358  +  queryplan {
          359  +    SELECT * FROM t2249b CROSS JOIN t2249a WHERE a=+b OR +a='hello';
          360  +  }
          361  +} {nosort t2249b {} t2249a {}}
          362  +
          363  +# Variations on the order of terms in a WHERE clause in order
          364  +# to make sure the OR optimizer can recognize them all.
          365  +do_test where2-6.20 {
          366  +  queryplan {
          367  +    SELECT * FROM t2249a x CROSS JOIN t2249a y WHERE x.a=y.a
          368  +  }
          369  +} {0123 0123 nosort x {} {} sqlite_autoindex_t2249a_1}
          370  +do_test where2-6.21 {
          371  +  queryplan {
          372  +    SELECT * FROM t2249a x CROSS JOIN t2249a y WHERE x.a=y.a OR y.a='hello'
          373  +  }
          374  +} {0123 0123 nosort x {} {} sqlite_autoindex_t2249a_1}
          375  +do_test where2-6.22 {
          376  +  queryplan {
          377  +    SELECT * FROM t2249a x CROSS JOIN t2249a y WHERE y.a=x.a OR y.a='hello'
          378  +  }
          379  +} {0123 0123 nosort x {} {} sqlite_autoindex_t2249a_1}
          380  +do_test where2-6.23 {
          381  +  queryplan {
          382  +    SELECT * FROM t2249a x CROSS JOIN t2249a y WHERE y.a='hello' OR x.a=y.a
          383  +  }
          384  +} {0123 0123 nosort x {} {} sqlite_autoindex_t2249a_1}
   260    385   
   261    386   # Unique queries (queries that are guaranteed to return only a single
   262    387   # row of result) do not call the sorter.  But all tables must give
   263    388   # a unique result.  If any one table in the join does not give a unique
   264    389   # result then sorting is necessary.
   265    390   #
   266    391   do_test where2-7.1 {