SQLite

Check-in [1fa40a78fe]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Consider doing a partial table scan to fulfill an IN operator rather than using an index. Try to pick the plan with the lowest cost.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | in-scan-vs-index
Files: files | file ages | folders
SHA3-256: 1fa40a78fef4516c39b217bff67efe7e7d2077cca00aae0ef5c2c9cff94f008b
User & Date: drh 2018-06-08 18:22:10.089
Context
2018-06-08
19:54
Merge the btreeNext() assertion bug fix from trunk. (check-in: 11bd66e090 user: drh tags: in-scan-vs-index)
18:22
Consider doing a partial table scan to fulfill an IN operator rather than using an index. Try to pick the plan with the lowest cost. (check-in: 1fa40a78fe user: drh tags: in-scan-vs-index)
2018-06-07
18:13
The IN-early-out optimization: When doing a look-up on a multi-column index and an IN operator is used on a column other than the left-most column, then if no rows match against the first IN value, check to make sure there exist rows that match the columns to the right before continuing with the next IN value. (check-in: 09fffbdf9f user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to src/where.c.
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473






























2474
2475
2476
2477
2478
2479
2480
        || (pNew->wsFlags & WHERE_COLUMN_NULL)!=0 
        || (pNew->wsFlags & WHERE_COLUMN_IN)!=0 
        || (pNew->wsFlags & WHERE_SKIPSCAN)!=0 
    );

    if( eOp & WO_IN ){
      Expr *pExpr = pTerm->pExpr;
      pNew->wsFlags |= WHERE_COLUMN_IN;
      if( ExprHasProperty(pExpr, EP_xIsSelect) ){
        /* "x IN (SELECT ...)":  TUNING: the SELECT returns 25 rows */
        int i;
        nIn = 46;  assert( 46==sqlite3LogEst(25) );

        /* The expression may actually be of the form (x, y) IN (SELECT...).
        ** In this case there is a separate term for each of (x) and (y).
        ** However, the nIn multiplier should only be applied once, not once
        ** for each such term. The following loop checks that pTerm is the
        ** first such term in use, and sets nIn back to 0 if it is not. */
        for(i=0; i<pNew->nLTerm-1; i++){
          if( pNew->aLTerm[i] && pNew->aLTerm[i]->pExpr==pExpr ) nIn = 0;
        }
      }else if( ALWAYS(pExpr->x.pList && pExpr->x.pList->nExpr) ){
        /* "x IN (value, value, ...)" */
        nIn = sqlite3LogEst(pExpr->x.pList->nExpr);
        assert( nIn>0 );  /* RHS always has 2 or more terms...  The parser
                          ** changes "x IN (?)" into "x=?". */
      }






























    }else if( eOp & (WO_EQ|WO_IS) ){
      int iCol = pProbe->aiColumn[saved_nEq];
      pNew->wsFlags |= WHERE_COLUMN_EQ;
      assert( saved_nEq==pNew->u.btree.nEq );
      if( iCol==XN_ROWID 
       || (iCol>=0 && nInMul==0 && saved_nEq==pProbe->nKeyCol-1)
      ){







|



















>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
        || (pNew->wsFlags & WHERE_COLUMN_NULL)!=0 
        || (pNew->wsFlags & WHERE_COLUMN_IN)!=0 
        || (pNew->wsFlags & WHERE_SKIPSCAN)!=0 
    );

    if( eOp & WO_IN ){
      Expr *pExpr = pTerm->pExpr;
      LogEst M, logK;
      if( ExprHasProperty(pExpr, EP_xIsSelect) ){
        /* "x IN (SELECT ...)":  TUNING: the SELECT returns 25 rows */
        int i;
        nIn = 46;  assert( 46==sqlite3LogEst(25) );

        /* The expression may actually be of the form (x, y) IN (SELECT...).
        ** In this case there is a separate term for each of (x) and (y).
        ** However, the nIn multiplier should only be applied once, not once
        ** for each such term. The following loop checks that pTerm is the
        ** first such term in use, and sets nIn back to 0 if it is not. */
        for(i=0; i<pNew->nLTerm-1; i++){
          if( pNew->aLTerm[i] && pNew->aLTerm[i]->pExpr==pExpr ) nIn = 0;
        }
      }else if( ALWAYS(pExpr->x.pList && pExpr->x.pList->nExpr) ){
        /* "x IN (value, value, ...)" */
        nIn = sqlite3LogEst(pExpr->x.pList->nExpr);
        assert( nIn>0 );  /* RHS always has 2 or more terms...  The parser
                          ** changes "x IN (?)" into "x=?". */
      }
      /* Let:
      **   N = the total number of rows in the table
      **   K = the number of entries on the right-hand side of the IN operator
      **   M = the number of rows in the table that match terms to the 
      **       to the left in the same index.  If the IN operator is on
      **       the left-most index column, M==N.
      **
      ** Given the definitions above, it is better to omit the IN operator
      ** from the index lookup and instead do a scan of the M elements,
      ** testing each scanned row against the IN operator separately, if:
      **
      **        M*log(K) < K*log(N)
      **
      ** Our estimates for M, K, and N might be inaccurate, so we build in
      ** a safety margin of 2 (LogEst: 10) that favors using the IN operator
      ** with the index, as using an index has better worst-case behavior.
      */
      M = pProbe->aiRowLogEst[saved_nEq+1];
      logK = sqlite3LogEst(nIn);
      if( M + logK + 10 < nIn + rLogSize ){
        WHERETRACE(0x40,
          ("IN operator costs more than scan on column %d of \"%s\" (%d<%d)\n",
           saved_nEq, pProbe->zName, M+logK+10, nIn+rLogSize));
        continue;
      }else{
        WHERETRACE(0x40,
          ("IN operator cheaper than scan on column %d of \"%s\" (%d>=%d)\n",
           saved_nEq, pProbe->zName, M+logK+10, nIn+rLogSize));
      }
      pNew->wsFlags |= WHERE_COLUMN_IN;
    }else if( eOp & (WO_EQ|WO_IS) ){
      int iCol = pProbe->aiColumn[saved_nEq];
      pNew->wsFlags |= WHERE_COLUMN_EQ;
      assert( saved_nEq==pNew->u.btree.nEq );
      if( iCol==XN_ROWID 
       || (iCol>=0 && nInMul==0 && saved_nEq==pProbe->nKeyCol-1)
      ){