/ Check-in [31e480f6]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Improved comments an presentation for the recent IN operator decision improvement.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 31e480f68dfd887cfd9114f9f9fec53d751d3ecc27e36f55c0166b51b2fbb08c
User & Date: drh 2018-06-09 14:13:46
Context
2018-06-09
16:49
Slightly smaller and faster code by encapsulating wal-index hash table location information in a separate WalHashLoc object rather than passing around the various elements as separate variables. check-in: 538a365b user: drh tags: trunk
14:13
Improved comments an presentation for the recent IN operator decision improvement. check-in: 31e480f6 user: drh tags: trunk
02:49
Performance improvement to sqlite3WhereExprUsage(). check-in: fd093413 user: drh tags: trunk
Changes
Hide Diffs Unified Diffs Show Whitespace Changes Patch

Changes to src/where.c.

2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
....
2468
2469
2470
2471
2472
2473
2474

2475
2476
2477
2478
2479
2480
2481
....
2489
2490
2491
2492
2493
2494
2495

2496
2497
2498
2499
2500
2501
2502
2503
        || (pNew->wsFlags & WHERE_COLUMN_NULL)!=0 
        || (pNew->wsFlags & WHERE_COLUMN_IN)!=0 
        || (pNew->wsFlags & WHERE_SKIPSCAN)!=0 
    );

    if( eOp & WO_IN ){
      Expr *pExpr = pTerm->pExpr;
      LogEst M, logK;
      if( ExprHasProperty(pExpr, EP_xIsSelect) ){
        /* "x IN (SELECT ...)":  TUNING: the SELECT returns 25 rows */
        int i;
        nIn = 46;  assert( 46==sqlite3LogEst(25) );

        /* The expression may actually be of the form (x, y) IN (SELECT...).
        ** In this case there is a separate term for each of (x) and (y).
................................................................................
      }else if( ALWAYS(pExpr->x.pList && pExpr->x.pList->nExpr) ){
        /* "x IN (value, value, ...)" */
        nIn = sqlite3LogEst(pExpr->x.pList->nExpr);
        assert( nIn>0 );  /* RHS always has 2 or more terms...  The parser
                          ** changes "x IN (?)" into "x=?". */
      }
      if( pProbe->hasStat1 ){

        /* Let:
        **   N = the total number of rows in the table
        **   K = the number of entries on the RHS of the IN operator
        **   M = the number of rows in the table that match terms to the 
        **       to the left in the same index.  If the IN operator is on
        **       the left-most index column, M==N.
        **
................................................................................
        ** a safety margin of 2 (LogEst: 10) that favors using the IN operator
        ** with the index, as using an index has better worst-case behavior.
        ** If we do not have real sqlite_stat1 data, always prefer to use
        ** the index.
        */
        M = pProbe->aiRowLogEst[saved_nEq];
        logK = estLog(nIn);

        if( M + logK + 10 < nIn + rLogSize ){
          WHERETRACE(0x40,
            ("Scan preferred over IN operator on column %d of \"%s\" (%d<%d)\n",
             saved_nEq, pProbe->zName, M+logK+10, nIn+rLogSize));
          continue;
        }else{
          WHERETRACE(0x40,
            ("IN operator preferred on column %d of \"%s\" (%d>=%d)\n",







<







 







>







 







>
|







2447
2448
2449
2450
2451
2452
2453

2454
2455
2456
2457
2458
2459
2460
....
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
....
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
        || (pNew->wsFlags & WHERE_COLUMN_NULL)!=0 
        || (pNew->wsFlags & WHERE_COLUMN_IN)!=0 
        || (pNew->wsFlags & WHERE_SKIPSCAN)!=0 
    );

    if( eOp & WO_IN ){
      Expr *pExpr = pTerm->pExpr;

      if( ExprHasProperty(pExpr, EP_xIsSelect) ){
        /* "x IN (SELECT ...)":  TUNING: the SELECT returns 25 rows */
        int i;
        nIn = 46;  assert( 46==sqlite3LogEst(25) );

        /* The expression may actually be of the form (x, y) IN (SELECT...).
        ** In this case there is a separate term for each of (x) and (y).
................................................................................
      }else if( ALWAYS(pExpr->x.pList && pExpr->x.pList->nExpr) ){
        /* "x IN (value, value, ...)" */
        nIn = sqlite3LogEst(pExpr->x.pList->nExpr);
        assert( nIn>0 );  /* RHS always has 2 or more terms...  The parser
                          ** changes "x IN (?)" into "x=?". */
      }
      if( pProbe->hasStat1 ){
        LogEst M, logK, safetyMargin;
        /* Let:
        **   N = the total number of rows in the table
        **   K = the number of entries on the RHS of the IN operator
        **   M = the number of rows in the table that match terms to the 
        **       to the left in the same index.  If the IN operator is on
        **       the left-most index column, M==N.
        **
................................................................................
        ** a safety margin of 2 (LogEst: 10) that favors using the IN operator
        ** with the index, as using an index has better worst-case behavior.
        ** If we do not have real sqlite_stat1 data, always prefer to use
        ** the index.
        */
        M = pProbe->aiRowLogEst[saved_nEq];
        logK = estLog(nIn);
        safetyMargin = 10;  /* TUNING: extra weight for indexed IN */
        if( M + logK + safetyMargin < nIn + rLogSize ){
          WHERETRACE(0x40,
            ("Scan preferred over IN operator on column %d of \"%s\" (%d<%d)\n",
             saved_nEq, pProbe->zName, M+logK+10, nIn+rLogSize));
          continue;
        }else{
          WHERETRACE(0x40,
            ("IN operator preferred on column %d of \"%s\" (%d>=%d)\n",