SQLite

Check-in [090304b8]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:In the Bloom filter optimization, hash all strings and blobs into the same value, because we do not know if two different strings might compare equal even if they have different byte sequences, due to collating functions. Formerly, the hash of a string or blob was just its length. This could all be improved. Fix for the issue reported by forum post 0846211821.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 090304b870419acb5b05205a07fc75830b556928149f76a843cda526f77a6fc0
User & Date: drh 2023-02-28 14:28:54
Context
2023-03-09
16:04
In the Bloom filter optimization, hash all strings and blobs into the same value, because we do not know if two different strings might compare equal even if they have different byte sequences, due to collating functions. Formerly, the hash of a string or blob was just its length. This could all be improved. (check-in: cc8a0ee4 user: drh tags: branch-3.41)
2023-02-28
18:06
Only use a Bloom filter on an automatic index if one or more of the key columns in the index can take on non-TEXT values. (check-in: 5916705c user: drh tags: trunk)
14:28
In the Bloom filter optimization, hash all strings and blobs into the same value, because we do not know if two different strings might compare equal even if they have different byte sequences, due to collating functions. Formerly, the hash of a string or blob was just its length. This could all be improved. Fix for the issue reported by forum post 0846211821. (check-in: 090304b8 user: drh tags: trunk)
13:46
When an automatic index creates a Bloom filter, show that in the EXPLAIN QUERY PLAN output. (check-in: d7b2ac1c user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to src/vdbe.c.
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
  for(i=pOp->p3, mx=i+pOp->p4.i; i<mx; i++){
    const Mem *p = &aMem[i];
    if( p->flags & (MEM_Int|MEM_IntReal) ){
      h += p->u.i;
    }else if( p->flags & MEM_Real ){
      h += sqlite3VdbeIntValue(p);
    }else if( p->flags & (MEM_Str|MEM_Blob) ){
      h += p->n;
      if( p->flags & MEM_Zero ) h += p->u.nZero;
    }
  }
  return h;
}

/*
** Return the symbolic name for the data type of a pMem







|
<







679
680
681
682
683
684
685
686

687
688
689
690
691
692
693
  for(i=pOp->p3, mx=i+pOp->p4.i; i<mx; i++){
    const Mem *p = &aMem[i];
    if( p->flags & (MEM_Int|MEM_IntReal) ){
      h += p->u.i;
    }else if( p->flags & MEM_Real ){
      h += sqlite3VdbeIntValue(p);
    }else if( p->flags & (MEM_Str|MEM_Blob) ){
      /* no-op */

    }
  }
  return h;
}

/*
** Return the symbolic name for the data type of a pMem
Changes to test/bloom1.test.
95
96
97
98
99
100
101
102














103
104
  |--CO-ROUTINE transit
  |  |--SETUP
  |  |  `--SEARCH objs USING COVERING INDEX objs_cspo (o=? AND p=?)
  |  `--RECURSIVE STEP
  |     |--SCAN transit
  |     `--SEARCH objs USING COVERING INDEX objs_cspo (o=? AND p=?)
  `--SCAN transit
} 















finish_test







|
>
>
>
>
>
>
>
>
>
>
>
>
>
>


95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
  |--CO-ROUTINE transit
  |  |--SETUP
  |  |  `--SEARCH objs USING COVERING INDEX objs_cspo (o=? AND p=?)
  |  `--RECURSIVE STEP
  |     |--SCAN transit
  |     `--SEARCH objs USING COVERING INDEX objs_cspo (o=? AND p=?)
  `--SCAN transit
}

# 2023-02-28
# https://sqlite.org/forum/forumpost/0846211821
#
# Bloom filter gives an incorrect result if the collating sequence is
# anything other than binary.
#
reset_db
do_execsql_test 3.1 {
  CREATE TABLE t0(x TEXT COLLATE rtrim);
  INSERT INTO t0(x) VALUES ('a'), ('b'), ('c');
  CREATE VIEW v0(y) AS SELECT DISTINCT x FROM t0;
  SELECT count(*) FROM t0, v0 WHERE x='b ';
} 3

finish_test