Index: tool/mklsmapi.tcl ================================================================== --- tool/mklsmapi.tcl +++ tool/mklsmapi.tcl @@ -182,11 +182,11 @@ puts $document_preamble puts "
If required, it is possible to configure LSM to use external data compression and/or encryption functions to transform data before it is stored in the database file. -
Say something about the difference in performance characteristics -between a b-tree and whatever it is LSM is. Link the performance graphs page. - +
Many database systems that support range queries, including SQLite 3, Berkeley DB and Kyoto Cabinet, are +based on a b-tree data +structure or variant thereof. A b-tree structure minimizes the number of +disk sectors that must be read from disk when searching the database for a +specific key. However, b-tree implementations usually suffer from poor write +localization - updating the contents of a b-tree often involves modifying the +contents of nodes scattered throughout the database file. If the database is +stored on a spinning disk (HDD), then the disk heads must be moved before +writing non-contiguous sector, which is extremely slow. If the database is +stored on solid state storage (SDD) a similar phenomena is encountered due +to the large erase-block sizes. In general, writing to a series of contiguous +disk sectors is orders of magnitude faster than updating to the same number +of disk sectors scattered randomly throughout a large file. Additionally, +b-tree structures are prone to fragmentation, reducing the speed of range +queries. + +
Todo: Should have references for the claims above. + +
Also, fix the link in the next paragraph to point to something more +specific. + +
LSM uses a different data structure that makes the +following performance tradeoffs relative to a b-tree: + +
In other words, all things considered equal, writing to an LSM database +should be very fast and scanning through large ranges of keys should also +perform well, but searching the database for specific keys may be slightly +slower than when using a b-tree based system. Additionally, avoiding random +writes in favour of largely contiguous updates can significantly reduce the +wear on SSD or flash memory devices. + +
Although it has quite different features to LSM in other respects, +LevelDB makes similar performance tradeoffs. + +
Benchmark test results for LSM are available here. Todo: Link to a page +with performance graphs here +
LSM is not currently built or distributed independently. Instead, it is part of the SQLite4 library. To use LSM in an application, the application