Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Improvements to lsmusr.wiki. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
e47b5e3ae69139d5ec261348717d52e5 |
User & Date: | dan 2012-11-14 18:23:17.784 |
Context
2012-11-14
| ||
20:09 | Updates to lsmusr.wiki. check-in: 1ea9187820 user: dan tags: trunk | |
18:23 | Improvements to lsmusr.wiki. check-in: e47b5e3ae6 user: dan tags: trunk | |
2012-11-13
| ||
20:16 | Further documentation updates. check-in: 414ed6da4e user: dan tags: trunk | |
Changes
Changes to src/lsm.h.
︙ | ︙ | |||
35 36 37 38 39 40 41 42 43 44 45 46 47 48 | /* Candidate values for the 3rd argument to lsm_env.xLock() */ #define LSM_LOCK_UNLOCK 0 #define LSM_LOCK_SHARED 1 #define LSM_LOCK_EXCL 2 /* ** Run-time environment used by LSM */ struct lsm_env { int nByte; /* Size of this structure in bytes */ int iVersion; /* Version number of this structure (1) */ /****** file i/o ***********************************************/ void *pVfsCtx; | > > | 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | /* Candidate values for the 3rd argument to lsm_env.xLock() */ #define LSM_LOCK_UNLOCK 0 #define LSM_LOCK_SHARED 1 #define LSM_LOCK_EXCL 2 /* ** CAPI: Database Runtime Environment ** ** Run-time environment used by LSM */ struct lsm_env { int nByte; /* Size of this structure in bytes */ int iVersion; /* Version number of this structure (1) */ /****** file i/o ***********************************************/ void *pVfsCtx; |
︙ | ︙ |
Changes to www/lsmapi.wiki.
︙ | ︙ | |||
13 14 15 16 17 18 19 20 21 22 23 24 25 26 | <p> This page contains the LSM API Reference Manual. It is intended to complement the <a href=lsmusr.wiki>LSM User Manual</a>. <h1>LSM API Topics</h1> <ol> <li><a href="#lsm" style=text-decoration:none>LSM Error Codes</a> <li><a href="#creating" style=text-decoration:none>Creating and Destroying Database Connection Handles</a> <li><a href="#connecting" style=text-decoration:none>Connecting to a Database</a> <li><a href="#obtaining" style=text-decoration:none>Obtaining pointers to databases environments</a> <li><a href="#configuring" style=text-decoration:none>Configuring a database connection.</a> <li><a href="#compression" style=text-decoration:none>Compression and/or Encryption Hooks</a> <li><a href="#allocating" style=text-decoration:none>Allocating and Freeing Memory</a> | > | 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | <p> This page contains the LSM API Reference Manual. It is intended to complement the <a href=lsmusr.wiki>LSM User Manual</a>. <h1>LSM API Topics</h1> <ol> <li><a href="#database" style=text-decoration:none>Database Runtime Environment</a> <li><a href="#lsm" style=text-decoration:none>LSM Error Codes</a> <li><a href="#creating" style=text-decoration:none>Creating and Destroying Database Connection Handles</a> <li><a href="#connecting" style=text-decoration:none>Connecting to a Database</a> <li><a href="#obtaining" style=text-decoration:none>Obtaining pointers to databases environments</a> <li><a href="#configuring" style=text-decoration:none>Configuring a database connection.</a> <li><a href="#compression" style=text-decoration:none>Compression and/or Encryption Hooks</a> <li><a href="#allocating" style=text-decoration:none>Allocating and Freeing Memory</a> |
︙ | ︙ | |||
64 65 66 67 68 69 70 71 72 73 74 75 76 77 | <span style=display:block;float:left;width:35ex><a href=#lsm_open>lsm_open</a></span> <span style=display:block;float:left;width:35ex><a href=#lsm_rollback>lsm_rollback</a></span> <span style=display:block;float:left;width:35ex><a href=#lsm_tree_size>lsm_tree_size</a></span> <span style=display:block;float:left;width:35ex><a href=#lsm_work>lsm_work</a></span> <br style=clear:both> <h1 style=clear:both>All LSM API Types</h1> <span style=display:block;float:left;width:35ex><a href=#lsm_compress>lsm_compress</a></span> <br style=clear:both> <h1>All LSM API Constants</h1> <span style=display:block;float:left;width:35ex><a href=#LSM_BUSY>LSM_BUSY</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_CANTOPEN>LSM_CANTOPEN</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_CONFIG_AUTOCHECKPOINT>LSM_CONFIG_AUTOCHECKPOINT</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_CONFIG_AUTOWORK>LSM_CONFIG_AUTOWORK</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_CONFIG_BLOCK_SIZE>LSM_CONFIG_BLOCK_SIZE</a></span> | > | 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | <span style=display:block;float:left;width:35ex><a href=#lsm_open>lsm_open</a></span> <span style=display:block;float:left;width:35ex><a href=#lsm_rollback>lsm_rollback</a></span> <span style=display:block;float:left;width:35ex><a href=#lsm_tree_size>lsm_tree_size</a></span> <span style=display:block;float:left;width:35ex><a href=#lsm_work>lsm_work</a></span> <br style=clear:both> <h1 style=clear:both>All LSM API Types</h1> <span style=display:block;float:left;width:35ex><a href=#lsm_compress>lsm_compress</a></span> <span style=display:block;float:left;width:35ex><a href=#lsm_env>lsm_env</a></span> <br style=clear:both> <h1>All LSM API Constants</h1> <span style=display:block;float:left;width:35ex><a href=#LSM_BUSY>LSM_BUSY</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_CANTOPEN>LSM_CANTOPEN</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_CONFIG_AUTOCHECKPOINT>LSM_CONFIG_AUTOCHECKPOINT</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_CONFIG_AUTOWORK>LSM_CONFIG_AUTOWORK</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_CONFIG_BLOCK_SIZE>LSM_CONFIG_BLOCK_SIZE</a></span> |
︙ | ︙ | |||
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_LOG_STRUCTURE>LSM_INFO_LOG_STRUCTURE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_NREAD>LSM_INFO_NREAD</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_NWRITE>LSM_INFO_NWRITE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_PAGE_ASCII_DUMP>LSM_INFO_PAGE_ASCII_DUMP</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_PAGE_HEX_DUMP>LSM_INFO_PAGE_HEX_DUMP</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_IOERR>LSM_IOERR</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_MISUSE>LSM_MISUSE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_NOMEM>LSM_NOMEM</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_OK>LSM_OK</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_PROTOCOL>LSM_PROTOCOL</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SAFETY_FULL>LSM_SAFETY_FULL</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SAFETY_NORMAL>LSM_SAFETY_NORMAL</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SAFETY_OFF>LSM_SAFETY_OFF</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SEEK_EQ>LSM_SEEK_EQ</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SEEK_GE>LSM_SEEK_GE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SEEK_LE>LSM_SEEK_LE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SEEK_LEFAST>LSM_SEEK_LEFAST</a></span> <br style=clear:both> <h2 id=lsm>LSM Error Codes<a id=LSM_OK></a><a id=LSM_ERROR></a><a id=LSM_BUSY></a><a id=LSM_NOMEM></a><a id=LSM_IOERR></a><a id=LSM_CORRUPT></a><a id=LSM_FULL></a><a id=LSM_CANTOPEN></a><a id=LSM_PROTOCOL></a><a id=LSM_MISUSE></a></h2> <verbatim>#define LSM_OK 0 #define LSM_ERROR 1 #define LSM_BUSY 5 #define LSM_NOMEM 7 #define LSM_IOERR 10 #define LSM_CORRUPT 11 | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 | <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_LOG_STRUCTURE>LSM_INFO_LOG_STRUCTURE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_NREAD>LSM_INFO_NREAD</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_NWRITE>LSM_INFO_NWRITE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_PAGE_ASCII_DUMP>LSM_INFO_PAGE_ASCII_DUMP</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_INFO_PAGE_HEX_DUMP>LSM_INFO_PAGE_HEX_DUMP</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_IOERR>LSM_IOERR</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_MISUSE>LSM_MISUSE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_MUTEX_GLOBAL>LSM_MUTEX_GLOBAL</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_MUTEX_HEAP>LSM_MUTEX_HEAP</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_NOMEM>LSM_NOMEM</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_OK>LSM_OK</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_PROTOCOL>LSM_PROTOCOL</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SAFETY_FULL>LSM_SAFETY_FULL</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SAFETY_NORMAL>LSM_SAFETY_NORMAL</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SAFETY_OFF>LSM_SAFETY_OFF</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SEEK_EQ>LSM_SEEK_EQ</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SEEK_GE>LSM_SEEK_GE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SEEK_LE>LSM_SEEK_LE</a></span> <span style=display:block;float:left;width:35ex><a href=#LSM_SEEK_LEFAST>LSM_SEEK_LEFAST</a></span> <br style=clear:both> <h2 id=database>Database Runtime Environment<a id=lsm_env></a><a id=LSM_MUTEX_GLOBAL></a><a id=LSM_MUTEX_HEAP></a></h2> <verbatim>struct lsm_env { int nByte; /* Size of this structure in bytes */ int iVersion; /* Version number of this structure (1) */ /****** file i/o ***********************************************/ void *pVfsCtx; int (*xFullpath)(lsm_env*, const char *, char *, int *); int (*xOpen)(lsm_env*, const char *, lsm_file **); int (*xRead)(lsm_file *, lsm_i64, void *, int); int (*xWrite)(lsm_file *, lsm_i64, void *, int); int (*xTruncate)(lsm_file *, lsm_i64); int (*xSync)(lsm_file *); int (*xSectorSize)(lsm_file *); int (*xRemap)(lsm_file *, lsm_i64, void **, lsm_i64*); int (*xFileid)(lsm_file *, void *pBuf, int *pnBuf); int (*xClose)(lsm_file *); int (*xUnlink)(lsm_env*, const char *); int (*xLock)(lsm_file*, int, int); int (*xShmMap)(lsm_file*, int, int, void **); void (*xShmBarrier)(void); int (*xShmUnmap)(lsm_file*, int); /****** memory allocation ****************************************/ void *pMemCtx; void *(*xMalloc)(lsm_env*, int); /* malloc(3) function */ void *(*xRealloc)(lsm_env*, void *, int); /* realloc(3) function */ void (*xFree)(lsm_env*, void *); /* free(3) function */ sqlite4_size_t (*xSize)(lsm_env*, void *); /* xSize function */ /****** mutexes ****************************************************/ void *pMutexCtx; int (*xMutexStatic)(lsm_env*,int,lsm_mutex**); /* Obtain a static mutex */ int (*xMutexNew)(lsm_env*, lsm_mutex**); /* Get a new dynamic mutex */ void (*xMutexDel)(lsm_mutex *); /* Delete an allocated mutex */ void (*xMutexEnter)(lsm_mutex *); /* Grab a mutex */ int (*xMutexTry)(lsm_mutex *); /* Attempt to obtain a mutex */ void (*xMutexLeave)(lsm_mutex *); /* Leave a mutex */ int (*xMutexHeld)(lsm_mutex *); /* Return true if mutex is held */ int (*xMutexNotHeld)(lsm_mutex *); /* Return true if mutex not held */ /****** other ****************************************************/ int (*xSleep)(lsm_env*, int microseconds); /* New fields may be added in future releases, in which case the ** iVersion value will increase. */ }; #define LSM_MUTEX_GLOBAL 1 #define LSM_MUTEX_HEAP 2 </verbatim> <p>Run-time environment used by LSM Values that may be passed as the second argument to xMutexStatic. <h2 id=lsm>LSM Error Codes<a id=LSM_OK></a><a id=LSM_ERROR></a><a id=LSM_BUSY></a><a id=LSM_NOMEM></a><a id=LSM_IOERR></a><a id=LSM_CORRUPT></a><a id=LSM_FULL></a><a id=LSM_CANTOPEN></a><a id=LSM_PROTOCOL></a><a id=LSM_MISUSE></a></h2> <verbatim>#define LSM_OK 0 #define LSM_ERROR 1 #define LSM_BUSY 5 #define LSM_NOMEM 7 #define LSM_IOERR 10 #define LSM_CORRUPT 11 |
︙ | ︙ |
Changes to www/lsmusr.wiki.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | <title>LSM Users Guide</title> <nowiki> <h2>Table of Contents</h2> <div id=start_of_toc></div> <a href=#introduction_to_lsm style=text-decoration:none>1. Introduction to LSM</a><br> <a href=#using_lsm_in_applications style=text-decoration:none>2. Using LSM in Applications </a><br> <a href=#basic_usage style=text-decoration:none>3. Basic Usage</a><br> <a href=#opening_and_closing_database_connections style=text-decoration:none>3.1. Opening and Closing Database Connections </a><br> <a href=#writing_to_a_database style=text-decoration:none>3.2. Writing to a Database </a><br> <a href=#reading_from_a_database style=text-decoration:none>3.3. Reading from a Database </a><br> <a href=#database_transactions_and_mvcc style=text-decoration:none>3.4. Database Transactions and MVCC </a><br> | > | | > > > > > > > > > > > > > > > > > > | < | | > > > | | | | | | < > > | | < < < | | | > | > > | | | | | | | | | < | | | > | > | | | | | | | | | | | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 | <title>LSM Users Guide</title> <nowiki> <h2>Table of Contents</h2> <div id=start_of_toc></div> <a href=#introduction_to_lsm style=text-decoration:none>1. Introduction to LSM</a><br> <a href=#using_lsm_in_applications style=text-decoration:none>2. Using LSM in Applications </a><br> <a href=#basic_usage style=text-decoration:none>3. Basic Usage</a><br> <a href=#opening_and_closing_database_connections style=text-decoration:none>3.1. Opening and Closing Database Connections </a><br> <a href=#writing_to_a_database style=text-decoration:none>3.2. Writing to a Database </a><br> <a href=#reading_from_a_database style=text-decoration:none>3.3. Reading from a Database </a><br> <a href=#database_transactions_and_mvcc style=text-decoration:none>3.4. Database Transactions and MVCC </a><br> <a href=#data_durability style=text-decoration:none>4. Data Durability </a><br> <a href=#compressed_and_encrypted_databases style=text-decoration:none>5. Compressed and Encrypted Databases </a><br> <a href=#performance_tuning style=text-decoration:none>6. Performance Tuning</a><br> <a href=#architectural_overview style=text-decoration:none>6.1. Architectural Overview </a><br> <a href=#work_and_checkpoint_scheduling style=text-decoration:none>6.2. Work and Checkpoint Scheduling </a><br> <a href=#automatic_work_and_checkpoint_scheduling style=text-decoration:none>6.2.1. Automatic Work and Checkpoint Scheduling</a><br> <a href=#explicit_work_and_checkpoint_scheduling style=text-decoration:none>6.2.2. Explicit Work and Checkpoint Scheduling</a><br> <a href=#compulsary_work_and_checkpoint_scheduling style=text-decoration:none>6.2.3. Compulsary Work and Checkpoint Scheduling</a><br> <a href=#database_optimization style=text-decoration:none>6.3. Database Optimization</a><br> <a href=#other_parameters style=text-decoration:none>6.4. Other Parameters </a><br> <div id=end_of_toc></div> <h2>Overview</h2> <p>This document describes the LSM embedded database library and use thereof. It is intended to be part user-manual and part tutorial. It is intended to to complement the <a href=lsmapi.wiki>LSM API reference manual</a>. <p>The <a href=#introduction_to_lsm>first section</a> of this document contains a description of the LSM library and its features. <a href=#using_lsm_in_applications>Section 2</a> describes how to use LSM from within a C or C++ application (how to compile and link LSM, what to #include etc.). The <a href=#basic_usage>third section</a> describes the essential APIs that applications use to open and close database connections, and to read from and write to databases. <p>The three sections described above contain all the information required to create applications that use LSM. The remaining sections discuss more specialized topics. <a href=#data_durability>Section 4</a> discusses the configuration parameter that influences transaction durability (the guarantees offered with respect to recently committed transactions if a power failure occurs). <a href=#compressed_and_encrypted_databases>Section 5</a> explains the interface provided by LSM that allow external data compression and/or encryption functions to be used to create compressed and/or encrypted databases. <i>Todo: Clarify exactly what section 6 is and link it to here</i>. <h1 id=introduction_to_lsm>1. Introduction to LSM</h1> <p>LSM is an embedded database library for key-value data, roughly similar in scope to <a href="http://www.oracle.com/technetwork/products/berkeleydb/overview/index.html">Berkeley DB</a>, <a href="http://code.google.com/p/leveldb/">LevelDB</a> or <a href="http://fallabs.com/kyotocabinet/">KyotoCabinet</a>. Both keys and values are specified and stored as byte arrays. Duplicate keys are not supported. Keys are always sorted in memcmp() order. LSM supports the following operations for the manipulation and query of database data: <ul> <li> Writing a new key and value into the database. <li> Deleting an existing key from the database. <li> Deleting a range of keys from the database. <li> Querying the database for a specific key. <li> Iterating through a range of database keys (either forwards or backwards). </ul> <p>Other salient features are: <ul> <li><p>LSM supports a <b>single-writer/multiple-reader MVCC</b> based transactional concurrency model. SQL style nested sub-transactions are supported. Clients may concurrently access a single LSM database from within a single or multiple application processes. <li><p>An entire LSM database is stored in a <b>single file on disk</b>. <li><p>Data <b>durability in the face of application or power failure</b>. LSM may optionally use a write-ahead log file when writing to the database to ensure committed transactions are not lost if an application or power failure occurs. <li>LSM <b>may be configured to use external data compression and/or encryption routines</b> to create and access compressed and/or encrypted databases. </ul> <p>Many database systems that support range queries, including <a href=http://www.sqlite.org>SQLite 3</a>, Berkeley DB and Kyoto Cabinet, are based on one of many variants of the <a href="http://en.wikipedia.org/wiki/B-tree">b-tree data structure</a>. B-trees are attractive because a b-tree structure minimizes the number of disk sectors that must be read from disk when searching the database for a specific key. However, b-tree implementations usually suffer from poor write localization - updating the contents of a b-tree often involves modifying the contents of nodes scattered throughout the database file. If the database is stored on a spinning disk (HDD), then the disk heads must be moved before writing non-contiguous sector, which is extremely slow. If the database is stored on solid state storage (SDD) a similar phenomena is encountered due to the large erase-block sizes. In general, writing to a series of contiguous disk sectors is orders of magnitude faster than updating to the same number of disk sectors scattered randomly throughout a large file. Additionally, b-tree structures are prone to fragmentation, reducing the speed of range queries. <p><i>Todo: Should have references for the claims above.</i> <p><i>Also, fix the link in the next paragraph to point to a description of the log-structured-merge tree within lsm.wiki (or its successor).</i> <p>LSM uses a <a href=lsm.wiki>different data structure</a> that makes the following performance tradeoffs relative to a b-tree: <ul> <li> A very large percentage of the disk sectors modified when writing to the database are contiguous. Additionally, in many cases the total number of sectors written to disk is reduced. This makes writing to an LSM database much faster than the equivalent b-tree. <li> LSM databases do not suffer from fragmentation to the same degree as b-trees. This means that the performance of large range queries does not degrade as the database is updated as it may with a b-tree. <li> It is accepted that under some circumstances searching an LSM database for a given key will involve examining more disk sectors than it would with a b-tree. In terms of disk sectors accessed when searching a database of size N, both b-trees and LSM provide O(log(N)) efficiency, but the base of the logarithm is generally larger for a b-tree than for LSM. </ul> <p>In other words, writing to an LSM database should be very fast and scanning through large ranges of keys should also perform well, but searching the database for specific keys may be slightly slower than when using a b-tree based system. Additionally, avoiding random writes in favour of largely contiguous updates (as LSM does) can significantly reduce the wear on SSD or flash memory devices. <p>Although it has quite different features to LSM in other respects, LevelDB makes similar performance tradeoffs. <p>Benchmark test results for LSM are <a href=#>available here</a>. <i>Todo: Fix this link to point to a page with performance graphs.</i> <h1 id=using_lsm_in_applications>2. Using LSM in Applications </h1> <p>LSM is not currently built or distributed independently. Instead, it is part of the SQLite4 library. To use LSM in an application, the application links against libsqlite4 and includes the header file "lsm.h" in any files that access the LSM API. <p><i>Pointer to build instructions for sqlite4</i> <h1 id=basic_usage>3. Basic Usage</h1> <h2 id=opening_and_closing_database_connections>3.1. Opening and Closing Database Connections </h2> <p>Opening a connection to a database is a two-step process. The <a href=lsmapi.wiki#lsm_new>lsm_new()</a> function is used to create a new database handle, and the <a href=lsmapi.wiki#lsm_open>lsm_open()</a> function is used to connect an existing database handle to a database on disk. This is because some database connection properties may only be configured before the database is opened. In that case, one or more calls to the <a href=lsmapi.wiki#lsm_config>lsm_config()</a> method are made between the calls to lsm_new() and lsm_open(). <p>The functions are defined as follows: <verbatim> int lsm_new(lsm_env *env, lsm_db **pDb); int lsm_open(lsm_db *db, const char *zFile); </verbatim> <p>Like most lsm_xxx() functions that return type int (the exception is <a href=lsmapi.wiki#lsm_csr_valid>lsm_csr_valid()</a>), both of the above return LSM_OK (0) if successful, or an <a href=lsmapi.wiki#LSM_ERROR>LSM error code</a> otherwise. The first argument to lsm_new() may be passed either a pointer to a <a href=lsmapi.wiki#lsm_env>database environment object</a> or NULL. Almost all applications should pass NULL. A database environment object allows the application to supply custom implementations of the various operating system calls that LSM uses to read and write files, allocate heap memory, and coordinate between multiple application threads and processes. This is normally only required if LSM is being used on a platform that is not supported by default. Passing NULL instructs the library to use the default implementations of all these things, which is usually the right thing to do. The second argument to lsm_new() is an output variable. Assuming the call is successful, *pDb is set to point to the new database handle before returning. <p>The first argument passed to lsm_open() must be an existing database handle. The second is the name of the database file to connect to. Once lsm_open() has been successfully called on a database handle, it can not be called again on the same handle. Attempting to do so is an LSM_MISUSE error. <p>For example, to create a new handle and connect it to database "test.db" on disk: <verbatim> int rc; lsm_db *db; /* Allocate a new database handle */ rc = lsm_new(0, &db); if( rc!=LSM_OK ) exit(1); /* Connect the database handle to database "test.db" */ rc = lsm_open(db, "test.db"); if( rc!=LSM_OK ) exit(1); </verbatim> <p>A database connection can be closed using the lsm_close() function. Calling lsm_close() disconnects from the database (assuming lsm_open() has been successfully called) and deletes the handle itself. Attempting to use a database handle after it has been passed to lsm_close() results in undefined behaviour (likely a segfault). <verbatim> rc = lsm_close(db); </verbatim> <p>It is important that lsm_close() is called to close all database handles created with lsm_new(), particularly if the connection has written to the database. If an application that has written to a database exits without closing its database connection, then subsequent clients may have to run "database recovery" when they open the database, making the lsm_open() call less responsive. Additionally, not matching each successful lsm_new() call with a call to lsm_close() is a resource leak. <p>Counter-intuitively, an lsm_close() call may fail. In this case the database handle is not closed, so if the application exits it invites the "database recovery" performance problem mentioned above. The usual reason for an lsm_close() call failing is that the database handle has been used to create <a href=lsmapi.wiki#lsm_csr_open>database cursors</a> that have not been closed. Unless all database cursors are closed before lsm_close() is called, it fails with an LSM_BUSY error and the database handle is not closed. <h2 id=writing_to_a_database>3.2. Writing to a Database </h2> <p>Three API functions are used to write to the database: <ul> <li> <b>lsm_insert()</b>: insert a new key/value pair into the database, |
︙ | ︙ | |||
207 208 209 210 211 212 213 214 215 216 | lsm_delete(db, "c", 1); lsm_delete_range(db, "c", 1, "f", 1); lsm_delete(db, "f", 1); </verbatim> <h2 id=reading_from_a_database>3.3. Reading from a Database </h2> <verbatim> lsm_csr *csr; | > > > > > > > > > > > | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | > > > > > > > > > > > > > > | > > > > > > > > > > > > > > > > > > > > | > > > > > > > > > > > > > > > > > > > > > > > > > > > | > > > > > > > > > > > > > > > | > > > > > > > > > > | 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 | lsm_delete(db, "c", 1); lsm_delete_range(db, "c", 1, "f", 1); lsm_delete(db, "f", 1); </verbatim> <h2 id=reading_from_a_database>3.3. Reading from a Database </h2> <p>All data read from an LSM database is read via a cursor handle. Cursor handles are opened using the <a href=lsmapi.wiki#lsm_csr_open>lsm_csr_open()</a> API, as follows: <verbatim> lsm_csr *csr; rc = lsm_csr_open(db, &csr); </verbatim> <p>Once an application has finished using a database cursor, it must be closed using the <a href=lsmapi.wiki#lsm_csr_close>lsm_csr_close()</a> API. The lsm_csr_close() function does not return any value. It cannot fail. <verbatim> lsm_csr_close(csr); </verbatim> <p>Database cursors support the following functions for positioning the cursor: <ul> <li> <b>lsm_csr_seek()</b> - move the cursor to point to a nominated database key. <li> <b>lsm_csr_first()</b> - move the cursor to point to the first entry in the database (the one with the smallest key). <li> <b>lsm_csr_last()</b> - move the cursor to point to the last entry in the database (the one with the largest key). <li> <b>lsm_csr_next()</b> - move the cursor to point to the next entry in the the database. <li> <b>lsm_csr_prev()</b> - move the cursor to point to the previous entry in the database. </ul> <p>Once a cursor has been positioned, it supports the following functions for retrieving the details of the current entry: <ul> <li> <b>lsm_csr_valid()</b> - determine whether or not the cursor currently points to a valid entry. <li> <b>lsm_csr_key()</b> - retrieve the key associated with the database entry the cursor points to. <li> <b>lsm_csr_value()</b> - retrieve the value associated with the database entry the cursor points to. <li> <b>lsm_csr_cmp()</b> - compare a key supplied by the application with the key associated with the entry the cursor points to. </ul> <p>The following example demonstrates using the <a href=lsmapi.wiki#lsm_csr_seek>lsm_csr_seek()</a> function to search the database for a specified key, <a href=lsmapi.wiki#lsm_csr_valid>lsm_csr_valid()</a> to check if the search was successful, and <a href=lsmapi.wiki#lsm_csr_value>lsm_csr_value()</a> to retrieve the value associated with the key within the database. <verbatim> rc = lsm_csr_seek(csr, "b", 1, LSM_SEEK_EQ); if( lsm_csr_valid(csr) ){ const void *pVal; int nVal; rc = lsm_csr_value(csr, &pVal, &nVal); if( rc==LSM_OK ){ /* pVal now points to a buffer nVal bytes in size containing the ** value associated with database key "b". */ } } </verbatim> <p> The example code below iterates forwards through all entries (in key order, from smallest to largest) in the database. Function <a href=lsmapi.wiki#lsm_csr_first>lsm_csr_first()</a> is used to position the cursor to point to the first entry in the database, and <a href=lsmapi.wiki#lsm_csr_next>lsm_csr_next()</a> is used to advance to the next entry. After lsm_csr_next() is called to advance past the final entry in the database, the cursor is left pointing to no entry at all, lsm_csr_valid() returns 0, and the loop is finished. API function <a href=lsmapi.wiki#lsm_csr_key>lsm_csr_key()</a> is used to retrieve the key associated with each database entry visited. <verbatim> for(rc = lsm_csr_first(csr); lsm_csr_valid(csr); rc = lsm_csr_next(csr)){ const void *pKey; int nKey; const void *pVal; int nVal; rc = lsm_csr_key(csr, &pKey, &nKey); if( rc==LSM_OK ) rc = lsm_csr_value(csr, &pVal, &nVal); if( rc!=LSM_OK ) break; /* At this point pKey points to the current key (size nKey bytes) and ** pVal points to the corresponding value (size nVal bytes). */ } </verbatim> <p> The example code above could be modified to iterate backwards through the entries in the database (again in key order, but this time from largest to smallest) by replacing the call to lsm_csr_first() with <a href=lsmapi.wiki#lsm_csr_last>lsm_csr_last()</a> and the call to lsm_csr_next() with <a href=lsmapi.wiki#lsm_csr_prev>lsm_csr_prev()</a>. <p>The signature of lsm_csr_seek() is: <verbatim> int lsm_csr_seek(lsm_cursor *csr, const void *pKey, int nKey, int eSeek); </verbatim> <p>The second and third arguments passed to lsm_csr_seek() define the key to search the database for (pKey must point to the buffer containing the nKey byte key when this function is called). Assuming no error occurs, if there an entry with the requested key is present in the database, the cursor is left pointing to it. Otherwise, if no such entry is present, the final position of the cursor depends on the value passed as the fourth parameter to lsm_csr_seek(). Valid values for the fourth parameter to lsm_csr_seek() are: <dl> <dt> LSM_SEEK_EQ <dd> <p style=margin-top:0> In this case, if the specified key is not present in the database, the cursor is not left pointing to any database entry (i.e. calling lsm_csr_valid() returns 0). <dt> LSM_SEEK_LE <dd> <p style=margin-top:0> If the specified key is not present in the database and the fourth argument to lsm_csr_seek() is LSM_SEEK_LE (Less than or Equal), the cursor is left pointing to the database entry with the largest key that is less than the specified key. Or, if there are no entries in the database with keys smaller than the specified key, the cursor is left pointing to no entry at all. <dt> LSM_SEEK_GE <dd> <p style=margin-top:0> If the specified key is not present in the database and the fourth argument to lsm_csr_seek() is LSM_SEEK_GE (Greater than or Equal), the cursor is left pointing to the database entry with the smallest key that is greater than the specified key. Or, if there are no entries in the database with keys larger than the specified key, the cursor is left pointing to no entry at all. <dd> <p style=margin-top:0> </dl> <p> Calls made to lsm_csr_seek() with LSM_SEEK_EQ as the final argument are slightly more efficient than those made specifying LSM_SEEK_LE or LSM_SEEK_GE. So to retrieve a specific entry from a database, LSM_SEEK_EQ should be preferred. The other two values are primarily useful for implementing range queries. For example, to iterate backwards through all keys from "ggg" to "cc", inclusive: <verbatim> for(rc = lsm_csr_seek(csr, "ggg", 3, LSM_SEEK_LE); lsm_csr_valid(csr); rc = lsm_csr_prev(csr)){ const void *pKey; int nKey; const void *pVal; int nVal; int res; /* Compare the key that the cursor currently points to with "cc". If ** the cursor key is less than "cc", break out of the loop. */ rc = lsm_csr_cmp(csr, "cc", 2, &res); if( rc!=LSM_OK || res<0 ) break; rc = lsm_csr_key(csr, &pKey, &nKey); if( rc==LSM_OK ) rc = lsm_csr_value(csr, &pVal, &nVal); if( rc!=LSM_OK ) break; /* At this point pKey points to the current key (size nKey bytes) and ** pVal points to the corresponding value (size nVal bytes). */ } </verbatim> <p>In the example code above, the call to lsm_csr_seek() positions the cursor to point to the entry with key "ggg", if it exists, or to the largest entry in the database with a key smaller than "ggg" if such a key can be found, or to EOF otherwise. The lsm_csr_prev() call advances the cursor to the next entry in the database file (in key order from largest to smallest), and the lsm_csr_valid() call returns 0 to break out of the loop once the cursor is advance past the entry with the smallest key in the database. So on its own, the "for" statement serves to iterate the cursor in reverse order through all keys in the database less than or equal to "ggg". <p>The call to <a href=lsmapi.wiki#lsm_csr_cmp>lsm_csr_cmp()</a> call in the body of the loop is used to enforce the lower bound (keys >= "cc") on the range query by breaking out of the loop if an entry with a key smaller than "cc" is ever visited. lsm_csr_cmp() has the following signature: <verbatim> int lsm_csr_cmp(lsm_cursor *csr, const void *pKey, int nKey, int *piRes); </verbatim> <p> When lsm_csr_cmp() is called, the key specified by the second and third arguments (pKey and nKey) is compared to the database key that the cursor currently points to. Assuming no error occurs, depending on whether or not the cursors key is less than, equal to, or greater than the specified key, *piRes is set to a value less than, equal to, or greater than zero before returning. In other words: <pre><i> *piRes = (cursors key) - (specified key) </i></pre> <h2 id=database_transactions_and_mvcc>3.4. Database Transactions and MVCC </h2> <p>LSM supports a single-writer/multiple-reader <a href=http://en.wikipedia.org/wiki/Multiversion_concurrency_control>MVCC</a> based transactional concurrency model. This is the same model that SQLite supports in <a href="http://www.sqlite.org/wal.html">WAL mode</a>. |
︙ | ︙ | |||
423 424 425 426 427 428 429 | lsm_delete(db, "k", 1); lsm_rollback(db, 2); lsm_delete(db, "m", 1); lsm_commit(db, 0); </verbatim> | | | 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 | lsm_delete(db, "k", 1); lsm_rollback(db, 2); lsm_delete(db, "m", 1); lsm_commit(db, 0); </verbatim> <h1 id=data_durability>4. Data Durability </h1> <p>The value of the configuration parameter LSM_CONFIG_SAFETY determines how often data is synced to disk by the LSM library. This is an important tradeoff - syncing less often can lead to orders of magnitude better performance, but also exposes the application to the risk of partial or total data loss in the event of a power failure; |
︙ | ︙ | |||
616 617 618 619 620 621 622 | two or more existing segments, the in-memory snapshot is updated immediately. This is the snapshot that database clients use when querying or otherwise operating on the database. <p> At any point after the in-memory snapshot has been updated, the in-memory snapshot may be written into the database file header. This is known as "checkpointing" the database. Depending on the value of the | | | | | | | | 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 | two or more existing segments, the in-memory snapshot is updated immediately. This is the snapshot that database clients use when querying or otherwise operating on the database. <p> At any point after the in-memory snapshot has been updated, the in-memory snapshot may be written into the database file header. This is known as "checkpointing" the database. Depending on the value of the <a href=lsmapi.wiki#LSM_CONFIG_SAFETY>LSM_CONFIG_SAFETY</a> parameter, it may be necessary to ensure that all segments referenced by the snapshot have been synced to disk (safely stored on the persistent media such that they will not be lost if a power failure occurs) before doing so. It is not necessary for every version of the in-memory snapshot to be checkpointed. The in-memory snapshot may be modified multiple times between checkpoints. <p> Because a checkpointer process is often required to sync the database file before updating the database header, "checkpointing" often appears to be the costliest part of transfering data to the database file, at least in terms of wall-clock time. |
︙ | ︙ | |||
1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 | <ul> <li> LSM_CONFIG_MMAP <li> LSM_CONFIG_MULTIPLE_PROCESSES <li> LSM_CONFIG_USE_LOG </ul> </i> | > > | 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 | <ul> <li> LSM_CONFIG_MMAP <li> LSM_CONFIG_MULTIPLE_PROCESSES <li> LSM_CONFIG_USE_LOG </ul> </i> |