Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Updates to lsmusr.wiki.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 1ea91878201f0998b35a6843da7c9e4c51bceb3d
User & Date: dan 2012-11-14 20:09:24.942
Context
2012-11-15
14:19
Add words to lsmusr.wiki. check-in: 2077c9d152 user: dan tags: trunk
2012-11-14
20:09
Updates to lsmusr.wiki. check-in: 1ea9187820 user: dan tags: trunk
18:23
Improvements to lsmusr.wiki. check-in: e47b5e3ae6 user: dan tags: trunk
Changes
Unified Diff Ignore Whitespace Patch
Changes to www/lsmusr.wiki.
1
2
3
4
5
6
7
8

9
10
11
12
13
14
15
16
17
18
19
20
21
22

23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

<title>LSM Users Guide</title>
<nowiki>

<h2>Table of Contents</h2>





<div id=start_of_toc></div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#introduction_to_lsm style=text-decoration:none>1. Introduction to LSM</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#using_lsm_in_applications style=text-decoration:none>2. Using LSM in Applications </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#basic_usage style=text-decoration:none>3. Basic Usage</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#opening_and_closing_database_connections style=text-decoration:none>3.1. Opening and Closing Database Connections </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#writing_to_a_database style=text-decoration:none>3.2. Writing to a Database </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#reading_from_a_database style=text-decoration:none>3.3. Reading from a Database </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#database_transactions_and_mvcc style=text-decoration:none>3.4. Database Transactions and MVCC </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#data_durability style=text-decoration:none>4. Data Durability </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#compressed_and_encrypted_databases style=text-decoration:none>5. Compressed and Encrypted Databases </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#performance_tuning style=text-decoration:none>6. Performance Tuning</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#architectural_overview style=text-decoration:none>6.1. Architectural Overview </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#work_and_checkpoint_scheduling style=text-decoration:none>6.2. Work and Checkpoint Scheduling </a><br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#automatic_work_and_checkpoint_scheduling style=text-decoration:none>6.2.1. Automatic Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#explicit_work_and_checkpoint_scheduling style=text-decoration:none>6.2.2. Explicit Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#compulsary_work_and_checkpoint_scheduling style=text-decoration:none>6.2.3. Compulsary Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#database_optimization style=text-decoration:none>6.3. Database Optimization</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#other_parameters style=text-decoration:none>6.4. Other Parameters </a><br>

<div id=end_of_toc></div>

<h2>Overview</h2>

<p>This document describes the LSM embedded database library and use thereof. 
It is intended to be part user-manual and part tutorial. It is intended to
to complement the <a href=lsmapi.wiki>LSM API reference manual</a>.

<p>The <a href=#introduction_to_lsm>first section</a> of this document contains
a description of the LSM library and its features. 
<a href=#using_lsm_in_applications>Section 2</a> describes how to use LSM from
within a C or C++ application (how to compile and link LSM, what to #include
etc.). The <a href=#basic_usage>third section</a> describes the essential APIs
that applications use to open and close database connections, and to read from








>












|
|
>
|
|
|
|
<







|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

<title>LSM Users Guide</title>
<nowiki>

<h2>Table of Contents</h2>





<div id=start_of_toc></div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#introduction_to_lsm style=text-decoration:none>1. Introduction to LSM</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#using_lsm_in_applications style=text-decoration:none>2. Using LSM in Applications </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#basic_usage style=text-decoration:none>3. Basic Usage</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#opening_and_closing_database_connections style=text-decoration:none>3.1. Opening and Closing Database Connections </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#writing_to_a_database style=text-decoration:none>3.2. Writing to a Database </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#reading_from_a_database style=text-decoration:none>3.3. Reading from a Database </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#database_transactions_and_mvcc style=text-decoration:none>3.4. Database Transactions and MVCC </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#data_durability style=text-decoration:none>4. Data Durability </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#compressed_and_encrypted_databases style=text-decoration:none>5. Compressed and Encrypted Databases </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#performance_tuning style=text-decoration:none>6. Performance Tuning</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#performance_related_configuration_options style=text-decoration:none>6.1. Performance Related Configuration Options </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#using_worker_threads_or_processes style=text-decoration:none>6.2. Using Worker Threads or Processes </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#architectural_overview style=text-decoration:none>6.2.1. Architectural Overview </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#automatic_work_and_checkpoint_scheduling style=text-decoration:none>6.2.2. Automatic Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#explicit_work_and_checkpoint_scheduling style=text-decoration:none>6.2.3. Explicit Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#compulsary_work_and_checkpoint_scheduling style=text-decoration:none>6.2.4. Compulsary Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#database_file_optimization style=text-decoration:none>6.3. Database File Optimization</a><br>


<div id=end_of_toc></div>

<h2>Overview</h2>

<p>This document describes the LSM embedded database library and use thereof. 
It is intended to be part user-manual and part tutorial. It is intended to
complement the <a href=lsmapi.wiki>LSM API reference manual</a>.

<p>The <a href=#introduction_to_lsm>first section</a> of this document contains
a description of the LSM library and its features. 
<a href=#using_lsm_in_applications>Section 2</a> describes how to use LSM from
within a C or C++ application (how to compile and link LSM, what to #include
etc.). The <a href=#basic_usage>third section</a> describes the essential APIs
that applications use to open and close database connections, and to read from
766
767
768
769
770
771
772










































































































773
774
775
776
777
778
779
780
<p><i>Maybe there should be a way to register a mismatch-handler callback.
Otherwise, applications have to handle LSM_MISMATCH everywhere...
</i>


<h1 id=performance_tuning>6. Performance Tuning</h1>











































































































<h2 id=architectural_overview>6.1. Architectural Overview </h2>

<p> The LSM library implements two separate data structures that are used 
together to store user data. When the database is queried, the library 
actually runs parallel queries on both of these data stores and merges the
results together to return to the user. The data structures are:

<ul>







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|







767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
<p><i>Maybe there should be a way to register a mismatch-handler callback.
Otherwise, applications have to handle LSM_MISMATCH everywhere...
</i>


<h1 id=performance_tuning>6. Performance Tuning</h1>

<p> This section describes the various measures that can be taken in order to
fine-tune LSM in order to improve performance in specific circumstances.
Sub-section 6.1 identifies the 
<a href=#performance_related_configuration_options> configuration
parameters</a> that can be used to influence database performance. 
Sub-section 6.2 discusses methods for shifting the time-consuming processes of
actually writing and syncing the database file to 
<a href=#using_worker_threads_or_processes>background threads or processes</a> 
in order to make writing to the database more responsive. Finally, 6.
3 introduces "<a href=#database_file_optimization>database optimization</a>"
- the process of reorganizing a database file internally so that it is as small
as possible and optimized for search queries.

<h2 id=performance_related_configuration_options>6.1. Performance Related Configuration Options </h2>

<p>The options in this section all take integer values. They may be both
set and queried using the <a href=lsmapi.wiki#lsm_config>lsm_config()</a>
function. To set an option to a value, lsm_config() is used as follows:

<verbatim>
  /* Set the LSM_CONFIG_AUTOFLUSH option to 1MB */
  int iVal = 1 * 1024 * 1024;
  rc = lsm_config(db, LSM_CONFIG_AUTOFLUSH, &iVal);
</verbatim>

<p>In order to query the current value of an option, the initial value of
the parameter (iVal in the example code above) should be set to a negative
value. Or any other value that happens to be out of range for the parameter -
negative values just happen to be out of range for all integer lsm_config()
parameters.

<verbatim>
  /* Set iVal to the current value of LSM_CONFIG_AUTOFLUSH */
  int iVal = -1;
  rc = lsm_config(db, LSM_CONFIG_AUTOFLUSH, &iVal);
</verbatim>

<dl>
  <dt> <a href=lsmapi.wiki#LSM_CONFIG_MMAP>LSM_CONFIG_MMAP</a>
  <dd> <p style=margin-top:0>
    This option may be set to either 1 (true) or 0 (false). If it is set to
    true and LSM is running on a system with a 64-bit address space, the
    entire database file is memory mapped. Or, if it is false or LSM is 
    running in a 32-bit address space, data is accessed using ordinary
    OS file read and write primitives. Memory mapping the database file
    can significantly improve the performance of read operations, as database 
    pages do not have to be copied from operating system buffers into user 
    space buffers before they can be examined. 

    <p>This option can only be set before lsm_open() is called on the database
    connection.

    <p>The default value is 1 (true).

  <dt> <a href=lsmapi.wiki#LSM_CONFIG_MULTIPLE_PROCESSES>LSM_CONFIG_MULTIPLE_PROCESSES</a>
  <dd> <p style=margin-top:0>
    This option may also be set to either 1 (true) or 0 (false). If it is
    set to 0, then the library assumes that all database clients are located 
    within the same process (have access to the same memory space). Assuming
    this means the library can avoid using OS file locking primitives to lock
    the database file, which speeds up opening and closing read and write
    transactions. 

    <p>This option can only be set before lsm_open() is called on the database
    connection.

    <p>The default value is 1 (true).

  <dt> <a href=lsmapi.wiki#LSM_CONFIG_USE_LOG>LSM_CONFIG_USE_LOG</a>
  <dd> <p style=margin-top:0>
    This is another option may also be set to either 1 (true) or 0 (false). 
    If it is set to false, then the library does not write data into the
    database log file. This makes writing faster, but also means that if
    an application crash or power failure occurs, it is very likely that
    any recently committed transactions will be lost.

    <p>If this option is set to true, then an application crash cannot cause
    data loss. Whether or not data loss may occur in the event of a power
    failure depends on the value of the <a href=#data_durability>
    LSM_CONFIG_SAFETY</a> parameter.

    <p>This option can only be set if the connection does not currently have
    an open write transaction.

    <p>The default value is 1 (true).

  <dt> <a href=lsmapi.wiki#LSM_CONFIG_AUTOFLUSH>LSM_CONFIG_AUTOFLUSH</a>
  <dd> <p style=margin-top:0>

  <dt> <a href=lsmapi.wiki#LSM_CONFIG_AUTOCHECKPOINT>LSM_CONFIG_AUTOCHECKPOINT</a>
  <dd> <p style=margin-top:0>

</dl>

<h2 id=using_worker_threads_or_processes>6.2. Using Worker Threads or Processes </h2>

<p><i>Todo: Fix the following </p>

<p>The section above describes the three stages of transfering data written
to the database from the application to persistent storage. A "writer" 
client writes the data into the in-memory tree and log file. Later on a 
"worker" client flushes the data from the in-memory tree to a new segment
in the the database file. Additionally, a worker client must periodically
merge existing database segments together to prevent them from growing too
numerous.

<h3 id=architectural_overview>6.2.1. Architectural Overview </h3>

<p> The LSM library implements two separate data structures that are used 
together to store user data. When the database is queried, the library 
actually runs parallel queries on both of these data stores and merges the
results together to return to the user. The data structures are:

<ul>
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
database file header (to checkpoint the database). 
</table>

<p>The tasks associated with each of the locks above may be performed
concurrently by multiple database connections, located either in the same
application process or different processes.

<h2 id=work_and_checkpoint_scheduling>6.2. Work and Checkpoint Scheduling </h2>

<p>The section above describes the three stages of transfering data written
to the database from the application to persistent storage. A "writer" 
client writes the data into the in-memory tree and log file. Later on a 
"worker" client flushes the data from the in-memory tree to a new segment
in the the database file. Additionally, a worker client must periodically
merge existing database segments together to prevent them from growing too
numerous.

<h3 id=automatic_work_and_checkpoint_scheduling>6.2.1. Automatic Work and Checkpoint Scheduling</h3>

<p>By default, database "work" (the flushing and merging of segments, performed
by clients holding the WORKER lock) and checkpointing are scheduled and
performed automatically from within calls to "write" API functions. The 
"write" functions are:

<ul>







<
<
<
<
<
<
<
<
<
<
|







1010
1011
1012
1013
1014
1015
1016










1017
1018
1019
1020
1021
1022
1023
1024
database file header (to checkpoint the database). 
</table>

<p>The tasks associated with each of the locks above may be performed
concurrently by multiple database connections, located either in the same
application process or different processes.











<h3 id=automatic_work_and_checkpoint_scheduling>6.2.2. Automatic Work and Checkpoint Scheduling</h3>

<p>By default, database "work" (the flushing and merging of segments, performed
by clients holding the WORKER lock) and checkpointing are scheduled and
performed automatically from within calls to "write" API functions. The 
"write" functions are:

<ul>
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
than zero, after performing database work, the library automatically checks
how many bytes of raw data have been written to the database file since the
last checkpoint (by any client, not just by the current client). If this
value is greater than the value of the LSM_CONFIG_AUTOCHECKPOINT parameter,
a checkpoint is attempted. It is not an error if the attempt fails because the
CHECKPOINTER lock cannot be obtained.

<h3 id=explicit_work_and_checkpoint_scheduling>6.2.2. Explicit Work and Checkpoint Scheduling</h3>

<p>The alternative to automatic scheduling of work and checkpoint operations
is to explicitly schedule them. Possibly in a background thread or dedicated
application process. In order to disable automatic work, a client must set
the LSM_CONFIG_AUTOWORK parameter to zero. This parameter is a property of
a database connection, not of a database itself, so it must be cleared
separately by all processes that may write to the database. Otherwise, they







|







1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
than zero, after performing database work, the library automatically checks
how many bytes of raw data have been written to the database file since the
last checkpoint (by any client, not just by the current client). If this
value is greater than the value of the LSM_CONFIG_AUTOCHECKPOINT parameter,
a checkpoint is attempted. It is not an error if the attempt fails because the
CHECKPOINTER lock cannot be obtained.

<h3 id=explicit_work_and_checkpoint_scheduling>6.2.3. Explicit Work and Checkpoint Scheduling</h3>

<p>The alternative to automatic scheduling of work and checkpoint operations
is to explicitly schedule them. Possibly in a background thread or dedicated
application process. In order to disable automatic work, a client must set
the LSM_CONFIG_AUTOWORK parameter to zero. This parameter is a property of
a database connection, not of a database itself, so it must be cleared
separately by all processes that may write to the database. Otherwise, they
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
  rc = lsm_info(db, LSM_INFO_TREE_SIZE, &nOld, &nLive);
</verbatim>

<verbatim>
  int lsm_flush(lsm_db *db);
</verbatim>

<h3 id=compulsary_work_and_checkpoint_scheduling>6.2.3. Compulsary Work and Checkpoint Scheduling</h3>

<p>Apart from the scenarios described above, there are two there are two 
scenarios where database work or checkpointing may be performed automatically,
regardless of the value of the LSM_CONFIG_AUTOWORK parameter.

<ul>
  <li> When closing a database connection, and 







|







1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
  rc = lsm_info(db, LSM_INFO_TREE_SIZE, &nOld, &nLive);
</verbatim>

<verbatim>
  int lsm_flush(lsm_db *db);
</verbatim>

<h3 id=compulsary_work_and_checkpoint_scheduling>6.2.4. Compulsary Work and Checkpoint Scheduling</h3>

<p>Apart from the scenarios described above, there are two there are two 
scenarios where database work or checkpointing may be performed automatically,
regardless of the value of the LSM_CONFIG_AUTOWORK parameter.

<ul>
  <li> When closing a database connection, and 
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193





1194
1195
1196
1197









1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
</ul>

<p>Finally, regardless of age, a database is limited to a maximum of 64
segments in total. If an attempt is made to flush an in-memory tree to disk
when the database already contains 64 segments, two or more existing segments
must be merged together before the new segment can be created.

<h2 id=database_optimization>6.3. Database Optimization</h2>

<p>Database optimization transforms the contents of database file so that
the following are true:

<ul>
  <li> All database content is stored in a single segment.





  <li> The database file contains no (or as little as possible) free space.
       In other words, it is no larger than required to contain the single
       segment.
</ul>










<p>In order to optimize the database, lsm_work() should be called repeatedly
with the nMerge argument set to 1 until it returns without writing any data
to the database file. For example:

<verbatim>
  int nWrite;
  int rc;
  do {
    rc = lsm_work(db, 1, 2*1024*1024, &nWrite);
  }while( rc==LSM_OK && nWrite>0 );
</verbatim>

<p>When optimizing the database as above, the LSM_CONFIG_AUTOCHECKPOINT
parameter should be set to a non-zero value, or otherwise lsm_checkpoint()
should be called periodically. Otherwise, no checkpoints will be performed,
preventing the library from reusing any space occupied by old segments even
after their content has been merged into the new segment. The result - a
database file that is optimized, except that it is up to twice as large as
it otherwise would be.

<h2 id=other_parameters>6.4. Other Parameters </h2>

<i>
<p>Mention other configuration options that can be used to tune performance
here.

<ul>
  <li> LSM_CONFIG_MMAP
  <li> LSM_CONFIG_MULTIPLE_PROCESSES
  <li> LSM_CONFIG_USE_LOG
</ul>

</i>












|





|
>
>
>
>
>
|



>
>
>
>
>
>
>
>
>













|
|
|
|
|
|
<

<

<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327

1328

1329
















</ul>

<p>Finally, regardless of age, a database is limited to a maximum of 64
segments in total. If an attempt is made to flush an in-memory tree to disk
when the database already contains 64 segments, two or more existing segments
must be merged together before the new segment can be created.

<h2 id=database_file_optimization>6.3. Database File Optimization</h2>

<p>Database optimization transforms the contents of database file so that
the following are true:

<ul>
  <li> <p>All database content is stored in a single 
       <a href=#architectural_overview>segment</a>. This makes the
       database effectively equivalent to an optimally packed b-tree stucture
       for search operations - minimizing the number of disk sectors that need
       to be visted when searching the database.

  <li> <p>The database file contains no (or as little as possible) free space.
       In other words, it is no larger than required to contain the single
       segment.
</ul>

<p><i> Should we add a convenience function lsm_optimize() that does not 
return until the database is completely optimized? One that more or less does
the same as the example code below and deals with the AUTOCHECKPOINT issue?
This would help with this user manual if nothing else, as it means a method
for database optimization can be presented without depending on the previous
section.

</i>

<p>In order to optimize the database, lsm_work() should be called repeatedly
with the nMerge argument set to 1 until it returns without writing any data
to the database file. For example:

<verbatim>
  int nWrite;
  int rc;
  do {
    rc = lsm_work(db, 1, 2*1024*1024, &nWrite);
  }while( rc==LSM_OK && nWrite>0 );
</verbatim>

<p>When optimizing the database as above, either the LSM_CONFIG_AUTOCHECKPOINT
parameter should be set to a non-zero value or lsm_checkpoint() should be
called periodically. Otherwise, no checkpoints will be performed, preventing
the library from reusing any space occupied by old segments even after their
content has been merged into the new segment. The result - a database file that
is optimized, except that it is up to twice as large as it otherwise would be.