SQLite4
Check-in [71b26d318d]
Not logged in

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Add table of contents to lsmusr.wiki.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 71b26d318dfeaba5ee975890f63b86b9ccdcbede
User & Date: dan 2012-11-13 14:03:23
Context
2012-11-13
18:44
Add lsmapi.wiki. And the script that generates it from lsm.h - tool/mklsmapi.tcl. check-in: 2377f4f991 user: dan tags: trunk
14:03
Add table of contents to lsmusr.wiki. check-in: 71b26d318d user: dan tags: trunk
2012-11-12
20:19
Fix small issues in lsmusr.wiki. check-in: 3904797435 user: dan tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Added tool/fixnumbers.tcl.





















































































































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74

if {[llength $argv]!=1} { 
  puts stderr "Usage: $argv0 FILENAME"
  exit -1
}
set zFile [lindex $argv 0]
if {![file exists $zFile]} {
  puts stderr "No such file: $zFile"
  exit -1
}
set fd [open $zFile]
set out ""

set idx(1) 0
set idx(2) 0
set idx(3) 0

set iSeenToc 0
set before_toc ""
set after_toc ""
set out ""

while {![eof $fd]} {
  set line [gets $fd]

  if {[string trim $line]=="<div id=start_of_toc></div>"} {
    if {$iSeenToc!=0} {error "start_of_toc mismatch"}
    set iSeenToc 1
    set before_toc $out
    set out ""
    continue
  }
  if {[string trim $line]=="<div id=end_of_toc></div>"} {
    if {$iSeenToc!=1} {error "end_of_toc mismatch"}
    set iSeenToc 2
    continue
  }
  if {$iSeenToc==1} continue

  if {[regexp {<h([123]).*>} $line -> iLevel]} {
    if {$iLevel==1 || $idx(1)>0} {
      incr idx($iLevel)
      for {set i [expr $iLevel+1]} {$i < 4} {incr i} { set idx($i) 0 }
      set num ""
      for {set i 1} {$i <= $iLevel} {incr i} { append num "$idx($i)." }

      regsub -all {<[^>]*>} $line "" tag
      regsub -all {[0123456789.]} $tag "" tag
      set tag [string map {" " _} [string trim [string tolower $tag]]]

      set pattern [subst -nocom {<h$iLevel[^>]*>[0123456789. ]*}]
      regsub $pattern $line "<h$iLevel id=$tag>$num " line

      regsub -all {<[^>]*>} $line "" text
      set margin [string repeat {&nbsp;} [expr $iLevel * 6]]
      append toc "${margin}<a href=#$tag style=text-decoration:none>$text</a><br>\n"
    }
  }
  append out "$line\n"
}
if {$iSeenToc!=2} {error "No toc divs..."}
set after_toc $out



puts $before_toc
puts "<div id=start_of_toc></div>"
puts $toc
puts "<div id=end_of_toc></div>"
puts $after_toc




Changes to www/lsmusr.wiki.

1
2
3
4

5






















6
7
8



9
10
11
12
13
14
15
..
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
..
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
...
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
...
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
...
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
...
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
...
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
...
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
...
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
...
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
...
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
...
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
...
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933


934
935
936
937

<title>LSM Users Guide</title>
<nowiki>

























<h1>1. Overview</h1>

<p>This page describes the LSM embedded database library and usage thereof. 




<p>LSM is an embedded database library for key-value data, roughly similar
in scope to
<a href="http://www.oracle.com/technetwork/products/berkeleydb/overview/index.html">Berkeley DB</a>, 
<a href="http://code.google.com/p/leveldb/">LevelDB</a> or
<a href="http://fallabs.com/kyotocabinet/">KyotoCabinet</a>.
Both keys and
................................................................................
compression and/or encryption functions to transform data before it is
stored in the database file.

<p><i>Say something about the difference in performance characteristics 
between a b-tree and whatever it is LSM is. Link the performance graphs page.
</i>

<h1>2. Using the LSM Library </h1>

<p>LSM is not currently built or distributed independently. Instead, it
is part of the SQLite4 library. To use LSM in an application, the application
links against libsqlite4 and includes the header file "lsm.h" in any files
that access the LSM API.

<p><i>Pointer to build instructions for sqlite4</i>

<h1>3. Basic Usage</h1>

<h2>3.1 Opening and Closing Database Connections </h2>

<p>Opening a connection to a database is a two-step process.

<verbatim>
  int rc;
  lsm_db *db;

................................................................................
  if( rc!=LSM_OK ) exit(1);
</verbatim>

<verbatim>
  rc = lsm_close(db);
</verbatim>

<h2>3.2 Writing to a Database </h2>

<p>Three API functions are used to write to the database:

<ul>
  <li> <b>lsm_insert()</b>: insert a new key/value pair into the database,
       overwriting any existing entry with the same key.
  <li> <b>lsm_delete()</b>: remove a specific key from the database.
................................................................................
<verbatim>
  /* Should be checking return codes! */
  lsm_delete(db, "c", 1);
  lsm_delete_range(db, "c", 1, "f", 1);
  lsm_delete(db, "f", 1);
</verbatim>

<h2>3.3 Reading from a Database </h2>

<verbatim>
  lsm_csr *csr;

  rc = lsm_csr_open(db, &csr);
</verbatim>

................................................................................
  }
</verbatim>

<verbatim>
  lsm_csr_close(csr);
</verbatim>

<h2 id=robustness>3.4 Database Robustness </h2>

<p>The value of the configuration parameter LSM_CONFIG_SAFETY determines
how often data is synced to disk by the LSM library. This is an important
tradeoff - syncing less often can lead to orders of magnitude better
performance, but also exposes the application to the risk of partial or total
data loss in the event of a power failure;

................................................................................
  /* At this point, variable iSafety is set to the currently configured value
  ** of the LSM_CONFIG_SAFETY parameter (either 0, 1 or 2).  */
</verbatim>

<p>The lsm_config() function may also be used to configure other database
connection parameters.  

<h2>3.5 Database Transactions and MVCC </h2>

<p>LSM supports a single-writer/multiple-reader 
<a href=http://en.wikipedia.org/wiki/Multiversion_concurrency_control>MVCC</a>
based transactional concurrency model. This is the same model that SQLite
supports in <a href="http://www.sqlite.org/wal.html">WAL mode</a>.

<p>A read-transaction must be opened in order to read from the database. 
................................................................................
  lsm_delete(db, "k", 1);
  lsm_rollback(db, 2);
  lsm_delete(db, "m", 1);
  lsm_commit(db, 0);
  
</verbatim>

<h1>4. Compressed and Encrypted Databases </h1>

<p>LSM does not provide built-in methods for creating encrypted or compressed
databases. Instead, it allows the user to provide hooks to call external
functions to compress and/or encrypt data before it is written to the database
file, and to decrypt and/or uncompress data as it is read from the database
file.

................................................................................
fails (no transaction or database cursor is opened).

<p><i>Maybe there should be a way to register a mismatch-handler callback.
Otherwise, applications have to handle LSM_MISMATCH everywhere...
</i>


<h1>5. Performance Tuning</h1>

<h2>5.1 Architectural Overview </h2>

<p> The LSM library implements two separate data structures that are used 
together to store user data. When the database is queried, the library 
actually runs parallel queries on both of these data stores and merges the
results together to return to the user. The data structures are:

<ul>
................................................................................
database file header (to checkpoint the database). 
</table>

<p>The tasks associated with each of the locks above may be performed
concurrently by multiple database connections, located either in the same
application process or different processes.

<h2>5.2 Work and Checkpoint Scheduling </h2>

<p>The section above describes the three stages of transfering data written
to the database from the application to persistent storage. A "writer" 
client writes the data into the in-memory tree and log file. Later on a 
"worker" client flushes the data from the in-memory tree to a new segment
in the the database file. Additionally, a worker client must periodically
merge existing database segments together to prevent them from growing too
numerous.

<h3>5.2.1 Automatic Work and Checkpoint Scheduling</h3>

<p>By default, database "work" (the flushing and merging of segments, performed
by clients holding the WORKER lock) and checkpointing are scheduled and
performed automatically from within calls to "write" API functions. The 
"write" functions are:

<ul>
................................................................................
<p>If auto-work is enabled, the library periodically checks the state of the
database file to see if there exist N or more segments with the same age
value A, where N is the value assigned to the LSM_CONFIG_AUTOMERGE parameter.
If so, work is done to merge all such segments with age=A into a new, larger
segment assigned age=A+1. At present, "periodically" as used above means 
roughly once for every 32KB of data (including overhead) written to the
in-memory tree. The merge operation is not necessarily completed within 
the same call to a write API in which it is started (this would result in
blocking the writer thread for too long in many cases - in large databases
segments may grow to be many GB in size). At present, the amount of data
written by a single auto-work operation is roughly 32KB multiplied by the
number of segments in the database file. This formula may change - the point 
is that the library attempts to limit the amount of data written in order to
avoid blocking the writer thread for too long within a single API call.

<p>Each time a transaction is committed in auto-work mode, the library checks
to see if there exists an "old" in-memory tree (see the LSM_CONFIG_AUTOFLUSH
option above). If so, it attempts to flush it to disk immediately. Unlike
merges of existing segments, the entire in-memory tree must be flushed to
disk before control is returned to the user. It is not possible to
incrementally flush an in-memory tree in the same ways as it is possible to
................................................................................
than zero, after performing database work, the library automatically checks
how many bytes of raw data have been written to the database file since the
last checkpoint (by any client, not just by the current client). If this
value is greater than the value of the LSM_CONFIG_AUTOCHECKPOINT parameter,
a checkpoint is attempted. It is not an error if the attempt fails because the
CHECKPOINTER lock cannot be obtained.

<h3>5.2.2 Explicit Work and Checkpoint Scheduling</h3>

<p>The alternative to automatic scheduling of work and checkpoint operations
is to explicitly schedule them. Possibly in a background thread or dedicated
application process. In order to disable automatic work, a client must set
the LSM_CONFIG_AUTOWORK parameter to zero. This parameter is a property of
a database connection, not of a database itself, so it must be cleared
separately by all processes that may write to the database. Otherwise, they
................................................................................
  rc = lsm_info(db, LSM_INFO_TREE_SIZE, &nOld, &nLive);
</verbatim>

<verbatim>
  int lsm_flush(lsm_db *db);
</verbatim>

<h3>5.2.3 Compulsary Work and Checkpoint Scheduling</h3>

<p>Apart from the scenarios described above, there are two there are two 
scenarios where database work or checkpointing may be performed automatically,
regardless of the value of the LSM_CONFIG_AUTOWORK parameter.

<ul>
  <li> When closing a database connection, and 
................................................................................
</ul>

<p>Finally, regardless of age, a database is limited to a maximum of 64
segments in total. If an attempt is made to flush an in-memory tree to disk
when the database already contains 64 segments, two or more existing segments
must be merged together before the new segment can be created.

<h2>5.3 Database Optimization</h2>

<p>Database optimization transforms the contents of database file so that
the following are true:

<ul>
  <li> All database content is stored in a single segment.
  <li> The database file contains no (or as little as possible) free space.
................................................................................
parameter should be set to a non-zero value, or otherwise lsm_checkpoint()
should be called periodically. Otherwise, no checkpoints will be performed,
preventing the library from reusing any space occupied by old segments even
after their content has been merged into the new segment. The result - a
database file that is optimized, except that it is up to twice as large as
it otherwise would be.

<h2>5.4 Other Parameters </h2>

<i>
<p>Mention other configuration options that can be used to tune performance
here.

<ul>
  <li> LSM_CONFIG_MMAP
  <li> LSM_CONFIG_MULTIPLE_PROCESSES
  <li> LSM_CONFIG_USE_LOG
</ul>

</i>










>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|

|
>
>
>







 







|








|

|







 







|







 







|







 







|







 







|







 







|







 







|

|







 







|









|







 







|
|
|
|
|
|
|







 







|







 







|







 







|







 







|












>
>




1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
..
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
...
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
...
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
...
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
...
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
...
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
...
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
...
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
...
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
...
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
...
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
...
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
...
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965

<title>LSM Users Guide</title>
<nowiki>

<h2>Table of Contents</h2>


<div id=start_of_toc></div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#overview_of_lsm style=text-decoration:none>1. Overview of LSM</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#using_lsm_in_applications style=text-decoration:none>2. Using LSM in Applications </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#basic_usage style=text-decoration:none>3. Basic Usage</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#opening_and_closing_database_connections style=text-decoration:none>3.1. Opening and Closing Database Connections </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#writing_to_a_database style=text-decoration:none>3.2. Writing to a Database </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#reading_from_a_database style=text-decoration:none>3.3. Reading from a Database </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#database_robustness style=text-decoration:none>3.4. Database Robustness </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#database_transactions_and_mvcc style=text-decoration:none>3.5. Database Transactions and MVCC </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#compressed_and_encrypted_databases style=text-decoration:none>4. Compressed and Encrypted Databases </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#performance_tuning style=text-decoration:none>5. Performance Tuning</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#architectural_overview style=text-decoration:none>5.1. Architectural Overview </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#work_and_checkpoint_scheduling style=text-decoration:none>5.2. Work and Checkpoint Scheduling </a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#automatic_work_and_checkpoint_scheduling style=text-decoration:none>5.2.1. Automatic Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#explicit_work_and_checkpoint_scheduling style=text-decoration:none>5.2.2. Explicit Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#compulsary_work_and_checkpoint_scheduling style=text-decoration:none>5.2.3. Compulsary Work and Checkpoint Scheduling</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#database_optimization style=text-decoration:none>5.3. Database Optimization</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href=#other_parameters style=text-decoration:none>5.4. Other Parameters </a><br>

<div id=end_of_toc></div>

<h2>Overview</h2>

<p>This page describes the LSM embedded database library and use thereof. 
It is intended to be part user-manual and part tutorial.

<h1 id=overview_of_lsm>1. Introduction to LSM</h1>

<p>LSM is an embedded database library for key-value data, roughly similar
in scope to
<a href="http://www.oracle.com/technetwork/products/berkeleydb/overview/index.html">Berkeley DB</a>, 
<a href="http://code.google.com/p/leveldb/">LevelDB</a> or
<a href="http://fallabs.com/kyotocabinet/">KyotoCabinet</a>.
Both keys and
................................................................................
compression and/or encryption functions to transform data before it is
stored in the database file.

<p><i>Say something about the difference in performance characteristics 
between a b-tree and whatever it is LSM is. Link the performance graphs page.
</i>

<h1 id=using_lsm_in_applications>2. Using LSM in Applications </h1>

<p>LSM is not currently built or distributed independently. Instead, it
is part of the SQLite4 library. To use LSM in an application, the application
links against libsqlite4 and includes the header file "lsm.h" in any files
that access the LSM API.

<p><i>Pointer to build instructions for sqlite4</i>

<h1 id=basic_usage>3. Basic Usage</h1>

<h2 id=opening_and_closing_database_connections>3.1. Opening and Closing Database Connections </h2>

<p>Opening a connection to a database is a two-step process.

<verbatim>
  int rc;
  lsm_db *db;

................................................................................
  if( rc!=LSM_OK ) exit(1);
</verbatim>

<verbatim>
  rc = lsm_close(db);
</verbatim>

<h2 id=writing_to_a_database>3.2. Writing to a Database </h2>

<p>Three API functions are used to write to the database:

<ul>
  <li> <b>lsm_insert()</b>: insert a new key/value pair into the database,
       overwriting any existing entry with the same key.
  <li> <b>lsm_delete()</b>: remove a specific key from the database.
................................................................................
<verbatim>
  /* Should be checking return codes! */
  lsm_delete(db, "c", 1);
  lsm_delete_range(db, "c", 1, "f", 1);
  lsm_delete(db, "f", 1);
</verbatim>

<h2 id=reading_from_a_database>3.3. Reading from a Database </h2>

<verbatim>
  lsm_csr *csr;

  rc = lsm_csr_open(db, &csr);
</verbatim>

................................................................................
  }
</verbatim>

<verbatim>
  lsm_csr_close(csr);
</verbatim>

<h2 id=database_robustness>3.4. Database Robustness </h2>

<p>The value of the configuration parameter LSM_CONFIG_SAFETY determines
how often data is synced to disk by the LSM library. This is an important
tradeoff - syncing less often can lead to orders of magnitude better
performance, but also exposes the application to the risk of partial or total
data loss in the event of a power failure;

................................................................................
  /* At this point, variable iSafety is set to the currently configured value
  ** of the LSM_CONFIG_SAFETY parameter (either 0, 1 or 2).  */
</verbatim>

<p>The lsm_config() function may also be used to configure other database
connection parameters.  

<h2 id=database_transactions_and_mvcc>3.5. Database Transactions and MVCC </h2>

<p>LSM supports a single-writer/multiple-reader 
<a href=http://en.wikipedia.org/wiki/Multiversion_concurrency_control>MVCC</a>
based transactional concurrency model. This is the same model that SQLite
supports in <a href="http://www.sqlite.org/wal.html">WAL mode</a>.

<p>A read-transaction must be opened in order to read from the database. 
................................................................................
  lsm_delete(db, "k", 1);
  lsm_rollback(db, 2);
  lsm_delete(db, "m", 1);
  lsm_commit(db, 0);
  
</verbatim>

<h1 id=compressed_and_encrypted_databases>4. Compressed and Encrypted Databases </h1>

<p>LSM does not provide built-in methods for creating encrypted or compressed
databases. Instead, it allows the user to provide hooks to call external
functions to compress and/or encrypt data before it is written to the database
file, and to decrypt and/or uncompress data as it is read from the database
file.

................................................................................
fails (no transaction or database cursor is opened).

<p><i>Maybe there should be a way to register a mismatch-handler callback.
Otherwise, applications have to handle LSM_MISMATCH everywhere...
</i>


<h1 id=performance_tuning>5. Performance Tuning</h1>

<h2 id=architectural_overview>5.1. Architectural Overview </h2>

<p> The LSM library implements two separate data structures that are used 
together to store user data. When the database is queried, the library 
actually runs parallel queries on both of these data stores and merges the
results together to return to the user. The data structures are:

<ul>
................................................................................
database file header (to checkpoint the database). 
</table>

<p>The tasks associated with each of the locks above may be performed
concurrently by multiple database connections, located either in the same
application process or different processes.

<h2 id=work_and_checkpoint_scheduling>5.2. Work and Checkpoint Scheduling </h2>

<p>The section above describes the three stages of transfering data written
to the database from the application to persistent storage. A "writer" 
client writes the data into the in-memory tree and log file. Later on a 
"worker" client flushes the data from the in-memory tree to a new segment
in the the database file. Additionally, a worker client must periodically
merge existing database segments together to prevent them from growing too
numerous.

<h3 id=automatic_work_and_checkpoint_scheduling>5.2.1. Automatic Work and Checkpoint Scheduling</h3>

<p>By default, database "work" (the flushing and merging of segments, performed
by clients holding the WORKER lock) and checkpointing are scheduled and
performed automatically from within calls to "write" API functions. The 
"write" functions are:

<ul>
................................................................................
<p>If auto-work is enabled, the library periodically checks the state of the
database file to see if there exist N or more segments with the same age
value A, where N is the value assigned to the LSM_CONFIG_AUTOMERGE parameter.
If so, work is done to merge all such segments with age=A into a new, larger
segment assigned age=A+1. At present, "periodically" as used above means 
roughly once for every 32KB of data (including overhead) written to the
in-memory tree. The merge operation is not necessarily completed within 
a single call to a write API (this would result in blocking the writer thread
for too long in many cases - in large databases segments may grow to be many GB
in size). Currently, the amount of data written by a single auto-work
operation is roughly 32KB multiplied by the number of segments in the database
file. This formula may change - the point is that the library attempts to limit
the amount of data written in order to avoid blocking the writer thread for too
long within a single API call.

<p>Each time a transaction is committed in auto-work mode, the library checks
to see if there exists an "old" in-memory tree (see the LSM_CONFIG_AUTOFLUSH
option above). If so, it attempts to flush it to disk immediately. Unlike
merges of existing segments, the entire in-memory tree must be flushed to
disk before control is returned to the user. It is not possible to
incrementally flush an in-memory tree in the same ways as it is possible to
................................................................................
than zero, after performing database work, the library automatically checks
how many bytes of raw data have been written to the database file since the
last checkpoint (by any client, not just by the current client). If this
value is greater than the value of the LSM_CONFIG_AUTOCHECKPOINT parameter,
a checkpoint is attempted. It is not an error if the attempt fails because the
CHECKPOINTER lock cannot be obtained.

<h3 id=explicit_work_and_checkpoint_scheduling>5.2.2. Explicit Work and Checkpoint Scheduling</h3>

<p>The alternative to automatic scheduling of work and checkpoint operations
is to explicitly schedule them. Possibly in a background thread or dedicated
application process. In order to disable automatic work, a client must set
the LSM_CONFIG_AUTOWORK parameter to zero. This parameter is a property of
a database connection, not of a database itself, so it must be cleared
separately by all processes that may write to the database. Otherwise, they
................................................................................
  rc = lsm_info(db, LSM_INFO_TREE_SIZE, &nOld, &nLive);
</verbatim>

<verbatim>
  int lsm_flush(lsm_db *db);
</verbatim>

<h3 id=compulsary_work_and_checkpoint_scheduling>5.2.3. Compulsary Work and Checkpoint Scheduling</h3>

<p>Apart from the scenarios described above, there are two there are two 
scenarios where database work or checkpointing may be performed automatically,
regardless of the value of the LSM_CONFIG_AUTOWORK parameter.

<ul>
  <li> When closing a database connection, and 
................................................................................
</ul>

<p>Finally, regardless of age, a database is limited to a maximum of 64
segments in total. If an attempt is made to flush an in-memory tree to disk
when the database already contains 64 segments, two or more existing segments
must be merged together before the new segment can be created.

<h2 id=database_optimization>5.3. Database Optimization</h2>

<p>Database optimization transforms the contents of database file so that
the following are true:

<ul>
  <li> All database content is stored in a single segment.
  <li> The database file contains no (or as little as possible) free space.
................................................................................
parameter should be set to a non-zero value, or otherwise lsm_checkpoint()
should be called periodically. Otherwise, no checkpoints will be performed,
preventing the library from reusing any space occupied by old segments even
after their content has been merged into the new segment. The result - a
database file that is optimized, except that it is up to twice as large as
it otherwise would be.

<h2 id=other_parameters>5.4. Other Parameters </h2>

<i>
<p>Mention other configuration options that can be used to tune performance
here.

<ul>
  <li> LSM_CONFIG_MMAP
  <li> LSM_CONFIG_MULTIPLE_PROCESSES
  <li> LSM_CONFIG_USE_LOG
</ul>

</i>