Documentation Source Text

Changes On Branch jsonb
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Changes In Branch jsonb Excluding Merge-Ins

This is equivalent to a diff from 150e715cda to f6fad8006b

2024-01-09
20:02
Typo fixes. (check-in: a808fb2778 user: drh tags: branch-3.44)
2023-12-05
19:57
Merge the JSONB enhancements. (check-in: b9c149e41a user: drh tags: trunk)
16:07
Further tweaks for the JSONB documentation. (Closed-Leaf check-in: f6fad8006b user: drh tags: jsonb)
15:39
Updates to JSONB documentation. (check-in: 83e6af3514 user: drh tags: jsonb)
14:43
Merge the latest trunk changes into the jsonb branch. Update with the latest documentation about JSONB. (check-in: 3b673fad86 user: drh tags: jsonb)
2023-12-02
22:40
Enhance the download page. Fix a typo on lang_attach.html (check-in: 150e715cda user: drh tags: trunk)
2023-11-24
11:48
Fix the 3.44.2 hash in chronology.tcl. (check-in: 9780baadfd user: drh tags: trunk)

Changes to pages/changes.in.

19
20
21
22
23
24
25
















26
27
28
29
30
31
32
  set aChng($nChng) [list $date $desc $options]
  set xrefChng($date) $nChng
  if {[regexp {\(([0-9.]+)\)} $date all vers]} {
    set xrefChng($vers) $nChng
  }
  incr nChng
}

















chng {2023-11-24 (3.44.2)} {
<li> Fix a mistake in the [CLI] that was introduced by the fix (item 15 above) in 3.44.1.
<li> Fix a problem in FTS5 that was discovered during internal fuzz testing only
     minutes after the 3.44.1 release was tagged.
<li> Fix incomplete assert() statements that the fuzzer discovered the day after
     the previous release.







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
  set aChng($nChng) [list $date $desc $options]
  set xrefChng($date) $nChng
  if {[regexp {\(([0-9.]+)\)} $date all vers]} {
    set xrefChng($vers) $nChng
  }
  incr nChng
}

chng {2024-01-31 (3.45.0)} {
<li> Enhancements to the [JSON SQL functions]:
    <ol type="a">
    <li> All JSON functions are rewritten to use a new internal parse tree
         format called [JSONB].  The new parse-tree format is serializable 
         and hence can be stored in the database to avoid unnecessary parsing.
    <li> New versions of JSON-generating functions generate binary JSONB instead
         of JSON text.
    <li> The [json_valid()] function adds a second parameter that is a bitmask
         used to determine what it means for the first argument to be "well-formed".
    </ol>
<p><b>Hashes:</b>
<li>SQLITE_SOURCE_ID: <i>pending</i>
<li>SHA3-256 for sqlite3.c: <i>pending</i>
}

chng {2023-11-24 (3.44.2)} {
<li> Fix a mistake in the [CLI] that was introduced by the fix (item 15 above) in 3.44.1.
<li> Fix a problem in FTS5 that was discovered during internal fuzz testing only
     minutes after the 3.44.1 release was tagged.
<li> Fix incomplete assert() statements that the fuzzer discovered the day after
     the previous release.

Changes to pages/chronology.tcl.

9
10
11
12
13
14
15

16
17
18
19
20
21
22
# Data returned by below proc is used by wrap.tcl for its dateof:?
# tag substitution and pages/chronology.in for its machinations.

# Return list of lists, each a 4-tuple: uuid date vers vnum
proc chronology_info {} {
  set rv [list]
  foreach line [split {

ebead0e723|2023-11-24|Version 3.44.2
d295f48e8f|2023-11-22|Version 3.44.1
17129ba1ff|2023-11-01|Version 3.44.0
e17a3dc693|2023-10-10|Version 3.43.2
2d3a40c05c|2023-09-11|Version 3.43.1
0f80b798b3|2023-08-24|Version 3.43.0
831d0fb283|2023-05-16|Version 3.42.0







>







9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Data returned by below proc is used by wrap.tcl for its dateof:?
# tag substitution and pages/chronology.in for its machinations.

# Return list of lists, each a 4-tuple: uuid date vers vnum
proc chronology_info {} {
  set rv [list]
  foreach line [split {
xxxxxxxxxx|pending|Version 3.45.0
ebead0e723|2023-11-24|Version 3.44.2
d295f48e8f|2023-11-22|Version 3.44.1
17129ba1ff|2023-11-01|Version 3.44.0
e17a3dc693|2023-10-10|Version 3.43.2
2d3a40c05c|2023-09-11|Version 3.43.1
0f80b798b3|2023-08-24|Version 3.43.0
831d0fb283|2023-05-16|Version 3.42.0

Changes to pages/index.in.

38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

<p>
SQLite [https://sqlite.org/src|source code]
is in the [public-domain] and is free to
everyone to use for any purpose.

<h3>Latest Release</h3>
<a href="releaselog/3_44_2.html">Version 3.44.2</a> ([dateof:3.44.2]).
<a class="button" href="download.html">Download</a>
<a class="button" href="chronology.html">Prior Releases</a>

<div class="mobileonly">
<h3>Common Links</h3>
<tcl>common_links</tcl>
</div>







|







38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

<p>
SQLite [https://sqlite.org/src|source code]
is in the [public-domain] and is free to
everyone to use for any purpose.

<h3>Latest Release</h3>
<a href="releaselog/3_45_0.html">Version 3.45.0</a> ([dateof:3.45.0]).
<a class="button" href="download.html">Download</a>
<a class="button" href="chronology.html">Prior Releases</a>

<div class="mobileonly">
<h3>Common Links</h3>
<tcl>common_links</tcl>
</div>

Changes to pages/json1.in.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44




45
46
47




48
49
50
51
52
53
54
55
56
57
58
59
60
61





62
63
64
65
66
67
68




69
70
71
72




73
74
75
76
77




78
79
80
81




82
83
84
85




86
87
88
89
90
91




92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115






116
117
118
119
120






121
122
123
124
125
126
127
<title>JSON Functions And Operators</title>
<tcl>hd_keywords json1 {the json1 extension} {JSON SQL functions}</tcl>

<table_of_contents>

<h1>Overview</h1>
<p>
By default, SQLite supports sixteen functions and two operators for
dealing with JSON values.  There are also two [table-valued functions]
that can be used to decompose a JSON string.

<p>
There are 15 scalar functions and operators:

<ol>
<tcl>
set tabcnt 0
proc tabentry {fx desc lnk} {
  global tabcnt
  incr tabcnt
  hd_puts "<li value='$tabcnt'>\n"
  set fx [string trim $fx]
  set hlink "<a href='#$lnk'>"
  regsub -all {^json\(} $fx "${hlink}json</a>(" fx
  regsub -all {(json_[a-z_]+)} $fx "$hlink\\1</a>" fx
  regsub -all {(value[1-9]?|path|label[1-9]?)} $fx "<i>\\1</i>" fx
  regsub -all {\((json)} $fx "(<i>\\1</i>" fx
  hd_puts "$fx\n"
  # hd_puts "[string trim $desc]\n"
  hd_puts "</li>\n\n"
}
proc tabentryop {lhs op rhs lnk} {
  global tabcnt
  incr tabcnt
  hd_puts "<li value='$tabcnt'>\n"
  hd_puts "<i>$lhs</i> <a href='#$lnk'>$op</a> <i>$rhs</i>\n"
  hd_puts "</li>\n\n"
}
set tsctr 100

tabentry {json(json)} {
  Validate and minify a JSON string
} jmini





tabentry {json_array(value1,value2,...)} {
  Return a JSON array holding the function arguments.
} jarray





tabentry {json_array_length(json)<br>json_array_length(json,path)} {
  Return the number of elements in the JSON array identified by
  the arguments.
} jarraylen

tabentry {json_error_position(json)} {
  Return the position of the first syntax error in the input JSON, or
  return 0 if the input is well-formed.
} jerr

tabentry {json_extract(json,path,...)} {
  Extract values or subcomponents from a JSON string.
} jex






tabentryop json -> path jptr
tabentryop json ->> path jptr

tabentry {json_insert(json,path,value,...)} {
  Insert values into a JSON string without overwriting existing values.
} jins





tabentry {json_object(label1,value1,...)} {
  Construct and return a new JSON object based on the arguments.
} jobj





tabentry {json_patch(json1,json2)} {
  Apply the [https://tools.ietf.org/html/rfc7396|RFC-7396] MergePatch
  algorithm.
} jpatch





tabentry {json_remove(json,path,...)} {
  Remove the specified values from a JSON string.
} jrm





tabentry {json_replace(json,path,value,...)} {
  Update existing values within a JSON string.
} jrepl





tabentry {json_set(json,path,value,...)} {
  Insert or replace values in a JSON string.  Overwrite existing elements
  or create new entries in the JSON string for elements that do not 
  previously exist.
} jset





tabentry {json_type(json)<br>json_type(json,path)} {
  Return the type of a JSON string or subcomponent.
} jtype

tabentry {json_valid(json)} {
  Return true (1) if the input text is a valid JSON string
} jvalid

tabentry {json_quote(value)} {
  Convert an SQL value (a number or a string) into its corresponding JSON
  representation.
} jquote
</tcl>
</ol>

<p>There are two [aggregate SQL functions]:

<ol>
<tcl>
tabentry {json_group_array(value)} {
  Return a JSON array composed of all <i>value</i> elements 
  in the aggregation.
} jgrouparray







tabentry {json_group_object(label,value)} {
  Return a JSON object composed of all <i>name</i> and <i>value</i> pairs
  in the aggregation.
} jgroupobject






</tcl>
</ol>

<p>The two [table-valued functions] are:

<ol>
<tcl>







|




|










|
|



















>
>
>
>

|

>
>
>
>



|




|



|

>
>
>
>
>







>
>
>
>




>
>
>
>





>
>
>
>




>
>
>
>




>
>
>
>






>
>
>
>





|
|









|







>
>
>
>
>
>





>
>
>
>
>
>







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
<title>JSON Functions And Operators</title>
<tcl>hd_keywords json1 {the json1 extension} {JSON SQL functions}</tcl>

<table_of_contents>

<h1>Overview</h1>
<p>
By default, SQLite supports twenty-nine functions and two operators for
dealing with JSON values.  There are also two [table-valued functions]
that can be used to decompose a JSON string.

<p>
There are 25 scalar functions and operators:

<ol>
<tcl>
set tabcnt 0
proc tabentry {fx desc lnk} {
  global tabcnt
  incr tabcnt
  hd_puts "<li value='$tabcnt'>\n"
  set fx [string trim $fx]
  set hlink "<a href='#$lnk'>"
  regsub -all {^(jsonb?)\(} $fx "${hlink}\\1</a>(" fx
  regsub -all {(jsonb?_[a-z_]+)} $fx "$hlink\\1</a>" fx
  regsub -all {(value[1-9]?|path|label[1-9]?)} $fx "<i>\\1</i>" fx
  regsub -all {\((json)} $fx "(<i>\\1</i>" fx
  hd_puts "$fx\n"
  # hd_puts "[string trim $desc]\n"
  hd_puts "</li>\n\n"
}
proc tabentryop {lhs op rhs lnk} {
  global tabcnt
  incr tabcnt
  hd_puts "<li value='$tabcnt'>\n"
  hd_puts "<i>$lhs</i> <a href='#$lnk'>$op</a> <i>$rhs</i>\n"
  hd_puts "</li>\n\n"
}
set tsctr 100

tabentry {json(json)} {
  Validate and minify a JSON string
} jmini

tabentry {jsonb(json)} {
  Convert JSON into its binary BLOB encoding
} jminib

tabentry {json_array(value1,value2,...)} {
  Return a JSON array holding the function arguments
} jarray

tabentry {jsonb_array(value1,value2,...)} {
  Return a binary BLOB representation of a JSON array
} jarrayb

tabentry {json_array_length(json)<br>json_array_length(json,path)} {
  Return the number of elements in the JSON array identified by
  the arguments
} jarraylen

tabentry {json_error_position(json)} {
  Return the position of the first syntax error in the input JSON, or
  return 0 if the input is well-formed
} jerr

tabentry {json_extract(json,path,...)} {
  Extract values or subcomponents from a JSON string
} jex

tabentry {jsonb_extract(json,path,...)} {
  Like json_extract() except subcomponents are returned
  in the binary BLOB represention
} jexb

tabentryop json -> path jptr
tabentryop json ->> path jptr

tabentry {json_insert(json,path,value,...)} {
  Insert values into a JSON string without overwriting existing values.
} jins

tabentry {jsonb_insert(json,path,value,...)} {
  Insert values into JSON, returning the binary BLOB representation
} jinsb

tabentry {json_object(label1,value1,...)} {
  Construct and return a new JSON object based on the arguments.
} jobj

tabentry {jsonb_object(label1,value1,...)} {
  Construct and return a new JSON object as a binary BLOB
} jobjb

tabentry {json_patch(json1,json2)} {
  Apply the [https://tools.ietf.org/html/rfc7396|RFC-7396] MergePatch
  algorithm.
} jpatch

tabentry {jsonb_patch(json1,json2)} {
  Like json_patch() except that the result is a binary BLOB
} jpatchb

tabentry {json_remove(json,path,...)} {
  Remove the specified values from a JSON string.
} jrm

tabentry {jsonb_remove(json,path,...)} {
  Remove elements from JSON and return the binary BLOB representation
} jrmb

tabentry {json_replace(json,path,value,...)} {
  Update existing values within a JSON string.
} jrepl

tabentry {jsonb_replace(json,path,value,...)} {
  Update elemenets in JSON and return the binary BLOB representation
} jreplb

tabentry {json_set(json,path,value,...)} {
  Insert or replace values in a JSON string.  Overwrite existing elements
  or create new entries in the JSON string for elements that do not 
  previously exist.
} jset

tabentry {jsonb_set(json,path,value,...)} {
  Like json_set() except that the result is a binary BLOB
} jsetb

tabentry {json_type(json)<br>json_type(json,path)} {
  Return the type of a JSON string or subcomponent.
} jtype

tabentry {json_valid(json)<br>json_valid(json,flags)} {
  Return true (1) if the input text is a valid JSON string or [JSONB] blob.
} jvalid

tabentry {json_quote(value)} {
  Convert an SQL value (a number or a string) into its corresponding JSON
  representation.
} jquote
</tcl>
</ol>

<p>There are four [aggregate SQL functions]:

<ol>
<tcl>
tabentry {json_group_array(value)} {
  Return a JSON array composed of all <i>value</i> elements 
  in the aggregation.
} jgrouparray

tabentry {jsonb_group_array(value)} {
  Return JSON in the binary BLOB representation that consists
  of all <i>value</i> elements 
  in the aggregation.
} jgrouparrayb

tabentry {json_group_object(label,value)} {
  Return a JSON object composed of all <i>name</i> and <i>value</i> pairs
  in the aggregation.
} jgroupobject

tabentry {jsonb_group_object(name,value)} {
  Return JSON in the binary BLOB representation that is an object
  composed from all <i>name</i> and <i>value</i> pairs
  in the aggregation.
} jgroupobjectb
</tcl>
</ol>

<p>The two [table-valued functions] are:

<ol>
<tcl>
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217





































































































218
219
220
221
222
223
224

<h1>Interface Overview</h1>

<p>
SQLite stores JSON as ordinary text.
Backwards compatibility constraints mean that SQLite is only able to
store values that are NULL, integers, floating-point numbers, text,
and BLOBs.  It is not possible to add a sixth "JSON" type.

<p>
SQLite does not (currently) support a binary encoding
of JSON.  Experiments have been unable to find a binary encoding
that is smaller or faster than a plain text encoding.
(The present implementation parses JSON text at over 250 MB/s.)
All JSON functions currently throw an error if any of their
arguments are BLOBs because BLOBs are reserved
for a future enhancement in which BLOBs will store the binary encoding
for JSON.

<h2>JSON arguments</h2>

<p>
For functions that accept JSON as their first argument, that argument
can be a JSON object, array, number, string, or null.  SQLite numeric
values and NULL values are interpreted as JSON numbers and nulls, respectively.
SQLite text values can be understood as JSON objects, arrays, or strings.
If an SQLite text value that is not a well-formed JSON object, array, or
string is passed into JSON function, that function will usually throw
an error.  (Exceptions to this rule are [json_valid()],
[json_quote()], and [json_error_position()].)

<p>
These routines understand all
[https://www.rfc-editor.org/rfc/rfc7159.txt | rfc-7159 JSON syntax]
and also [https://spec.json5.org/ | JSON5 extensions].  JSON text
generated by these routines always strictly conforms to the
[https://json.org|canonical JSON definition] and does not contain any JSON5
or other extensions.  The ability to read and understand JSON5 was added in
version 3.42.0 ([dateof:3.42.0]).
Prior versions of SQLite would only read canonical JSON.







































































































<tcl>hd_fragment jsonpath {JSON paths} {JSON function path arguments}</tcl>
<h2>PATH arguments</h2>

<p>
For functions that accept PATH arguments, that PATH must be well-formed or
else the function will throw an error.







|

<
<
<
<
<
<
<
<
<















|







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







226
227
228
229
230
231
232
233
234









235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365

<h1>Interface Overview</h1>

<p>
SQLite stores JSON as ordinary text.
Backwards compatibility constraints mean that SQLite is only able to
store values that are NULL, integers, floating-point numbers, text,
and BLOBs.  It is not possible to add a new "JSON" type.











<h2>JSON arguments</h2>

<p>
For functions that accept JSON as their first argument, that argument
can be a JSON object, array, number, string, or null.  SQLite numeric
values and NULL values are interpreted as JSON numbers and nulls, respectively.
SQLite text values can be understood as JSON objects, arrays, or strings.
If an SQLite text value that is not a well-formed JSON object, array, or
string is passed into JSON function, that function will usually throw
an error.  (Exceptions to this rule are [json_valid()],
[json_quote()], and [json_error_position()].)

<p>
These routines understand all
[https://www.rfc-editor.org/rfc/rfc8259.txt | rfc-8259 JSON syntax]
and also [https://spec.json5.org/ | JSON5 extensions].  JSON text
generated by these routines always strictly conforms to the
[https://json.org|canonical JSON definition] and does not contain any JSON5
or other extensions.  The ability to read and understand JSON5 was added in
version 3.42.0 ([dateof:3.42.0]).
Prior versions of SQLite would only read canonical JSON.

<tcl>hd_fragment jsonbx {JSONB}</tcl>
<h2>JSONB</h2>

<p>
Beginning with version 3.45.0 ([dateof:3.44.0]), SQLite supports an
alternative binary encoding of JSON which we call "JSONB".  The JSONB
format is stored as a BLOB.  JSONB is analogous in structure to
canonical RFC-8259 text JSON.  JSONB just happens to be slightly more
compact and much easier to parse, so it uses fewer CPU cycles to
process.

<p>
Any SQL function parameter that accepts text JSON as an input will also
accept a BLOB in the JSONB format.  The function will operate the
same in either case, except that it will run faster when
the input is JSONB.

<p>
Most SQL functions that return JSON text have a corresponding function
that returns the equivalent JSONB.  The functions that return JSON
in the text format begin with "<tt>json_</tt>" and functions that
return the JSONB format begin with "<tt>jsonb_</tt>".

<h3>The JSONB format</h3>

<p>
The core idea behind SQLite's JSONB is that each element begins with a header
that includes the size and type of that element.  This makes reading
faster.  For example, when reading
a string literal, it is no longer necessary to search forward looking
for the closing double-quote, reading byte by byte and taking care
to avoid escaped double-quotes.  The size of
the literal is right there in the header, and so the process can jump
ahead to the next element without having to scrutinize each
intervening byte.  Since the size and type of each element is
identified in the header, punctuation characters such as string, object,
and array delimiters and comma and colon separators can all be omitted.
The payload for JSONB is the same as the corresponding text JSON.
The only difference is that JSONB omits punctuation and replaces it with
a header on each element.

<p>
The "JSONB" name is inspired by [https://postgresql.org|PostgreSQL], but the
on-disk format for SQLite's JSONB is not the same as PostgreSQL's.
The two formats have the same name, but are not binary compatible.
The PostgreSQL JSONB format claims to offer O(1)
lookup of elements in objects and arrays.  SQLite's JSONB format makes no
such claim.  SQLite's JSONB has O(N) time complexity for
most operations in SQLite, just like text JSON.  The advantage of JSONB in
SQLite is that it is smaller and faster than text JSON - potentially several
times faster. There is space in the
on-disk JSONB format to add enhancements and future versions of SQLite might
include options to provide O(1) lookup of elements in JSONB, but no such
capability is currently available.

<p>
The SQLite JSONB format is intended to be private to
SQLite and is for use by the built-in SQLite functions only.  The
JSONB format is not intended as an interchange format.  Nevertheless,
JSONB is stored in database files which are intended to
be readable and writable for many decades into the future.
To that end, the JSONB format is well-defined and stable. The separate
[SQLite JSONB format] document provides details of the JSONB format
for the curious reader.

<h3>Handling of malformed JSONB</h3>

<p>
The JSONB that is generated by SQLite will always be well-formed.  If you
treat JSONB as an opaque BLOB that is generated by some JSON functions
and consumed by others, then you will not have any problems.  But
JSONB is just a BLOB, so a mischievious programmer could devise BLOBs
that are similar to JSONB but that are technically malformed.  When
misformatted JSONB is feed into JSON functions, any of the following
might happen:

<ul>
<li><p>
The SQL statement might abort with a "malformed JSON" error.

<li><p>
If the error is in a part of the JSONB that is not required to obtain
the correct answer, then the correct answer might be returned.

<li><p>
A goofy or nonsensical answer might be returned.
</ul>

<p>
The way in which SQLite handles invalid JSONB might change
from one version of SQLite to the next.  The system follows
the garbage-in/garbage-out rule:  If you feed the JSON functions invalid
JSONB, you get back an invalid answer.  If you are in doubt about the
validity of our JSONB, use the [json_valid()] function to verify it.

<p>
The implementation does make this one promise:
Malformed JSONB will never cause a memory
error or similar problem that might lead to a vulnerability.
Invalid JSONB might lead to crazy answers,
or it might cause queries to abort, but it won't cause a crash.

<tcl>hd_fragment jsonpath {JSON paths} {JSON function path arguments}</tcl>
<h2>PATH arguments</h2>

<p>
For functions that accept PATH arguments, that PATH must be well-formed or
else the function will throw an error.
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
<h2>Compatibility</h2>

<p>
The current implementation of this JSON library uses a recursive descent
parser.  In order to avoid using excess stack space, any JSON input that has
more than 1000 levels of nesting is considered invalid.   Limits on nesting
depth are allowed for compatible implementations of JSON by
[https://tools.ietf.org/html/rfc7159#section-9|RFC-7159 section 9].

<tcl>hd_fragment json5 {JSON5} {JSON5 extensions}</tcl>
<h2>JSON5 Extensions</h2>

<p>
Beginning in version 3.42.0 ([dateof:3.42.0]), these routines will
read and interpret input JSON text that includes







|







441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
<h2>Compatibility</h2>

<p>
The current implementation of this JSON library uses a recursive descent
parser.  In order to avoid using excess stack space, any JSON input that has
more than 1000 levels of nesting is considered invalid.   Limits on nesting
depth are allowed for compatible implementations of JSON by
[https://tools.ietf.org/html/rfc8259#section-9|RFC-8259 section 9].

<tcl>hd_fragment json5 {JSON5} {JSON5 extensions}</tcl>
<h2>JSON5 Extensions</h2>

<p>
Beginning in version 3.42.0 ([dateof:3.42.0]), these routines will
read and interpret input JSON text that includes
335
336
337
338
339
340
341
342

343
344
345
346

347
348
349
350
351
352
353
<li> Additional white space characters are allowed.
</ul>

<p>
To convert string X from JSON5 into canonical JSON, invoke
"[json(X)]".  The output of the "[json()]" function will be canonical
JSON regardless of any JSON5 extensions that are present in the input.
For backwards compatibility, the [json_valid(X)] function continues

to report false for inputs that are not canonical JSON, even if the
input is JSON5 that the function is able to understand.  To determine
whether or not an input string is valid JSON5,
use the expression: "[json_error_position(X)]==0".


<p>
These routines understand all of JSON5, plus a little more.
SQLite extends the JSON5 syntax in these two ways:

<ol>
<li><p>







|
>


|
<
>







476
477
478
479
480
481
482
483
484
485
486
487

488
489
490
491
492
493
494
495
<li> Additional white space characters are allowed.
</ul>

<p>
To convert string X from JSON5 into canonical JSON, invoke
"[json(X)]".  The output of the "[json()]" function will be canonical
JSON regardless of any JSON5 extensions that are present in the input.
For backwards compatibility, the [json_valid(X)] function without a
"flags" argument continues
to report false for inputs that are not canonical JSON, even if the
input is JSON5 that the function is able to understand.  To determine
whether or not an input string is valid JSON5, include the 0x02 bit

in the "flags" argument to json_valid:  "<tt>json_valid(X,2)</tt>".

<p>
These routines understand all of JSON5, plus a little more.
SQLite extends the JSON5 syntax in these two ways:

<ol>
<li><p>
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416










417
418
419
420
421
422
423
(with all unnecessary whitespace removed).  If X is not a well-formed
JSON string, then this routine throws an error.

<p>In other words, this function converts raw text that looks like
JSON into actual JSON so that it may be passed into the [value argument]
of some other json function and will be interpreted as JSON rather than
a string.  This function is not appropriate for testing whether or not
a particular string is well-formed JSON - use the [json_valid()] and/or
[json_error_position()] routines below for that task.

<p>If the argument X to json(X) contains JSON objects with duplicate
labels, then it is undefined whether or not the duplicates are
preserved.  The current implementation preserves duplicates.
However, future enhancements
to this routine may choose to silently remove duplicates.

<p>
Example:

<tcl>
jexample \
  {json(' { "this" : "is", "a": [ "test" ] } ')} \
      {'{"this":"is","a":["test"]}'}
</tcl>











<tcl>hd_fragment jarray {json_array SQL function} {json_array}</tcl>
<h2>The json_array() function</h2>

<p>The json_array() SQL function accepts zero or more arguments and
returns a well-formed JSON array that is composed from those arguments.
If any argument to json_array() is a BLOB then an error is thrown.







|
|















>
>
>
>
>
>
>
>
>
>







535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
(with all unnecessary whitespace removed).  If X is not a well-formed
JSON string, then this routine throws an error.

<p>In other words, this function converts raw text that looks like
JSON into actual JSON so that it may be passed into the [value argument]
of some other json function and will be interpreted as JSON rather than
a string.  This function is not appropriate for testing whether or not
a particular string is well-formed JSON - use the [json_valid()] for
that task.

<p>If the argument X to json(X) contains JSON objects with duplicate
labels, then it is undefined whether or not the duplicates are
preserved.  The current implementation preserves duplicates.
However, future enhancements
to this routine may choose to silently remove duplicates.

<p>
Example:

<tcl>
jexample \
  {json(' { "this" : "is", "a": [ "test" ] } ')} \
      {'{"this":"is","a":["test"]}'}
</tcl>

<tcl>hd_fragment jminib {jsonb SQL function} {jsonb}</tcl>
<h2>The jsonb() function</h2>

<p>The jsonb(X) function returns the binary JSONB representation
of the JSON provided as argument X.  An error is raised if X is
TEXT that does not have valid JSON syntax.  If X is a BLOB and
superficially appears to be a well-formed JSONB,
then this routine simply returns a copy of X.  The deep structure
of the JSONB is not validated.

<tcl>hd_fragment jarray {json_array SQL function} {json_array}</tcl>
<h2>The json_array() function</h2>

<p>The json_array() SQL function accepts zero or more arguments and
returns a well-formed JSON array that is composed from those arguments.
If any argument to json_array() is a BLOB then an error is thrown.
437
438
439
440
441
442
443








444
445
446
447
448
449
450
  {json_array(json_array(1,2))} {'[[1,2]]'} \
  {json_array(1,null,'3','[4,5]','{"six":7.7}')} \
      {'[1,null,"3","[4,5]","{\"six\":7.7}"]'} \
  {json_array(1,null,'3',json('[4,5]'),json('{"six":7.7}'))} \
      {'[1,null,"3",[4,5],{"six":7.7}]'}
</tcl>










<tcl>hd_fragment jarraylen {json_array_length SQL function} \
         {json_array_length}</tcl>
<h2>The json_array_length() function</h2>

<p>The json_array_length(X) function returns the number of elements
in the JSON array X, or 0 if X is some kind of JSON value other







>
>
>
>
>
>
>
>







589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
  {json_array(json_array(1,2))} {'[[1,2]]'} \
  {json_array(1,null,'3','[4,5]','{"six":7.7}')} \
      {'[1,null,"3","[4,5]","{\"six\":7.7}"]'} \
  {json_array(1,null,'3',json('[4,5]'),json('{"six":7.7}'))} \
      {'[1,null,"3",[4,5],{"six":7.7}]'}
</tcl>


<tcl>hd_fragment jarrayb {jsonb_array SQL function} {jsonb_array}</tcl>
<h2>The jsonb_array() function</h2>

<p>The jsonb_array() SQL function works just like the [json_array()]
function except that it returns the constructed JSON array in the
SQLite's private JSONB format rather than in the standard
RFC 8259 text format.

<tcl>hd_fragment jarraylen {json_array_length SQL function} \
         {json_array_length}</tcl>
<h2>The json_array_length() function</h2>

<p>The json_array_length(X) function returns the number of elements
in the JSON array X, or 0 if X is some kind of JSON value other
472
473
474
475
476
477
478
479
480
481
482
483
484
485

486
487
488
489
490
491
492

493
494
495
496
497
498
499
<h2>The json_error_position() function</h2>

<p>The json_error_positionf(X) function returns 0 if the input X is a
well-formed JSON or JSON5 string.  If the input X contains one or more
syntax errors, then this function returns the character position of the
first syntax error.  The left-most character is position 1.

<p>This routine is useful for at least two purposes:

<ol>
<li>
<p> To determine is a text string X is valid JSON or JSON5 as understood
    by SQLite, run "<tt>json_error_position(X)==0</tt>".  This is similar
    to [json_valid()] except that json_valid(X) requires X to be strictly

    conforming canonical JSON whereas json_error_position() allows the
    input to contains [JSON5 extensions].
<li>
<p> Use this routine to find the location of a syntax error in a large
    JSON string during interactive debugging, or to generate a better
    error messages for human users.
</ol>


<p>
The json_error_position() function was added with
SQLite version 3.42.0 ([dateof:3.42.0]).


<tcl>hd_fragment jex {json_extract SQL function} {json_extract}</tcl>







|
|
<
<
|
|
<
>
|
<
<
<
<
|
<
>







632
633
634
635
636
637
638
639
640


641
642

643
644




645

646
647
648
649
650
651
652
653
<h2>The json_error_position() function</h2>

<p>The json_error_positionf(X) function returns 0 if the input X is a
well-formed JSON or JSON5 string.  If the input X contains one or more
syntax errors, then this function returns the character position of the
first syntax error.  The left-most character is position 1.

<p>If the input X is a BLOB, then this routine returns 0 if X appears
to be a well-formed JSONB blob.  If the input X is a BLOB that is clearly


not valid JSONB, then some non-zero value is returned.  The positive value
returned by the json_error_position() function with a BLOB input does not

necessarily indicate the position in the BLOB where it deviates from the
JSONB spec.  Note also that json_error_position() does not do a thorough




check of the BLOB and it might miss errors and return 0 even though the

BLOB is not a strictly conforming JSONB.

<p>
The json_error_position() function was added with
SQLite version 3.42.0 ([dateof:3.42.0]).


<tcl>hd_fragment jex {json_extract SQL function} {json_extract}</tcl>
545
546
547
548
549
550
551












552
553
554
555
556
557
558

<center>
<table border=1 cellpadding=3 cellspacing=0>
<tr><th>Operation<th>SQLite Result<th>MySQL Result
<tr><td>json_extract('{"a":null,"b":"xyz"}','$.a')<td>NULL<td>'null'
<tr><td>json_extract('{"a":null,"b":"xyz"}','$.b')<td>'xyz'<td>'"xyz"'
</table></center>













<tcl>hd_fragment jptr {the -> operator} {the -> and ->> operators}</tcl>
<h2>The -> and ->> operators</h2>

<p>Beginning with SQLite version 3.38.0 ([dateof:3.38.0]), the ->
and ->> operators are available for extracting subcomponents of JSON.
The SQLite implementation of -> and ->> strives to be







>
>
>
>
>
>
>
>
>
>
>
>







699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724

<center>
<table border=1 cellpadding=3 cellspacing=0>
<tr><th>Operation<th>SQLite Result<th>MySQL Result
<tr><td>json_extract('{"a":null,"b":"xyz"}','$.a')<td>NULL<td>'null'
<tr><td>json_extract('{"a":null,"b":"xyz"}','$.b')<td>'xyz'<td>'"xyz"'
</table></center>

<tcl>hd_fragment jexb {jsonb_extract SQL function} {jsonb_extract}</tcl>
<h2>The jsonb_extract() function</h2>

<p>
The jsonb_extract() function works the same as the [json_extract()] function,
except in cases where json_extract() would normally return a text
JSON array object, this routine returns the array or object in the
JSONB format.  For the common case where a text, numeric, null, or
boolean JSON element is returned, this routine works exactly the same
as json_extract().


<tcl>hd_fragment jptr {the -> operator} {the -> and ->> operators}</tcl>
<h2>The -> and ->> operators</h2>

<p>Beginning with SQLite version 3.38.0 ([dateof:3.38.0]), the ->
and ->> operators are available for extracting subcomponents of JSON.
The SQLite implementation of -> and ->> strives to be
570
571
572
573
574
575
576





577
578
579
580
581
582
583
JSON representation of that subcomponent and the ->> operator always
returns an SQL representation of that subcomponent.  Thus, these operators
are subtly different from a two-argument [json_extract()] function call.
A call to json_extract() with two arguments will return a JSON representation
of the subcomponent if and only if the subcomponent is a JSON array or
object, and will return an SQL representation of the subcomponent if the
subcomponent is a JSON null, string, or numeric value.






<p>The right-hand operand to the -> and ->> operators can
be a well-formed JSON path expression.  This is the form used by MySQL.
For compatibility with PostgreSQL,
the -> and ->> operators also accept a text label or
integer as their right-hand operand.  If the right operand is a text
label X, then it is interpreted as the JSON path '$.X'.  If the right







>
>
>
>
>







736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
JSON representation of that subcomponent and the ->> operator always
returns an SQL representation of that subcomponent.  Thus, these operators
are subtly different from a two-argument [json_extract()] function call.
A call to json_extract() with two arguments will return a JSON representation
of the subcomponent if and only if the subcomponent is a JSON array or
object, and will return an SQL representation of the subcomponent if the
subcomponent is a JSON null, string, or numeric value.

<p>When the -> operator returns JSON, it always returns the
RFC 8565 text representation of that JSON, not JSONB.  Use the
[jsonb_extract()] function if you need a subcomponent in the
JSONB format.

<p>The right-hand operand to the -> and ->> operators can
be a well-formed JSON path expression.  This is the form used by MySQL.
For compatibility with PostgreSQL,
the -> and ->> operators also accept a text label or
integer as their right-hand operand.  If the right operand is a text
label X, then it is interpreted as the JSON path '$.X'.  If the right
673
674
675
676
677
678
679










680
681
682
683
684
685
686
  {json_set('{"a":2,"c":4}', '$.c', '[97,96]')} \
            {'{"a":2,"c":"[97,96]"}'} \
  {json_set('{"a":2,"c":4}', '$.c', json('[97,96]'))} \
            {'{"a":2,"c":[97,96]}'} \
  {json_set('{"a":2,"c":4}', '$.c', json_array(97,96))} \
            {'{"a":2,"c":[97,96]}'}
</tcl>











<tcl>hd_fragment jobj {json_object SQL function} {json_object}</tcl>
<h2>The json_object() function</h2>

<p>The json_object() SQL function accepts zero or more pairs of arguments
and returns a well-formed JSON object that is composed from those arguments.
The first argument of each pair is the label and the second argument of







>
>
>
>
>
>
>
>
>
>







844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
  {json_set('{"a":2,"c":4}', '$.c', '[97,96]')} \
            {'{"a":2,"c":"[97,96]"}'} \
  {json_set('{"a":2,"c":4}', '$.c', json('[97,96]'))} \
            {'{"a":2,"c":[97,96]}'} \
  {json_set('{"a":2,"c":4}', '$.c', json_array(97,96))} \
            {'{"a":2,"c":[97,96]}'}
</tcl>

<tcl>hd_fragment jinsb {jsonb_insert SQL function} {jsonb_insert}</tcl>
<tcl>hd_fragment jreplb {jsonb_replace SQL function} {jsonb_replace}</tcl>
<tcl>hd_fragment jsetb {jsonb_set SQL function} {jsonb_set}</tcl>
<h2>The jsonb_insert(), jsonb_replace, and jsonb_set() functions</h2>

<p>The jsonb_insert(), jsonb_replace(), and jsonb_set() functions work the
same as [json_insert()], [json_replace()], and [json_set()], respectively,
except that "<tt>jsonb_</tt>" versions return their result in the binary
JSONB format.

<tcl>hd_fragment jobj {json_object SQL function} {json_object}</tcl>
<h2>The json_object() function</h2>

<p>The json_object() SQL function accepts zero or more pairs of arguments
and returns a well-formed JSON object that is composed from those arguments.
The first argument of each pair is the label and the second argument of
704
705
706
707
708
709
710







711
712
713
714
715
716
717
<tcl>
jexample \
  {json_object('a',2,'c',4)} {'{"a":2,"c":4}'} \
  {json_object('a',2,'c','{e:5}')} {'{"a":2,"c":"{e:5}"}'} \
  {json_object('a',2,'c',json_object('e',5))} {'{"a":2,"c":{"e":5}}'}
</tcl>








<tcl>hd_fragment jpatch {json_patch SQL function} {json_patch}</tcl>
<h2>The json_patch() function</h2>

<p>The json_patch(T,P) SQL function runs the
[https://tools.ietf.org/html/rfc7396|RFC-7396] MergePatch algorithm
to apply patch P against input T.  The patched copy of T is returned.








>
>
>
>
>
>
>







885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
<tcl>
jexample \
  {json_object('a',2,'c',4)} {'{"a":2,"c":4}'} \
  {json_object('a',2,'c','{e:5}')} {'{"a":2,"c":"{e:5}"}'} \
  {json_object('a',2,'c',json_object('e',5))} {'{"a":2,"c":{"e":5}}'}
</tcl>

<tcl>hd_fragment jobjb {jsonb_object SQL function} {jsonb_object}</tcl>
<h2>The jsonb_object() function</h2>

<p>
The jsonb_object() function works just like the [jsonb_object()] function
except that the generated object is returned in the binary JSONB format.

<tcl>hd_fragment jpatch {json_patch SQL function} {json_patch}</tcl>
<h2>The json_patch() function</h2>

<p>The json_patch(T,P) SQL function runs the
[https://tools.ietf.org/html/rfc7396|RFC-7396] MergePatch algorithm
to apply patch P against input T.  The patched copy of T is returned.

736
737
738
739
740
741
742








743
744
745
746
747
748
749
         {'{"b":2}'} \
 {json_patch('{"a":1,"b":2}','{"a":9,"b":null,"c":8}')} \
         {'{"a":9,"c":8}'} \
 {json_patch('{"a":{"x":1,"y":2},"b":3}','{"a":{"y":9},"c":8}')} \
         {'{"a":{"x":1,"y":9},"b":3,"c":8}'}
</tcl>









<tcl>hd_fragment jrm {json_remove SQL function} {json_remove}</tcl>
<h2>The json_remove() function</h2>

<p>The json_remove(X,P,...) function takes a single JSON value as its
first argument followed by zero or more path arguments.
The json_remove(X,P,...) function returns
a copy of the X parameter with all the elements 







>
>
>
>
>
>
>
>







924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
         {'{"b":2}'} \
 {json_patch('{"a":1,"b":2}','{"a":9,"b":null,"c":8}')} \
         {'{"a":9,"c":8}'} \
 {json_patch('{"a":{"x":1,"y":2},"b":3}','{"a":{"y":9},"c":8}')} \
         {'{"a":{"x":1,"y":9},"b":3,"c":8}'}
</tcl>


<tcl>hd_fragment jpatchb {jsonb_patch SQL function} {jsonb_patch}</tcl>
<h2>The json_patch() function</h2>

<p>
The jsonb_patch() function works just like the [jsonb_patch()] function
except that the patched JSON is returned in the binary JSONB format.

<tcl>hd_fragment jrm {json_remove SQL function} {json_remove}</tcl>
<h2>The json_remove() function</h2>

<p>The json_remove(X,P,...) function takes a single JSON value as its
first argument followed by zero or more path arguments.
The json_remove(X,P,...) function returns
a copy of the X parameter with all the elements 
771
772
773
774
775
776
777






778
779
780
781
782
783
784
  {json_remove('[0,1,2,3,4]','$[#-1]','$[0]')} {'[1,2,3]'} \
  {json_remove('{"x":25,"y":42}')} {'{"x":25,"y":42}'} \
  {json_remove('{"x":25,"y":42}','$.z')} {'{"x":25,"y":42}'} \
  {json_remove('{"x":25,"y":42}','$.y')} {'{"x":25}'} \
  {json_remove('{"x":25,"y":42}','$')} NULL
</tcl>







<tcl>hd_fragment jtype {json_type SQL function} {json_type}</tcl>
<h2>The json_type() function</h2>

<p>The json_type(X) function returns the "type" of the outermost element
of X.  The json_type(X,P) function returns the "type" of the element
in X that is selected by path P.  The "type" returned by json_type() is
one of the following SQL text values:







>
>
>
>
>
>







967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
  {json_remove('[0,1,2,3,4]','$[#-1]','$[0]')} {'[1,2,3]'} \
  {json_remove('{"x":25,"y":42}')} {'{"x":25,"y":42}'} \
  {json_remove('{"x":25,"y":42}','$.z')} {'{"x":25,"y":42}'} \
  {json_remove('{"x":25,"y":42}','$.y')} {'{"x":25}'} \
  {json_remove('{"x":25,"y":42}','$')} NULL
</tcl>

<tcl>hd_fragment jrmb {jsonb_remove SQL function} {jsonb_remove}</tcl>
<h2>The jsonb_remove() function</h2>
<p>
The jsonb_remove() function works just like the [jsonb_remove()] function
except that the edited JSON result is returned in the binary JSONB format.

<tcl>hd_fragment jtype {json_type SQL function} {json_type}</tcl>
<h2>The json_type() function</h2>

<p>The json_type(X) function returns the "type" of the outermost element
of X.  The json_type(X,P) function returns the "type" of the element
in X that is selected by path P.  The "type" returned by json_type() is
one of the following SQL text values:
804
805
806
807
808
809
810
811







812







813

814











815

























816
817
818
819
820


821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
  {json_type('{"a":[2,3.5,true,false,null,"x"]}','$.a[5]')} 'text' \
  {json_type('{"a":[2,3.5,true,false,null,"x"]}','$.a[6]')} NULL
</tcl>

<tcl>hd_fragment jvalid {json_valid SQL function} {json_valid}</tcl>
<h2>The json_valid() function</h2>

<p>The json_valid(X) function return 1 if the argument X is well-formed







canonical RFC-7159 JSON without any extensions, or return 0 if the







argument X is not well-formed JSON or is JSON that includes

[JSON5 extensions].





































<p>Examples:

<tcl>
jexample \
  {json_valid('{"x":35}')} 1 \


  "json_valid('\173\"x\":35')" 0 \
  {json_valid(NULL)} NULL
</tcl>

<p>Use the expression "[json_error_position(X)]==0" to determine if
a string is well-formed JSON5.  Use the "[json(X)]" routine to convert
JSON5 into canonical JSON.

<tcl>hd_fragment jquote {json_quote SQL function} {json_quote}</tcl>
<h2>The json_quote() function</h2>

<p>The json_quote(X) function converts the SQL value X (a number or a
string) into its corresponding JSON representation.  If X is a JSON value
returned by another JSON function, then this function is a no-op.








|
>
>
>
>
>
>
>
|
>
>
>
>
>
>
>
|
>
|
>
>
>
>
>
>
>
>
>
>
>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>





>
>




<
<
<
<







1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079




1080
1081
1082
1083
1084
1085
1086
  {json_type('{"a":[2,3.5,true,false,null,"x"]}','$.a[5]')} 'text' \
  {json_type('{"a":[2,3.5,true,false,null,"x"]}','$.a[6]')} NULL
</tcl>

<tcl>hd_fragment jvalid {json_valid SQL function} {json_valid}</tcl>
<h2>The json_valid() function</h2>

<p>The json_valid(X,Y) function return 1 if the argument X is well-formed
JSON, or returns 0 if X is not well-formed.  The Y parameter is an integer
bitmask that defines what is meant by "well-formed".  The following bits
of Y are currently defined:

<ul>
<li> <b>0x01</b> &rarr;
The input is text that strictly complies with canonical RFC-8259 JSON,
without any extensions.
<li> <b>0x02</b> &rarr;
The input is text that is JSON with [JSON5] extensions described above.
<li> <b>0x04</b> &rarr;
The input is a BLOB that superficially appears to be [JSONB].
<li> <b>0x08</b> &rarr;
The input is a BLOB that strictly conforms to the [JSONB format].
</ul>

<p>By combining bits, the following useful values of Y can be derived:

<ul>
<li> <b>1</b> &rarr; X is RFC-8259 JSON text
<li> <b>2</b> &rarr; X is [JSON5] text
<li> <b>4</b> &rarr; X is probably [JSONB]
<li> <b>5</b> &rarr; X is RFC-8259 JSON text or [JSONB]
<li> <b>6</b> &rarr; X is [JSON5] text or [JSONB]
     &larr; <i>This is probably the Y value you want</i>
<li> <b>8</b> &rarr; X is strictly conforming [JSONB]
<li> <b>9</b> &rarr; X is RFC-8259 or strictly conforming [JSONB]
<li> <b>10</b> &rarr; X is JSON5 or strictly conforming [JSONB]
</ul>

<p>The Y parameter is optional. If omitted, it defaults to 1, which means
that the default behavior is to return true only if the input X is
strictly conforming RFC-8259 JSON text without any extensions.  This
makes the one-argument version of json_valid() compatible with older
versions of SQLite, prior to the addition of support for
[JSON5] and [JSONB].

<p>The difference between 0x04 and 0x08 bits in the Y parameter is that
0x04 only examines the outer wrapper of the BLOB to see if it superficially
looks like [JSONB].  This is sufficient for must purposes and is very fast.
The 0x08 bit does a thorough examination of all internal details of the BLOB.
The 0x08 bit takes time that is linear in the size of the X input and is much
slower.  The 0x04 bit is recommended for most purposes.

<p>If you just want to know if a value is a plausible input to one of
the other JSON functions, a Y value of 6 is probably what you want to use.

<p>Any Y value less than 1 or greater than 15 raises an error, for the
latest version of json_valid().  However, future versions of json_valid()
might be enhanced to accept flag values outside of this range, having new
meanings that we have not yet thought of.

<p>If either X or Y inputs to json_valid() are NULL, then the function
returns NULL.

<p>Examples:

<tcl>
jexample \
  {json_valid('{"x":35}')} 1 \
  {json_valid('{x:35}')} 0 \
  {json_valid('{x:35}',6)} 1 \
  "json_valid('\173\"x\":35')" 0 \
  {json_valid(NULL)} NULL
</tcl>





<tcl>hd_fragment jquote {json_quote SQL function} {json_quote}</tcl>
<h2>The json_quote() function</h2>

<p>The json_quote(X) function converts the SQL value X (a number or a
string) into its corresponding JSON representation.  If X is a JSON value
returned by another JSON function, then this function is a no-op.

845
846
847
848
849
850
851




852
853

854
855
856
857
858
859
860


861
862
863
864
865
866
867
</tcl>

<tcl>
hd_fragment jgrouparray {json_group_array SQL function} \
   {json_group_array}
hd_fragment jgroupobject {json_group_object SQL function} \
   {json_group_object}




</tcl>
<h2>The json_group_array() and json_group_object()

aggregate SQL functions</h2>

<p>The json_group_array(X) function is an
[Aggregate Functions|aggregate SQL function] that returns a JSON array
comprised of all X values in the aggregation.
Similarly, the json_group_object(NAME,VALUE) function returns a JSON object
comprised of all NAME/VALUE pairs in the aggregation.




<tcl>hd_fragment jeach {json_each table-valued function} {json_each}</tcl>
<tcl>hd_fragment jtree {json_tree table-valued function} {json_tree}</tcl>
<h2>The json_each() and json_tree() table-valued functions</h2>

<p>The json_each(X) and json_tree(X) [table-valued functions] walk the







>
>
>
>

|
>







>
>







1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
</tcl>

<tcl>
hd_fragment jgrouparray {json_group_array SQL function} \
   {json_group_array}
hd_fragment jgroupobject {json_group_object SQL function} \
   {json_group_object}
hd_fragment jgrouparrayb {jsonb_group_array SQL function} \
   {jsonb_group_array}
hd_fragment jgroupobjectb {jsonb_group_object SQL function} \
   {jsonb_group_object}
</tcl>
<h2>The json_group_array(), json_group_object(),
jsonb_group_array(), and jsonb_group_object()
aggregate SQL functions</h2>

<p>The json_group_array(X) function is an
[Aggregate Functions|aggregate SQL function] that returns a JSON array
comprised of all X values in the aggregation.
Similarly, the json_group_object(NAME,VALUE) function returns a JSON object
comprised of all NAME/VALUE pairs in the aggregation.
The "<tt>jsonb_</tt>" variants are the same except that they return their
result in the binary [JSONB] format.


<tcl>hd_fragment jeach {json_each table-valued function} {json_each}</tcl>
<tcl>hd_fragment jtree {json_tree table-valued function} {json_tree}</tcl>
<h2>The json_each() and json_tree() table-valued functions</h2>

<p>The json_each(X) and json_tree(X) [table-valued functions] walk the

Added pages/jsonb.in.





























































































































































































































































































































































































































































































































































































































































>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
<title>The SQLite JSONB Format</title>
<alt-title>The SQLite JSONB Format</alt-title>
<tcl>hd_keywords {JSONB format} {SQLite JSONB format}</tcl>

<table_of_contents>

<p>This document describes SQLite's [JSONB] binary encoding of
JSON.

<h1>What Is JSONB?</h1>

<p>
Beginning with version 3.45.0 ([dateof:3.45.0]), SQLite supports an
alternative binary encoding of JSON which we call "JSONB".  JSONB is
a binary format that stored as a BLOB.

<p>
The advantage of JSONB over ordinary text RFC 8259 JSON is that JSONB
is both slightly smaller (by between 5% and 10% in most cases) and
can be processed in less than half the number of CPU cycles.  The built-in
[JSON SQL functions] of SQLite can accept either ordinary text JSON
or the binary JSONB encoding for any of their JSON inputs.

<p>
The "JSONB" name is inspired by [https://postgresql.org|PostgreSQL], but the
on-disk format for SQLite's JSONB is not the same as PostgreSQL's.
The two formats have the same name, but they have wildly different internal
representations and are not in any way binary compatible.

<p>
The central idea behind this JSONB specification is that each element
begins with a header that includes the size and type of that element.
The header takes the place of punctuation such as double-quotes,
curly-brackes, square-brackets, commas, and colons.  Since the size
and type of each element is contained in its header, the element can
be read faster since it is no longer necessary to carefully scan forward
looking for the closing delimiter.  The payload of JSONB is the same
as for corresponding text JSON.  The same payload bytes occur in the
same order.  The only real difference between JSONB and ordinary text
JSON is that JSONB includes a binary header on
each element and omits delimiter and separator punctuation.

<h1>The Purpose Of This Document</h1>

<p>
JSONB is not intended as an external format to be used by
applications.  JSONB is designed for internal use by SQLite only.
Programmers do not need to understand the JSONB format in order to
use it effectively.
Applications should access JSONB only through the [JSON SQL functions],
not by looking at individual bytes of the BLOB.

<p>
However, JSONB is intended to be portable and backwards compatible
for all future versions of SQLite.  In other words, you should not have
to export and reimport your SQLite database files when you upgrade to
a newer SQLite version.  For that reason, the JSONB format needs to
be well-defined.

<p>
This document is therefore similar in purpose to the
[SQLite database file format] document that describes the on-disk
format of an SQLite database file.  Applications are not expected
to directly read and write the bits and bytes of SQLite database files.
The SQLite database file format is carefully documented so that it
can be stable and enduring.  In the same way, the JSONB representation
of JSON is documented here so that it too can be stable and enduring,
not so that applications can read or writes individual bytes.

<h1>Encoding</h1>

<p>
JSONB is a direct translation of the underlying text JSON. The difference
is that JSONB uses a binary encoding that is faster to parse compared to
the detailed syntax of text JSON.

<p>
Each JSON element is encoded as a header and a payload.  The header
determines type of element (string, numeric, boolean, null, object, or
array) and the size of the payload.  The header can be between 1 and
9 bytes in size.  The payload can be any size from zero bytes up to the
maximum allowed BLOB size.

<h2>Payload Size</h2>

<p>
The upper four bits of the first byte of the header determine size of the
header and possibly also the size of the payload.
If the upper four bits have a value between 0 and 11, then the header is
exactly one byte in size and the payload size is determined by those
upper four bits.  If the upper four bits have a value between 12 and 15,
that means that the total header size is 2, 3, 5, or 9 bytes and the
payload size is unsigned big-endian integer that is contained in the
subsequent bytes.  The size integer is the one byte that following the
initial header byte if the upper four bits
are 12, two bytes if the upper bits are 13, four bytes if the upper bits
are 14, and eight bytes if the upper bits are 15.  The current design
of SQLite does not support BLOB values larger than 2GiB, so the eight-byte
variant of the payload size integer will never be used by the current code.
The eight-byte payload size integer is included in the specification
to allow for future expansion.

<p>
The header for an element does <i>not</i> need to be in its simplest
form.  For example, consider the JSON numeric value "<tt>1</tt>".
That element can be encode in five different ways:

<ul>
<li>  <tt>0x13 0x31</tt>
<li>  <tt>0xc3 0x01 0x31</tt>
<li>  <tt>0xd3 0x00 0x01 0x31</tt>
<li>  <tt>0xe3 0x00 0x00 0x00 0x01 0x31</tt>
<li>  <tt>0xf3 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01 0x31</tt>
</ul>

<p>
The shortest encoding is preferred, of course, and usually happens with
primitive elements such as numbers.  However the total size of an array
or object might not be known exactly when the header of the element is
first generated.  It is convenient to reserve space for the largest
possible header and then go back and fill in the correct payload size
at the end.  This technique can result in array or object headers that
are larger than absolutely necessary.

<h2>Element Type</h2>

<p>
The least-significant four bits of the first byte of the header (the first
byte masked against 0x0f) determine element type.  The following codes are
used:

<ol>
<li value="0"><p><b>NULL</b> &rarr;
The element is a JSON "null".  The payload size for a true JSON NULL must
must be zero.  Future versions of SQLite might extend the JSONB format
with elements that have a zero element type but a non-zero size.  In that
way, legacy versions of SQLite will interpret the element as a NULL 
for backwards compatibility while newer versions will interpret the
element in some other way.

<li value="1"><p><b>TRUE</b> &rarr;
The element is a JSON "true".  The payload size must be zero for a actual
"true" value.  Elements with type 1 and a non-zero payload size are
reserved for future expansion.  Legacy implementations that see an element
type of 1 with a non-zero payload size should continue to interpret that
element as "true" for compatibility.

<li value="2"><p><b>FALSE</b> &rarr;
The element is a JSON "false".  The payload size must be zero for a actual
"false" value.  Elements with type 2 and a non-zero payload size are
reserved for future expansion.  Legacy implementations that see an element
type of 2 with a non-zero payload size should continue to interpret that
element as "false" for compatibility.

<li value="3"><p><b>INT</b> &rarr;
The element is a JSON integer value in the canonical
RFC 8259 format, without extensions.  The payload is the ASCII
text representation of that numeric value.

<li value="4"><p><b>INT5</b> &rarr;
The element is a JSON integer value that is not in the
canonical format.   The payload is the ASCII
text representation of that numeric value.  Because the payload is in a
non-standard format, it will need to be translated when the JSONB is
converted into RFC 8259 text JSON.

<li value="5"><p><b>FLOAT</b> &rarr;
The element is a JSON floating-point value in the canonical
RFC 8259 format, without extensions.  The payload is the ASCII
text representation of that numeric value.

<li value="6"><p><b>FLOAT5</b> &rarr;
The element is a JSON floating-point value that is not in the
canonical format.   The payload is the ASCII
text representation of that numeric value.  Because the payload is in a
non-standard format, it will need to be translated when the JSONB is
converted into RFC 8259 text JSON.

<li value="7"><p><b>TEXT</b> &rarr;
The element is a JSON string value that does not contain
any escapes nor any characters that need to be escaped for either SQL or
JSON.  The payload is the UTF8 text representation of the string value.
The payload does <i>not</i> include string delimiters.

<li value="8"><p><b>TEXTJ</b> &rarr;
The element is a JSON string value that contains
RFC 8259 character escapes (such as "<tt>\n</tt>" or "<tt>\u0020</tt>").
Those escapes will need to be translated into actual UTF8 if this element
is [json_extract|extracted] into SQL.
The payload is the UTF8 text representation of the escaped string value.
The payload does <i>not</i> include string delimiters.

<li value="9"><p><b>TEXT5</b> &rarr;
The element is a JSON string value that contains
character escapes, including some character escapes that part of JSON5
and which are not found in the canonical RFC 8259 spec.
Those escapes will need to be translated into standard JSON prior to
rendering the JSON as text, or into their actual UTF8 characters if this
element is [json_extract|extracted] into SQL.
The payload is the UTF8 text representation of the escaped string value.
The payload does <i>not</i> include string delimiters.

<li value="10"><p><b>TEXTRAW</b> &rarr;
The element is a JSON string value that contains
UTF8 characters that need to be escaped if this string is rendered into
standard JSON text.
The payload does <i>not</i> include string delimiters.

<li value="11"><p><b>ARRAY</b> &rarr;
The element is a JSON array.  The payload contains
JSONB elements that comprise values contained within the array.

<li value="12"><p><b>OBJECT</b> &rarr;
The element is a JSON object.  The payload contains
pairs of JSONB elements that comprise entries for the JSON object.
The first element in each pair must be a string (types 7 through 10).
The second element of each pair may be any types, including nested
arrays or objects.

<li value="13"><p><b>RESERVED-13</b> &rarr;
Reserved for future expansion.  Legacy implements that encounter this
element type should raise an error.

<li value="14"><p><b>RESERVED-14</b> &rarr;
Reserved for future expansion.  Legacy implements that encounter this
element type should raise an error.

<li value="15"><p><b>RESERVED-15</b> &rarr;
Reserved for future expansion.  Legacy implements that encounter this
element type should raise an error.
</ol>

<p>
Element types outside the range of 0 to 12 are reserved for future
expansion.  The current implement raises an error if see an element type
other than those listed above.  However, future versions of SQLite might
use of the three remaining element types to implement indexing or similar
optimizations, to speed up lookup against large JSON arrays and/or objects.

<h2>Design Rationale For Element Types</h2>

<p>
A key goal of JSONB is that it should be quick to translate
to and from text JSON and/or be constructed from SQL values.
When converting from text into JSONB, we do not want the
converter subroutine to burn CPU cycles converting elements
values into some standard format which might never be used.
Format conversion is "lazy" - it is deferred until actually
needed.  This has implications for the JSONB format design:

<ol>
<li><p>
Numeric values are stored as text, not a numbers.  The values are
a direct copy of the text JSON values from which they are derived.

<li><p>
There are multiple element types depending on the details of value
formats.  For example, INT is used for pure RFC-8259 integer
literals and INT5 exists for JSON5 extensions such as hexadecimal
notation.  FLOAT is used for pure RFC-8259 floating point literals
and FLOAT5 is used for JSON5 extensions.  There are four different
representations of strings, depending on where the string came from
and how special characters within the string are escaped.
</ol>

<p>
A second goal of JSONB is that it should be capable of serving as the
"parse tree" for JSON when a JSON value is being processed by the
various [JSON SQL functions] built into SQLite.  Before JSONB was
developed, operations such [json_replace()] and [json_patch()]
and similar worked in three stages:

<ol>
<li> Translate the text JSON into a internal format that is
     easier to scan and edit.
<li> Perform the requested operation on the JSON.
<li> Translate the internal format back into text.
</ol>

<p>
JSONB seeks to serve as the internal format directly - bypassing
the first and third stages of that process.  Since most of the CPU
cycles are spent on the first and third stages, that suggests that
JSONB processing will be much faster than text JSON processing.

<p>
(Aside:  As of this writing, the legacy internal binary format is still used
for many operations.  Only [json_extract()] has, thus far, been converted
to use JSONB directly.  The plan is to convert all JSON processing to
use JSONB internally.  Hopefully this will be accomplished prior to the
first official release.)

<p>
So when processing JSONB, only the second stage of the three-stage
process is required.  But when processing text JSON, it is still necessary
to do stages one and three.  If JSONB is to be used as the internal
binary representation, this is yet another reason to store numeric
values as text.  Storing numbers as text minimizes the amount of
conversion work needed for stages one and three.  This is also why
there are four different representations of text in JSONB.  Different
text representations are used for text coming from different sources
(RFC-8259 JSON, JSON5, or SQL string values) and conversions only
happen if and when they are actually needed.

<h2>Valid JSONB BLOBs</h2>

<p>
A valid JSONB BLOB consists of a single JSON element.  The element must
exactly fill the BLOB.  This one element is often a JSON object or array
and those usually contain additional elements as its payload, but the
element can be a primite value such a string, number, boolean, or null.

<p>
When the built-in JSON functions are attempting to determine if a BLOB
argument is a JSONB or just a random BLOB, they look at the header of
the outer element to see that it is well-formed and that the element
completely fills the BLOB.  If these conditions are met, then the BLOB
is accepted as a JSONB value.