ADDED doc/json-enhancements.md Index: doc/json-enhancements.md ================================================================== --- /dev/null +++ doc/json-enhancements.md @@ -0,0 +1,123 @@ +# JSON Functions Enhancements (2022) + +This document summaries enhancements to the SQLite JSON support added in +early 2022. + +## 1.0 New feature summary: + + 1. New **->** and **->>** operators that work like MySQL and PostgreSQL (PG). + 2. New functions: **json_nextract()** and **json_ntype()**. + 3. JSON functions are built-in rather than being an extension. They + are included by default, but can be omitted using the + -DSQLITE_OMIT_JSON compile-time option. + +## 2.0 The **json_nextract()** function. + +The new **json_nextract()** function works like **json_extract()** with +one exception: if the input text in the first argument is not well-formed +JSON, then json_nextract() returns NULL whereas json_extract() raises +an error. The extra "n" in the name of json_nextract() can be throught +of as meaning "null-if-error". + +A call to json_nextract($JSON,$PATH) is logically equivalent to: + +> ~~~ +CASE WHEN json_valid($JSON) THEN json_extract($JSON,$PATH) END +~~~ + +The json_nextract() function is intended for use in tables where a +column might hold a mixture of datatypes - some rows holding JSON and other +rows holding primitive SQL datatypes such as INT, REAL, and TEXT. The +json_nextract() function makes it easier to write robust queries +against such tables. + +## 3.0 New operators **->** and **->>** + +The SQLite language adds two new binary operators **->** and **->>**. +The -> operator works like the two-argument version of json_nextract() +and the ->> operator works like the two-argument version of json_extract(). +The left-hand operand of -> and ->> is JSON. The right-hand operand +is a JSON path expression. These operators extract and return a value +from the left-hand JSON that is specified by right-hand path expression. + +The operators work exactly the same if the left-hand side is well-formed +JSON. The only difference is that if the left-hand side is not well-formed +JSON, the ->> raises an error whereas the -> operator simply returns NULL. + +### 3.1 Compatibility with MySQL + +The ->> operator should be compatible with MySQL in the sense that +a ->> operator that works in MySQL should work the same way in SQLite. +But (see below) the SQLite ->> operator is also extended to support PG +syntax so not every use of ->> that wworks in SQLite will work for MySQL. + +The -> operator is *mostly* compatible with MySQL. Key differences +between the SQLite -> operator and the MySQL -> operator are: + + * The SQLite -> operator returns NULL if the left-hand side is + not well-formed JSON whereas MySQL will raise an error. + + * When the JSON path expression on the right-hand side selects a + text value from the JSON, the -> operator in MySQL returns the + string quoted as if for JSON, whereas the SQLite -> operator + returns an unquoted SQL text value. + +This second difference - the handling of text values extracted from JSON - +is also a difference in the json_extract() function between SQLite and +MySQL. Because json_extract() has been in active use for 6 years, and +because the SQLite semantics seem to be more useful, there +are no plans to change json_extract() to make it compatible with MySQL. + +### 3.2 Compatibility with PostgreSQL (PG) + +The ->> operator in PG does not accept a JSON path expression as its +right-hand operand. Instead, PG looks for either a text string X +(which is then interpreted as the path "$.X") or an integer N (which +is then interpreted as "$[N]"). In order to make the SQLite ->> operator +compatible with the PG ->> operator, the SQLite ->> operator has been +extended so that its right-hand operand can be either a text label or +a integer array index, as it is in PG. The SQLite ->> operator also +accepts full JSON path expressions as well. + +The enhancement of accepting JSON path expression that consist of just +a bare object label or array index is unique to the -> and ->> operators. +All other places in the SQLite JSON interface that require JSON path +expressions continue to require well-formed JSON path expressions. +Only -> and ->> accept the PG-compatible abbreviated path expressions. + +The -> operator in SQLite is *mostly* compatible with the -> operator +in PG. The differences are the same as for MySQL. + +## 4.0 The **json_ntype()** function. + +The **json_ntype()** function works like **json_type()** except that when +the argument is not well-formed JSON, the json_ntype() function returns +NULL whereas json_type() raises an error. The extra "n" in the name can +be understood as standing for "null-if-error". + +The json_ntype($JSON) function is logically equivalent to: + +> ~~~ +CASE WHEN json_valid($JSON) THEN json_type($JSON) END +~~~ + +The json_ntype() function can be seen as an enhanced version of +the json_valid() function, that in addition to indicating whether or +not the string is well-formed JSON, also indicates the top-level type +of that JSON. + +## 5.0 JSON moved into the core + +The JSON interface is now moved into the SQLite core. + +When originally written in 2015, the JSON functions were an extension +that could be optionally included at compile-time, or loaded at run-time. +The implementation was in a source file named ext/misc/json1.c in the +source tree. JSON functions were only compiled in if the +-DSQLITE_ENABLE_JSON1 compile-time option was used. + +After these enhancements, the JSON functions are now built-ins. +The source file that implements the JSON functions is moved to src/json.c. +No special compile-time options are needed to load JSON into the build. +Instead, there is a new -DSQLITE_OMIT_JSON compile-time option to leave +them out. Index: ext/misc/json1.c ================================================================== --- ext/misc/json1.c +++ ext/misc/json1.c @@ -1579,55 +1579,126 @@ } } sqlite3_result_int64(ctx, n); } +/* +** Bit values for the flags passed into jsonExtractFunc() or +** jsonSetFunc() via the user-data value. +*/ +#define JSON_NULLERR 0x01 /* Return NULL if input is not JSON */ +#define JSON_ABPATH 0x02 /* Allow abbreviated JSON path specs */ +#define JSON_ISSET 0x04 /* json_set(), not json_insert() */ + /* ** json_extract(JSON, PATH, ...) +** json_nextract(JSON, PATH, ...) +** "->"(JSON,PATH) +** "->>"(JSON,PATH) +** +** Return the element described by PATH. Return NULL if that PATH element +** is not found. For leaf nodes of the JSON, the value returned is a pure +** SQL value. In other words, quotes have been removed from strings. +** +** If there are multiple PATHs, then the value returned is a JSON array +** with one entry in the array for each PATH term. +** +** Throw an error if any PATH is malformed. +** +** If JSON is not well-formed JSON then: +** +** (1) raise an error if the JSON_NULLERR flag is not set. +** +** (2) Otherwise (if the JSON_NULLERR flags is set and) if there +** is a single PATH argument with the value '$', simply quote +** the JSON input as if by json_quote(). In other words, treat +** the JSON input as a string and convert it into a valid JSON +** string. +** +** (3) Otherwise (if JSON_NULLERR is set and the PATH is not '$') +** return NULL ** -** Return the element described by PATH. Return NULL if there is no -** PATH element. If there are multiple PATHs, then return a JSON array -** with the result from each path. Throw an error if the JSON or any PATH -** is malformed. +** If the JSON_ABPATH flag is set and there is only a single PATH, then +** allow abbreviated PATH specs that omit the leading "$". */ static void jsonExtractFunc( sqlite3_context *ctx, int argc, sqlite3_value **argv ){ JsonParse *p; /* The parse */ JsonNode *pNode; const char *zPath; + int flags = *(int*)sqlite3_user_data(ctx); JsonString jx; - int i; if( argc<2 ) return; - p = jsonParseCached(ctx, argv, ctx); - if( p==0 ) return; - jsonInit(&jx, ctx); - jsonAppendChar(&jx, '['); - for(i=1; inErr ) break; - if( argc>2 ){ + p = jsonParseCached(ctx, argv, (flags & JSON_NULLERR)!=0 ? 0 : ctx); + if( p==0 ){ + /* If the form is "json_nextract(IN,'$')" and IN is not well-formed JSON, + ** then return IN as a quoted JSON string. */ + if( (flags & JSON_NULLERR)!=0 + && argc==2 + && (zPath = (const char*)sqlite3_value_text(argv[1]))!=0 + && zPath[0]=='$' && zPath[1]==0 + ){ + jsonQuoteFunc(ctx, argc, argv); + } + return; + } + if( argc==2 ){ + /* With a single PATH argument, the return is the unquoted SQL value */ + zPath = (const char*)sqlite3_value_text(argv[1]); + if( zPath && zPath[0]!='$' && zPath[0]!=0 && (flags & JSON_ABPATH)!=0 ){ + /* The -> and ->> operators accept abbreviated PATH arguments. This + ** is mostly for compatibility with PostgreSQL, but also for convenience. + ** + ** NUMBER ==> $[NUMBER] // PG compatible + ** LABEL ==> $.LABEL // PG compatible + ** [NUMBER] ==> $[NUMBER] // Not PG. Purely for convenience + */ + jsonInit(&jx, ctx); + if( safe_isdigit(zPath[0]) ){ + jsonAppendRaw(&jx, "$[", 2); + jsonAppendRaw(&jx, zPath, (int)strlen(zPath)); + jsonAppendRaw(&jx, "]", 2); + }else{ + jsonAppendRaw(&jx, "$.", 1 + (zPath[0]!='[')); + jsonAppendRaw(&jx, zPath, (int)strlen(zPath)); + jsonAppendChar(&jx, 0); + } + pNode = jx.bErr ? 0 : jsonLookup(p, jx.zBuf, 0, ctx); + jsonReset(&jx); + }else{ + pNode = jsonLookup(p, zPath, 0, ctx); + } + if( p->nErr ) return; + if( pNode ) jsonReturn(pNode, ctx, 0); + }else{ + /* Two or more PATH arguments results in a JSON array with each + ** element of the array being the value selected by one of the PATHs */ + int i; + jsonInit(&jx, ctx); + jsonAppendChar(&jx, '['); + for(i=1; inErr ) break; jsonAppendSeparator(&jx); if( pNode ){ jsonRenderNode(pNode, &jx, 0); }else{ jsonAppendRaw(&jx, "null", 4); } - }else if( pNode ){ - jsonReturn(pNode, ctx, 0); - } - } - if( argc>2 && i==argc ){ - jsonAppendChar(&jx, ']'); - jsonResult(&jx); - sqlite3_result_subtype(ctx, JSON_SUBTYPE); - } - jsonReset(&jx); + } + if( i==argc ){ + jsonAppendChar(&jx, ']'); + jsonResult(&jx); + sqlite3_result_subtype(ctx, JSON_SUBTYPE); + } + jsonReset(&jx); + } } /* This is the RFC 7396 MergePatch algorithm. */ static JsonNode *jsonMergePatch( @@ -1855,10 +1926,11 @@ jsonReturnJson(x.aNode, ctx, argv); } replace_err: jsonParseReset(&x); } + /* ** json_set(JSON, PATH, VALUE, ...) ** ** Set the value at PATH to VALUE. Create the PATH if it does not already @@ -1916,14 +1988,17 @@ jsonParseReset(&x); } /* ** json_type(JSON) +** json_ntype(JSON) ** json_type(JSON, PATH) ** -** Return the top-level "type" of a JSON string. Throw an error if -** either the JSON or PATH inputs are not well-formed. +** Return the top-level "type" of a JSON string. json_type() raises an +** error if either the JSON or PATH inputs are not well-formed. json_ntype() +** works like the one-argument version of json_type() except that it +** returns NULL if the JSON argument is not well-formed. */ static void jsonTypeFunc( sqlite3_context *ctx, int argc, sqlite3_value **argv @@ -1930,11 +2005,11 @@ ){ JsonParse *p; /* The parse */ const char *zPath; JsonNode *pNode; - p = jsonParseCached(ctx, argv, ctx); + p = jsonParseCached(ctx, argv, *(int*)sqlite3_user_data(ctx) ? 0 : ctx); if( p==0 ) return; if( argc==2 ){ zPath = (const char*)sqlite3_value_text(argv[1]); pNode = jsonLookup(p, zPath, 0, ctx); }else{ @@ -2638,34 +2713,38 @@ int sqlite3Json1Init(sqlite3 *db){ int rc = SQLITE_OK; unsigned int i; static const struct { const char *zName; + void (*xFunc)(sqlite3_context*,int,sqlite3_value**); int nArg; int flag; - void (*xFunc)(sqlite3_context*,int,sqlite3_value**); } aFunc[] = { - { "json", 1, 0, jsonRemoveFunc }, - { "json_array", -1, 0, jsonArrayFunc }, - { "json_array_length", 1, 0, jsonArrayLengthFunc }, - { "json_array_length", 2, 0, jsonArrayLengthFunc }, - { "json_extract", -1, 0, jsonExtractFunc }, - { "json_insert", -1, 0, jsonSetFunc }, - { "json_object", -1, 0, jsonObjectFunc }, - { "json_patch", 2, 0, jsonPatchFunc }, - { "json_quote", 1, 0, jsonQuoteFunc }, - { "json_remove", -1, 0, jsonRemoveFunc }, - { "json_replace", -1, 0, jsonReplaceFunc }, - { "json_set", -1, 1, jsonSetFunc }, - { "json_type", 1, 0, jsonTypeFunc }, - { "json_type", 2, 0, jsonTypeFunc }, - { "json_valid", 1, 0, jsonValidFunc }, + { "json", jsonRemoveFunc, 1, 0 }, + { "json_array", jsonArrayFunc, -1, 0 }, + { "json_array_length", jsonArrayLengthFunc, 1, 0 }, + { "json_array_length", jsonArrayLengthFunc, 2, 0 }, + { "json_extract", jsonExtractFunc, -1, 0 }, + { "json_nextract", jsonExtractFunc, -1, JSON_NULLERR }, + { "->", jsonExtractFunc, 2, JSON_NULLERR|JSON_ABPATH }, + { "->>", jsonExtractFunc, 2, JSON_ABPATH }, + { "json_insert", jsonSetFunc, -1, 0 }, + { "json_object", jsonObjectFunc, -1, 0 }, + { "json_patch", jsonPatchFunc, 2, 0 }, + { "json_quote", jsonQuoteFunc, 1, 0 }, + { "json_remove", jsonRemoveFunc, -1, 0 }, + { "json_replace", jsonReplaceFunc, -1, 0 }, + { "json_set", jsonSetFunc, -1, JSON_ISSET }, + { "json_type", jsonTypeFunc, 1, 0 }, + { "json_ntype", jsonTypeFunc, 1, JSON_NULLERR }, + { "json_type", jsonTypeFunc, 2, 0 }, + { "json_valid", jsonValidFunc, 1, 0 }, #if SQLITE_DEBUG /* DEBUG and TESTING functions */ - { "json_parse", 1, 0, jsonParseFunc }, - { "json_test1", 1, 0, jsonTest1Func }, + { "json_parse", jsonParseFunc, 1, 0 }, + { "json_test1", jsonTest1Func, 1, 0 }, #endif }; static const struct { const char *zName; int nArg; Index: src/parse.y ================================================================== --- src/parse.y +++ src/parse.y @@ -234,11 +234,11 @@ // values that are early enough so that all jump operations are clustered // at the beginning. // %token ABORT ACTION AFTER ANALYZE ASC ATTACH BEFORE BEGIN BY CASCADE CAST. %token CONFLICT DATABASE DEFERRED DESC DETACH EACH END EXCLUSIVE EXPLAIN FAIL. -%token OR AND NOT MATCH LIKE_KW BETWEEN IS IN ISNULL NOTNULL NE EQ. +%token OR AND NOT IS MATCH LIKE_KW BETWEEN IN ISNULL NOTNULL NE EQ. %token GT LE LT GE ESCAPE. // The following directive causes tokens ABORT, AFTER, ASC, etc. to // fallback to ID if they will not parse as their original value. // This obviates the need for the "id" nonterminal. @@ -284,11 +284,11 @@ %left GT LE LT GE. %right ESCAPE. %left BITAND BITOR LSHIFT RSHIFT. %left PLUS MINUS. %left STAR SLASH REM. -%left CONCAT. +%left CONCAT PTR. %left COLLATE. %right BITNOT. %nonassoc ON. // An IDENTIFIER can be a generic identifier, or one of several @@ -1232,10 +1232,16 @@ {A = sqlite3PExpr(pParse, @B, X, 0);/*A-overwrites-B*/} expr(A) ::= PLUS|MINUS(B) expr(X). [BITNOT] { A = sqlite3PExpr(pParse, @B==TK_PLUS ? TK_UPLUS : TK_UMINUS, X, 0); /*A-overwrites-B*/ } + +expr(A) ::= expr(B) PTR(C) expr(D). { + ExprList *pList = sqlite3ExprListAppend(pParse, 0, B); + pList = sqlite3ExprListAppend(pParse, pList, D); + A = sqlite3ExprFunction(pParse, pList, &C, 0); +} %type between_op {int} between_op(A) ::= BETWEEN. {A = 0;} between_op(A) ::= NOT BETWEEN. {A = 1;} expr(A) ::= expr(A) between_op(N) expr(X) AND expr(Y). [BETWEEN] { Index: src/tokenize.c ================================================================== --- src/tokenize.c +++ src/tokenize.c @@ -288,10 +288,13 @@ case CC_MINUS: { if( z[1]=='-' ){ for(i=2; (c=z[i])!=0 && c!='\n'; i++){} *tokenType = TK_SPACE; /* IMP: R-22934-25134 */ return i; + }else if( z[1]=='>' ){ + *tokenType = TK_PTR; + return 2 + (z[2]=='>'); } *tokenType = TK_MINUS; return 1; } case CC_LP: { Index: test/func.test ================================================================== --- test/func.test +++ test/func.test @@ -1504,7 +1504,22 @@ } {} do_execsql_test func-35.200 { CREATE TABLE t0(c0 CHECK(ABS(-9223372036854775808))); PRAGMA integrity_check; } {ok} + +# 2021-01-07: The -> and ->> operators. +# +proc ptr1 {a b} { return "$a->$b" } +db func -> ptr1 +proc ptr2 {a b} { return "$a->>$b" } +db func ->> ptr2 +do_execsql_test func-36.100 { + SELECT 123 -> 456 +} {123->456} +do_execsql_test func-36.110 { + SELECT 123 ->> 456 +} {123->>456} + + finish_test Index: test/json102.test ================================================================== --- test/json102.test +++ test/json102.test @@ -69,25 +69,74 @@ SELECT json_array_length('{"one":[1,2,3]}', '$.two'); } {{}} do_execsql_test json102-250 { SELECT json_extract('{"a":2,"c":[4,5,{"f":7}]}', '$'); } {{{"a":2,"c":[4,5,{"f":7}]}}} +do_execsql_test json102-251 { + SELECT json_nextract('{"a":2,"c":[4,5,{"f":7}]}', '$'); +} {{{"a":2,"c":[4,5,{"f":7}]}}} +do_execsql_test json102-252 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> '$'; +} {{{"a":2,"c":[4,5,{"f":7}]}}} do_execsql_test json102-260 { SELECT json_extract('{"a":2,"c":[4,5,{"f":7}]}', '$.c'); } {{[4,5,{"f":7}]}} +do_execsql_test json102-261 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> '$.c'; +} {{[4,5,{"f":7}]}} +do_execsql_test json102-262 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> 'c'; +} {{[4,5,{"f":7}]}} +do_catchsql_test json102-265 { + SELECT json_extract('[1,2,3', '$[2]'); +} {1 {malformed JSON}} +do_catchsql_test json102-266 { + SELECT json_nextract('[1,2,3', '$[2]'); +} {0 {{}}} +do_catchsql_test json102-267 { + SELECT json_extract('[1,2,3', '$'); +} {1 {malformed JSON}} +do_catchsql_test json102-268 { + SELECT json_nextract('[1,2,3', '$'); +} {0 {{"[1,2,3"}}} +do_catchsql_test json102-269a { + SELECT '[1,2,3' ->> '$'; +} {1 {malformed JSON}} +do_catchsql_test json102-269b { + SELECT '[1,2,3' -> '$'; +} {0 {{"[1,2,3"}}} + do_execsql_test json102-270 { SELECT json_extract('{"a":2,"c":[4,5,{"f":7}]}', '$.c[2]'); } {{{"f":7}}} +do_execsql_test json102-271 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> '$.c[2]'; +} {{{"f":7}}} +do_execsql_test json102-272 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> 'c' -> 2; +} {{{"f":7}}} do_execsql_test json102-280 { SELECT json_extract('{"a":2,"c":[4,5,{"f":7}]}', '$.c[2].f'); } {{7}} +do_execsql_test json102-281 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> 'c' -> 2 -> 'f'; +} {{7}} +do_execsql_test json102-282 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> 'c' -> '[2]' -> 'f'; +} {{7}} do_execsql_test json102-290 { SELECT json_extract('{"a":2,"c":[4,5],"f":7}','$.c','$.a'); } {{[[4,5],2]}} do_execsql_test json102-300 { SELECT json_extract('{"a":2,"c":[4,5,{"f":7}]}', '$.x'); } {{}} +do_execsql_test json102-301 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> 'x'; +} {{}} +do_execsql_test json102-302 { + SELECT '{"a":2,"c":[4,5,{"f":7}]}' -> NULL; +} {{}} do_execsql_test json102-310 { SELECT json_extract('{"a":2,"c":[4,5,{"f":7}]}', '$.x', '$.a'); } {{[null,2]}} do_execsql_test json102-320 { SELECT json_insert('{"a":2,"c":4}', '$.a', 99); @@ -147,10 +196,13 @@ SELECT json_remove('{"x":25,"y":42}','$'); } {{}} do_execsql_test json102-510 { SELECT json_type('{"a":[2,3.5,true,false,null,"x"]}'); } {{object}} +do_execsql_test json102-511 { + SELECT json_ntype('{"a":[2,3.5,true,false,null,"x"]}'); +} {{object}} do_execsql_test json102-520 { SELECT json_type('{"a":[2,3.5,true,false,null,"x"]}','$'); } {{object}} do_execsql_test json102-530 { SELECT json_type('{"a":[2,3.5,true,false,null,"x"]}','$.a'); @@ -180,10 +232,16 @@ SELECT json_valid(char(123)||'"x":35'||char(125)); } {{1}} do_execsql_test json102-620 { SELECT json_valid(char(123)||'"x":35'); } {{0}} +do_catchsql_test json102-630 { + SELECT json_type('["a",'); +} {1 {malformed JSON}} +do_catchsql_test json102-631 { + SELECT json_ntype('["a",'); +} {0 {{}}} ifcapable vtab { do_execsql_test json102-1000 { CREATE TABLE user(name,phone); INSERT INTO user(name,phone) VALUES