SQLite User Forum

Extract SQLite parser
Login

Extract SQLite parser

(1) By Andrew (splatter) on 2022-08-22 13:01:19 [link] [source]

I am trying to make a NodeJS/JavaScript program that parses SQL queries and create table statements, and then generates an AST that I can traverse, as part of a kind of ORM for SQLite. The only way I can think of doing this is to use the parser that SQLite itself uses. Even though it is written in C, I should be able to create an interface between NodeJS and the c program.

I tried to create the parser using lemon. I pointed it to the grammar, and it generated a parse.c file amongst other things. I then tried to compile parse.c, but it gave me an error Undefined symbols for architecture x86_64, then a list of functions that are in the sqliteInt.h header file (which is #included in parse.c). I haven't written any c in 9 or so years and well, I have no idea what to do now. Actually, I don't have any problem writing the c, it is just that I think I have to build all of SQLite or something to get the functions compiled or... I have no idea about all of that stuff.

(2) By Larry Brasfield (larrybr) on 2022-08-22 14:23:51 in reply to 1 [link] [source]

There are publicly available grammars for SQL (in its various stages of evolution), so you need not be stuck using the grammar ensconced in SQLite's parse.y Lemon input.

The C code that Lemon produces can be split into two categories: control of the parse; and actions associated with recognition of constructs in the grammar. The actions in parse.y are generally not suitable for your project. You might copy or emulate the syntax tree construction portions but anything related to VDBE code generation should be left behind. Leaving actions or portions thereof behind which are extraneous to your quest will go far toward avoiding those "undefined symbol" problems. You will, of course, have to define any code called from your actions that remain after substantial editing of the Lemon input.

I suggest, emphatically, that you start small to get something which builds and works. Perhaps the expression recognizer would be a good starting point until you grow comfortable and familiar with using Lemon. This will be far less frustrating than trying to use parse.y just as it is found in the SQLite project. (For that, you would have to build a huge portion of the SQLite DBMS. That's going to take some obstinacy that will divert you from progress toward that ORM.) Once you have a small grammar implemented and something to show that your generated AST is sensible, it will be relatively simple to incrementally add in more of the grammar found in parse.y or elsewhere.

I see no reasonable way to use Lemon without writing some C (or C++). Trying to avoid that will be more trouble than it's worth.

I presume you have researched and studied LALR parsing and how generated parsers are used for other languages (such as C.) If not, a few hours spent on that will repay themselves as you continue your present quest.

Good luck.

(3) By ddevienne on 2022-08-22 14:29:04 in reply to 1 [source]

Just an FYI, if using Rust is an option: https://crates.io/crates/sqlite3-parser

(4) By Domingo (mingodad) on 2022-08-22 15:30:00 in reply to 1 [link] [source]

Have a look at https://cgsql.dev/ they have already done what you are looking for maybe you could provide the javascript code generator for it.

(5) By Rico Mariani (rmariani) on 2022-08-23 09:10:44 in reply to 4 [link] [source]

There's a lot you could do starting with the parser we have and its AST. There's a ton of docs as well. You might be able to just use the amalgam build and then link that into your own thing. There's lots of ways you could move forward.