SQLite Forum

XML extension
Login

XML extension

(1) By little-brother on 2021-04-11 14:16:01

I made a small extension ([link](https://github.com/little-brother/sqlite-gui/blob/master/extensions/xml.cpp)) to processing XML. It uses a [pugixml](https://pugixml.org/)-library to parse XML-tree.<br>
There are five scalar functions and one table-valued. Usage examples can be found in the code and in a [Wiki](https://github.com/little-brother/sqlite-gui/wiki#extensions).<br>
`xml_valid(xml)
xml_extract(xml, xpath, sep = "")
xml_append(xml, xpath, insertion, pos = after)
xml_update(xml, xpath, replacement)
xml_remove(xml, xpath)
xml_each(xml, xpath)
`

It looks like XML-format is more dead than alive replaced by json, yaml and protobuf. And parsing XML is not usual in databases too.<br>
But perhaps this extension will be useful to someone. 




The extension can be build by mingw like that: `g++ -I ../include -shared xml.cpp ../include/pugixml.cpp -o xml.dll -s -static -DPUGIXML_NO_STL -Os`

(2) By ddevienne on 2021-04-12 08:57:21 in reply to 1 [link]

Thanks for sharing! I do use XML inside SQLite, so that interesting to me.  
Although ATM to do that outside the DB, just storing the XML.

One thing to note though is that *pugixml* is not Namespace aware.  
**Namespace prefixes** have no meaning in XML processing (except for the `xml:` one)  
and you're on your own with *pugixml* to get the **Namespace URI** that really matters.  
For multi-Namespaces XML documents, that makes it challenging.

(3) By little-brother on 2021-04-12 22:35:25 in reply to 2 [link]

> One thing to note though is that pugixml is not Namespace aware.
 
I know about this limitation. I choose `pugixml` because it's a simple to build and simple to use. It seems to me that namespaces are rare case, therefore it can be skipped.

In any case, I have to add this nuance to code and Wiki. Thank you for pointing that out.

(4) By anonymous on 2021-04-12 23:01:24 in reply to 3 [link]

> I choose `pugixml` because it's a simple to build and simple to use. It seems to me that namespaces are rare case, therefore it can be skipped.

That's a matter of perspective, I guess. Namespaces are anything but rare if you're dealing with standardized vocabularies like SVG, XHTML, XMPP, RSS, Atom, OfficeXML, and so on.

(5) By Rob (yarg56) on 2022-11-06 00:41:19 in reply to 1 [link]

little-brother, I am trying out your sqlite-gui and XML extensions.

I am new to sqlite3 and so far have been unable to figure out how to pass XML stored in a table to an xml_ extension function using only SQL.   

For example, 

I hoped the following SQL would return a list of ITEMID values, but the result is a SQL error.  Is it possible to accomplish this using just sqlite3 SQL?

select xml_extract(select distinct xml from xmlData where file='inventory.xml', 'ITEMID/text()'); 



Here is how I imported XML into a sqlite3 table.

c:\\xml-import\\inventory.xml
<?xml version="1.0" encoding="UTF-8"?>
<INVENTORY>
<ITEM>
<ITEMID>2397</ITEMID>
</ITEM>
</INVENTORY>

create table xmlData as 
select line file, exec('powershell Get-Content c:\\xml-import\\' || line || ' -Encoding UTF8') xml
from exec('powershell Get-ChildItem -Path c:\\xml-import\\ -Name');

select * from xmlData
      
inventory.xml    <?xml version="1.0" encoding="UTF-8"?><INVENTORY><ITEM><ITEMID>2397</ITEMID></ITEM></INVENTORY>

(6) By little-brother on 2022-11-06 02:41:46 in reply to 5 [link]

Try 
```
with t (xml) as (select distinct xml from xmlData where file='inventory.xml')
select xml_extract(xml, 'ITEMID/text()') from t; 
```

P.S. The best way to communicate with me is to use email lb.im@ya.ru because I rarely check the forum.

(7) By Rob (yarg56) on 2022-11-06 15:50:34 in reply to 6 [link]

little-brother:

Thank you for the quick answer to my question. 

I am writing this response to document what I have learned with the hope that it helps people in the future (perhaps even future me).  

After using your suggestion, I next realized that I did not know xpath syntax well enough to extract the data in the way that I wanted. I used https://www.w3schools.com/xml/xpath_syntax.asp to review.

I arrived at this statement which achieves my goal.

with t(xml) as (select distinct xml from xmlData where file='inventory.xml')
select xml_extract(xml, '//ITEMID/text()',',') as extract from t; 


A follow up question that I answered for myself is "How can xml_each() be used to convert values in XML text into rows?"

Reviewing syntax examples for json_each() helped me figure this out.
https://www.sqlite.org/json1.html#jeach

select file, itemid.value
from xmlData, xml_each(xmlData.xml,'//ITEMID/text()') as itemid


Thanks again,  little-brother