SQLite Forum

regexp for Unicode ranges
Login
Okay, I've tried it and I get the same result.

sqlite> .load /usr/lib/sqlite3/pcre
sqlite> .mode csv
sqlite> select *, name regexp '\p{Greek}', name regexp '\p{Latin}', name regexp 'ß'
   ...> from (
   ...> select 'ascii' as name
   ...> union
   ...> select 'γλώσσα'
   ...> union
   ...> select 'straße'
   ...> );
ascii,0,1,0
"straße",0,1,1
"γλώσσα",0,1,0

P.S. I use the latest version of SQLite (3.33) and I've compiled PCRE on my machine.
I think that PCRE is the first suspect to analyze.

So why is the unicode property causing an issue? I found this:
If PCRE is built with Unicode character property support (which implies UTF support), the escape sequences \p{..}, \P{..}, and \X can be used. The available properties that can be tested are limited to the general category properties such as Lu for an upper case letter or Nd for a decimal number, the Unicode script names such as Arabic or Han, and the derived properties Any and L&. Full lists is given in the pcrepattern and pcresyntax documentation. Only the short names for properties are supported. For example, \p{L} matches a letter. Its Perl synonym, \p{Letter}, is not supported. Furthermore, in Perl, many properties may optionally be prefixed by "Is", for compatibility with Perl 5.6. PCRE does not support this.