SQLite Forum

Joining on Hebrew words including vowel points and cantillation marks
Login
Hello, I'm working with three different sources of Hebrew and would like to join the tables based on the Hebrew words.  I'm pretty sure everything is in UTF-8 and all three sources claim to have used the exact same source for the Hebrew but many words don't match, yet they appear identical on the screen including vowel points and cantillation marks.

For example, for the word below, each source displays the exact same word visually, but if they are written out character-by-character, it appears that they are built in a different order.  The right most letter in one table is built as first the reversed-looking C symbol, then the dot inside of it, and then the double stacked dots below it.  In another table, the stacked dots are second and the single dot last.

<span style="font-size: 50px;">בְּרֵאשִׁית</span>

Is there anything that can be done to be make them match for a join? Or is it a case of bytes are bytes and, if their not in the same order, they're different no matter how they render?

Thank you.