SQLite Forum

Subqueries and "no such column" error on references to other tables
Login
> The performance will be, at best case, O(N*M^2) where N is the number of rows in chat and M is the number of rows in post. It will probably degrade a lot faster than that.

All right. Thank you very much for pointing this out! I can't use this technique, then, because I won't ship a library which generates SQL with such a bad complexity.

> We recently discussed that in another thread.

I'd be happy to check this conversation.

> You appear to be trying to find the "earliest post" for each "thread".

Actually, the example wants to load both thread ("chat") and latest post. Imagine you want to feed the main screen of a chat app, with all conversations and a preview of the last message.

> the chat table is useless when querying data as it does nothing except waste time and you can tell this because 
> (a) there is nothing from that table in the select list; and, 
> (b) removing all reference to it does not affect the query result in any way.

So, the point is actually to get chat information along: its a `SELECT chat.*, post.*`

> (chatid, date) must be a candidate key for the post table (that is, it must be unique).

I really appreciate your help. This is not an acceptable constraint, in the general case, unfortunately:

In the chat/post example, we can not require unique post dates in a thread. This would exclude a whole set of reasonable and real use cases. If the user wants to disambiguate the "last post", the user would would have to sort per date and another disambiguation column (such as the primary key). If the user does not perform such disambiguation, then *any* post can be chosen (meaning I can add the primary key as an extra disambiguation ordering clause in the SQL I generate).

The chat/post example is just one example, for illustration purpose. My general case is the following:

Input:

- one "parent" table
- one "child" table
- a foreign key from child to parent
- a set of ordering clauses on the child table, which identifies one child per parent. The ordering clauses *may not* identify a single child per parent. The set of ordering clauses may actually be *empty*.

Ouput:

- An SQL query
    - which selects (`parent.*, child.*`) pairs,
    - where the child is the first child of each parent according by the set of ordering clauses (any child is ok if the ordering clauses are ambiguous)
    - where the child may actually be missing (if a parent has no child), in a similar fashion to LEFT JOIN
    - of acceptable complexity
- Advice for indexes in order to get the desired complexity.

Of course, if this general case is too relaxed, and does not satisfy the requirements for such a desired output, then I'll have to adapt, and maybe distinguish more cases. For example, unique indexes, if acceptable by the user, can be used. But I have to deal with the fact that users may comes with a "weak" setup. Yet I do not want to punish users who have a "strong" setup with an SQL query which has a bad O(...) complexity.

It's a difficult exercice, but I hope I can find out a solution :-)

Edit: if there is no general solution to the general problem as I describe it, then I may have to provide an advice such as "in such situation, setup an extra foreign key to the desired child, and consider using triggers so that it get automatically updated whenever the child table is modified". That would be an acceptable solution also, even if much less easy for the users to setup. I consider the "all chats with their latest post" and *all similar queries* to be an important use case to address, and I want to help users who do not want to write their SQL themselves (because they don't want to, or because they can't).