RETURNING clause require cursor consumption

(1) By beetlejuice (coleifer) on 2021-02-04 16:41:16 [link] [source]

I was a bit surprised that using a RETURNING clause requires explicit consumption of the resulting statement object. Example using Python:

conn = sqlite3.connect(':memory:')
conn.execute('create table reg(id integer primary key, k text)')

cur = conn.execute('insert into reg (k) values (?), (?), (?) returning id',
                   ('k1', 'k2', 'k3'))
print(next(cur))  # First result row.

cur = conn.execute('select * from reg')
print(cur.fetchall())

The above code prints out:

(1,)  <-- the first id inserted
[(1, 'k1'), (2, 'k2')]  <-- first and second row

The python driver has called sqlite3_step twice under-the-hood for our INSERT, and as a result only the first 2 rows are inserted.

My expectation, and the way postgres seems to work, is that the INSERT is performed regardless of whether we consume all the returned rows. I want to raise this now, so if this is expected behavior, I can plan how to work around it in my library code.

(2) By beetlejuice (coleifer) on 2021-02-04 16:58:55 in reply to 1 [link] [source]

I had been looking at the commit log and not the corresponding draft documentation for the release. It seems that the docs clarify that this is an implementation detail and undefined behavior:

https://www.sqlite.org/draft/lang_returning.html

However, applications should not depend on this ordering. Future versions of SQLite might precompute the outputs and return them all before any rows have been modified, or it might save all the outputs into a buffer then play the buffer back after all database modifications have been applied. Or future versions of SQLite might use a mixture of these algorithms.

Seems like it is behaving as intended, although I'm a little concerned about integrating this into libraries that interop with Postgres where the behavior is different. It certainly caught me off-guard.

(3) By Larry Brasfield (LarryBrasfield) on 2021-02-04 17:38:12 in reply to 2 [source]

Hard as it may be, you probably should wait to decide upon work-arounds, alternative approaches, feature avoidance or feature reliance. Pre-release code is unstable, lacking the backwards compatibility guarantees that the SQLite project honor, (as do many other projects.) The pre-release drops are made for improving bug-catching prospects, exposing problems such as you report, and getting feedback on new or enhanced features.

(4) By Richard Hipp (drh) on 2021-02-04 23:32:32 in reply to 1 [link] [source]

The latest trunk check-in uses a new RETURNING algorithm that embargos all outputs until after all changes have been completed. Please try to break the new code and report back success or failure.

(5) By beetlejuice (coleifer) on 2021-02-05 00:24:29 in reply to 4 [link] [source]

This change is working well for me. Here's a small test program I used to check the behaviors:

Inserting multiple rows working as expected:

conn = sqlite3.connect(':memory:')
conn.execute('create table reg(id integer primary key, k text)')

cur = conn.execute('insert into reg (k) values (?), (?), (?) returning id',
                   ('k1', 'k2', 'k3'))
print(next(cur))  # First result row.

# Prints (1,)

cur = conn.execute('select * from reg')
print(cur.fetchall())

# All inserts have been processed, output is now:
# Prints: [(1, 'k1'), (2, 'k2'), (3, 'k3')]

Similar workflow with an UPDATE this time also works as expected:

cur = conn.execute('update reg set k=k||? where k in (?, ?, ?) returning id, k',
                   ('x', 'k1', 'k3', 'k2'))
print(next(cur))  # First result row.

# Prints: (1, 'k1x')

cur = conn.execute('select * from reg')
print(cur.fetchall())

# All updates were processed. Output is:
# [(1, 'k1x'), (2, 'k2x'), (3, 'k3x')]

Do an upsert which will be executed the first time, and not the 2nd time. Both operations working as expected:

# Check behavior w/upsert
conn.execute('create table r2(id integer primary key, k text unique, v real)')
conn.execute('insert into r2 (k, v) values (?, ?), (?, ?)',
             ('k1', 1, 'k2', 2))

for _ in range(2):
    curs = conn.execute('insert into r2 (k, v) values (?, ?) '
                        'on conflict(k) do update set v = v + ? '
                        'where v <= ? '
                        'returning id, k, v', ('k1', 3, 0.5, 1))
    print(curs.lastrowid)
    try:
        print(next(curs))
    except StopIteration:
        pass

# Correctly prints:
# 2
# (1, 'k1', 1.5)
# 2
# <no output here, since no rows via RETURNING>