SQLite Forum

RAM consumption spike when using SAVEPOINT and FK update
Login

RAM consumption spike when using SAVEPOINT and FK update

(1) By Paul (greenscape) on 2021-02-17 14:27:03 [link] [source]

This is probably a regression. Back in 2014, it was already reported and fixed thanks to Dan. See reference [1] on the bottom.

It all seems to be related to the use of nested SAVEPOINTs (Tx + Sp or Sp + Sp) and FK operations.

Here's the simple python script to reproduce the issue:

import sqlite3
import os

DATABASE_FILE = "/tmp/test.sqlite3"
JOURNAL_FILE = f"{DATABASE_FILE}-journal"


def prepare_database():
    if os.path.exists(DATABASE_FILE):
        os.remove(DATABASE_FILE)
    if os.path.exists(JOURNAL_FILE):
        os.remove(JOURNAL_FILE)

    result = sqlite3.connect(DATABASE_FILE)
    result.isolation_level = None

    # Requirement (1): 'PRAGMA foreign_keys = ON;'
    result.executescript("""
        PRAGMA page_size = 4096;
        PRAGMA temp_store = MEMORY;
        PRAGMA synchronous = OFF;
        PRAGMA journal_mode = DELETE;
        PRAGMA foreign_keys = ON;
        """)

    # Requirement (2): foo_id is FK
    result.executescript("""
        CREATE TABLE IF NOT EXISTS foo (
          id       INTEGER PRIMARY KEY,
          data     TEXT DEFAULT NULL
        );
        
        CREATE TABLE IF NOT EXISTS bar (
          id       INTEGER PRIMARY KEY,
          foo_id   INTEGER,
          data     TEXT DEFAULT NULL,
          FOREIGN KEY(foo_id) REFERENCES foo(id) ON DELETE RESTRICT
        );
        """)

    # Optional requirement: causes x3 more memory consumption
    result.execute("CREATE INDEX IF NOT EXISTS foo_bar_idx ON bar(foo_id, id);")

    # Just filling the data
    result.execute("BEGIN IMMEDIATE;")
    result.execute("INSERT INTO foo(id) VALUES(1),(2);")
    for i in range(1, 500000):
        result.execute(f"INSERT INTO bar(id, foo_id) VALUES({i}, 1);")
    result.execute("COMMIT;")
    return result


def main():
    db = prepare_database()

    # Requirement (3): top-level transaction
    db.execute("BEGIN IMMEDIATE;")
    # Requirement (4): savepoint inside top-level transaction
    db.execute("SAVEPOINT foo1;")

    # This is the place where memory consumption skyrockets
    # Each update allocates a large chunk of memory
    for i in range(1, 500000):
        db.execute(f"UPDATE bar SET foo_id = 2 WHERE id = {i};")

    # At this point, RAM consumption is around 6GiB
    db.close()

    # This is a result of *ALL* 4 requirements being met, if either of them isn't:
    #  - (1) foreign_keys not enabled
    #  - or (2) foo_id isn't a FK
    #  - or (3) there's no top-level transaction
    #  - or (4) there's no savepoint
    # then no memory consumption spike occurs


if __name__ == "__main__":
    main()


References

[1] Original thread about similar issue http://sqlite.1065341.n5.nabble.com/Abnormal-memory-usage-when-removing-rows-within-nested-transaction-from-a-table-referenced-by-anothe-td79640.html

(2) By Paul (greenscape) on 2021-02-22 09:56:19 in reply to 1 [link] [source]

Sorry for being annoying and bumping this thread. But this is a pretty severe issue, that renders SAVEPOINTs, as means of creating nested transactions (what they are designed for), unusable.

Is there any way to work this around, PRAGMAs, compile options or anything? With the exception of changing PRAGMA temp_store = MEMORY; to anything else. I know I could just use disk, but that would create an even worse problem.

(3) By Dan Kennedy (dan) on 2021-02-22 15:55:03 in reply to 2 [link] [source]

I think this is a slightly different problem than the one in 2014. Back then we were worried about statement journal growth when writing to the parent table of an FK constraint - this example is writing to the child table.

Given that you have no triggers or anything, and that your DML statement affects a single row, it's most likely possible to run this SQL without a statement journal at all, which would fix the problem. But it's a fairly tricky change.

This patch:

https://sqlite.org/src/info/e36327fb22db0876

attempts to prevent the statement journal from growing indefinitely at a lower level. I think it will help in many cases, but not all. It does with your example, FWIW.

If you get the chance, can you try this in your application code and let us know if it helps any?

Thanks,

Dan.

(4) By Paul (greenscape) on 2021-02-23 10:05:28 in reply to 3 [link] [source]

Hi Dan,

Yet again, thank you! We're tried your patch and it works perfectly: RAM usage is, as if there was no SAVEPOINT at all!

Can we hope for it to be included in the next upcoming release?

(5) By Dan Kennedy (dan) on 2021-02-23 10:40:30 in reply to 4 [link] [source]

I think so. These things are never certain though, we might find a problem with the patch yet.

Dan.

(6) By Paul (greenscape) on 2021-02-23 10:45:08 in reply to 5 [source]

I see. We'll be keeping an eye on it.

(7) By Richard Hipp (drh) on 2022-01-06 15:48:07 in reply to 3 [link] [source]

Historical note: That patch can cause database corruption. The problem was first identified in PoC #3 of forum post d7338bf4901f1151 and fixed by check-in 73c2b50211d3ae26.