Placeholder for table name when inserting data?

(1) By anonymous on 2020-09-13 16:02:26 [link] [source]

I couldn't find example, how to use placeholder for table name when inserting data. Can someone help me fix SyntaxError in my simplified example bellow?

Why do I need table name placeholder? Say I have 100 tables like so: first_table, second_table... My (Python) script loop over 100 times, manipulate & extract data, that needs to be written into these tables. At first loop it writes into first table, at second loop it writes into second table... and so on. First row is UNIQUE (that's why I have "INSERT OR IGNORE" in insert statement) For now I'm looping and inserting into CSV files but I would like to save data into SQLite3 database file.

I was told, there is no table name placeholder in SQLite3, but that I could use python's f-string. I can't get code to work. Bellow is simple code I would like to get it to work, if anyone has any suggestion:

import sqlite3

conn = sqlite3.connect('./test.db')
c = conn.cursor()

c.execute('''CREATE TABLE IF NOT EXISTS first_table (value1 INTEGER, value2 INTEGER, value3 INTEGER)''')

table_name = 'first_table'

numbers = [111, 222, 333]

c.execute(f"INSERT OR IGNORE {table_name} INTO VALUES (?,?,?)", numbers)

conn.commit()

(2) By Warren Young (wyoung) on 2020-09-13 16:10:28 in reply to 1 [link] [source]

You can't do that in SQLite on purpose. Preparing a SQL statement requires that SQLite be able to build a query plan, and it can't do that if it doesn't know what tables and columns to operate over.

If you need SQL queries to vary beyond that, you build the query string in your calling code and prepare each version in turn.

(3) By anonymous on 2020-09-13 16:30:45 in reply to 2 [link] [source]

(I created account in the mean time) :-)

OK I understand the reason.

Question is, what are my options? One would be, that my script don't loop over and I use the same (data manipulation code) 100 times which goes against the coding practice: don't repeat yourself. Plus code would become verrrry long ;)

Is there any other solution, that I can implement in my Python code? I don't understand: "you build the query string in your calling code and prepare each version in turn" - I'm beginner in Python and I have never run SQLite commands alone. Only withing Python script, since storing data is just part of the bigger Python project.

Do you have any idea, how I could modify my sample code in my original post, to get this looping functionality working? I would really like to stop using *.CSV files and start using SQLite database if possible.

(4.1) By Warren Young (wyoung) on 2020-09-13 16:44:08 edited from 4.0 in reply to 3 [link] [source]

you build the query string in your calling code and prepare each version in turn" - I'm beginner in Python and I have never run SQLite commands alone

The rest isn't a SQLite question at all, but a Python question. There are certainly better places to get such answers. However, briefly:

for...table_name...
   qstr = "INSERT OR IGNORE INTO {0} VALUES (?,?,?)".format(table_name)
   c.execute(qstr, arg1, arg2, arg3)

EDIT: I believe I fixed a syntax error in your SQL with respect to the placement of INTO.

(5) By anonymous on 2020-09-13 16:52:24 in reply to 4.1 [link] [source]

Ah, thank you very much! I knew there has to be some solution for my problem :)

Will give it a try, you made my day, thanks again!

(6) By anonymous on 2020-09-13 17:03:37 in reply to 4.1 [link] [source]

Yes, it is working! Thank you so much!!! I'm now going to convert all my code where data is being stored into CSV files, to now be stored into SQLite database file! I'm so happy!!! You guys here rock!

(7) By TripeHound on 2020-09-13 21:08:26 in reply to 6 [link] [source]

I'm by no means a database expert, but having a hundred tables is very often not the best way of doing things. If your 100 tables are essentially storing the same (or very similar) types of data, it is often better to put all records in the same table, with an additional field to indicate which "table" it came from.

For instance, in a manual system you might keep a list of books on Sci-Fi, a list of books on history and another list of books on cooking. In a database, it is usually better to have one "Books" list, with a field indicating which type of book it is (often as a "type code": 1=SciFi, 2=History etc.).

Similarly, if the original tables are split by time (e.g. a different table for each month), then it is usually better to use one database table with a date or month-number field.

With a (semi) manual system, it's can be easier to have separate lists, so you're not hunting for the entry you want among lots of irrelevant entries. With a database, it's (usually) just a matter of adding WHERE BOOKTYPE=2 to the query.

Example benefits of reduced numbers of tables (apart from the "don't repeat yourself" practice you mention) include: it is easier to make cross-subject queries (how many books in total were published before 2000); it is easier to change how the data is stored (say you suddenly decide you want to record the number of pages in each book). With 100 tables, you'd need to make 100 queries and combine the results, or make alterations to 100 table structures. With one table, things are much easier.

(8) By anonymous on 2020-09-13 21:24:51 in reply to 7 [link] [source]

Hi, anonymous, So you have bunch of XLSX files that you want to put into the N tables of the SQLite DB, right?

That's not the efficient way to do that.

What are those XLSX files hold? It would be nice to know that so we can advise you on the best DB schema to create and make you queries work efficiently.

One such idea is done by TripeHound. But he gave just an example.

Thank you.

(9) By anonymous on 2020-09-14 09:59:43 in reply to 7 [source]

Hi TripeHound, you are making good point about the books, will keep this in mind, thank you.

My data are financial stock prices. So database is say stocks.db and then I have one table for each stock price. Schema is the same for each table, date is UNIQUE, the rest are 5 columns of numbers, open, hi, low, close and volume. This would be example with lots of tables, it probably can't be structured any other way?

Or, another example are some economic numbers I'm scraping on a weekly basis. Again, first column, date is UNIQUE, this prevents double entries. Then I have some 10+ columns of numbers. Here tables have a different schema, since number of columns vary from table to table.

(10) By anonymous on 2020-09-14 10:10:11 in reply to 8 [link] [source]

My bad, I should have being more specific. Yes, TripeHound is making a good point. Something, I will think about it in my future projects. In this case, I'm probably OK.

Data is simple financial stock market data with UNIQUE date, then columns: open, high, low, close and volume. That's where I have most tables. Before I was writing data into simple txt files (CSV) and I had one file for each stock.

Now I have stocks.db and each table is one stock ticker. I like this so much better :) In this case, I'm probably OK with my schema? Also, these are all learning, hobby projects.

P.s. Interestingly, I created account, I'm logged in but I still come out as anonymous ;)

(11) By Gunter Hick (gunter_hick) on 2020-09-14 11:25:48 in reply to 10 [link] [source]

You should NOT be storing the name of a stock as the table name. This is THE prime example of how NOT to do things.

Instead of 

CREATE TABLE stock_xyz(date unique, ...);

do something like

CREATE TABLE stocks (stockID INTEGER PRIMARY KEY, short TEXT, long TEXT);
CREATE TABLE ticker (stockID INTEGER REFERENCES stocks(stockID),
                     date, ...,
                     unique(stockID,date));

Then you will always be inserting into ticker using the stockID read from table stocks. To simplify selecting, you can create a view

CREATE VIEW stock_ticker AS
SELECT s.short,s.long,t.date [,...] FROM stocks s JOIN ticker t ON s.stock_ID=t.stockID;

So that you can refer to stocks via short or long name.

(12) By anonymous on 2020-09-14 11:31:38 in reply to 10 [link] [source]

Data is simple financial stock market data with UNIQUE date, then columns: open, high, low, close and volume.

each table is one stock ticker

In this case, I'm probably OK with my schema?

I don't think so, and having to parametrise the table name is usually a red flag meaning that someone is not using SQL the way it was intended to be used. In this case each stock ticker is like a book type in TripeHound's example, so they might better be merged in one table. You won't have to parametrise table names if you make the ticker part of the table definition and make the constraint UNIQUE(date, ticker). Making reports per each ticker would also become an SQL problem instead of "how to iterate over multiple similarly-defined tables in $programming_language" problem; instead you'd just add a GROUP BY ticker to an aggregating query.

(13) By Adrian Ho (lexfiend) on 2020-09-14 12:57:42 in reply to 10 [link] [source]

Besides Gunter and anonymous' suggestions, you could also consider storing each counter's data in a separate SQLite file. On server-based DBMSs, this is roughly equivalent to "sharding on counter", and the same general considerations apply (e.g. you'd only ever process one counter's data at a time, and JOINS across counters are rare-to-nonexistent).

(14) By Simon Slavin (slavin) on 2020-09-14 13:56:43 in reply to 9 [link] [source]

Here's what tells us you're probably doing something wrong: you have multiple tables with the same structure. To people who are experienced with SQL, this is a very obvious sign. Not only does it mean that your programming will be inefficient, but it means that your software will take longer to run, that your database will take more space in storage.

Examples of why doing it this way is bad

SQLite is extremely well optimized for searching through an index. It does it very quickly, with a minimum of memory used. However, the design of SQL assumes you will have a limited number of tables. Because of this SQLite is not well optimized for searching through a long list of tables. Every operation which includes a table name requires this search. It takes far longer to search through a thousand table names than it does to search through a thousand values in an index.

SQLite's database file is based around pages. Each page is assigned to a table or an index. If a table or an index doesn't take up an entire page, then the unused part of the page is wasted. Suppose your database holds data for a thousand stocks. Then you will have a thousand pages partially unused. This takes up valuable storage space, makes backups take longer, etc..

(The above section is slightly distorted and dramatically simplified for clarity.)

Suggested new structure

If you have two or more tables with the same columns and the same indexes, those tables should be merged into one table. You add an extra column to hold what was previously the table name.

This may require changing your primary key. It looks like for your existing primary key was a date field. It looks like you had a separate table for each stock. So your new primary key should be (stock, date). SQLite will ensure that each combination of those two is unique.

How to proceed

Don't import your existing CSV files into many separate tables. Or if you've already done so, drop those tables. Instead import your existing CSV files into one big table, setting the extra column for the stock code as you do so.