SQLite User Forum

Why doesn’t negative rounding (very useful for binning) work for sqlite3 in the Pandas read_sql_query() function?
Login

Why doesn't negative rounding (very useful for binning) work for sqlite3 in the Pandas read_sql_query() function?

(1) By George Hayward (ghayward) on 2022-07-16 16:49:43 [link] [source]

No one's solved this yet in StackOverflow, so I'm posting here:

For some reason, simple negative rounding doesn't seem to work in sqlite3, when we pass it through Pandas via the read_sql_query() function.

Does anyone know why?

And better yet, anyone know how to make it work?

For a quick primer on negative rounding check out Dr. Greg Low's blog post.

I've managed to re-produce the error here, in the code below.

You'll see that you should get a negative round to 12,000 for bobg89 but instead you get 12,425.0. This should not happen.

Here's the reproducible code:

import pandas as pd import numpy as np import sqlite3 as sql

d = { 'nickname': ['bobg89', 'coolkid34','livelaughlove38'], 'state': ['NY', 'CA','TN'], 'score': [12425.22, 200.002,310.2656] } df = pd.DataFrame(data=d) stack_overflow_example_dataset = df.copy() print(stack_overflow_example_dataset)

When you print that, just to take a peek, you'll get this:

nickname state score 0 bobg89 NY 12425.2200 1 coolkid34 CA 200.0020 2 livelaughlove38 TN 310.2656

Then, I am going to make that table into a sqlite database in the notebook so you can see what happens next. This is where it gets crazy.

conn = sql.connect("stack.db") stack_overflow_example_dataset.to_sql("stack",conn,if_exists='replace',index=False)

And this next part is truly head-scraching. I've done this a million times to peek at the data inside a notebook, but this is my first time trying to do a negative round, which is super useful for bucketing stuff in data science.

test_output = pd.read_sql_query(""" SELECT *, round(1.0*score,1.0) as regular_round, round(1.0*score,-3.0) as negative_round FROM stack ; """, conn)

print(test_output)

And inexplicably you get this:

nickname state score regular_round negative_round 0 bobg89 NY 12425.2200 12425.2 12425.0 1 coolkid34 CA 200.0020 200.0 200.0 2 livelaughlove38 TN 310.2656 310.3 310.0

So it's like no negative rounding is allowed. It should be 12,000 not 12,425, because I've used a -3.0 rounding strategy.

Does anyone know how to make the negative rounding work within read_sql_query()? Or, less preferred, anyone know how to do this better in Pandas or sqlite?

Best, George

You can see my post on this in StackOverflow here via the below link:

https://stackoverflow.com/questions/72998925/why-doesnt-negative-rounding-very-useful-for-binning-work-for-sqlite3-in-the

Thank you for all your hard work on this very helpful library!!! :)

The Lord Always Delivers!!!

(2) By Larry Brasfield (larrybr) on 2022-07-16 18:50:51 in reply to 1 [link] [source]

No one's solved this yet ...

Not surprising given that the amount of information presented overwhelms what would be needed to explicate the issue.

[Summarizing instead of quoting:]

The SQL function, round(X,Y), does not treat negative Y values as the (negative) number digits to round to the left of the decimal point.

That might be reasonable, but it is not suggested by the documentation. A few minutes ago, the documented effect was to produce a result "rounded to Y digits to the right of the decimal point" and nothing was said about the effect of negative values. Now, in the source for that doc page, it says "If the Y argument is omitted or negative, it is taken to be 0." This reflects what the code has done since its inception and will continue to do.

(3.1) By Keith Medcalf (kmedcalf) on 2022-07-16 19:08:11 edited from 3.0 in reply to 1 [link] [source]

What you are calling "negative rounding" is nothing new. It has been done for centuries if not millenia.

The round function in SQLite3 only does kindergarten rounding. It takes advanced mathematics and computational theory to progress from "kiddy" rounding to "adult" rounding.

You can generate your own round function which implements proper rounding (as in round nearest ties to even) and also permits a negative "places" specification.

The reason that proper rounding is not included is likely because proper rounding would generate more complaints -- five-nines of the population cannot even comprehend binary floating point, so expecting them to understand an advanced mathematical concept such as rounding is already a lost cause.

Not to mention that implementing proper rounding is difficult (without harware/runtime support), which is not necessarily available to all compilers. (In fact, the specific required language feature is unavailable in the versions of the C language standard used by SQLite3, and writing a "generic" compliant alternative that maintians IEEE precision guarantees is difficult).

The below function will perform "proper" rounding provided that your compiler is IEEE compliant, supports extended precision, and includes the IEEE remainder function. (Precision is generally 1 ULP but may reach 512 ULP, uncommonly, due to the scaling).

SQLITE_PRIVATE void _fastroundingFunc(sqlite3_context *context, int argc, sqlite3_value **argv)
{
    register long double x, scale;
    register int digits = 0;

    if ((argc == 0) || (argc > 2) || (sqlite3_value_type(argv[0]) == SQLITE_NULL))
    {
        return;
    }
    x = sqlite3_value_double(argv[0]);
    if ((intptr_t)sqlite3_user_data(context) == 1)
    {
        digits = 4;
    }
    else if (argc == 2)
    {
        digits = sqlite3_value_int(argv[1]);
    }
    digits = digits < 16 ? digits : 15;

    scale = powl(10.0, digits);
    x *= scale;
    x -= remainder(x, 1.0);
    x /= scale;

    sqlite3_result_double(context, x);
    return;
}

And the functions registered:

    // Use this function for roundhe and money(4 places) rounding because it has higher IEEE compliance

    nErr += sqlite3_create_function(db, "roundhe",     -1, SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS,  0,              _fastroundingFunc,0, 0);
    nErr += sqlite3_create_function(db, "money",        1, SQLITE_UTF8|SQLITE_DETERMINISTIC|SQLITE_INNOCUOUS,  (void*)1,       _fastroundingFunc,0, 0);

You should note that you can make the built-in round function work as you wish thusly (so that it respects a negative place specification however it will still do kindergarten rounding):

round(x, y) -> round(x*pow(10,y),0)/pow(10,y)

(4) By Keith Medcalf (kmedcalf) on 2022-07-16 19:23:34 in reply to 3.1 [source]

Note this function does "significant digits", that is, rounded to display the requested number of non-zero digits.

SQLITE_PRIVATE void _rounddigitsFunc(sqlite3_context *context, int argc, sqlite3_value **argv)
{
    static int maxdigits = 14;
    register long double x, scale;
    register int digits = maxdigits;

    if ((argc==0) || (argc>2) || (sqlite3_value_type(argv[0]) == SQLITE_NULL))
    {
        return;
    }

    x = sqlite3_value_double(argv[0]);
    if (argc == 2)
    {
        digits = sqlite3_value_int(argv[1]);
    }

    digits = (digits > 0 ? (digits <= maxdigits ? digits : maxdigits) : 1) - ceil(log10(fabs(x)));
    scale = powl(10.0, digits);
    x *= scale;
    x -= remainder(x, 1.0);
    x /= scale;

    sqlite3_result_double(context, x);
    return;
}
sqlite> select sigdigits(28374658734.0, 3);
┌─────────────────────────────┐
│ sigdigits(28374658734.0, 3) │
├─────────────────────────────┤
│ 28400000000.0               │
└─────────────────────────────┘

(5) By George Hayward (ghayward) on 2022-07-17 15:31:38 in reply to 2 [link] [source]

Just wanted to have reproducible code so anyone, anywhere with access to a Jupyter Notebook could reproduce the error.

Certainly surprised that it doesn't throw a warning; and I wasn't even sure if it was a sqlite3 or Pandas-parsing error.

And now, I can see, it's more of a choice than an error.

Thanks for the time.

(6) By George Hayward (ghayward) on 2022-07-17 15:32:08 in reply to 3.1 [link] [source]

Thank you!!