GUI with drill-down?

(1) By JoergMeyer (JrgMyr) on 2022-04-21 09:06:35 [link] [source]

Dear all.

Do you know of an application with the following properties?

It displays an initial query's results as a table.
Table columns can be associated with further views/queries.
Table cells (not necessarily all) are clickable.
Clicking on a cell starts the next view/query (as associated with the column) where the cell value is provided as argument.
The result replaces the previous table. (Repeat steps 3 to 5)

I have a complex (finance) data set and many queries. I would like to "jump around" and run queries interactively a.k.a "explorative data analysis".

Example: We start with an account list, jump into a specific account, look-up a selected posting in that account, jump into its posting type and so on.

Right now, all of that is possible, but only manually, including priming queries with look-up values.

Any helpful hints or ideas? Maybe a Javascript framework to build it from scratch?

Cheers, Jörg

(2) By Stephan Beal (stephan) on 2022-04-21 09:19:33 in reply to 1 [link] [source]

I have a complex (finance) data set and many queries. I would like to "jump around" and run queries interactively a.k.a "explorative data analysis".

The closest thing i've used to what you describe is Tableau. i've no idea whether it works with an sqlite back end (we used it with Oracle back then, about 6 years ago), but it probably does. It's not cheap, but it is powerful and allows even those who are not experts in data analytics to build powerful reports with ease. It was a genuine pleasure to work with and an amazing piece of software.

(3) By Aask (AAsk1902) on 2022-04-21 09:34:08 in reply to 1 [link] [source]

Seems you are looking for a multi-dimensional pivot table based on a view with drill down features.

For a prototype, you could draw the underlying data from an SQLite view into an Microsoft Excel pivot table. This will enable you to fine-tune your ideas for building the real or bespoke application from scratch.

I am not aware of an off the shelf package that you could use, certainly not one that is cross-platform.

Sounds interesting!

(4) By skywalk on 2022-04-21 15:45:52 in reply to 1 [link] [source]

I use DB Browser for SQLite for prototyping. It has most of what you need, short of "drill-down" triggered queries. Those you can manually achieve and later automate in your eventual code when satisfied with structure.

(5) By JoergMeyer (JrgMyr) on 2022-04-21 16:14:12 in reply to 3 [link] [source]

Thanks for responding. The data set is a bit big for a spreadsheet, I have to say.

I thought about a config file that associates "drill down" queries with column names. Example: Whenever a column is called "account" then it would run a certain query to select an account's content.

That way I would avoid to associate drill-down queries for every query result again.

And then a generic HTML file with a table (that gets replaced all the time) and an SQLite interface to run the queries against the transaction database.

(6) By JoergMeyer (JrgMyr) on 2022-04-21 16:15:40 in reply to 4 [link] [source]

I use SQLiteSudio and I absolutely love it.

(7) By MBL (UserMBL) on 2022-04-21 16:25:57 in reply to 4 [link] [source]

I always and still like SQLite3Explorer. Old but still good and works well with latest SQLite3.dll versions. (Yes, it is a Windows32 GUI tool)

It is possible to work on a query results et with additional filters and grouping ... not all but several of your wishes might come true this way.

(8.3) By Aask (AAsk1902) on 2022-04-22 13:00:42 edited from 8.2 in reply to 5 [link] [source]

The data set is a bit big for a spreadsheet

I had such a constraint in mind when I responded; hence, I suggested Excel Pivot Table as a means of prototyping your application.

However, although an Excel Worksheet has a limit of 1,048,576 rows, I have built Excel Pivot Tables using external data i.e. a database which contains significantly more raw rows. I have built pivot tables with 5 million rows of data each with 18 columns - the limit appears to be available memory.

I also suggested multidimensional pivot tables: You can use several Filters and multiple Rows as dimensions; these appear as combo boxes from which selections serve as drill downs.

That way I would avoid to associate drill-down queries for every query result again.

Drill down does not fetch the data from the source but you do have the option of refreshing the underlying data manually or automatically, say, when the workbook opens.

Excel Pivot tables are intuitive both for building and in use. and would serve well as a prototyping environment.

Also, rather than assuming that The data set is a bit big for a spreadsheet you need to prove that that is indeed the case by initialising the pivot table using external data i.e. using Insert | Pivot Table | From External Data Source

Other plus points:

You cannot modify the underlying data from Excel.
You do not have to write SQL queries - your selection from Filters and Rows automatically fetch the correct data.

(9) By MBL (UserMBL) on 2022-04-22 16:55:52 in reply to 5 [link] [source]

You may want to give KNIME Analytics Platform a trial. It is freeware, only the server solution costs money; but is available for large community usage and automation. It is very good to organize repeated working steps in work flows, which can be step by step or at once executed again and again as needed. (They can also be exported or imported and send to someone per email - with or without data included.)

The principle is the interactively done connection of different nodes to a whole work flow. Each node has to be configured. After each step the intermediate results can be inspected. - There is many training material available world wide. Its root came several years ago from the bio-chemical analysis work.

The Sunburst chart is one which supports drill down in a visual way. The selection done can be taken out and used to feed a parameterized SQL query or to assemble a complete new one. - There are uncountable possibilities. But you will have to spent some time to learn all the available nodes.

SQLite3 is one of the possible supported sources. As that nice tool is a Java application it is multi-platform usable and the DB interface will be JDBC.

I am using it a lot. The complicated query arrangements I let SQLite3 or another database do per SQL statement and the preparation, bindings and presentation I do with support of the work flows.

Very nice: Two nodes can be combined to perform parallel chunks. My SQL query can run then in some dozen servers simultaneously and finally I get the results from them all in the same time as if the query has run only on one of them.

(10) By anonymous on 2022-04-22 19:01:47 in reply to 4 [link] [source]

+1 for KNIME and for DB Browser for SQLite as tools I enjoy to use.

I am not sure I fully understand your use case, but these might go in the direction:

https://github.com/multiprocessio/datastation

https://github.com/orchest/orchest

Cheers!

(11.1) By little-brother on 2022-04-24 02:09:06 edited from 11.0 in reply to 1 [link] [source]

sqlite-gui (Windows only) can do exactly what you need.

Honestly, I read your post and then refined one of the existing features.

Here is how to set up queries for your data.

(12) By MBL (UserMBL) on 2022-04-24 10:36:50 in reply to 11.1 [link] [source]

Indeed it seems to be a very good tool. I will do many more trials in near future on it.

The exec.dll extension was deleted by my McAfee scanner as malware suspect, hence I will not be able to try that. I replaced the sqlite3.dll with my own one, which contains my extension functions and virtual tables, too, without problems. Also I put in the inja.dll to also give that a try.

(13) By MBL (UserMBL) on 2022-04-24 17:43:06 in reply to 12 [source]

Drill down can be done by 'filtering on click' but could also be done like in Sqlite3explorer by dragging-and-dropping a column name into a special paste area and then use that info to do 'grouping by folding'. SQLite-Gui could for example use that to do various result tabs. But it could also be done by using the group info to build a tree and only when unfolding do the subquery execution; with the advantage that groups of no interest do not waste query time nor memory.

Each node in the tree view would represent a query statement with filter items from all parents (ancestor-or-self axis in terms of xpath). Each sub-node (child-of-self in terms of xpath) would be the same query but with an addition filter on one of the group items. In other words the drill-down would be a generated enhancement with filter conditions. I think the CTEs are excellent for such an automated SQL generation for cascaded filter generation.

The click to 'drill down' would execute the rule for how the filter-extension has to be added to the previous CTEs. Could be a text field entry but could also be a bin width of a numeric field content around a numeric field entry. Each field could get its own rule and when clicked it is used for the filter extension chaining.

I hope my idea is understandable ... each drill down step as a generation of a number of filter extended SQL query statements.

(14) By anonymous on 2022-04-24 23:31:38 in reply to 13 [link] [source]

Did you have a look at Datastation ( https://datastation.multiprocess.io/ )?

Maybe this video explains how you can create a sequence of queries very easily:

https://www.youtube.com/watch?v=q_jRBvbwIzU

Each “query” can be in SQL, Python or R among others. The result can be shown as a table or a chart and serves as an input for another query.

Cheers!

(15) By MBL (UserMBL) on 2022-04-25 06:55:02 in reply to 14 [link] [source]

Did you have a look at Datastation?

Yes, I did. Thank you very much to share these links with us and for the hint.

It looks like another powerful good tool and I will spend some more time in near future to learn more about how to use it for my work with data.

To drill-down seems to be easy by taking in each additional Panel the data of the previous - but it is not interactively done, it needs some coding.

With 300 MB it is not a light-weight tool like SQLite-GUI and SQLite3Explorer are; but the community edition is free and that's good.

(16) By anonymous on 2022-04-25 13:19:59 in reply to 15 [link] [source]

I tested it very briefly only, and it is still at early stage of development. I reckon the developer is open to your suggestions e.g. for more interactive use of the tables.

Yes, for me, to be able to build one panel/query on top of another one while the sequence is always visible makes it quite easy to follow and to reproduce each step.

Actually, no need for any coding other than SQL; Python, R, Julia etc. are to add more flavor.

At current stage, might be not stable enough for production, though.

If you want to process the data further e.g. some statistics or missing values, outliers etc. while remaining on GUI level, KNIME can help (and much more, incl. ML), but it is not lightweight …

Cheers!

(17) By MBL (UserMBL) on 2022-04-25 15:52:19 in reply to 16 [link] [source]

DataStation should consider the KNIME table format and H2-Database, too. That would be be great for the future. H2 is a light-weight Java SQL database with similarities to SQLite3 and can be used with the same ODBC driver as is used for PostgreSQL (but not yet recommended for production usage) or with JDBC connectivity like KNIME likes it most. - H2 is MPL 2.0 licensed.

From a benchmark with SQLite 3.36.0.3 they say:

Performance Comparison

In many cases H2 is faster than other (open source and not open source) database engines. Please note this is mostly a single connection benchmark run on one computer, with many very simple operations running against the database. This benchmark does not include very complex queries. The embedded mode of H2 is faster than the client-server mode because the per-statement overhead is greatly reduced.

SQLite 3.36.0.3, configured to use WAL and with synchronous=NORMAL was tested in a separate, less reliable run. A rough estimate is that SQLite performs approximately 2-5x worse in the simple benchmarks, which perform simple work in the database, resulting in a low work-per-transaction ratio. SQLite becomes competitive as the complexity of the database interactions increases. The results seemed to vary drastically across machine, and more reliable results should be obtained. Benchmark on your production hardware.

The benchmarks used include multi-threaded scenarios, and we were not able to get the SQLite JDBC driver we used to work with them. Help with configuring the driver for multi-threaded usage is welcome.

(18) By little-brother on 2022-04-25 22:45:10 in reply to 13 [link] [source]

Drill down can be done by 'filtering on click' Could you provide some video at lb-im@ya.ru. I don't know where is special paste in sqlite3explorer.

I think your idea of drill-down is different from mine: if there is an identifier in the row, then it should be possible to get data from its table by it. And it's all.

Grouping can be helpfull features. But it should be easy to understand for easy use.

The exec.dll extension was deleted

It's a known mingw issue. I found the way how to trick virustotal, but I can't impact on local antiviruses :(

I replaced the sqlite3.dll with my own one

So you lost database encryption support. It's okay if you don't need it.

(19) By anonymous on 2022-04-26 00:18:19 in reply to 17 [link] [source]

I hope we are not getting too much off-topic for this forum:

I guess, you already know dbeaver. It supports “any” database with an available JDBC driver.

And you might also aware of this blog post: https://duckdb.org/2022/03/07/aggregate-hashtable.html

Cheers!

(20) By MBL (UserMBL) on 2022-05-02 12:24:25 in reply to 18 [link] [source]

... I don't know where is special paste in sqlite3explorer.

On left side you have the two tabs for Schema and Snippets, both in tree form.

On top you have the tabs with the editor sections, the one with label 1 is shown by default. (Button New Tab will create more of them.)

On the bottom (majority of the application canvas) you get the query results in table form.

You may have not yet payed attention to the gray area just above that table view, which is that paste area and which shows in white color the following text:

Drag a column header here to group by that column

You can drag a column header from the table view and drop it into that gray area. This will make a grouped view in the table area where all groups are folded by default. Klick on a plus-box to unfold the summarized table item.

When more than one is in there then they get combined and make an order (looking like a tree but with only one branch each). Below an unfolded plus-box line you will get more plus-boxed lines, which can be unfolded individually.

By drag-and-drop of the table-header this column only appears in the group name and disappear from the table view. This can be changed when you click the icon left of the left-most column header just above the little column with the row arrow.

When you open that pop-up dialog you can check the boxes which columns you want to see in the table view - this can bring the dropped out columns back to the table even when filtered by the grouped view.

You can drag-and-drop from the gray area into another order position of the gray area to get differently grouped table output.

You can also drag-and-drop from the gray area back to the table header area to remove the grouping and to make the header visible again.

My thinking about drill-down was not only filtering and grouping but also restriction of the output amount of data by using cascaded CTE's. Each level down could be the result from an automated SQL generation step. In my example I set a fixed limit on each deeper step. What would make sense here could be the restriction of up to first N items for each group similar to a clause like: having partition_row_number <= 20. I still have to learn more about the WINDOW functions.

WITH Level_1 as (
  select * from requests 
  group by addr 
--having partition_row_number <= 20
  order by addr
  limit 2000
)
, Level_2 as (
  select * from Level_1 order by uri
  limit 200
)
, Level_3 as (
  select * from Level_2 order by query_string
  limit 5
)
select * from Level_3

Each CTE in an application could be handled like a separate edit field and execution could generate the chained cascade of CTE's just for the next drill-down step-to-table click.

To be not completely unlinked from SQLite3 features I would like to know how such a "limit per group" SQL statement has to look like. Does someone has an idea?

I experimented already with pareto sorting to get the most important to top and not too many in total. Something like that per group would be very helpful.

D:\UTL>sqlite3 pareto2.sqb
SQLite version 3.38.3 2022-04-27 12:03:15
Enter ".help" for usage hints.
sqlite> .load sqlite3.dll
sqlite> .mode qbox
sqlite> .tab
Pareto                    dotFolder_
sqlite> select * from Pareto;

┌──────────────────────────────────┬──────────┬───────────┬───────────┬────────┬──────┬────────┬───────┬───────┬───────┬─...─────────────┐
│                    Entry         │   Size   │  cumsum   │   ftot    │ pareto │  No  │  Npct  │ sel98 │ sel95 │ sel90 │ ...   Path      │
├──────────────────────────────────┼──────────┼───────────┼───────────┼────────┼──────┼────────┼───────┼───────┼───────┼─────────────────┤
│ 'Logfile-Process-Monitor.Log'    │ 11776032 │ 11776032  │ 549578185 │ 2.143  │ 1    │ 0.065  │ 98    │ 95    │ 90    │ '.\'            │
│ 'f64.exe'                        │ 9424896  │ 21200928  │ 549578185 │ 3.858  │ 2    │ 0.13   │ 98    │ 95    │ 90    │ '.\'            │
│ 'f64-2-18.exe'                   │ 9355264  │ 30556192  │ 549578185 │ 5.56   │ 3    │ 0.196  │ 98    │ 95    │ 90    │ '.\'            │
│ 'f64-v214.exe'                   │ 8974336  │ 39530528  │ 549578185 │ 7.193  │ 4    │ 0.261  │ 98    │ 95    │ 90    │ '.\'            │
│ 'f64-v215.exe'                   │ 8952832  │ 48483360  │ 549578185 │ 8.822  │ 5    │ 0.326  │ 98    │ 95    │ 90    │ '.\'            │
...                                                                     ... rows 6 to 1529 ...
│ 'manifest.uuid'                  │ 41       │ 549578055 │ 549578185 │ 100.0  │ 1530 │ 99.739 │ NULL  │ NULL  │ NULL  │ '.\'            │
│ '.SQLiteRC'                      │ 39       │ 549578094 │ 549578185 │ 100.0  │ 1531 │ 99.804 │ NULL  │ NULL  │ NULL  │ '.\'            │
│ 'fs.bat'                         │ 38       │ 549578132 │ 549578185 │ 100.0  │ 1532 │ 99.87  │ NULL  │ NULL  │ NULL  │ '.\'            │
│ 'manifest.tags'                  │ 33       │ 549578165 │ 549578185 │ 100.0  │ 1533 │ 99.935 │ NULL  │ NULL  │ NULL  │ '.\'            │
│ 'try.cmd'                        │ 20       │ 549578185 │ 549578185 │ 100.0  │ 1534 │ 100.0  │ NULL  │ NULL  │ NULL  │ '.\tcc\'        │
└──────────────────────────────────┴──────────┴───────────┴───────────┴────────┴──────┴────────┴───────┴───────┴───────┴─...─────────────┘

The SQL for this, using my virtual table extension for reading the file structure:

drop view if exists Pareto;
create view Pareto as
WITH sumtable(FilePath, Entry, Size, No, cumsum, ftot) as (
select FilePath, Entry, Size
, rank()     over win    as No
, sum(Size)  over win    as cumsum
, sum(Size)  over wintot as ftot
 from dotFolder_
 where Size>0
 window win as ( order by size desc, entry Rows between unbounded preceding AND current row )
    ,wintot as ( Rows between unbounded preceding AND unbounded following )
), percentable as (
select Entry, Size, cumsum, ftot, (cumsum * 100.0 / ftot) as pareto, No
, round(No * 100.0 / (select max(No) from sumtable),3) as Npct
, substr(FilePath,1,instr(FilePath,Entry)-1) as Path
 from sumtable
)
select Entry, Size, cumsum, ftot, round((cumsum * 100.0 / ftot),3) as pareto, No, Npct
, case when pareto <= 98 then 98 end as sel98
, case when pareto <= 95 then 95 end as sel95
, case when pareto <= 90 then 90 end as sel90
, case when pareto <= 80 then 80 end as sel80
, case when pareto <= 70 then 70 end as sel70
, case when pareto <= 60 then 60 end as sel60
, case when pareto <= 50 then 50 end as sel50
, case when pareto <= 40 then 40 end as sel40
, case when pareto <= 30 then 30 end as sel30
, case when pareto <= 20 then 20 end as sel20
, case when pareto <= 10 then 10 end as sel10
, Path
from percentable;

(21) By little-brother on 2022-05-05 11:20:09 in reply to 20 [link] [source]

I found the latest version of sqlite3explorer in WebArchive and there I can see the "group"-header. But imho it's a not a drill-down functionality. It's a grouping.

Each level down could be the result from an automated SQL generation step

To work with a resultset there is no need to use sql. Because the resultset is in-memory data that stored as 2D-array.

P.S. Dbeaver has both features: drill-down and grouping.

(22) By Sunny Saini (SunnySaini_com) on 2022-10-17 13:03:49 in reply to 11.1 [link] [source]

I love sqlite-gui