FYI: Personal Data Warehouses: Reclaiming Your Data
(1) By Andreas Kupries (andreas-kupries) on 2020-11-14 19:03:18 [link]
Article found at [HN](https://news.ycombinator.com/item?id=25090218) by our simonw/datasette: [Personal Data Warehouses: Reclaiming Your Data](https://simonwillison.net/2020/Nov/14/personal-data-warehouses/)
(2) By Andreas Kupries (andreas-kupries) on 2020-11-14 22:54:38 in reply to 1 [link]
Replying to myself, now that I have read the article, the first thing coming to mind is [perkeep](https://perkeep.org/). DS and PK seem to both have a go at the same kind of thing, although from different angles. PK is more looking at the storage and having that under control. It has indexing and searching, however that is a custom language and limited to what the PK engine is indexing. DS OTOH looks to me to be all about the indexing, and the searching/analysis. Where the data sources live is not as important, only that they can be indexed (converted to sqlite) in some way, and then datasette can have a go at it. As another connection, the article mentions Wireguard/Tailscale for personal mesh networking. The main PK developer, [Brad Fitzpatrick works at Tailscale](https://bradfitz.com/2020/01/30/joining-tailscale).
(3) By Simon Willison (simonw) on 2020-11-17 17:12:17 in reply to 2 [link]
Yeah I see Perkeep as being a little bit more aimed at the personal archiving problem, especially around larger files such as photos and videos. It feels a bit like an open source equivalent of Dropbox in that regard. Dogsheep is much more aimed at smaller, highly structured data which you want to run arbitrary queries against, hence the focus on SQLite. I think the two projects could complement each other - having Perkeep archive my personal SQLite database files seems particularly appropriate for example.
(4) By Andreas Kupries (andreas-kupries) on 2020-11-17 18:28:39 in reply to 3
Heh. While I was thinking them complementary too, it was more wondering about how feasible it would be to replace perkeep's indexing with something based on dogsheep. It would at least need some importer for the perkeep schema documents. Plus ways to import information about non-schema documents (like for the meta data supported by various image formats, etc.). With the reference to Dropbox I now also wonder if perkeep has importers for data in Dropbox, or Google drive, or ... Found an [issue for writing an importer from gdrive](https://github.com/perkeep/perkeep/issues/896).