Some links into the map require javascript! Sorry!
Well, it's been a year since I started the draft, so I guess it's about time to publish this! :)
This is a map of my personal data liberation infrastructure , with links to the scripts and tools used; and my blog posts elaborating on different parts of it.
My goal for data liberation is approximating the 'personal data mirror' concept,
often despite crappy interoperability (or lack thereof) of different platforms.
I prepared this diagram for several reasons:
to give more context for my blog posts about data liberation and tools around it
to highlight the complexity and hoops we have to jump over because of the lack of interoperability
it was also sort of fun :)
This time I won't write too much text and just let you explore it. Tips for exploring the diagram :
perhaps open the full size SVG in a new tab
make sure to read the legend
links you can follow are marked with blue (and sometimes other colours)
there is a bubble (💬) near some nodes/edges, you can hover it to see the comment
some integrations are in progress: marked with WIP , construction signs (🚧🚧) and dashed edges
arrows roughly represent the direction of data flow
arrow colors roughly correspond to the data source (so it's easier to track how it flows)
there are some rendering issues
it's probably not very mobile friendly (it's barely desktop friendly!)
SVG support varies among web browsers, so there might be some minor artifacts (chromium works better, but firefox works well enough)
navbar centering isn't broken on this page – it's just a temporary hack to fit in the diagram till I figure out wide pages properly
G
cluster_group
cluster_legend
Legend
cluster_meta
Meta
(why I'm doing all this?)
cluster_phone
Android
phone
cluster_phone_fss
Filesystem
cluster_devices
cluster_orger_cl
Orger ¶
Orger ¶
orger_posts
Orger:
plaintext reflection
of your digital self
Managing inbound digital content
Orger
+
Roam Research
orger
Github: orger
Mirrors:
kobo
twitter
instapaper
youtube
hypothesis
github
polar
...and more
Queues:
kobo2org
ip2org
reddit
hackernews
...and more
cluster_for_dashboard
cluster_promnesia_cl
Promnesia ¶
Promnesia ¶
promnesia_posts
My journey in
fixing browser history
promnesia
Github: promnesia
cluster_orger_outputs
Plaintext files
cluster_pipelines
cluster_exports
Export layer ¶
Export layer ¶
exp_telegram_backup
telegram_backup
telegram_backup
exp_messenger
fbmessengerexport
fbmessengerexport
exp_takeout_manual
semi-manual
(periodic)
semi-manual
(periodic)
exp_vkexport
vkexport
vkexport
exp_twint
twint
twint
exp_tw_manual
manual request
(periodic)
exp_discord_manual
manual request
(periodic)
exp_pinbexport
pinbexport
pinbexport
exp_ghexport
ghexport
ghexport
exp_github_manual
manual
download
manual
download
exp_pockexport
pockexport
pockexport
exp_rexport
rexport
rexport
exp_pushshift
pushshift_export
pushshift_export
exp_instapexport
instapexport
instapexport
exp_kobuddy
kobuddy
kobuddy
exp_remarkable_sync
script
exp_inp_weight
manual
input
exp_inp_blood
manual
input
exp_emfitexport
emfitexport
emfitexport
exp_jbexport
jbexport
exp_inp_sleep
manual
input
exp_garmindb
GarminDB
GarminDB
exp_endoexport
endoexport
endoexport
exp_inp_exercise
manual
input
exports_infra
Data export infrastructure
Building data
liberation
infrastructure
In search of
a friendlier
scheduler
cluster_filesystem
Filesystem ¶
Filesystem ¶
fs_twitter
sqlite
fs_twitter_archive
json
fs_vk
json
fs_telegram
sqlite
fs_messenger
sqlite
fs_reddit
json
fs_pushshift
json
fs_pinboard
json
fs_discord_archive
zip/json
fs_github
json
fs_github_archive
zip/json
fs_pocket
json
fs_instapaper
json
fs_takeouts
json
html
fs_kobo
sqlite
fs_remarkable
custom
format
fs_weight
orgmode
fs_blood
orgmode
fs_emfit
json
fs_jawbone
json
fs_sleep
orgmode
fs_garmin
sqlite
json fit
fs_endomondo
json
fs_exercise
orgmode
filesystem_blog
Against
unnecessary databases
🚧Ensuring backup safety
🚧Data exports deduplication
fs_materialistic
sqlite
fs_bluemaestro
sqlite
fs_runnerup
tcx
workouts
fs_gpslogger
gpx
tracks
cluster_hpicl
Human Programming Interface ¶
Human Programming Interface ¶
hpi_in_fs_messenger
hpi_in_fs_reddit
hpi_in_fs_pushshift
hpi_in_fs_pinboard
hpi_in_fs_github
hpi_in_fs_github_archive
hpi_in_fs_pocket
hpi_in_fs_twitter
hpi_in_fs_twitter_archive
hpi_in_fs_discord_archive
hpi_in_fs_kobo
hpi_in_fs_materialistic
hpi_in_fs_remarkable
hpi_in_fs_vk
hpi_in_fs_instapaper
hpi_in_fs_bluemaestro
hpi_in_fs_blood
hpi_in_fs_weight
hpi_in_fs_emfit
hpi_in_fs_jawbone
hpi_in_fs_sleep
hpi_in_fs_garmin
hpi_in_fs_endomondo
hpi_in_fs_exercise
hpi_in_fs_runnerup
hpi_in_fs_gpslogger
hpi_in_fs_takeouts
cluster_hpi_core
cluster_for_timeline
cluster_for_hpi
Device
Device
Cloud service
Cloud service
legend_auto
Automatic
script
legend_manual
Manual
step
legend_blog
Entry from my blog
(clickable)
Entry from my blog
(clickable)
legend_ui
User facing
interface
sad_infra
The sad state of
personal data
and infrastructure
The sad state of
personal data
and infrastructure
Disk storage
Disk storage
legend_dead
Dead
service/product
brain_coping
How to cope
with a human brain
How to cope
with a human brain
mydata
What data I collect
and why?
What data I collect
and why?
gps
GPS
app_garmin
Garmin
app
gps->app_garmin
app_runnerup
Runnerup
app
Runnerup
app
gps->app_runnerup
app_gpslogger
Gpslogger
app
Gpslogger
app
gps->app_gpslogger
google
Google
Browser
history
Location
Takeout
💬
Unclear retention rules https://beepb00p.xyz/takeout-data-gone.html
gps->google
jawbone
Jawbone
(dead)
💬
Discontinued in 2017
API
app_jawbone->jawbone
endomondo
Endomondo
(dead)
💬
discontinued in December 2020
API
app_endomondo->endomondo
garmin
Garmin Connect
website
(scraping)
💬
Scraping is inherently fragile
app_garmin->garmin
app_fs_bluemaestro
sqlite
app_fs_materialistic
sqlite
app_fs_runnerup
tcx
workouts
app_fs_gpslogger
gpx
tracks
app_runnerup->app_fs_runnerup
app_materialistic
Materialistic
(Hackernews app)
Materialistic
(Hackernews app)
app_materialistic->app_fs_materialistic
app_gpslogger->app_fs_gpslogger
app_bm
Bluemaestro
app
app_bm->app_fs_bluemaestro
telegram
Telegram
API
telegram:api->exp_telegram_backup
messenger
FB Messenger
API
(private)💬
messenger:api->exp_messenger
fragile
fragile
fragile
google:takeout->exp_takeout_manual
wahoo
Wahoo Tickr X
(HR monitor)
Wahoo Tickr X
(HR monitor)
wahoo->app_endomondo
BT
wahoo->app_runnerup
BT
jawbone_band
Jawbone
sleep tracker
jawbone_band->app_jawbone
BT
bluemaestro
Bluemaestro
(environment
sensor)
Bluemaestro
(environment
sensor)
bluemaestro->app_bm
BT
garmin_watch
Garmin watch
garmin_watch->app_garmin
BT
emfit
Emfit QS
sleep tracker
wifi
(local API)
wifi
(cloud API)
emfit_cloud
Emfit
API
emfit:cloud->emfit_cloud
vk
VK.com
API
💬
Messages API locked down
vk:api->exp_vkexport
API closed?
API closed?
API closed?
twitter
Twitter
API
💬
Twitter is getting more and more hostile to hobbyist project and 3rd party clients
website
(scraping)
💬
scraping Twitter is extremely fragile
archive
twitter:website->exp_twint
fragile
fragile
fragile
twitter:archive->exp_tw_manual
discord
Discord
API
💬
Hostile against alternative clients, e.g. can't retrieve DMs with api
archive
discord:archive->exp_discord_manual
pinboard
Pinboard
API
pinboard:api->exp_pinbexport
github
Github
API
💬
only 300 latest events via API
archive
github:api->exp_ghexport
github:archive->exp_github_manual
pocket
Pocket
API
pocket:api->exp_pockexport
reddit
Reddit
API
💬
only 1000 latest items via API
GDPR
export
💬
Only on email request
pushshift
reddit:api->exp_rexport
reddit:pushshift->exp_pushshift
instapaper
Instapaper
API
instapaper:api->exp_instapexport
kobo
Kobo reader
sqlite
kobo:sqlite->exp_kobuddy
remarkable
Remarkable 2 tablet
ssh
remarkable:ssh->exp_remarkable_sync
scales
scales
scales
scales->exp_inp_weight
blood_tests
Blood tests
(GP/Thriva/etc)
Blood tests
(GP/Thriva/etc)
blood_tests->exp_inp_blood
emfit_cloud:api->exp_emfitexport
jawbone:api->exp_jbexport
dead
sleep_subj
Sleep data
(subjective)
Sleep data
(subjective)
sleep_subj->exp_inp_sleep
garmin:website->exp_garmindb
endomondo:api->exp_endoexport
dead
Exercise
Exercise
Exercise->exp_inp_exercise
browser_for_promnesia
Browser
(extension)
Browser
(extension)
promnesia->browser_for_promnesia
archivebox
Archivebox
(web preservation)
Archivebox
(web preservation)
promnesia->archivebox
data mirrors
(read only)
data mirrors
(read only)
orger:mirrors->data mirrors
(read only)
todo lists
interactive queues
todo lists
interactive queues
orger:queues->todo lists
interactive queues
emacs
Emacs
(Doom)
Emacs
(Doom)
logseq
Logseq
Logseq
pkm_search_post
Building personal
search engine
Building personal
search engine
exp_twint->fs_twitter
exp_tw_manual->fs_twitter_archive
exp_vkexport->fs_vk
exp_telegram_backup->fs_telegram
exp_messenger->fs_messenger
exp_rexport->fs_reddit
exp_pushshift->fs_pushshift
exp_pinbexport->fs_pinboard
exp_discord_manual->fs_discord_archive
exp_ghexport->fs_github
exp_github_manual->fs_github_archive
exp_pockexport->fs_pocket
exp_instapexport->fs_instapaper
exp_takeout_manual->fs_takeouts
exp_kobuddy->fs_kobo
exp_remarkable_sync->fs_remarkable
exp_inp_weight->fs_weight
exp_inp_blood->fs_blood
exp_emfitexport->fs_emfit
exp_jbexport->fs_jawbone
exp_inp_sleep->fs_sleep
exp_garmindb->fs_garmin
exp_endoexport->fs_endomondo
exp_inp_exercise->fs_exercise
fs_messenger->hpi_in_fs_messenger
DAL
DAL
DAL
fs_reddit->hpi_in_fs_reddit
DAL
DAL
DAL
fs_pushshift->hpi_in_fs_pushshift
DAL
DAL
DAL
fs_pinboard->hpi_in_fs_pinboard
DAL
DAL
DAL
fs_github->hpi_in_fs_github
DAL
DAL
DAL
fs_github_archive->hpi_in_fs_github_archive
fs_pocket->hpi_in_fs_pocket
DAL
DAL
DAL
fs_twitter->hpi_in_fs_twitter
fs_twitter_archive->hpi_in_fs_twitter_archive
fs_discord_archive->hpi_in_fs_discord_archive
DAL
DAL
DAL
fs_kobo->hpi_in_fs_kobo
DAL
DAL
DAL
fs_materialistic->hpi_in_fs_materialistic
fs_remarkable->hpi_in_fs_remarkable
🚧WIP🚧
fs_vk->hpi_in_fs_vk
fs_instapaper->hpi_in_fs_instapaper
DAL
DAL
DAL
fs_bluemaestro->hpi_in_fs_bluemaestro
fs_blood->hpi_in_fs_blood
fs_weight->hpi_in_fs_weight
fs_emfit->hpi_in_fs_emfit
DAL
DAL
DAL
fs_jawbone->hpi_in_fs_jawbone
fs_sleep->hpi_in_fs_sleep
fs_garmin->hpi_in_fs_garmin
🚧WIP🚧
fs_endomondo->hpi_in_fs_endomondo
DAL
DAL
DAL
fs_exercise->hpi_in_fs_exercise
fs_runnerup->hpi_in_fs_runnerup
fs_gpslogger->hpi_in_fs_gpslogger
fs_takeouts->hpi_in_fs_takeouts
hpi_usecases
Usecases
Making sense of
Endomondo's
calorie estimation
Extending my
personal infrastructure
hpi_node
location.google
gpslogger
sb
location&timezones
for other modules
messenger
vk
twitter
discord
sb
pinboard
github
pocket
reddit
instapaper
hackernews
kobo
and more...
github/HPI
bluemaestro
body.weight
body.blood
body.sleep
body.exercise
hpi_node:pocket->promnesia
hpi_node:reddit->promnesia
hpi_node:hackernews->promnesia
hpi_node:pinboard->promnesia
hpi_node:discord->promnesia
hpi_node:instapaper->promnesia
hpi_node:twitter->promnesia
hpi_node:github->promnesia
hpi_node:messenger->promnesia
hpi_node:vk->promnesia
hpi_node:pocket->orger
hpi_node:reddit->orger
hpi_node:hackernews->orger
hpi_node:pinboard->orger
hpi_node:discord->orger
hpi_node:instapaper->orger
hpi_node:twitter->orger
hpi_node:github->orger
hpi_node:kobo->orger
hpi_memacs
Memacs
Memacs
hpi_node:main->hpi_memacs
🚧 WIP 🚧
jupyter
Jupyter
IPython
Jupyter
IPython
hpi_node:main->jupyter
hpi_http
HTTP API
(🚧wip🚧)
HTTP API
(🚧wip🚧)
hpi_node:main->hpi_http
hpi_spreadsheet
Spreadsheet-like
interface?
Spreadsheet-like
interface?
hpi_node:main->hpi_spreadsheet
🚧 WIP 🚧
hpi_influxdb
Influxdb
Influxdb
hpi_node:main->hpi_influxdb
🚧 WIP 🚧
🚧 WIP 🚧
🚧 WIP 🚧
hpi_ffi
Other
programming
languages
(FFI)
Apache Arrow
hpi_node:main->hpi_ffi
🚧 WIP 🚧
hpi_sqlite
Sqlite
(via cachew)
Sqlite
(via cachew)
hpi_node:main->hpi_sqlite
hpi_memri
Memri
Memri
hpi_node:main->hpi_memri
🚧 WIP 🚧
timeline
Timeline
/Memex
(🚧wip🚧)
Timeline
/Memex
(🚧wip🚧)
hpi_node:main->timeline
dashboard
Dashboard
(🚧wip🚧)
Dashboard
(🚧wip🚧)
hpi_node:bluemaestro->dashboard
hpi_node:blood->dashboard
hpi_node:weight->dashboard
hpi_node:exercise->dashboard
hpi_node:sleep->dashboard
hpi_tech
Libraries/patterns
cachew
persistent cache/serialization
Configs suck
Using mypy for
error handling
hpi_in_fs_messenger->hpi_node:messenger_in
hpi_in_fs_reddit->hpi_node:reddit_api
hpi_in_fs_pushshift->hpi_node:reddit_pushshift
hpi_in_fs_github->hpi_node:github_api
hpi_in_fs_github_archive->hpi_node:github_archive
hpi_in_fs_pinboard->hpi_node:pinboard_in
hpi_in_fs_pocket->hpi_node:pocket_in
hpi_in_fs_twitter->hpi_node:twitter_api
hpi_in_fs_garmin->hpi_node:ex_garmin
hpi_in_fs_garmin->hpi_node:sleep_garmin
hpi_in_fs_endomondo->hpi_node:endomondo
hpi_in_fs_instapaper->hpi_node:instapaper_in
hpi_in_fs_kobo->hpi_node:kobo_in
hpi_in_fs_bluemaestro->hpi_node:bluemaestro_in
hpi_in_fs_materialistic->hpi_node:hackernews_in
hpi_in_fs_runnerup->hpi_node:runnerup
hpi_in_fs_takeouts->hpi_node:loc_google
hpi_in_fs_twitter_archive->hpi_node:archive
hpi_in_fs_discord_archive->hpi_node:discord_in
hpi_in_fs_jawbone->hpi_node:jawbone
hpi_in_fs_emfit->hpi_node:emfit
hpi_in_fs_vk->hpi_node:vk_in
hpi_in_fs_weight->hpi_node:weight_in
hpi_in_fs_blood->hpi_node:blood_in
hpi_in_fs_sleep->hpi_node:sleep_manual
hpi_in_fs_exercise->hpi_node:exercise_manual
hpi_in_fs_gpslogger->hpi_node:gpslogger
hpi_solid
Solid project
Solid project
hpi_http->hpi_solid
🚧 WIP 🚧
hpi_metabase
Metabase
Metabase
hpi_spreadsheet->hpi_metabase
hpi_grafana
Grafana
Grafana
hpi_influxdb->hpi_grafana
see demo
see demo
see demo
hpi_ffi->hpi_http
hpi_sqlite->hpi_metabase
🚧 WIP 🚧
hpi_datasette
Datasette
Datasette
hpi_sqlite->hpi_datasette
🚧 WIP 🚧
🚧 WIP 🚧
🚧 WIP 🚧
hpi_sqlite->hpi_grafana
plugin
plugin
plugin
hpi_memri->hpi_http
hpi_memri->hpi_ffi
dashboard->hpi_grafana
🚧 WIP 🚧
browser_for_dashboard
Browser
(HTML)
dashboard->browser_for_dashboard
jupyter2
Jupyter
IPython
dashboard->jupyter2
hpi_openhumans
openhumans.org
openhumans.org
dashboard->hpi_openhumans
🚧 WIP 🚧
Some notes regarding the diagram:
it's plotted via graphviz, and you can find the source here (although the code is quite domain specific)
even though there is a lot of stuff on the diagram, it's still incomplete!
here there is an (also incomplete) list of data I collect/export
HPI modules are a good proxy for the data I'm using
note that despite some platforms dying (e.g. Jawbone/Endomondo), I can still use data produced with them!
E.g. after Endomondo was discontinued, I was able to quickly switch to open source RunnerUp app ,
while preserving complete data compatibility.
note how many services are outright malicious with their anti-API/anti-scraping/anti-interoperability measures (yellow/red highlight for API nodes)
probably more platforms have GDPR exports, I just haven't tried yet
indirection is crazy
Note how for some data, before I can get it on my computer, it goes as
device –> phone (over bluetooth)
phone –> cloud (over internet)
cloud –> computer (over internet)
for many phone apps the only way I can sync the data is by rooting my phone in order to access the /data/data
directory
This is getting worse and worse with every Android version. I understand the security concerns, but this is ridiculous.
some modules/packages (marked withsb superscript) were developed by Sean Breckenridge
He's forked my HPI package and working on it in parallel .
For now, we decided to hack on it independently, in the hope that eventually we figure out what's a good model for cooperating and maintaining the modules.
Also, he's done some cool work on automatic HTTP API for HPI!
1 TODOs
TODO [C] [2021-02-07 19:53] hmm some 'HTML label' boxes seem to have extra padding?
although only in svg mode? png renders fine.
STRT [C] [2020-02-03 01:57] fix css so it's occupying full screen width
[2020-02-07 19:49] a bit adhoc, but works for now
STRT [C] [2020-02-03 01:57] legend
DONE [B] [2020-02-07 19:51] labels don't fit into the boxes??
[2020-02-14 21:25] apparently only on desktop Firefox =/
[2021-02-07 19:46] looks fine now?
STRT [C] [2020-02-14 21:30] Chrome doesn't support svg side attribute, so some labels appear upside down :(
2 ---
Let me know what you think, and as always happy to answer your questions!