Axol: search alerts
Eventually I'll write about it here
Table of Contents
- [A] Motivation
- TODO [B] I don't necessarily want to read everything found by 'scott alexander', but it's still interesting to run search to see the overlap between people?
- TODO why not hnrss?
- TODO why axol over rss bridges?
- STRT motivation: I don't understand how google search alerts work. e.g. try on openbci query (see my old emails from google alert)
- TODO [D] Ask HN: Do you still use RSS? | Hacker News rsstoblog
- [D] problems with diff approach
- [C] TW at А кто-нибудь знает тулы типа https://t.co/EbpbNZWQFC , но чтобы туда можно было вбить грубо говоря любую поисковую кверю, или API (например reddit/github); и оно отслеживало результаты?
- [C] I feel the same. So many cool things I'd love to learn about, but not enough tim… | Hacker News pkmaxol
- [B] Similar/existing projects
- [C] Show HN: Mailbrew – Automated Email Digests from HN, RSS, Reddit, Twitter
- TODO [C] awesome-selfhosted/awesome-selfhosted: A list of Free Software network services and web applications which can be hosted locally. Selfhosting is the process of hosting and managing applications instead of renting from Software-as-a-Service providers
- DONE [B] trackreddit only two subscriptions
- DONE [D] tool to search on reddit or even custom services? special ordering ('least likely' for showing least occuring subreddits). could also do it on rust? pkm
- [B] * Make it more user friendly
- [B] * Blacklisting
- TODO [B] maybe button to ban user? it would write to config or something? maybe I can even use some public API constructor?
- TODO [B] I suppose pouchdb would be perfect for blacklisting couchdb
- TODO [B] for blacklisting, instead could just apply custom per-user classes? or even edit them. that would allow to highlight properly
- TODO [C] yeah, blacklisting could both update backend and hide locally
- TODO [C] axol results for redditpkm, rendered at Fri 12 Apr 2019 05:07
- [C] shit, top lifelogging tweets are on japanese… twitter
- TODO [C] would be interesting to ignore links I already visited from results. It can even be done automatically…. promnesiaaxol
- TODO [D] huh, quite a few bots on reddit? reddit
- [C] huh, lots of stuff from twitter is just garbage. need a good way of suppressing it… axoltwitter
- [B] What would be a good UI for axol?
- TODO [B] I really need some sort of proper frontend browser for it…
- TODO [C] would be nice to have some html dashboard, so it's easy to blacklist terms?
- STRT [B] need a UI to easily add items to axol. e.g. Alexei Kitaev
- TODO [C] use metabase or something? could use a column to mark as seen? would be much easier than rss
- TODO [B] dunno about rss interface… really need a more efficient way of processing content, reordering, etc
- [C] Queries
- TODO [A] search for 'data export' or something?
- TODO [C] github.com/karlicoss - Twitter Search / Twitter self
- STRT [C] my. package | beepb00p postprivacyqstoread
- STRT [B] What Universal Human Experiences Are You Missing Without Realizing It? | Slate Star Codex mind
- TODO [D] https://github.com/hypotext/notation - Twitter Search
- [C] karlicoss/cachew - Twitter Search / Twitter cachew
- TODO [B] All | Search powered by Algolia Noon Universe search
- STRT [C] mypy – exclude mypython; prioritize topics mypy
- TODO [C] sleep tracking sleepqs
- STRT [C] add bret victor? bretvictor
- TODO [C] ted chiang – pretty nice to search on twitter tedchiang
- TODO [C] complex numbers group; argonov; transhumanism? argonov
- STRT [C] kobo; spaced repetition? spacedrep
- STRT [C] scott alexander unsong - Twitter Search
- STRT [C] karlicoss! self
- STRT [C] cancel scott alexander search alert
- TODO [D] set up alerts for nutrition stuff
- TODO [B] add "lagrangian mechanics"??? lagrangian
- [C] #promnesia
- STRT [A] kedr livansky kedr
- STRT [B] exobrain? exobrain
- TODO [D] Pinboard bookmarks tagged eeg
- TODO [D] Pinboard bookmarks tagged km pkm
- STRT [B] memex? esp github memex
- STRT [B] george hotz?
- DONE [C] add mypy to search??
- [D] tried aaxol for
- TODO [C] pkm for twitter can probably be removed…
- STRT [C] initial query… mypy
- TODO cleanup 'extended mind' – certainly lots of crap in the database twitter
- TODO hmm, beepb00p.xyz isn't resolving anything? selftwitter
- [D] axol results for hackernewspkm, rendered at 02 Dec 2019 11:05
- [D] axol results for hackernewspkm, rendered at 02 Dec 2019 11:05
- DONE [B] subscribe to more news on QS, BCI and gadgets qs
- [C] Sources
- STRT [C] wonder if I could search among hypothesis users… hypothesis
- TODO [D] could add google search too I suppose.. but that's def lowest priority
- STRT [C] implement for reddit. release reddit/github searchers (as library, then import and use)
- STRT [C] youtube? could search quantified self at least
- TODO [C] World be great to search in comments axolreddit
- TODO [C] hypothesis
- TODO [C] Schedule - pushshift.io
- TODO [C] New API endpoint – Now you can search comments! : redditdev
- TODO [D] for google search, only notify about new results; not about changes. wonder how?
- [C] Search Reddit Comments by User
- TODO [C] pushshift/api: Pushshift API
- TODO [C] duckduckgo?
- [C] Pushshift Reddit Search redditscrape
- [C] hacker-news-favorites-api/main.js at master · reactual/hacker-news-favorites-api
- TODO [B] Hypothesis
- TODO [C] could run HN more often hackernews
- [[https://grep.app/search?q=import%20my%5C..%2A%24®exp=true&filter[lang][0]=Python][import my\..*$ - grep.app]]
- [C] CI/testing
- TODO [B] Sort tags by number of total occurences?
- TODO [B] Use cachew and keep stuff as blobs with id cachew
- TODO [B] warn when there are too many atom items?
- TODO [B] suppress some feeds in the config?
- TODO [B] Show HN: I made an alternative to Google Alerts that listens to social media
- STRT [C] shit, seems that the timestamps are wrong and also I got the link wrong
- TODO [C] Maybe record a video on the phone ? demo
- STRT [C] maybe check crawled pinboard users for interesting tags/links?
- STRT [C] maybe, summary and 'rendered' are really sort of the same page? just different sorting…
- STRT [C] Def interesting to see user stats
- TODO [C] Sort tags by number of total occurences?
- TODO [C] Maybe better way of normalising? E.g. look at tedchiang and gq article. Display 'bumped' entries separately? Like a different way of sorting
- TODO [C] prepend # in tag?
- TODO [C] could search for interesting tags occurence without them actually being scraped
- TODO [C] might be good to do some sort of fuzzy grouping?
- TODO [C] would be interesting to have explorer for users that looks for some relevant taks/keywords? pinboard
- TODO [C] Hmm also need real-time search and notify I guess? hackernews
- TODO [C] Eh, better idea would be a tag subscription… mypy
- STRT [C] would be nice to have some efficient frontend + backend thing timeline
- hmmm. actually could do it in a twitter account??
- TODO could ask on HN? outbox
- or RSS? https://github.com/awesome-selfhosted/awesome-selfhosted#feed-readers
- TODO [C] Edit Feed: beepb00p.xyz - Miniflux
- TODO [C] Command Line Usage - Documentation
- TODO [C] could make a filter to release items slowly? e.g. tweets with more than 10 likes, if update pops it up, then it ends up in the feed. although I need 'processed' entries
- [C] Axol: Personal automatic news feed – crawl Reddit/Twitter/HN and read as RSS | Hacker News
- TODO [C] perhaps redefine everything in entities? and have relations – people, subreddits, urls, tags, etc
- TODO [C] rename adhoc to 'search'?
- TODO [C] think about a special tag to mark stuff that should be autoimported in a similar manner my kibitzr thing worked
- TODO [C] some todos
- TODO [C] def should keep original results in the DB as far as possible
- TODO [C] to start with, only support exact queries? e.g. demand them in queries and mention that support for fuzzier might be added later
- TODO [C] think about multiple small databases vs one huge?
- TODO [C] thinking about query language
- TODO [C] for people to try it out it really needs a simplest service possible they can run with docker? ideally without auth etc
- STRT [D] Track most active pinboard users? They might have interesting other stuff
- TODO [D] running under docker results in /app/axol/js/sorttable
- TODO [D] use different font?
- TODO [D] might need two pass algorithm? One for crawling, second for filtering?
- related pkmsearchdegoogle
- [C] Pinboard: network for karlicoss pinboardaxol
- TODO [C] spinboard: something's not right. e.g. try
- STRT [D] classes — classes 0.1.0 documentation
- axol hmm, somethihg I was trying to do in axol?…
- doesn't look active. all top results are from 2017 axolupspin
- STRT [D] ScriptSmith/socialreaper: Social media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs redditscrapeaxol
- TODO [B] pruning – for now via sqlitedbbrowser? make sure it locks the db? axol
¶[A] Motivation axol
¶TODO [B] I don't necessarily want to read everything found by 'scott alexander', but it's still interesting to run search to see the overlap between people? axol
¶TODO why not hnrss? axol
it's very likely more convenient to use if you only want a few HN queries, and don't care about historic ones
¶TODO why axol over rss bridges? axol
rss is awesome! downsides
- might be trickier to do various post-filtering
- with axol, you can compare results across queries (user summary)
- can be used with promnesia maybe
¶STRT motivation: I don't understand how google search alerts work. e.g. try on openbci query (see my old emails from google alert) axol
¶TODO [D] Ask HN: Do you still use RSS? | Hacker News axolrsstoblog
https://news.ycombinator.com/item?id=21913598
I've just started using Feedbin about a month ago, and although my HN firehose feed is at like 1100 something, it definitely limits the rest of the HN feeds. The show and ask feeds are both stuck around 400 something.
¶[D] problems with diff approach axol
¶random errors, resulting in empty diff axol
¶small differences in output (e.g. google search) axol
¶not always interested in items disappearing from query axol
the downside – having to keep the state :(
¶[C] TW at А кто-нибудь знает тулы типа https://t.co/EbpbNZWQFC , но чтобы туда можно было вбить грубо говоря любую поисковую кверю, или API (например reddit/github); и оно отслеживало результаты? axol
¶ huh axol
¶[C] I feel the same. So many cool things I'd love to learn about, but not enough tim… | Hacker News axolpkm
I feel the same. So many cool things I'd love to learn about, but not enough time.
¶[B] Similar/existing projects axol
¶TODO [C] awesome-selfhosted/awesome-selfhosted: A list of Free Software network services and web applications which can be hosted locally. Selfhosting is the process of hosting and managing applications instead of renting from Software-as-a-Service providers axol
https://github.com/awesome-selfhosted/awesome-selfhosted
Search Engines
¶DONE [B] trackreddit only two subscriptions axol
wanted lifelogging
trackreddit
¶DONE [D] tool to search on reddit or even custom services? special ordering ('least likely' for showing least occuring subreddits). could also do it on rust? axolpkm
searched as 'keyword monitoring tool'
tried searching on reddit, but nothing really useful..
https://github.com/trulia/thoth – unclear what it's doing
keyword tracking (SERP) – not sure if an overkill..
¶DONE just implement a provider for kibitzr? axolpkm
¶CANCEL rust? axolpkm
¶[B] * Make it more user friendly axol
¶TODO add axol doctor config axolproject
also axol doctor to check individual providers + reuse in tests
¶TODO [B] rely on user config dirs axol
¶TODO [C] provide an asci diagram for crawler + report + feed reader? axol
¶[B] * Blacklisting axol
¶TODO [B] maybe button to ban user? it would write to config or something? maybe I can even use some public API constructor? axol
¶TODO [B] I suppose pouchdb would be perfect for blacklisting axolcouchdb
¶TODO [B] for blacklisting, instead could just apply custom per-user classes? or even edit them. that would allow to highlight properly axol
¶TODO [C] yeah, blacklisting could both update backend and hide locally axol
¶TODO [C] axol results for redditpkm, rendered at Fri 12 Apr 2019 05:07 axol
redditpkm.html
shit. need to ignore the weapons subreddits
I think generally, my tools needs to have a database…
¶[C] shit, top lifelogging tweets are on japanese… axoltwitter
¶TODO [C] would be interesting to ignore links I already visited from results. It can even be done automatically…. axolpromnesia
¶TODO [D] huh, quite a few bots on reddit? axolreddit
azncbot
bprogramming even maybe?
autotldr
tabledresser
¶[C] huh, lots of stuff from twitter is just garbage. need a good way of suppressing it… axoltwitter
¶ twittermypy (211) - Miniflux axoltwitter
¶ twittermypy (211) - Miniflux axoltwitter
https://axol.karlicoss.xyz/feed/53/entries
/aymk_mypy/status/1211970059205107712 All twitter_mypy 7 hours ago Original @Witch_Astaroth みどりさん!この垢にしてから相互になった方の中では割と話せたと思ってます笑 来年もよろしくお願いします!
¶ twittermypy (111) - Miniflux axoltwitter
https://axol.karlicoss.xyz/feed/53/entries
/mypy2424/status/1211845733210443778 All twitter_mypy 7 hours ago Original 事実でも噂でも、クズとかいうやつお前はその人より努力してからいえよな〜って思うよ!!!!! 好きな
¶ twittermypy (111) - Miniflux axoltwitter
https://axol.karlicoss.xyz/feed/53/entries
/soe1113/status/741281801323175936 All twitter_mypy 7 hours ago O
¶ twitterlifelogging (20) - Miniflux axoltwitter
https://axol.karlicoss.xyz/feed/52/entries
/jager_atami/status/24390787028 All twitter_lifelogging 2 days ago Original #udetate #lifelogging 陶房で壺割り 12 個 201
¶ twitterquantifiedself (36) - Miniflux axoltwitter
https://axol.karlicoss.xyz/feed/55/entries
/hiperesoterismo/status/1212803558203985920 All twitter_quantified_self 4 hours ago Original mis únicos 4 moodspic.twitter.com/5RgPiKKhMx ★
¶[B] What would be a good UI for axol? axol
¶TODO [B] I really need some sort of proper frontend browser for it… axol
¶TODO [C] would be nice to have some html dashboard, so it's easy to blacklist terms? axol
¶STRT [B] need a UI to easily add items to axol. e.g. Alexei Kitaev axol
maybe some simple cmdline available from anywhere. or org mode as source?
¶TODO [C] use metabase or something? could use a column to mark as seen? would be much easier than rss axol
¶TODO [B] dunno about rss interface… really need a more efficient way of processing content, reordering, etc axol
¶[C] Queries axol
¶TODO [A] search for 'data export' or something? axol
¶ not much on reddit for 'data liberation: axol
¶ 'data export' looks promising on github axol
¶TODO [C] github.com/karlicoss - Twitter Search / Twitter axolself
¶ All | Search powered by Algolia axolself
¶STRT [C] my. package | beepb00p axolpostprivacyqstoread
https://beepb00p.xyz/mypkg.html
Interesting experiment! Thanks for sharing :-) You might find this person's musings about such experiments interesting: https://www.plomlompom.de/index.en.html#topic_postprivacy
¶TODO axol it axolpostprivacyqstoread
¶STRT [B] What Universal Human Experiences Are You Missing Without Realizing It? | Slate Star Codex axolmind
https://slatestarcodex.com/2014/03/17/what-universal-human-experiences-are-you-missing-without-realizing-it/
search this post on reddit or something
¶ actually even found something interesting on gh.. axolmind
https://github.com/search?q=what-universal-human-experiences-are-you-missing-without-realizing-it&type=Code
although, it's code search, not repo search
¶ so trying to google that query axolmind
if looking for past month, that basically results in random keywords
what universal human experiences are you missing without realizing it
¶ yeah, twitter feed is not too huge, so could subscribe to it axolmind
¶TODO [D] https://github.com/hypotext/notation - Twitter Search axol
¶[C] karlicoss/cachew - Twitter Search / Twitter axolcachew
¶TODO [B] All | Search powered by Algolia Noon Universe search axol
¶STRT [C] mypy – exclude mypython; prioritize topics axolmypy
¶TODO [C] sleep tracking axolsleepqs
¶STRT [C] add bret victor? axolbretvictor
¶ uh. need a proper interface for it axolbretvictor
¶TODO [C] ted chiang – pretty nice to search on twitter axoltedchiang
¶TODO [C] complex numbers group; argonov; transhumanism? axolargonov
¶STRT [B] youtube.com/watch?v=YrXk2buqsgg axolargonov
can find some interesting stuff on twitter..
¶DONE "виктор аргонов" got some good results on twitter axolargonov
¶STRT [C] kobo; spaced repetition? axolspacedrep
¶ eh, kobo not so interesting.. axolspacedrep
¶STRT [C] scott alexander unsong - Twitter Search axol
¶TODO could add this to my twitter poller thing (again, via API) or kibitzr? axol
¶STRT [C] karlicoss! axolself
¶ doesn't look much on pinboard… axolself
¶ not much interesting axolself
¶STRT [C] cancel scott alexander search alert axol
¶TODO [D] set up alerts for nutrition stuff axol
¶TODO [B] add "lagrangian mechanics"??? axollagrangian
¶ or 'Hamiltonian'? at least on HN axollagrangian
¶[C] #promnesia axol
GitHub - karlicoss/promnesia - Another piece of your extended mind
search on pinboard? or even axol..
¶STRT [A] kedr livansky axolkedr
¶STRT [B] exobrain? axolexobrain
¶TODO [D] Pinboard bookmarks tagged eeg axol
¶TODO [D] Pinboard bookmarks tagged km axolpkm
¶STRT [B] memex? esp github axolmemex
¶STRT [B] george hotz? axol
¶DONE [C] add mypy to search?? axol
¶TODO [C] pkm for twitter can probably be removed… axol
¶STRT [C] initial query… axolmypy
mypy -from:mypy2424 -from:mypy1031 -from:aymkmypy -to:aymkmypy -from:mypy0229
ugh, not sure how convenient it'd be to filter this shit
¶TODO cleanup 'extended mind' – certainly lots of crap in the database axoltwitter
¶TODO hmm, beepb00p.xyz isn't resolving anything? axolselftwitter
¶[D] axol results for hackernewspkm, rendered at 02 Dec 2019 11:05 axol
axol/summary/hackernewspkm.html
Personal Knowledge database
¶[D] axol results for hackernewspkm, rendered at 02 Dec 2019 11:05 axol
axol/summary/hackernewspkm.html
Personal knowledge base
¶[C] Sources axol
¶STRT [C] wonder if I could search among hypothesis users… axolhypothesis
¶ eh, search is a bit weird… axolhypothesis
¶TODO [D] could add google search too I suppose.. but that's def lowest priority axol
¶STRT [C] implement for reddit. release reddit/github searchers (as library, then import and use) axol
¶STRT [C] youtube? could search quantified self at least axol
¶ eh, tried few queries and does't look that result appear that often… axol
¶TODO [C] World be great to search in comments axolreddit
¶TODO [C] hypothesis axol
¶ not that many results on pkm/quantified self.. axol
¶ more on spaced repetition and ted chiang axol
¶TODO [C] Schedule - pushshift.io axol
https://pushshift.io/schedule/
Current Schedule April comments should be available around May 20 ,2018.
¶TODO [C] New API endpoint – Now you can search comments! : redditdev axol
https://www.reddit.com/r/redditdev/comments/3fv8vv/new_api_endpoint_now_you_can_search_comments/
New API endpoint -- Now you can search comments!
¶TODO [D] for google search, only notify about new results; not about changes. wonder how? axol
¶[C] Search Reddit Comments by User axol
https://redditcommentsearch.com/
Search through comments of a particular reddit user.
¶TODO [C] pushshift/api: Pushshift API axol
¶TODO [C] duckduckgo? axol
¶[C] Pushshift Reddit Search axolredditscrape
¶[C] hacker-news-favorites-api/main.js at master · reactual/hacker-news-favorites-api axol
https://github.com/reactual/hacker-news-favorites-api/blob/master/src/main.js
const x = require('x-ray')()
hmm, it's got 'paginate'?
¶TODO [B] Hypothesis axol
eh need to run orger I guess? or axol!
¶TODO [C] could run HN more often axolhackernews
also use more generic hooks?
¶ [[https://grep.app/search?q=import%20my%5C..%2A%24®exp=true&filter[lang][0]=Python][import my\..*$ - grep.app]] axol
¶[C] CI/testing axol
¶TODO HN is very quick, so prob really good to test on (even on CI) axol
¶TODO [B] Sort tags by number of total occurences? axol
¶TODO [B] Use cachew and keep stuff as blobs with id axolcachew
Not sure if I should overwrite or update? Could decide later and query with unique ids to start with?
¶TODO [B] warn when there are too many atom items? axol
¶TODO [B] suppress some feeds in the config? axol
¶TODO [B] Show HN: I made an alternative to Google Alerts that listens to social media axol
¶ eh, demands to register etc axol
¶STRT [C] shit, seems that the timestamps are wrong and also I got the link wrong axol
might need to work on this: axol/databases/twitterextendedmind.sqlite
¶TODO [C] Maybe record a video on the phone ? axoldemo
¶STRT [C] maybe check crawled pinboard users for interesting tags/links? axol
¶ yeah, need to make this bit more effecient.. axol
¶STRT [C] maybe, summary and 'rendered' are really sort of the same page? just different sorting… axol
¶STRT [C] Def interesting to see user stats axol
¶TODO [C] Sort tags by number of total occurences? axol
¶TODO [C] Maybe better way of normalising? E.g. look at tedchiang and gq article. Display 'bumped' entries separately? Like a different way of sorting axol
¶TODO [C] prepend # in tag? axol
¶TODO [C] could search for interesting tags occurence without them actually being scraped axol
¶TODO [C] might be good to do some sort of fuzzy grouping? axol
wonder what's an effecient way of doing it? sort of similarity connected components?
/TheGoogleDotCom/status/915750443275444226
Can Google's AI-powered Clips make people care about lifelogging? - TechCrunch http://ift.tt/2wyk69G
2017-10-05 01:28 by TheGoogleDotCom
/gauravndhankar/status/915750414774972416
Can Google’s AI-powered Clips make people care about lifelogging? http://dlvr.it/PsRpwK pic.twitter.com/IAPiiqacKo
2017-10-05 01:28 by gauravndhankar
/animesh1977/status/915749491344596992
Can Google’s AI-powered Clips make people care about lifelogging? http://ift.tt/2xUwbaz
¶TODO [C] would be interesting to have explorer for users that looks for some relevant taks/keywords? axolpinboard
¶TODO [C] Hmm also need real-time search and notify I guess? axolhackernews
¶TODO [C] Eh, better idea would be a tag subscription… axolmypy
¶STRT [C] would be nice to have some efficient frontend + backend thing axoltimeline
¶ hmmm. actually could do it in a twitter account?? axoltimeline
¶TODO could ask on HN? axoltimelineoutbox
¶ or RSS? https://github.com/awesome-selfhosted/awesome-selfhosted#feed-readers axoltimeline
¶TODO [C] Edit Feed: beepb00p.xyz - Miniflux axol
https://axol.karlicoss.xyz/feed/56/edit
Scraper Rules Rewrite Rules Title Filter Content Filter
¶TODO [C] Command Line Usage - Documentation axol
https://miniflux.app/docs/cli.html
miniflux -config-file /etc/miniflux.conf
¶TODO [C] could make a filter to release items slowly? e.g. tweets with more than 10 likes, if update pops it up, then it ends up in the feed. although I need 'processed' entries axol
¶[C] Axol: Personal automatic news feed – crawl Reddit/Twitter/HN and read as RSS | Hacker News axol
¶TODO [C] perhaps redefine everything in entities? and have relations – people, subreddits, urls, tags, etc axol
¶TODO [C] rename adhoc to 'search'? axol
¶TODO [C] think about a special tag to mark stuff that should be autoimported in a similar manner my kibitzr thing worked axol
¶TODO [C] some todos axol
[ ]
move individual data sources to files within the repo.. not even submodules, too much hassle
if someone needs, they can just import axol.sources.src directly[ ]
cleanup the json shit.. ideally use some proper library[ ]
not sure what to do with RSS feeds.. but could start with HTML report generation[ ]
query language:
might be better to adopt
service:sub:query
e.g.
pinboard:tag:whatever
or
github:some query
not sure what to do with colons though.. but maybe think about this later. most won't support searching them anyway
¶TODO [C] def should keep original results in the DB as far as possible axol
¶TODO [C] to start with, only support exact queries? e.g. demand them in queries and mention that support for fuzzier might be added later axol
¶TODO [C] think about multiple small databases vs one huge? axol
multiple small:
- easier to mess with/explore
- easier concurrency
- easier to remove from reports (although for that need to make sure it's really 1-1 correspondence with source and query? dunno)
single db:
- easier to bulk clean/somewhat easier to bulk normalise
although this would be kind of useless if I store raw json outputs - easier to do queries across multiple (e.g. associating users?)
¶TODO [C] thinking about query language axol
how it could look in adhoc mode
github:'scott alexander' twitter:'scott alexander'
in config, allow something nicer like
[twitter,github,reddit]:'scott alexander'
or [twitter,github,reddit, pinboard]:['scott alexander', 'quantified self']
pinboard:tag:scottalexander
[ ]
NOTE: echo twitter:'scott alexander' – this is gonna get swallowed by bash… suggest to always quote?[ ]
NOTE: treat " and ' the same? twitter does it…[ ]
TODO: make sure that query parsing is defensive
¶TODO [C] for people to try it out it really needs a simplest service possible they can run with docker? ideally without auth etc axol
¶STRT [D] Track most active pinboard users? They might have interesting other stuff axol
¶ maybe, try to intersect known user's tags and see what they got in common? axol
¶TODO [D] running under docker results in /app/axol/js/sorttable axol
¶TODO [D] use different font? axol
¶TODO [D] might need two pass algorithm? One for crawling, second for filtering? axol
e.g. I crawled quite a bit of pokemon crap, would be good to filter it?
¶related axolpkmsearchdegoogle
¶[C] Pinboard: network for karlicoss axolpinboard
https://pinboard.in/network/
shit… too many tweets. I need a way to filter the network…
¶ in fact it's the most common request to pinboard author apparently axolpinboard
¶TODO [C] spinboard: something's not right. e.g. try axol
querying t:quantified-self
https://pinboard.in/t:quantified-self
spinboard gives 220 total results. however, on the first page there are 50…
scraper is missing something?
eh. sooo, there are no dupes even!! BS4 actually sees only 20 per page (pinboard still gives us '50' in the next url).
whereas chrome does show up 50 entries; but if you go to the second page they are gonna overlap.
¶TODO must be some pinboard bug?? axolpinboard
¶STRT [D] classes — classes 0.1.0 documentation axol
¶ hmm, somethihg I was trying to do in axol?… axol
¶doesn't look active. all top results are from 2017 axolupspin
¶STRT [D] ScriptSmith/socialreaper: Social media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs axolredditscrape
https://github.com/ScriptSmith/socialreaper
Reddit Get the top 10 comments from the top 50 threads of all time on reddit