Using AI to Analyze My Saved Articles
I saved over 30k articles to Pocket, the read-it-later app, from 2011 until they announced their shut-down this year. This seemed like a data source worth analyzing.
ChatGPT’s analysis
Inspired by this post, I asked ChatGPT o3 to build a profile of me based on all the articles I read, but it wasn’t super accurate. I then asked it to group the topics I’ve read by theme, which was more interesting. Its method was a “quick Python pass over 29673 titles, regex word-boundary matching into hand-rolled keyword buckets”. It was also able to build charts of the topic count over time. I asked a few follow-up questions such as to just focus on the last few years, but it always seemed to rewrite its python from scratch. While it was thinking I noticed it catching some mistakes, e.g. it initially was counting the text “ea” for the category “philosophy and rationalism” but it then realized those letters could appear in many more contexts and adjusted its code!
Book Recommendations
I then asked it for some book and article recommendations based on my interests; here were some of its book recommendations (I left out the article recommendations since some of them didn’t seem to exist):
AI & Machine Learning - Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow (3e, 2022) by Aurélien Géron—completely rebuilt around transformers and LLM fine-tuning. (amazon.com)
General science / big history: The Dawn of Everything (2021) by Graeber & Wengrow—anthro-archaeology that rewrites equality, urbanism, and progress. (amazon.com)
Long-term ethics & rationalism: What We Owe the Future (2022) by William MacAskill—crisp intro to longtermism & decision theory. (basicbooks.com)
Health & longevity: Outlive (2023) by Peter Attia MD—nutritional, exercise, & metabolic protocols for decades-long healthspan. (amazon.com)
Software craftsmanship & leadership: Software Engineering at Google (2e, 2024)—culture, design docs, reliability playbook. (amazon.com)
My Lacunae
I asked it: “What are potential lacunae in my reading to have a fuller understanding of the world?” and it suggested the following:
Climate & environment
Africa
Latin America
South & Southeast Asia beyond China
Sports & global pop entertainment
Practical civics & local governance
Maybe I need to read the Economist more!
NotebookLM
I often save articles in a Google Doc to print out and read over the weekend so I have hundreds of docs containing thousands of articles in total. To analyze this, NotebookLM seemed like a good fit since it can easily analyze the full text instead of the simple title analysis that ChatGPT did. Each Notebook can only contain 50 sources so I created separate groups of articles based on the time period.
NotebookLM offers various features to help with study and review. The “Generate Podcast” option is famous but felt very random in this case, which isn’t surprising.
The “Generate mind map” option was cool since it automatically categorized all the topics:
Clicking on a topic generates a summary of what the sources said on that topic. For example, when I clicked on “hedonistic treadmill” it generated a fairly detailed overview from articles on philosophy, psychology, neuroscience and Buddhism. It ended as follows, which gives an idea of its scope, even if it’s a bit of a mishmash:
In summary, the Hedonic Treadmill reflects a fundamental aspect of human psychology where our capacity for adaptation and rising expectations can limit sustained happiness from achievements. This is influenced by the brain's reward systems, particularly dopamine's role in "wanting," and deeper psychological mechanisms like taṇhā (craving), which drive a continuous pursuit and "grasping" that can paradoxically lead to suffering. Understanding this complex interplay of biological drives, cognitive biases, and the nature of goal-oriented versus process-oriented activities is crucial for comprehending how people experience and seek satisfaction in life.
There’s a lot more room to explore here. I can use this as a tool to review what I’ve read, and maybe also as an “extended mind” if I want to quickly find something I read in the past.
In the future maybe I can hack together some way to use it to recommend additional articles to read. Eventually AI could be used to filter through all the noise and just display the most relevant and interesting articles to read based on its deep knowledge of the person. Of course this might lead one to recognize that they don’t need to read all the latest articles…