Artificial Intelligence Reading List
Thanks in no small part to OpenAI’s ChatGPT, the past year has seen an explosion in interest in artificial intelligence in general, and in large language models in particular. That democratization of access turned this niche research area into a common topic of conversation, and has led to a lot of fascinating writing on the subject. Although certainly not comprehensive, this article collects some of my favorite articles, papers, and resources into a single reading list.
For more on this subject, check out Vicky Boykis’ Anti-hype LLM reading list which features some of the same artificial intelligence-related resources.
I have organized this reading list into several sections loosely by focus and generally ordered by publication date. Introduction to Artificial Intelligence contains a high-level primer on the field. Large Language Model Fundamentals delves into some lower-level details but should still be approachable for most laypeople. Artificial Intelligence Theory deals with interesting theoretical questions like, “What really counts as intelligence?” Military Applications for Artificial Intelligence contains some interesting reading on the possible applications — and limitations — of artificial intelligence in military settings. Finally, Additional Resources lists a few useful tools and resources not specifically related to the other sections. I added notes to some of these entries but not all of them.
Introduction to Artificial Intelligence #
This section is a high-level primer on the field of artificial intelligence.
- The AI Revolution: The Road to Superintelligence and The AI Revolution: Our Immortality or Extinction. Back in 2015, Tim Urban wrote a lengthy yet approachable two-part series speculating on the future impact of artificial super intelligence.
- How to Think Computationally about AI, the Universe and Everything. Stephen Wolfram’s TED AI talk discusses the central role of computation in AI and the universe. He theorizes that the universe is composed of discrete computational elements and introduces the “ruliad”, a computational universe explored by AI. He also emphasizes the importance of computational language in bridging the gap between human understanding and computational reality. He also talks about the ruliad in Generative AI Space and the Mental Imagery of Alien Minds
- AI. Ten years ago, Sam Altman described what I think is the current best use of artificial intelligence: “The most positive outcome I can think of is one where computers get really good at doing, and humans get really good at thinking.”
Large Language Model Fundamentals #
This section delves into some lower-level details of a particular form of artificial intelligence, large language models, but should still be approachable for most laypeople.
- What is ChatGPT doing and why does it work?. Stephen Wolfram explains in detail how large language models work.
- All Languages Are NOT Created (Tokenized) Equal. Yennie Jun explores the impact of English-centric training in large language models. Vox also made a good video on this subject: Why AI doesn’t speak every language. In a similar vein, The Babelian Tower Of AI Alignment discusses a related issue of cultural biases affecting AI.
- Multifaceted: the linguistic echo chambers of LLMs. In a similar vein, James Padolsey explores the root cause of curious linguistic tendencies in large language models. As artificial intelligence systems generate more internet content, this will become more and more pronounced as successive generations exacerbate the biases of their predecessors.
- Llama from scratch. Brian Kitano walks through his own implementation of Meta’s LLaMA.
- A Survey of Large Language Models. This fantastic paper touches on every aspect of large language models, from their history to the underlying theory to the performance today.
Artificial Intelligence Theory #
This section deals with interesting theoretical questions like, “What really counts as intelligence?”
- Alien Intelligence and the Concept of Technology. Stephen Wolfram explores the idea that all processes are fundamentally equivalent in computational terms. He suggests that what we consider intelligence, governed by physics, may not be fundamentally different from “alien” processes, challenging traditional views of intelligence and technology.
- Artificial General Intelligence is Already Here. “Today’s most advanced AI models have many flaws, but decades from now, they will be recognized as the first true examples of artificial general intelligence.”
- The Many Ways that Digital Minds can Know. In the theme of, “What really counts as intelligence?”, Ryan Moulton shares some relevant thoughts. Michael Levin also explores this question in The Space Of Possible Minds.
- The Stochastic Parrot Hypothesis. Quentin Feuillade-Montixi and Pierre Peigne evaluate GPT4’s performance against the stochastic parrot hypothesis, challenging the idea that it is “only” regurgitating words.
- Are Large Language Models Conscious?. Sebastian Konig discusses the role that language plays in determining consciousness in an interesting exploration of the question, “Are large language models more than ‘just’ machines?”
- Sparks of Artificial General Intelligence: Early Experiments with GPT-4. This controversial paper from 2023 stops short of declaring GPT-4 an instance of artificial general intelligence, but it does offer some compelling arguments that the model’s emergent abilities indicate it is more than just an autocomplete engine or math function.
- Are Emergent Abilities of Large Language Models a Mirage?. Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo examine some emergent properties of large language models and offer explanations for them.
- Are Language Models Good at Making Predictions?. An evaluation of large language models’ ability to predict outcomes.
- Applied Fallabilism: A Design Concept for Superintelligent Machines. In part one, the author argues that “induction constrains and cannot support deduction”, that deduction is necessary to achieve artificial general intelligence, and describes how it may be achieved. Part two explained design principles for building that world model. Part 3 dealt with the apparent emergent properties of current models and promising avenues for achieving an explanatory world model. Part 4 contained some predictions for what it would take to achieve artificial general intelligence and what that might look like. Part 5 walks through an example of what this process might look like at a high level. While dense, this series is informative.
- Levels of AGI: Operationalizing Progress on the Path to AGI. From Google’s DeepMind team, this paper “proposes a framework for classifying the capabilities and behavior of Artificial General Intelligence (AGI) models and their precursors.”
- Google’s Gemini Advanced: Tasting Notes and Implications. Under the guise of reviewing Google’s latest model, Gemini Advanced, Ethan Mollick shared some insightful observations on the state of large language models with an eye toward the future. I think the idea of ghosts is fascinating: “[What many have called ‘sentience’] is the illusion of a person on the other end of the line, even though there is nobody there. GPT-4 is full of ghosts. Gemini is also full of ghosts.”
- Claude’s Character. Anthropic, one of OpenAI’s primary competitors, talks about how the company imbues character — what some might call personality — within its flagship model, Claude. Experiments like these further blur the lines between machine and human, making that a more academic than practical debate.
Military Applications for Artificial Intelligence #
This section contains some reading on the possible applications — and limitations — of artificial intelligence in military settings.
- Laplace’s Demon and the Black Box of Artificial Intelligence. Thom Hawkins explores some of the challenges of relying on artificial intelligence in a military context. See also: You Don’t Need AI, You Need an Algorithm.
- What ChatGPT Can and Can’t Do for Intelligence. Stephen Coulthart, Sam Keller, and Michael Young explore uses for large language models like ChatGPT in intelligence work.
- PoisonGPT: How we Hid a Lobotomized LLM on HuggingFace to Spread Fake News. Researchers surgically modified a large language model and then distributed it in an interesting new supply chain attack vector.
- Trust the AI, But Keep Your Powder Dry: A Framework for Balance and Confidence in Human-Machine Teams. Thomas Gaines and Amanda Mercier discuss the application of principles for building human teams to building trust in human-machine hybrid teams.
Additional Resources #
This section lists a few useful tools and resources not specifically related to the previous sections.
- LLM University. A nice collection of videos and text-based explanations of large language models and the underlying technologies.
- How fast is AI improving?. An interactive website that demonstrates how large language models have increased in capability over the years — and the associated dangers.
- LLM Visualization. Brendan Bycroft created an informative, interactive guide to understanding large language models. The website walks through the entire inference process both visually and by explanation.
- Bullet Papers and Papers.day both provide artificial intelligence-generated summaries of ArXiv papers.