A Little Happier: I Can’t Stop Thinking About This Observation About Artificial Intelligence

EPISODE NOTES

In an admittedly extremely rudimentary way, I’m very interested in artificial intelligence. On my “24 for 24” list, I included the item “Keep learning and experimenting with AI.”

As part of that effort, I listen the Hard Fork podcast, a podcast I love, where they often discuss issues related to AI.

In their May 31, 2024, episode 85, the two hosts discussed recent strides made in understanding how large language models work. This has been a quite mysterious issue.

The hosts were interviewing guest Josh Batson, a research scientist at Anthropic, an AI company, about Claude, an AI assistant created by Anthropic. I often use Claude, myself.

Josh Batson explained a method called “dictionary learning,” which means—from what I can tell, and I may get this totally wrong—that certain patterns, which they call “features,” correspond to certain information or concepts, and that looking at various patterns reveals that the AI organizes related concepts together. It’s not trained to do that; this kind of organization arises without training. For instance, inner conflict, navigating a romantic breakup, and catch-22 political tensions are grouped together.

So in this conversation, host Kevin Roose comments that his favorite feature was one where you ask Claude, “What’s going on in your head?” In other words, in what context does AI place itself?

At this point, I paused the podcast, and asked myself, “What answer do I think Claude would give?” And I thought, maybe Claude would place itself in the context of the abacus, the encyclopedia, a hand calculator, or an IBM mainframe. What do you think?

Well, I was wrong.

Claude puts itself in the company of ghosts, angels, souls.

I can’t stop thinking about it.

Google Eats Rocks + A Win for A.I. Interpretability + Safety Vibe Check