💡 A Quick Q&A … with AI researcher Melanie Mitchell

'I feel more confident that these language models do have a lot of uses in the real world.'

Dec 12, 2023

∙ Paid

Quote of the Issue

“At its root, steampunk venerates the artisan, celebrates an abundance of technology, and still damns the factory that destroyed the former’s livelihood to create the latter.” - Cory Doctorow

I have a book out: The Conservative Futurist: How To Create the Sci-Fi World We Were Promised is currently available pretty much everywhere. I’m very excited about it! Let’s gooooo! ⏩🆙↗⤴📈

Q&A

An abstract image of artificial intelligence in the artistic style of British psychedelia with more psychedelic effects and more vibrant colors

💡 Quick Questions for … AI expert Melanie Mitchell

Melanie Mitchell is the Davis Professor at the Santa Fe Institute, a non-profit research center for complex systems science, where her research focuses on conceptual abstraction and analogy-making in artificial intelligence systems. She is the author of six books, her latest being “Artificial Intelligence: A Guide for Thinking Humans,” released in 2019. (Mitchell also writes a Substack with the same title.) Her latest paper is “Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks,” written with two co-authors.

The following exchange is from a recent Zoom chat and has been edited for clarity.

1/ ChatGPT came out in November of last year. As you have seen the technology be used, evolve, different use cases, studies, are you more impressed by the potential of this technology, has it turned out to be what you thought it would be a year ago, or have things lagged more than you would've guessed?

ChatGPT came out a year ago, but I think that the upgraded version, the GPT-4 version of it came out more recently, and that version is a lot more capable. The original ChatGPT version had a lot of problems with hallucinations and failures—very, very basic failures — and other problems that got improved quite a lot, not totally solved, but improved quite a lot in GPT-4. Given that degree of improvement, I feel more confident that these language models do have a lot of uses in the real world. I use them myself for various tasks and I am definitely much more impressed with the GPT-4 version.

2/ I recall the Microsoft paper which had the provocative title, “Sparks of Artificial General Intelligence.” Have you detected any sparks, any embers, of artificial general intelligence as you look at these models?

AGI is such a hard thing to define, and I think that there was a paper that came out recently from the Harvard Business School that was trying to assess the capabilities of GPT-4 to help consultants with various tasks that they do. They found that, in some of the tasks, it was very helpful, it really augmented the consultant's ability to do things. And in some of the tasks, it hurt their ability to do things because they trusted it too much and it was not smart enough to do those tasks reliably.

I think that they called this “a jagged frontier.” That's kind of what we're seeing with these systems. They're not AGI in the sense that they can do all these different tasks that humans do at the same level of humans. They have some superhuman abilities and much subhuman performance.

I don't think I am seeing sparks of AGI, if we want to define AGI as better-than-human performance on all tasks. But certainly we're seeing very impressive and useful abilities in certain kinds of problems. I guess the challenge is to say, “What are the kinds of problems where we can trust these systems? What are the kinds of problems that we can't trust them?” That's still not well understood, and I think they still lack a lot of important aspects of human intelligence. One important aspect is what we might call metacognition, the ability to sort of reflect on one's own thinking and notice when one's own thinking is flawed. That's something that they lack.

Keep reading with a 7-day free trial

Subscribe to Faster, Please! to keep reading this post and get 7 days of free access to the full post archives.