Saturday, March 18, 2023 at 2:30 PM
LLMs and the Future of Human Computer Interaction
This is a pretty exciting moment in tech. Like clockwork, every decade or so since the broad adoption of electricity there’s been a new technical innovation that completely upends society once it becomes widely adopted. One could even argue it goes back to the telegraph in the 1850s.
With appropriate caveats and rough dating, here’s a list I can think of:
- Electric lights in 1890s,
- Radio communication in the mid 00’s,
- Telephones in the mid 10s,
- Talking Movies in the mid 20s,
- Commercial Radio in the mid 30s,
- Vinyl records in the mid 40s,
- TVs in the mid 50s,
- Transistors / Computers in the mid 60s,
- The Microchip / Integrated Circuit in the mid 70s,
- The GUI / Desktop Publishing in the mid 80s,
- Internet/Web in the mid 90s,
- Smartphone in the mid 2000s,
- Streaming Video / Social Networking in the mid 2010s,
And now AI. This is a big one. The timing is right, and the technology is the requisite order of magnitude improvement over previous ways of doing things.
In response to Tim Bray's somewhat dismissal blog post about Large Language Models (LLM), I left this (somewhat edited) comment:
The data sets that current LLMs are trained on are basically any old shit off the Internet...
Sure, but also every line of open-source code, manuals, how-to guides, recipes, all of Wikipedia, online magazines, patents, research papers, medical sites, dictionaries, Project Gutenberg books, museums, MIT open courseware, US government sites (USGS to the Smithsonian), etc., etc. The amount of useful information online *vastly* outweighs the entirety of Twitter, Facebook and disinformational blogs combined and then some. To think otherwise is just pure cynicism.
Yes, the quality and consistency of LLM summaries could be better, but that's only part of what's going on. It's the ability of LLMs to follow detailed instructions which is truly mind boggling. This is one of those monumental leaps forward in computers akin to the integrated circuit or the GUI. Web search like queries and a "write x in the style of y" examples barely scratches the surface of what's possible.
Prompts are programming without the details.
For old school techies who have spent decades learning how to break down tasks into their constituent components, sometimes down to the byte, it's going to mean having to do a lot of unlearning. Prompts seem more like user stories that a product manager would write, rather than something a computer could process and implement. The fact that it works seems like magic, even to hardened computer experts. Actually, more so, as they know what it would take to program its functionality by hand. The other issue is that the AI output isn't consistent or predictable - always a red flag for a computer programmer, who expects that a program being run twice will always produce the same results, and if it doesn't something is seriously wrong. But it doesn't matter, what's being produced is already good enough and getting better at a pace that is astounding. LLMs bring the power of computer programming to the masses.
For those who are just starting to interact with LLMs, it might be a bit confusing, as prompts look more or less like the command lines or chat boxes we've been using for decades. But prompts aren't just queries and REPL like commands - they allow input of complete instructions as if you were telling an intelligent being what to do. Adding detailed context and rules to the prompts allows the LLMs to create everything from code to poetry with surprising utility. The outcome isn't just a replacement for a Google search, it's so much more.
These plain language instructions prepended to prompts is how Bing's implementation differentiates itself from ChatGPT and sets up guardrails and response type. This is what people are messing with when they "jailbreak" the chats. Seeing this is what finally flipped the light on for me. Imagine trying to write all the code to implement those instructions manually. It'd be nearly impossible.
Just look at the depth and breadth of these example prompts:
https://github.com/f/awesome-chatgpt-prompts
Yes, LLMs aren't perfect right now, but it's sort of like DOS users complaining about the mouse. The speed at which this stuff is improving is truly mind blowing and represents a fundamental shift in HCI that will be with us for the rest of our lives. This isn't a VC fever dream or another Web3 debacle. This is for real.
One last note: You'll know you truly get what's happening when you become very, very frightened. It's not a Skynet AGI style apocalypse - it's worse. It's "information mechanization", and just like what mechanization did to manual laborers, AI is going to do to office workers.
-Russ