Generative AI is making waves across the entertainment industry. But what exactly is it, and how does this pattern-recognition technology work?
“The Matrix” Courtesy Warner Bros.
When you think of AI, pop-culture takes might come to mind—Agent Smith from “The Matrix” or HAL 9000 from “2001: A Space Odyssey,” for example. However, that’s not what generative artificial intelligence is. Those sci-fi characters would fall under the category of general AI, a technology that hasn’t been developed in the real world (yet).
Chris LeDoux, the co-founder and senior visual effects supervisor of Crafty Apes, describes generative AI as “really advanced pattern recognition, which is incredibly powerful. But it’s not sentient or conscious by any means.”
However, AI is being trained to think like us. In the past, software systems needed to be explicitly programmed to perform each requested function. Now more advanced deep learning algorithms and larger swaths of data allow AIs to process information in ways that imitate the human brain. The result is software that is both versatile and adaptive in its ability to make predictions, create new content, and heighten user productivity in virtually any circumstance.
“None of us knows exactly where [AI] is headed. I’ve never seen anything move this fast before”
While the idea of a mechanical mind goes all the way back to the ancient Greeks, the modern, feasible conception of AI came in the 1950s, when scientists such as Claude Shannon, Norbert Wiener, and Alan Turing established the foundations of the field.
Turing’s name went on to become synonymous with the idea of AI. That’s thanks to his Turing Test, which he devised to determine if a machine could not only regurgitate information on command, but actually think like a human being. (If you’ve ever seen the Benedict Cumberbatch–starring film “The Imitation Game,” you’ll remember this test well.)
Now, developers have finally cracked the code. Several AIs, including Google chatbot Eugene Goostman and OpenAI’s ChatGPT, have passed the Turing Test. While some experts have been critical of the test’s legitimacy as a benchmark for assessing artificial intelligence, these results are a sign of true breakthroughs in the field.
So while you may not have to worry about a robot apocalypse just yet, generative AI merits all the attention it’s been receiving; and it’s only the beginning. “None of us knows exactly where it’s headed,” says LeDoux. “I’ve never seen anything move this fast before.”
Ascannio/Shutterstock
As an example, OpenAI trained each of its GPT language models using an enormous corpus (a body of text and speech data) and metadata (the data’s structure and formatting). They developed the model using two strands of a neural network: a generator and a discriminator.
Think of the generator as a student learning a craft and the discriminator as their mentor. The generator produces new content based on the patterns it recognizes, and the discriminator appraises it by comparing that content to real examples from the data set.
As the training process went on, OpenAI’s generator improved its ability to create content that the discriminator couldn’t recognize as a significant deviation from the real content in the data set. The result? A generative AI that can create new content and dynamically respond to user prompts.
Hethers/Shutterstock
GPT-4: This AI is multimodal, meaning it will respond to both text and image prompts, although the latter feature isn’t yet publicly available. You can access GPT-4 with a monthly subscription. (GPT-3.5 is OpenAI’s freely available chatbot.)
These models have a huge variety of uses, including answering complex questions, translating languages, writing fiction or scripts in various styles, generating code, and building functional websites.
- DeepMind: Google’s generative AI is a versatile general-purpose model. So far, it’s mainly being used to identify medical issues and develop scientific research.
- DALL-E 2: When given either text or image prompts, this model can create new images in styles ranging from cartoonish to photorealistic. But it’s not simply spitting out composites of different objects, but comprehending the fundamental features of each item it’s asked to describe.
- Murf.ai: This is one of many text-to-speech generators that have hit the market in recent months. Murf.ai and competitors like Synthesys and Listnr are audio-diffusion tools that generate realistic voiceovers that pull from a range of different options.
Developers are releasing new generative AIs every day; so what’s viable today might be obsolete by next year—or even next month.