Gemini Grinds My Gears
At the end of the 20th century, excitement about stem-cells reached a fever pitch. Stem-cells, "from which all other cells with specialized functions are generated", have the potential to revolutionize disease research, regenerative medicine, and drug research (quote and info from Mayo). Look at this freaking diagram from U of Nebraska Medical Center on the myriad ways stem-cells could be used:
On February 12, 2004, South Korean researcher Woo Suk Hwang and his colleagues published a paper in Science: an astonishing announcement that they'd managed to clone human embryos and extract stem-cells. You can see how, given the excitement of the decades prior and the possible impacts, this was huge news. In 2005, another landmark paper dropped claiming a further advance in creating lines of stem-cells from patient skin cells.
By 2006, it had all unraveled. Ethical questions dogged early responses to the work. Scientists began to question data practices in the research, and by January 2006, Science retracted both of these papers for “research misconduct” and “fabricated data”. Hwang was let go from Seoul National University.
The hubbub around the retractions died down. In August 2007 an independent investigation showed that, while the original results had been deceptive, Hwang and the team had contributed a breakthrough in stem-cell research. There had been valuable results after all - obscured by misdirection.
On December 6, 2023, Google published a series of videos and blogs about its latest AI model, Gemini. An announcement video featuring Google CEO Sundar Pichai and Google DeepMind CEO Demis Hassabis gave viewers soaring music, inspiring images, and discussions of lifelong missions in AI. Nerds all over the world had been eagerly waiting for Google to answer OpenAI’s GPT-4 with a worthy competitor.
One of the coolest parts of the announcement was a video featuring a person appearing to film their hands performing various activities, and having intelligent, responsive conversation with Gemini in real time. The video looks more impressive than anything yet to come from the large language model era, leaving many to wonder if Google had leapfrogged ahead of the state of the art.
Within two days, the video was widely denounced as deceptive. What appeared to be Gemini processing video was in fact still images. Gemini was given “detailed rules for the game and examples of correct and incorrect answers”. The interaction with the model was not with live voice - it was a voice reproduction of what was done in chat-style text prompts. Gemini’s actual capabilities (at least at this point) appear to be a similar experience to ChatGPT Plus. That is nothing to slouch at! But because the video created controversy among techies, Gemini’s accomplishments were - similar to to Hwang’s stem-cell research - obscured by misdirection.
One of the biggest things AI needs right now is clarity. We media consumers must understand that anything we view online may be the work of a human prompting an AI. An AI that is not tightly tethered to truth, unable to verify its output. Businesses already often use this AI as a first line of interaction with customers. In the future, governments might use it to reduce the cost of providing services. People need to understand the basics of how it works, so they can understand how to advocate for themselves and monitor their media consumption for deception.
(I’ll save the paperclip maximizers and Skynet and FOOM and p(doom) for a different post.)
I’ve given a couple of presentations on generative AI, and my friends and colleagues must be sick of me jumping into the subject whenever we have dinner. But I find that after those discussions, they grasp and use AI better than they did before. Clarity and simplicity make strange things approachable.
The deceptive Gemini video reduces clarity. It makes it seem like the thing is X, when behind the curtain it’s Y. I can understand the video’s makers did not intend to deceive. And I am not saying that Gemini’s researchers engaged in data falsification. But the impact is similar: confusion and wheel-spinning where there could be clarity and focus. For the next couple months, every bit of Gemini news I read will be tainted by the nagging question: am I being deceived? PM