BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

The Cliff Clavin Effect: ChatGPT, Bard, And The Limits Of Generative AI

Following

Bard, Google’s generative-AI-searchbot, flubbed its public debut. The mistake was superficial, but the risks it exposed run deep. Call it the Cliff Clavin effect. How should we cope?

Google’s Bard hits a flat note

Gernerative-AI-search is THE hot technology.

ChatGPT’s global earthquake

OpenAI’s generative-AI search engine, ChatGPT, came to market first. Its debut shook the ground underneath Google, which currently controls about 93% of online search market. Going forward, who will need search terms or key words when you can have human-like conversations with a computer that ostensibly knows anything and everything?

The $200 billion online-search market is now in play.

So is the future itself, as people scramble to leverage generative-AI-search’s possibilities.

Students celebrate the end of homework: plug any assignment into ChatGPT and get your answers in seconds. ChatGPT can write essays. It can compose images or poems in the style of any artist you name. It can check your computer code for errors and will, over time, write code and design websites of increasing sophistication.

Not everyone is cheering, though. ChaptGPT has reportedly already passed the bar exam and medical-licensing exams.

John Henry had to drive steel against a steam drill. Knowledge workers will have to compete against self-learning algorithms.

The Google Empire strikes back

One of ChatGPTs biggest backers has been Microsoft. Microsoft invested $1 billion in OpenAI and will pour in billions more while incorporating ChatGPT into Bing, Microsoft’s proprietary search engine.

Microsoft smells blood. But Google and Chinese competitor Baidu are striking back with their own generative-AI-search tools.

However, in a recent debut of Google’s generative-AI-search tool, Bard, the program reportedly made a mistake. It misidentified the first technology to photograph a planet outside our solar system.

The error was not immediately apparent to the layman. But it was demonstrable. (Counterspin argues the opposite.)

Suffice it to say that on this issue, investors voted more resoundingly than any viewers of American Idol or Eurovision ever have: the perceived blunder contributed to Google losing nearly $100 billion in market capitalization.

The putative mistake also showed the risks of treating like an oracle a technology that occasionally acts like Cliff Clavin on Cheers: a self-proclaimed know-it-all who in fact spouts nonsense.

All Too Human: The Cliff Clavin Effect

Ironically, the Achilles heel of generative-AI search remains human error.

Garbage in, garbage out

Self-learning algorithms hoover vast quantities of data in microseconds. But, the problem of garbage in — garbage out remains.

Blame human error. Where these algorithms ingest faulty data, incorrect information, and biased opinions, the results will mis-state, mis-direct, and mislead.

In interacting with a generative-AI chatbot, we can’t tell we’re not dealing with a human being. But, in acquiring data, a generative-AI algorithm can’t tell it’s not dealing with a Cliff Clavin.

Bard’s apparent mistake was of small consequence — except to Google’s stock price. But catastrophe can arise from critical human decisions informed by faulty AI output.

Where high-speed AI systems run autonomously, disasters will unfold faster than humans can intervene.

The Deadly Seduction of Technology

The risks posed by overeliance on technology is not new. As a previous column discussed, Teslas may be self-driving, but they are also on occasion self-crashing.

What drove several Tesla owners to their graves — literally — was the seductiveness of self-driving technology. Tesla abetted this with marketing that labeled the feature, “Autopilot.”

Generative-AI’s ability to mimic human communication holds a similarly seductive — and potentially deadly — appeal. Interacting with such a Chatbot seems like conversing with the smartest, most knowledgeable person who ever lived.

But it isn’t a person. AI, however powerful, remains bounded by its algorithms. It lacks understanding and, ultimately, discernment.

The Discernment Arms Race

The Cliff Clavin effect presents us with a choice: build discernment into generative-AI search, or discipline ourselves to anticipate the technology’s limitations.

More Capable Algorithms

The best and the brightest will opt for the challenge — and rewards — of building 2.0 generative AI with discernment, or something approaching discernment.

Sketching out this 2.0 has already begun. According to Kevin Buehler, co-founder of McKinsey & Company’s global Risk practice and Chairman of McKinsey’s wholly-owned risk data-science advisory firm, Risk Dynamics, “Current generative-AI models focus on statistically plausible statements. They currently do not check for veracity in the way a fact checker might, or try to pick apart factual statements and opinions as happens in an adversarial court proceeding.”

Having analyzed the limitations of ordinary statistical models, Buehler and his colleagues identify six new areas that need to be managed carefully to develop machine learning and artificial intelligence models that are more fit for purpose:

  • Interpretability: Being able to describe a model and its output
  • Bias: Ensuring that the model accurately represents the data without learning patterns that could impermissibly affect different groups
  • Feature Engineering: Assembling individual data elements into features that drive the model (e.g., using debt-to-income ratios as a feature in a credit model)
  • Hyperparameters: Making choices about the design of a model (e.g., the number of hidden layers in a neural network)
  • Production Readiness: Anticipating how generative-AI algorithms must operate within a larger system. For example, can each part of the system handle the other’s data-flow rates? Will additional processing required for discernment throw either pricing or margins out of whack?
  • Dynamic Model Calibration: Adjusting processing on the fly to reflect new information — when should this happen, and who should make that determination?

Language and dialogue models can be immense. LaMDA, upon which Bard is based, trained on 1.56 trillion words, and one version comprised 137 billion model parameters. As a result, bias and interpretability represent particularly prominent concerns. No human can review more than a trillion words to ensure there is no subtle bias. At best, one must rely on other models to do so.

Similarly, when a complex model has billions of parameters, it can be extraordinarily difficult to interpret precisely why the model has produced a specific output.

More Cautious Humans

It seems unwise to bet against the best and the brightest. But at the least, their efforts call for a taxonomy — a set of formal definitions — so people will know what they are saying and hearing.

So, for example, what does “discernment” mean in this context? Is it a search for "truthfulness"? For “validation”? If we rely on the work of “authorities” to validate statements or opinions, how should the algorithms assess and weigh “authoritativeness”? Bear in mind that a 1905 survey of all physicists about the relativity of time would have generated a single "Yes" — from an obscure Swiss patent clerk named Einstein.

In the short term, at least, avoiding potentially calamatous misuse or abuse of generative-AI’s potential means humans will have to act more cautiously. This technology is seductive. Rather than deciding at what point algorithms may call something black or white, it will be safer and better to ensure human understanding of what shades of grey we're dealing with.

therightwaytowin

Follow me on Twitter or LinkedInCheck out my website