#5: Self-driving LLMs, AI Eats Art, All Kinds of Podcasts
Hi there!
Interesting self-driving, art, and podcastings in this one!
Don’t forget to leave a comment - hearing from fellow nerds is what makes us tick.
-The Boring Enterprise Nerds
Hot Takes
Self-driving Cars: Another GenAI Use Case?
Check out Wayve’s LINGO-1 model, where they use generative AI to explore enhancing the self-driving car experience. They call it a Vision-Language-Action Model (VLAM), saying “Incorporating language along with vision and action may have an enormous impact as a new modality to enhance how we interpret, explain and train our foundation driving models”. (For the insatiably LLM/driving curious, there’s also a paper out on arXiv studying the potential of using LLMs for driving. This paper talks about the “closed-loop” mentioned below, but is a bit less real than seeing the videos in the Wayve post.)
It seems like the right level of ambition. If I’m reading this right, LINGO-1 isn’t directly in control of a self-driving car, it’s providing commentary and question-answering along the way, and the data it generates might be useful for future training of driving models. Self-driving cars seem to be making headway lately, but I think a lot more people would feel at ease if there was a natural way to interact with the vehicle ferrying you to Cold Stone.
The end goal is what they call a “closed-loop architecture”, where the driving policy and the language capabilities directly influence one another. This is one of the reasons I like large language models: they’re the version of AI whose output is the most easily communicable. Of course, they admit the well-known hallucination problem persists in this model, but say there may be better ways to mitigate this than in the super-huge generalist models. Let’s hope so. PM
Artists Gonna Art - AI Gonna…Also Art?
Something can be shocking but not surprising. Imagine watching a scheduled building demolition, done by explosives: the time, place, and result are not in doubt. You know exactly when and how it’ll happen. But when the explosives go off, you can’t help your pulse quickening when the sound hits your body.
This is how I feel seeing the tiny beginnings of artist job loss to AI. You can’t not see the clear, unmistakable progress of image generation tools over the last couple years. Now concept artists are saying they’re aware of production companies turning to AI for “blue sky” concept work: the mood-suggesting, look-finding stage of film. Artist Reid Southen says in Business Insider “Companies can use generative AI to throw a bunch of stuff at the wall and churn out thousands of concepts in an afternoon.”
Midjourney, Stable Diffusion, and others are the scheduled demolition; Reid and his colleagues’ work is the building collapsing. I’m a (relatively) senior developer, when I do that kind of work. GitHub Copilot X, ChatGPT with Advanced Data Analysis, and others are starting to draw up the plans to demolish my building. When will it collapse? PM
Cool Links
AI Revolution: Just follow the link. There’s so much juicy goodness (also a couple podcasts condensing some of the material). Watch out for two cool moments: Dario Amodei talking about scaling, and Noam Shazeer’s quick mental math on GPU compute in the pipeline. PM
Sequoia’s Act Two: This is probably worth a longer story, but Sequoia Capital’s take on what’s upcoming in genAI is worth a mention as a cool link. Come for the insight, stay for the highly informative graphics, maps of the space we’re just entering. PM
Enterprise LLM Adoption: Listen to Sanjay Rajagopalan of Vianai wax philosophically practical on the ins and outs of enterprises starting to use LLMs. I’m personally biased here - stay tuned for this Friday’s edition where yours truly has the privilege to interview Dr. Rajagopalan. We discuss design, empathy, and hallucinations. Hat tip Camille Morhardt. PM
Stay tuned this Friday for a video interview with Sanjay Rajagopalan! He’s bringing design, ideas, and truthiness to enterprise LLMs with Vianai.
The latest Boring Enterprise Nerdletter covers Oracle CloudWorld, RAP/CAP, and Nuve Platform.
If this is beneficial, consider dropping a few bucks in our tip jar. Thank you!