Less than two months have passed since the launch of the advanced neural network Gemini, and Google has already announced its successor. The large Gemini 1.5 language model was unveiled today and is immediately available to developers and enterprise users, with distribution to consumers to begin soon. Google has made it clear that it wants to use Gemini as a business tool, personal assistant, and more.
Gemini 1.5 has many improvements. The Gemini 1.5 Pro, which will power many of Google's services, beats the Gemini 1.0 Pro by 87% in tests, putting it roughly on par with the high-end Gemini 1.0 Ultra. When creating a new model, the increasingly popular “Mixture of Experts” (MoE) approach is used, which implies that when sending a request, only part of the overall model is launched, and not the whole. This approach should make the model faster for the user and more efficient for Google.
But there's one new thing about Gemini 1.5 that everyone at Google, starting with CEO Sundar Pichai, is especially excited about.. The new version of the neural network has a huge context window, which means it can process much larger queries and view much more information at once.. The window size is 1 million tokens, which is much larger than the 128,000 tokens for OpenAI's GPT-4 and the 32,000 for the current Gemini Pro. “This is approximately 10 or 11 hours of video, tens of thousands of lines of code,” Pichai noted.. He also added that Google researchers are testing a context window for 10 million tokens – this is, for example, the entire Game of Thrones series in one request.
As an example, Pichai says that the entire Lord of the Rings trilogy could fit into this contextual window.. This seems too specific, but perhaps someone at Google will check to see if Gemini will find any continuity errors while trying to make sense of Middle-earth's complex ancestry. Or the AI might be able to understand Tom Bombadil.
Pichai also believes that a larger contextual window will be very useful for business. “ This will allow you to use examples where you can add a lot of personal context and information at the time of the request ,” he says. “ Consider that we have significantly expanded the request window.”. The Google CEO envisions filmmakers being able to upload their entire film and asking Gemini what reviewers say, and companies being able to use Gemini to process reams of financial documents.. “ I consider this one of the biggest breakthroughs we've made ,” he says.
For now, Gemini 1.5 will only be available to business users and developers through Google Vertex AI and AI Studio. It will eventually replace Gemini 1.0, and the standard version of Gemini Pro – the one available to everyone on gemini.google.com and Google apps – will be replaced by 1.5 Pro with a contextual window for 128,000 tokens. To get a million, you have to pay extra. Google is also testing the safety and ethical boundaries of the model, especially with regard to the new larger context window.
Google is now in a frantic race to create the best AI tool, while companies around the world are trying to define their own AI strategy and collaborate with OpenAI, Google or whoever. Just recently, OpenAI announced “memory” for ChatGPT and seems to be preparing to enter the web search market. While Gemini looks impressive, especially to those already in the Google ecosystem, the company still has a lot of work to do.
In the end, Pichai says, all these 1.0 and 1.5, Pro and Ultra, and enterprise battles will not matter to users. “ People will simply consume a better user experience ,” he says. “ It’s like using a smartphone without paying attention to the processor under the hood .” But for now, he says, we're still at the stage where everyone knows what chip is inside their phone, because it matters. “The underlying technology is changing so quickly,” says Google's CEO. “ People care .”