Google introduced its next-generation artificial intelligence model Gemini 1.5, which differs from version 1.0 with significantly improved performance and can process significantly more information.
The new version of Gemini 1.5, available for Google developers and enterprise customers, has a default context window of 128,000 tokens. For comparison, Gemini 1.0 is limited to 32 thousand tokens. Models from competitors GPT-4 Turbo from OpenAI and Claude 2.1 from Anthropic offer 128 and 200 thousand tokens, respectively.
“After testing on a comprehensive scoreboard of text, code, images, audio, and video, 1.5 Pro outperforms 1.0 Pro on 87% of the tests used to power our Large Language Models (LLM),” Google says.
A select group of developers and enterprise customers will be able to use Gemini 1.5 with a context window limit of 1 million tokens, which is equivalent to processing over 700,000 words, a code base of over 30,000 lines of code, 11 hours of audio, or 1 hour of video.
“The 1.5 Pro can perform highly complex comprehension and reasoning tasks across multiple modalities, including video. For example, given a 44-minute Buster Keaton silent film, the model can accurately analyze various plot points and events, and even reflect on small details in the film that could easily be missed,” says Google.
These advances are made possible by the new Mixture-of-Experts (MoE) architecture. Depending on the type of input provided, MoE models learn to selectively activate only the most relevant pathways in their neural network. The company is currently working on optimizing the performance of the updated AI model to “improve latency, reduce computational requirements and improve user experience.”
As a reminder, earlier this month Google released an updated version of its Bard chatbot with an improved Gemini Ultra artificial intelligence model. The chatbot itself was renamed from Bard to Gemini. At the same time, a dedicated Google Gemini app for Android was launched and the Google app for iOS was updated.