Google Expands AI Capabilities at Cloud Next ’24

Written by David Menninger | May 9, 2024 10:00:00 AM

The artificial intelligence (AI) market is exploding with activity, which is part of the reason we recently announced that we have dedicated an entire practice at Ventana Research to the topic. Large language models (LLMs) and generative AI (GenAI) have taken the AI world by storm. In fact, we assert that through 2026, one-half of all AI investments will be based on generative rather than predictive AI. My colleague Rob Kugel has written about how AI can improve productivity and benefit the Office of Finance. Our team has also described how AI can help enterprises improve customer experiences, transform human capital management, improve marketing and sales effectiveness, enhance data integration processes and drive automation for enhanced efficiency. So, it was no surprise that at the recent Google Cloud Next ’24 event, the big focus was on AI.

Google is a massive organization, and they made 218 announcements at Google Cloud Next. In this analyst perspective, I’ll focus on those related to AI. Following the company’s introduction of Gemini, a multimodal LLM in December, Google announced Gemini 1.5. Multimodal LLMs support processing of text, images, audio and video. Gemini is available in three different sizes: Ultra, the most capable; Pro, designed for supporting a wide variety of tasks; and Nano, the most compact and efficient version, which can run on mobile devices. Gemini 1.5 uses a mixture-of-experts architecture based on a collection of sub-networks. This approach increases the capacity and efficiency of the models. Gemini 1.5 will support a context window of up to 1 million tokens. Larger context windows can be useful when trying to understand long texts or generating long responses. The examples Google provided to understand the value of a million tokens of input included 1 hour of video, 11 hours of audio, 30,000 lines of code or 700,000 words. As such, larger context windows introduce additional complexities in terms of performance and cost. They can also influence how prompts should be designed, since the material at the beginning and end can have a greater influence on the results of that material in the middle. Gemini 1.5 is currently available to a limited number of developers.

At the event, Google also announced a half dozen Gemini-based, GenAI assistants which are all in public or private preview. Gemini Code Assist (formerly Duet AI for Developers) helps developers create or modify applications by generating blocks of code in more than 20 different languages, including Java, JavaScript, Python, C, C++, Go, PHP and SQL. Code Assist provides a natural language interface and can handle tasks such as test generation and code explanation. Google Cloud Assist helps with cloud operations, including designing an architecture to support specific objectives such as maximizing performance, optimizing cost savings or ensuring high availability. Cloud Assist can also troubleshoot and diagnose issues to help resolve incidents more easily and quickly. Gemini in Security Operations can be used to investigate security issues using natural language and to generate rules to check for specific patterns that may indicate malicious activities. Google also introduced Gemini capabilities in BigQuery, Looker and in databases that my colleague Matt Aslett will cover in a separate analyst perspective.

In addition to Gemini enhancements, Google also announced new capabilities for Vertex AI, its AI development platform. In the developer keynote, Google demonstrated some new prompt-engineering tools. Prompt engineering is one of the techniques used to improve responses from LLMs. However, how does one know if a prompt is properly constructed? The prompt evaluation tool, now in preview, will perform an assessment and provide feedback. The tools also include prompt management, so once a prompt has been tuned, it can be recalled and used repeatedly. In addition, Vertex AI has new preview capabilities to augment Gemini models with Google Search grounding. To deliver these GenAI experiences to line-of-business personnel, Vertex AI now includes preview capabilities for developing custom, conversational and automation agents. These agents can be customized for different functions within the enterprise. They can also be linked together to execute workflows as part of business processes.

The company also announced infrastructure enhancements aimed at improving the performance, scalability and security of AI-based activities. Google introduced Tensor Processing Units (TPUs) in 2016 to accelerate the processing of neural networks. Since that time, there have been multiple revisions, with the latest iteration being the TPU v5p. The TPU v5p pod consists of 8,960 chips and a high-bandwidth inter-chip interconnect. Google has bundled the TPUs together with storage and networking-optimized hardware to create what it calls an AI Hypercomputer. Google also continues to embrace and adopt the latest enhancements from NVIDIA, which will be available in its AI Hypercomputer.

The AI arms race is on. Google wants to be a player in that race. Their developments show they are serious. Organizations that are on the Google platform or considering it should learn about the latest advances in Gemini and explore the preview features so they can be ready to adopt them as they become generally available.

Regards,

David Menninger

View full post