The last few days have been extremely busy in the technology sector. If we thought 2023 was the year of artificial intelligence, it was because we didn’t know what 2024 had in store for us. We could guess something, but even assuming that the advances would be great, few people imagined it until after finishing the first one. Halfway through this year we would have seen so many new and important things.
While waiting for what Apple will present at WWDC, all eyes were on the Google I/O will take place in San Francisco this week. But OpenAI is not in the habit of giving the spotlight to its competitors if they can avoid it, and they announced a few hours earlier one of the most impressive advances in artificial intelligence since the official presentation by ChatGPT.
[Qué es Google Gemini: las 8 claves que debes conocer sobre la inteligencia artificial que tendrás en tu móvil]
In a presentation lasting less than 30 minutes, the company not only announced the new version of your chatbot, ChatGPT 4obut one new form of multimodal interaction this goes way beyond what has been seen so far. It’s impossible not to remember Spike Jonze’s film “Her”. The impact on the sector is clearly visible, with many analysts and professionals praising the new product from the company led by Sam Altman. A search engine was expected and the company launched a crackdown against all its competitors. So much so that Google previewed what it would be showcasing at its inaugural Google I/O conference. It was a short video recorded at Google headquarters in which you can see how they were ready to present a system very similar to what OpenAI had just announced.
Gemini is the future of Google
Late last year, Google announced Gemini, its new brand for all things generative artificial intelligence. It even replaced Bard, the chatbot launched by Google to be ChatGPT’s rival. This product is becoming more and more important in Sundar Pichai’s company, even replacing the Google Assistant on its mobile phones in certain markets. This was the central focus of the inaugural Google I/O conference.
It is currently integrated in one way or another not only in new Pixels, including the Pixel 8a, but also in mobile phones from other companies, such as Samsung’s Galaxy S24. This shows how relevant this product is to Google, a company that It has been saying for eight years that it is an artificial intelligence-focused company but he saw how a newbie, OpenAI, unexpectedly overtook him.
On this basis, Google presented its new features during Google I/O so as not to lose the rhythm of its competitors’ presentations. And it did it in a way its competitors can’t dream of: by using its users’ data. And that can change everything. For example, the The Gemini application is capable of creating content or understanding what is on the mobile, respond on this basis.
The new era of data
Data has been a key element in businesses for many years now, but in the era of artificial intelligence reaching a new level. Google has an advantage here thanks to the billions of users who already use its services, from the search engine to Gmail, including Google Maps and Google Photos.
This is research in the Age of Gemini. #GoogleIO pic.twitter.com/JxldNjbqyn
-Google Google) May 14, 2024
Google announced some of the improvements to these services, including Gemini. And he does it in a way multimodal and with great context, the two keys to this new generation of AI. For example, in summer we will have a new function in Google Photos that will allow you to make requests to the application as complex as “show me how my nephew learned to swim”, and will show timeline photographs of my nephew related to my request. Another demonstration concerns Gmail, specifically WorkSpace. We’ll be able to ask you for a summary of all a person’s emails, or search through the thousands of emails we have stored. Other examples relate to purchases made and the management of a return, all from Gemini, which is possible by having access to our data. This feature will be rolled out next month in select countries.
Astra Project
But the most impressive thing about the event was Project Astra, a real-time video recognition system which is reminiscent of what OpenAI presented a few hours ago. To do this, Google had to reduce latency as much as possible, for which it created Gemini Flasha version of its artificial intelligence designed for this.
The Astra project is a prototype of @GoogleDeepMind explore how a universal AI agent can be truly useful in everyday life. Watch our prototype in action in two parts, each captured in a single take, in real time ↓ #GoogleIO pic.twitter.com/uMEjIJpsjO
-Google Google) May 14, 2024
For the moment, Google has not announced a release date, but aims to integrate it into the Gemini application at the end of the year. This could change the way we use voice assistants, including Google’s, which is becoming increasingly outdated, although Gemini still can’t do things like run routines related to home automation devices.
Imagine 3 and I see
But Google does not lose sight of the impact of Dall-E and, above all, of Sora, OpenAI’s video generation system which, although not available to users, demonstrated what was to come. At Google I/O, we attended the presentation of I see, Google’s video generation system, which goes a little further than Sora, prioritizing not only quality, but also consistency between images.
Introducing Veo: our most powerful generative video model. 🎥
> It can create high-quality 1080p clips that can last more than 60 seconds.
From photorealism to surrealism and animation, he can tackle a range of cinematic styles. 🧵 #GoogleIO pic.twitter.com/6zEuYRAHpH
– Google DeepMind (@GoogleDeepMind) May 14, 2024
In the examples presented, it stands out for its cinematic aspect, capable of creating videos in FHD resolution of a certain duration, approximately one minute per shot. Additionally, you can use visual effects, such as filters, smoke particles… But the best part is that these videos can be edited using text controls.
In collaboration with Veo, they presented Imagine 3, the new image generation engine, with a quality reminiscent of Adobe Firefly due to the photorealistic results. Additionally, they placed importance on text, something many AI engines struggle to create. Both products will be available in the Google Labs section, although there is no date for a massive commercial rollout.
On the way to the AGI
All the advances demonstrated by Google are impressive, such as artificial intelligence that manages to predict the structure and interaction of all the molecules of life, and keep the company in the race to achieve the ultimate goal, AGI . Artificial General Intelligence or Artificial General Intelligence is what Google and OpenAI are seeking to developa system capable of performing multiple actions without needing to be trained to do so, capable of learning on its own.
This hasn’t been created yet, but at the speed at which we’re seeing current developments, it’s not crazy to think that sooner or later one of these two companies will likely create such a product. Meanwhile, Google and Open Ai will continue to lead the development of artificial intelligence products, and we will be the ones to benefit.
Table of Contents