The Intelligent Worker
Posts
Gemini 2.0 is an actual game-changer

Gemini 2.0 is an actual game-changer

🤖 Why? Video support, browser and screen control, dev companion + many more...

Henry Habib
December 13, 2024

Subscribe | Sponsor

Hi everyone,

We’ve wrapped up a full week of amazing AI news.

We got a bombshell yesterday - so much so that we’ve dedicated all the Deep Dive section to it.

It’s the release of Gemini 2.0 - GOOGLE IS BACK BABY!

Let’s get right into it.

In this issue:

🤿Deep Dive: Gemini 2.0 has been launched
🖼️AI Art: Examples of great and trending AI art
🤿Deep Dive: Google’s smarter AI agents for everyday life
🤝In Partnership: The one thing you’re not doing to stay ahead of the competition.
🤿Deep Dive: AI code agent enhances GitHub workflows
⚒️Tool Snapshots: Tools for AI, no-code, and productivity
📰Top News: News on AI, no-code, and productivity

🤿 DEEP DIVE

Gemini 2.0 Launch by Google DeepMind Highlights New AI Potentials

Intelligence: Google DeepMind has announced the launch of Gemini 2.0 Flash, a groundbreaking AI model specifically designed to enhance user experiences through multimodal capabilities and improved functionalities.

This new model is a successor to the popular 1.5 Flash, offering double the speed and better performance on key benchmarks.
Gemini 2.0 Flash supports various formats, enabling inputs and outputs like images, video, audio, and text, alongside tools like Google Search and coding functionalities.
- PAUSE: you heard that right - video!
The model is available to developers now through the Gemini API within Google AI Studio and Vertex AI, with wider availability expected in January, including additional model sizes.
The introduction of the Multimodal Live API allows for real-time processing of audio and video streams and supports multipurpose tool usage, aiding developers in creating interactive applications.
The experimental model can be accessed via the Gemini app globally, enhancing the capabilities of the AI assistant it powers.

🖼️ AI ART

Examples of great and trending AI art

Spice up your space with Japanese-inspired Street Fighter art. Kudos to vonfanaustin and Midjourney for this epic transformation.

🤿 DEEP DIVE

Google DeepMind's Projects Astra and Mariner Pioneering Multimodal AI Assistants

Intelligence: Google DeepMind has unveiled advancements in its Projects Astra and Mariner, leveraging the capabilities of Gemini 2.0 to enhance the functionality and safety of AI agents in everyday life.

Astra
- Project Astra can now converse in multiple languages, understand mixed languages, and handle accents and uncommon words more effectively.
- Astra can remember up to 10 minutes of in-session interactions and retain information from past conversations, allowing for a more personalized experience while ensuring user control over what is remembered.
- Google aims to integrate these functionalities into various products, including the Gemini app and prototype smart glasses and will expand its testing program to include more users.
Mariner
- Project Mariner operates within a web browser, understanding and reasoning over various screen elements to help users accomplish complex tasks through a Chrome extension. It has achieved a state-of-the-art score of 83.5% on the WebVoyager benchmark, demonstrating significant potential for aiding users with real-world web tasks.
- While the system is in its early stages and may not always provide fast or accurate results, it employs safety checkpoints, requiring user confirmation before executing sensitive actions like purchases.

🤝IN PARTNERSHIP WITH AIRTABLE

Not every project can be a P0. Snap your roadmap into focus with AI-powered prioritization.

Consistently align your people to the most strategic priorities, discover product opportunities from deep customer insights, and gain total visibility on execution with Airtable ProductCentral, the complete operating system for Product teams, built on Airtable's powerful platform.

🤿 DEEP DIVE

Jules & Gaming Agents Innovate Developer and Gaming Experiences

Intelligence: Google DeepMind is launching Jules, an experimental AI-powered code agent designed to assist developers within their GitHub workflows, while also expanding its agent capabilities into gaming and robotics.

Jules integrates into GitHub, enabling developers to address programming issues, plan solutions, and execute code, all while remaining under user supervision. This aims to streamline the coding process and expand AI support in various domains, particularly software development.
Continuing its legacy of using games for AI training, Google DeepMind has developed agents within Gemini 2.0 to assist players by analyzing real-time gameplay and offering suggestions based on actions displayed on the screen.
Partnerships with game developers, such as Supercell, are enhancing the testing and integration of these agents across a variety of game genres, from strategy games like "Clash of Clans" to simulation games like "Hay Day."
These agents can utilize Google Search to provide players with additional gaming knowledge, creating a more enriching experience.

⚒️ TOOL SNAPSHOTS

Futuristic tools within AI, no-code, and productivity

🔐 AgentAuth - Empowers AI agents with secure access to 250+ apps. Free option available.
🤖 Muku.ai - Transform products into sales-driven AI influencer videos. Free to try.
🌐 Remention - Effortlessly grow users and boost revenue through smart mentions. Free to try.
🎨 Seelab.ai - Generate and customize stunning brand-aligned images effortlessly. Free to try.
🖥️ Templated - Effortlessly integrate an editable and customizable design interface. Free to try.

📰 TOP NEWS

News on AI, no-code, automation, and productivity

🍎 Apple Unveils New AI Features in iOS 18.2 and macOS Sequoia

Apple introduced Image Playground for creative visuals, Genmoji for custom emojis, enhanced writing tools, ChatGPT integration, expanded language support, and improved privacy measures.

⏳ AI Growth Slows as Scaling Challenges Shift Focus

The era of massive AI investments faces hurdles as scaling improvements plateau, prompting the exploration of smarter algorithms and specialized tools. Lower costs and open-source models may empower startups to compete.

📊 Microsoft Unveils Phi-4 for Advanced AI Research

Phi-4, Microsoft’s latest generative AI model, excels in math problem-solving and efficient performance, available for limited research use on Azure AI Foundry.

🤖 How to Use AI and When to Avoid It

Discover 15 useful applications for AI, such as brainstorming and summarizing, as well as 5 cases where care is required, such as learning new ideas or demanding high accuracy.

🎥 ChatGPT Adds Real-time Video Analysis and Screen Sharing

OpenAI's ChatGPT now offers Advanced Voice Mode with vision, allowing Plus, Team, and Pro users to analyze objects, explain screen content, and assist with tasks in near real-time.

ℹ️ ABOUT US

The Intelligent Worker helps you to be more productive at work with AI, automation, no-code, and other technologies.

We like real, practical, and tangible use-cases and hate hand-wavy, theoretical, and abstract concepts that don’t drive real-world outcomes.

Our mission is to empower individuals, boost their productivity, and future-proof their careers.

We read all your comments - please provide your feedback!

Did you like today's email?

Your feedback is more valuable to us than coffee on a Monday morning!

What more do you want to see in this newsletter?

Please vote