jtrent238 test blog: This AI assistant wants to do your boring office chores

View in browser | Your newsletter preferences

By Will Knight | 11.09.23

Hello again future gazers,

This week, let's take a look at the automated future of office work. One startup is looking beyond chatbots and testing programs that learn through demonstration and dialog how to do chores like sorting job candidates, handling invoicing, and expenses (hallelujah!).

This AI Assistant Wants to Do Your Boring Office Chores 💻 🤖 📂

Collage showing multiple robotic hands reaching out to cursors around a desktop.

This week, OpenAI announced a service that makes it possible for just about anyone to build a custom version of ChatGPT, no coding skills required. The company suggests that users may want to build a bot that knows the rules of all board games, teaches kids about math, or can offer culinary advice. These GPTs, as OpenAI calls them, can also perform simple actions by connecting with internet services, for example searching through emails or ordering products from an online store.

You can't fault OpenAI for trying to build on the success of its smash hit ChatGPT. But maybe more chatbots is not what we need?

Adept AI, a startup in San Francisco founded by veterans of OpenAI, Google, and DeepMind, is today launching an experimental AI agent that automates common chores in a more sophisticated and potentially powerful way than chatbots like ChatGPT. Instead of being limited to using online services that provide APIs to make them accessible to software, ACT-2 attempts to use a computer more like a human—by making sense of the pixels on a display and then taking action to control a browser and online services.

Adept's demos show how ACT-2 can be used to do things like gathering info from emails and documents to fill out insurance claims, inputting information from emailed invoices into accounts-payable software, and coming up with a walking tour for a city by interacting with Google Maps.

The way ACT-2 attempts to use the same user interfaces that humans do promises to make it a lot more capable and expansive. In theory that approach could allow a chatbot to do literally anything a person might do on their phone or computer. But operating that way is also more challenging for algorithms, and for now makes the agent more error prone.

Under the hood, ACT-2 uses a large language model called Fuyu. It is similar to the one that powers many chatbots, but like ChatGPT it can handle both text and images (making it a "multimodal model"). The model analyzes what it sees on a computer screen and tries to translate the request a user typed into useful actions the bot should take. Adept uses reinforcement learning—a technique used to teach computers tasks including playing board games and video games—to instruct its AI on how to perform different tasks. This involves watching lots of humans perform specific tasks and trying to achieve similar performance for itself.

David Luan, founder and CEO of Adept and previously VP of engineering at OpenAI, says that while chatbots have wowed everyone with their capabilities, it has proven challenging to get AI agents to work reliably. But he believes Adept and others are getting a lot closer to solving that.

"This year they just weren't there," Luan says of today's agents, including his own. "I think what's going to happen is next year there's going to be a giant war around agents that actually work." Adept is initially designing its agents to perform only a limited number of simple but common office tasks, and it says they are now at least 95 percent reliable, which is sufficient for them to be commercially deployed at a few companies.

Reaching that level of reliability just for the initial, limited tasks ACT-2 is designed for is a major breakthrough. For years, tools have existed to automate office tasks—what's known as robotic process automation—but these are finicky to build and prone to breaking. If Adept and others can use AI to reliably automate a lot more tasks, it could transform office work and increase productivity.

If Luan is right, then the battle to automate your most tedious chores could make the chatbot wars of 2023 seem relatively tame.

Will Knight, Senior Writer

Need to Know

illustration showing toy soldiers with floating giant blocks with tech-inspired graphics

Big Tech Ditched Trust and Safety. Now Startups Are Selling It Back As a Service

The burgeoning trust and safety industry promises to help tech companies navigate scrutiny and regulation. But these services bring problems of their own.

The security cameras at the Laundromat at Laguna Street and Magnolia Street in the Marina District in San Francisco, Calif. on October 22, 2023 (L) Magnolia Street in the Marina District in San Francisco, Calif. on October 22, 2023 (R).

What a Bloody San Francisco Street Brawl Tells Us About the Age of Citizen Surveillance

When a homeless man attacked a former city official, footage of the onslaught became a rallying cry. Then came another video, and another—and the story turned inside out.

Googly eyes taped to a red wall with Youtube play buttons reflected in the eyes

YouTube's Crackdown Spurs Record Uninstalls of Ad Blockers

YouTube expanded a "test" that threatens to cut off users who don't turn off their ad blocker. Developers of the tools are scrambling to respond.

Elon Musk Announces Grok, a 'Rebellious' AI With Few Guardrails

xAI, Elon Musk's new company, claims to have built a powerful language model with cutting-edge performance in just two months.

For all our future-gazing tech coverage, visit WIRED Business.

GET WIRED

Our Fall Sale is Officially live! Subscribe to WIRED for only ~~$29.99~~ $5. That includes subscriber-only content like Steven Levy's Plaintext column, plus FREE WIRED stickers as a special bonus!

So, This Happened

Anthropic, a startup competing with OpenAI, will use Google's chips to train its AI models, as the companies expand their partnership. (Bloomberg)

Meta plans to require advertisers to disclose when they run ads featuring content that has been manipulated by AI. (The Wall Street Journal)

GM's Cruise is issuing an emergency software update to 950 cars following an accident involving a pedestrian in San Francisco last month. The company also admits that its vehicles require human assistance every four to five miles. (Reuters & CNBC)

The latest version of Runway's video-generating AI software suggests that the technology is advancing at a rapid pace. (X)

Not entirely unrelated: Scarlett Johansson asks an AI app to stop using her likeness in ads without her permission. (NBC News)

Until Next Time

That's it for now. Drop me a line if you've experimented with using AI to automate your workload. And if I send out five newsletters next week, you'll know that I'm using Adept's tool to accelerate all of my daily chores.