
AI agents were a prominent feature at Microsoft’s Build conference in April, and they remain a hot topic in the tech world. These agents, like Microsoft’s Copilot Actions, are designed to not just respond to user prompts but to act on their behalf. This feature, which integrates with various third-party web services, allows users to perform tasks such as booking show tickets or making restaurant reservations. While it is free to use, it is not yet available in the EU due to privacy regulations. Google’s Project Mariner offers a similar promise.
Having tested many of Copilot’s AI features, I found Copilot Actions to be an intriguing yet underdeveloped technology. It is still faster to interact with websites and services directly. However, exploring Copilot Actions in its current form might be worthwhile, as it could become a standard way of interacting with the web in the future.
How to Use Copilot Actions
To access Copilot Actions, users must sign in to their Microsoft account and navigate to the web version of Copilot using an up-to-date browser. The feature is available everywhere except the EU. Free accounts have a limited number of interactions, while a Copilot Pro subscription offers more. Microsoft does not specify the exact limits, but my free account was cut off after four sessions.
Once the Action option is selected, Copilot suggests actions below the text entry box. The feature supports more than just the recommended sites, as it should work with any public website deemed safe by Microsoft. However, voice commands are not supported in Action mode, despite the presence of a microphone icon.
Experiencing Copilot Actions
To initiate an action, users simply prompt Copilot with a specific task on a specified site. For instance, I asked Copilot to make a dinner reservation for two at a Japanese restaurant using OpenTable. The process involves Copilot opening a web browser window alongside the original Copilot window, running on a cloud-based virtual machine. Copilot then drives the page with clicks and entries based on the user’s prompt.
However, one peculiar aspect was Copilot’s inability to determine my location in Action mode, mistakenly assuming I was in Chicago. This could be due to the virtual machine’s location. In contrast, the standard Copilot mode accurately identified my location and suggested local restaurants, although it couldn’t make reservations.
Testing Beyond Reservations
In another test, I asked Copilot to find and purchase a book on Barnes & Noble’s website. The feature required input on my book preferences before suggesting a title, demonstrating its evolving capabilities. Yet, the need for user intervention remains a barrier to its full potential.
Challenges and Limitations
Copilot Actions faces several challenges, primarily the need for user intervention due to website checks and permissions, such as Captchas and sign-ins. Microsoft labels the feature as “experimental” and “early stage.” Additionally, the process is slower than manual interaction, which would be acceptable if it were entirely autonomous.
Despite these issues, the AI process does not significantly tax local system resources, as most actions occur in the cloud. This contrasts with other AI browsers, like Perplexity’s Comet, which heavily utilize local machine resources.
Privacy Considerations
Privacy concerns are minimal at this stage, as Copilot Actions operates in the cloud and cannot access local machine data. However, it does capture screenshots of websites to analyze and determine interaction points. Future developments that allow Copilot to log into accounts or handle credit card details could raise privacy issues, but these capabilities would also enhance its utility.
The Road Ahead for Copilot Actions
Currently, Copilot Actions is an intriguing demonstration of AI technology, but it lacks the ability to proactively suggest or complete tasks autonomously. The feature’s future usefulness hinges on its ability to access personal details and navigate site barriers more adeptly. Additionally, third-party sites must improve their handling of AI-based interactions.
While it remains uncertain when or if these challenges will be overcome, the potential for Copilot Actions to revolutionize web interaction is significant. As Microsoft continues to develop this feature, it will be interesting to see how it balances privacy concerns with the need for more comprehensive capabilities.