Manus, a Fully Autonomous AI Agent Unveiled: China's New AI Breakthrough or Just Hype?

On March 6, many people stayed up all night trying to find invitation codes for the world's first AI agent and be among the first to try out the new breakthrough. That day, Chinese Tencent-backed startup Butterfly Effect introduced Manus, positioning it as a general-purpose AI agent capable of making decisions and performing complex tasks independently without constant user control. 

According to the demo video on the Manus website, the agent can quickly create travel itineraries, evaluate stocks, invent educational content, compare insurance policies, provide real estate advice on the user's requests, and much more. The first users who managed to get access called Manus the "killer" of OpenAI's Operator, while others noted its low performance. But it is difficult to verify both sides and assess how powerful and useful the agent is and what tasks it can do in practice, as Manus is now in beta testing and restricted for users with access codes which does not provide any opportunity for the wide audience to check its real capabilities. 

So, is the hype around the new Chinese development justified? We’ll try to sort it out below.

Who is the founder of Butterfly Effect?

Xiao Hong, 33, graduated with a bachelor's degree in software engineering from Huazhong University of Science and Technology (HUST) in Wuhan. He founded Butterfly Effect two months before ChatGPT launched in 2022. The company's first product was an AI assistant called Monica. But Xiao is not the only one attracting attention. The company's research lead, Ji Yichao, was the youngest person to appear on the Forbes China "30 Under 30" list in 2012 and 2013. While still in school, he created the Mammoth mobile browser, and in 2012, he founded Peak Labs, receiving investments from ZhenFund and Sequoia China (now HongShan).

What is Manus and How Does It Work?

The name "Manus" comes from the Latin phrase Mens et Manus, which means mind and hand. This metaphor emphasizes the agent's ability to help users by performing routine tasks for them. Unlike generative AI models like ChatGPT and DeepSeek, which simply react to prompts, Manus is designed to operate on its own, making decisions, completing tasks, and producing results with minimal human intervention. This development signals a paradigm shift in AI evolution, moving away from reactive models to fully autonomous agents.

The platform is capable of working without an API or other complex settings. It's designed to understand a prompt, access the Internet, analyze the information on the screen, and execute tasks autonomously. At its core, Manus operates through a structured agent loop that mimics human decision-making processes. When given a task, it first analyzes the request to determine goals and constraints. It then selects tools from its toolbox, such as web scrapers, data processors, or code interpreters,  and executes commands in a secure Linux sandbox environment. This sandbox allows Manus to install software, manage files, and interact with web applications while preventing unauthorized access to external systems. After each action, the AI ​​evaluates the results, iterates its approach, and refines them until the task meets predefined success criteria.

A defining feature of Manus is its multi-agent architecture. This architecture primarily relies on a central executor agent responsible for managing various specialized sub-agents. These sub-agents can handle specific tasks such as web browsing, data analysis, or even coding, allowing Manus to work on multi-step tasks without the need for additional human intervention. Additionally, Manus runs in a cloud-based, asynchronous environment. Users can assign tasks to Manus and then disconnect, knowing that the agent will continue to run in the background and send results when they are complete.

Manus was not built from the ground up. Unlike DeepSeek, Butterfly Effect has not developed its own AI models. Instead, it functions as an additional layer on top of a large language model. Ji Yichao posted on X: “We use Claude & Qwen-finetunes. Started with Claude 3.5 Sonnet v1, now testing 3.7—promising!" The company employs its own fine-tuned models and proprietary techniques, but its main services appear to rely primarily on those two models.  LLMs are complemented by deterministic scripts for data processing and system operations. For example, while an LLM might write Python code to analyze a data set, Manus's backend executes the code in a controlled environment, validates the output, and adjusts parameters if errors occur. This hybrid model balances the creativity of generative AI with the reliability of programmed workflows, allowing it to perform complex tasks such as deploying web applications or automating cross-platform interactions.

Manus Benchmarks

GAIA (General AI Assistants) is a benchmark designed to evaluate the capabilities of general AI assistants on real-world problems. It was suggested by Meta AI (FAIR), Hugging Face, and AutoGPT to evaluate the performance of agent-based AI systems in 2023. It is divided into three levels, Lv.1, Lv.2, and Lv.3, with increasing difficulty. The GAIA benchmark evaluates AI proficiency in web search, tool use, coding, file processing, and multimodal reasoning. It assesses AI systems' abilities to perform real-world tasks using external tools, process multimodal data, and reason logically.  Butterfly Effect claims it achieved state-of-the-art performance on the GAIA benchmark across all difficulty levels, outperforming leading AI models, including OpenAI's GPT-4.

However, the South China Morning Post (SCMP) has raised questions about these claims, citing limited access to Manus as a potential reason for skepticism. In March, reporters from the Chinese publication National Business Daily visited Butterfly Effect's office in Wuhan but found it closed for development, with a notice requesting not to be disturbed. This lack of transparency has contributed to the cautious stance taken by some media outlets regarding Manus's purported achievements.

What are Manus Use Cases?

Manus is absolutely versatile. As already mentioned above, Manus was designed to solve a wide range of complex and dynamic tasks. Whether you need deep market analysis, processing large amounts of documents, or professional data interpretation, Manus is expected to handle these tasks on its own and deliver high-quality results rather than just offering suggestions or answers. It can help with recruiting, interview optimization, market analysis, SEO, and supply chain management in business and marketing. As a personal assistant, it generates documents, plans travel and creates meditation audio. It provides financial insights, consumer analytics, and social sentiment tracking for data analysis. In content creation and education, it transcribes audio, organizes learning resources, and develops educational materials. Finally, for research, it supports industry analysis, policy studies, and market intelligence. 

To demonstrate Manus's practical capabilities, the developers offered numerous impressive use cases and presented corresponding videos of how Manus copes with the suggested tasks. Manus was asked to create a personalized travel itinerary to Japan in one such use case. It took into account not only the user's preferences but also external factors such as weather conditions, local crime statistics, and rental trends. It went beyond simple data mining and reflected a deeper understanding of the user's unspoken needs, illustrating Manus's ability to perform independent, context-sensitive tasks.

When Manus was asked to handle a hiring process, Manus did more than simply sort a given set of resumes by keywords or qualifications. It went further by analyzing each resume, matching skills with job market trends, and ultimately providing the user with a detailed hiring report and an optimized decision. Manus completed this task without the need for additional human intervention or supervision. This use case demonstrates its ability to autonomously handle a complex workflow.

First Users Feedback

Those lucky owners of invitation codes who have already tried the model had different feedback about Manus.

Andrew Wilkinson, co-founder of technology holding company Tiny,in his X post wrote,  "It's absolutely insane. I feel like I just time travelled six months into the future. I threw it a zip file of 20 applicants for a CEO job and it did a deep dive on each, one by one, browsing the web and taking notes, then ranked them based on the requirements. Now I have it building me a web app to replace a piece of software we pay $6,000 per year for".

Victor Mustar, the Head of Product at Hugging Face, tweeted, "Got access and it's true... Manus is the most impressive AI tool I've ever tried. The agentic capabilities are mind-blowing, redefining what's possible. The UX is what so many others promised... but this time it just works.prompt: "code a threejs game where you control a plane". 

And Deedy Das, an investor at Menlo Ventures, wrote, "Manus, the new AI product that everyone's talking about, is worth the hype. This is the AI agent we were promised. Deep Research+Operator+Computer Use+Lovable+memory. Asked it to "Do a professional analysis of Tesla stock" and it did ~2wks of professional-level work in ~1hr!."

Other users, on the contrary, noted the neural network's low performance. Derya Unutmaz, the professor and biomedical scientist researching aging and cancer immunotherapy, tweeted: "Deep Research finished in under 15 minutes. Unfortunately, Manus AI failed after 50 minutes at step 18/20!  It was performing quite well-I was watching Manus' output & it seemed excellent. However, running the same prompt a second time is a bit frustrating as it takes too long!" In another post, he mentioned that Manus AI was missing the Deep Research-style citations and references, which were very important.

Alexander Doria, the co-founder of AI startup Pleias, pointed out, "After experimenting with it, I do like the UI. But it's fundamentally a workflow like Devin, not an actual agent (at least nothing really beyond the built-in agentic capacities of Claude)."

Other users complained that Manus makes factual errors and omits obvious information that can be found online. And some early adopters seemed disappointed by the fact that Manus AI relies on Claude and is not really an agent with a proprietary foundational model and speculated that since it used third-party models, it would be possible to come up with an open-source alternative soon, and they were right. For those unwilling to wait for a Manus invitation, a four-person team from the MetaGPT community in just 3 hourscreated OpenManus, an open-source project replicating Manus's core functionality.

With access to the product remains closed, some users allegedly pay large sums to use the tool. Specifically, a secondary market for invitation codes for access to the platform has appeared on the Chinese trading platform Xianyu. The cost of a beta account at its peak reached 10 million yuan ($1.3 million). Probably, the deal for such a large sum was concluded to create FOMO. So, while OpenAI charges $200 for access to advanced AI features like Operator, Manus seems to be free, but many users, even those willing to pay, can not access it now. Once it becomes accessible, we can expect a monthly fee to be charged after its full release.

Because of this buzz, some users claim that the platform is predominantly a marketing stunt that received hype by creating an effect of exclusivity and making many Chinese AI influencers praise it. For instance, an X pos from @grok on March 18, 2025, states: "Manus, an AI hyped as a breakthrough, has been linked to scams before. Its rapid rise and server issues sparked doubts—some even accused the team of scarcity marketing on Xianyu". 

Challenges and Considerations

Manus' use cases serve as valid examples of Manus' successful implementation, but independent verification of its capabilities is limited. This lack of transparency makes it difficult to establish trust, especially when companies consider delegating sensitive tasks to autonomous systems. Additionally,  early adopters have reported issues with the system entering "loops," where it repeatedly performs inefficient actions, requiring human intervention to reset tasks. These failures highlight the difficulty of developing AI that can consistently navigate unstructured environments.

Additionally, while Manus operates in isolated sandboxes for security purposes, its web automation capabilities raise concerns about potential misuse, such as stealing secure data or manipulating online platforms.

Conclusions

Manus represents a significant step forward in the evolution of autonomous agents. The future of AI is no longer limited to passive assistants – it's about creating systems that think, act, and learn on their own, performing tasks across a wide range of industries, independently and without human supervision. To be objective, Manus is still in the early stages of development. There are too many unknowns to make any definite assessments of its capabilities. More information about Manus data sources, algorithms, and full public access would help to determine if Manus lives up to its promise. Whatever the outcome with Manus will be, it is clear that such AI systems are becoming more sophisticated, and they have the full potential to redefine industries and change labor markets. Moreover, the active rise of projects like Manus and DeepSeek also sparked debate about whether China could overtake the US in AI field.

article-author-img

Charlie Lambropoulos

03/21/2025

Artificial Intelligence