According to CBI’s latest buyer interviews, AI agent reliability varies dramatically across providers. Many customers report a gap between marketing and reality.
"Whatever was promised didn't work as great as said," one LangChain user told us about the company’s APIs. "We encountered cases where we were getting partially processed information, and the data we were trying to scrape was not exactly clean or was hallucinating."
For many customers, reliability is largely a function of how complex the data and use cases are. For instance, the LangChain customer saw ~80% accuracy for simpler tasks, but “for complex tasks, the accuracy dropped to around 50%.”
Organizations are tackling the reliability issue with 1) human oversight; and 2) more extensive model training.
An Ema customer, for instance, first has a subject-matter expert review outputs, and once “more than 90% of the responses that we have tested are now accurate, we let it fly.”
The customer still needs to intervene with their own ML algorithms when CrewAI is unable to handle outliers or unconventional data structures. If CrewAI is able to tackle these cases in the future, “that would be a huge leap forward.”
Source: CB Insights — AI agent market map featuring CrewAI and LangChain in the infrastructure category
2. Integration headaches
Integration limitations rank as another top customer pain point.
An Artisan AI customer echoes this: “It was a bit of a gamble that we were signing up for a product where they didn't have quite all the integrations that we wanted.”
Where customers see real value from these tools is when they can support seamless data flow, especially through customers' existing tech stacks. This buyer went with Decagon because of its integrations:
"There's so many short-term moats, but in the long term there is no moat," one customer observed. "Whatever you build will be rapidly reproduced."
In a crowded market, specialization will determine success.
Hebbia, for instance, has tailored its solution to financial players. An exec at a PE firm framed this as a selling point when getting internal buy-in: “When I bring tools to the deal team that live and breathe diligence and deal execution, ensuring that it's aligned to what they know and understand and [that it] speaks their language is incredibly important.”
While many horizontal AI agents are actively deploying or even scaling their solutions, vertical AI agents remain nascent, with half still in the first 2 levels of Commercial Maturity.
They’ll gain more momentum this year as enterprises prioritize solutions that are highly tailored to the needs of individual industries.
Let’s make a Deel: HR and payroll platform Rippling is suing its rival Deel for allegedly planting a spy to access Rippling’s internal sales pipeline data and customer interactions. Rippling CEO Parker Conrad confirmed they set up a “honeypot” to prove that Deel’s senior leadership was orchestrating the illegal activity — and the double agent fell for it. When served with a court order, the spy locked himself in a bathroom and allegedly tried to flush his phone down the toilet. Rippling is seeking damages.
Source: X
Huawei in hot water: Belgian police arrested multiple individuals in a corruption probe involving forgery and falsified documents linked to Chinese tech giant Huawei. Authorities believe lobbyists paid off European Parliament members with cash, expensive gifts, and luxury trips to promote Huawei’s business interests in the region. Huawei denies wrongdoing but insists it is taking the allegations “seriously."
Dirty laundry: Indian authorities arrested Lithuanian national Aleksej Besciokov, co-founder of Russian crypto exchange Garantex, at the request of the US. Besciokov is accused of facilitating money laundering linked to North Korean hackers and other cybercriminals. The arrest follows the US government’s seizure of Garantex’s website and $26M in frozen assets. Garantex, now under fire, claims it has plans to compensate customers for blocked funds.
Lilac claps back: Former employees of lithium tech startup Lilac Solutions sued the company, claiming exposure to toxic chemicals left them with severe health issues. The Breakthrough Energy Ventures-backed firm fired back with its own lawsuit, accusing the workers of leaking trade secrets. OSHA has already hit Lilac with multiple citations, and a legal battle is now brewing over safety, whistleblower retaliation, and IP theft. Oh my.
Apple vs. UK: Apple is fighting a UK order demanding that it build a “back door” into its security systems. Privacy activists are suing the government, calling the demand a major privacy violation. Apple already pulled its iCloud Advanced Data Protection from the UK after getting a secret government order. Now, US lawmakers are jumping in, pushing the UK to be more transparent about its legal process.
Scalpel scandal: AI imaging firm ChemImage is suing Johnson & Johnson, claiming the healthcare giant stole its tech after a multibillion-dollar partnership went south. J&J argues the contract was scrapped because ChemImage failed to meet key milestones, while ChemImage says J&J bailed to cut losses on its struggling surgical robotics venture. The Manhattan federal court trial will decide whether ChemImage gets its patents and IP back, as well as $180M in contract termination penalties.