(Mark Greene at UiPath Forward 2024)
Despite my piece lambasting AI agent (over)hype, I am not anti-agent, even if “agentic” is low on my list of favorite buzzwords. I’m just pushing for more precision on the pros and cons – and use case design. Most enterprise events fell short of that this fall.
But as I wrote in UiPath Forward – where does agentic AI go from here, and is RPA still relevant?, UiPath shared a different narrative – one where old school RPA can execute transactions with certainty where needed, alongside newer agent technology perhaps orchestrating, but not trying to square peg -> round hole the conversation.
No questions about this part: LLM technology is vastly superior at dealing with unstructured data than RPA bots. AI agents are not a new thing, but generative AI agents do open up new possibilities, not just in autonomy and orchestration, but in their ability to parse a wider range of data and advance a workflow – perhaps making something resembling a human decision along the way.
RPA bots are deterministic, but AI agents are not
As UiPath CEO Daniel Dines pointed out during his UiPath Forward keynote, these generative AI agents are probabilistic technologies. RPA bots are deterministic, but AI agents are not:
It’s non-deterministic. It’s you cannot predict the answer of Gen AI. It’s simply impossible to predict the answer. So, while we all understand that it’s extremely powerful, its own nature makes it extremely difficult to use it in the context of an enterprise workflow because enterprise workflows need to be reliable and deterministic. Our job right now is actually to make it exactly as I said, reliable and deterministic and capable of use in an enterprise workflow, and we are going to spend the next few years in order to make it happen.
Just how deterministic and accurate agents will become – and when – is a subject of fierce debate, and for good reason. No, that doesn’t prevent agents from being useful, but doesn’t it impact use case selection and design – and where adult (human) supervision fits in? To hash this out, Alyx MacQueen and I sat down in Las Vegas with Mark Greene, SVP & General Manager of Product Management at UiPath.
What is controlled agency?
Some might object: it’s obviously in UiPath’s self-interest to ensure that RPA-like, rules-based automations aren’t relegated to legacy status. But consider this from Greene:
I’ve spent time in probably 50 individual customer meetings during this week already; what they’re excited about in terms of our platform is that we already have autonomy in our platform. So robots operate autonomously, and we built the best governance structure for how you can take autonomous actions.
Our customers use robots to update their most critical systems of record. They have built these automations over and over again – millions of them. Every customer I’ve talked is excited about the fact that they can use those robots in conjunction with agents.
This led UiPath to “controlled agency.” Greene explains:
So the last mile of ‘I’m going to change a system of record,’ or ‘There’s going to be money changing hands,’ we’re calling it controlled agency, where I can bound the agent with humans, but with robots as well. Because the agent can have structured inputs and structured outputs, you can pass it along to these automations that are already proven, and are already proven to operate autonomously. They have proven a level of accuracy of how they engage your systems.
When compliance is at stake, customers can accept no less. Greene:
I don’t have to give that agent the ability to go change, you know, the patient record in Epic for example. When I was talking to a major healthcare company, they already trust the robot to do that – and the automation to do that. The agent can use that automation as another tool that we provide, to be able to reliably execute those actions. Or we can just put the agent in a workflow, with robots surrounded on either side.
Greene challenged my example of Uber as an agent. Yes, it’s an AI agent, but a gen AI agent, he argues, is a different beast: it can orchestrate automations in a less confined (yet still controlled) way. This is what he calls dynamic planning.
Now you can have this digital workflow worker that works alongside those robots. That’s a that’s a bit different. Uber is obviously using AI models to make very intelligent decisions on how to route you to the right driver and all the rest of it. But I bet there’s a workflow to what Uber does that’s that’s defined. A [gen AI] agent creates the plan on the fly. The LLM model essentially creates the plan, guided and bounded by the context and the tools you give the agent. So I think there’s something there that’s next level.
Greene refined our AI accuracy discussion: more specialized agents are going to work better. Agentic orchestration of specialized agents holds promise:
Where I really agree with your article is if you say, ‘An agent is going to do the job of a salesperson;’ a salesperson does quotes, a sales manager might hire people. Their job responsibilities are so massive, and I don’t believe that if you give it a massive set of job responsibilities like that, the agents are going to be accurate.
Much more interesting is saying, ‘This is the quote agent,’ and ‘This is the customer research agent that’s going to research the background of that customer.’ If you take narrow, specialized agents that have guardrails around what you’re asking them to do, you will get higher accuracy results. We already know that with LLMs right? The broader the question you ask it to do, the more risk you have of hallucinations.
Is payment dispute resolution a valid agentic workflow?
Greene and I had a bit of a debate around the topic of invoice dispute resolution. This is an early agentic use case I’ve heard from several enterprise vendors. But is dispute resolution a straightforward use case? After all, this is not just internal anomaly detection.
This is an agent having an interaction with a customer, around a topic that many times is routine (such as a payment terms discrepancy or invoice typo). But at other times this could be volatile, such as a discount issue with a VIP customer. How far up the sophistication chain can we go? Greene describes the use case:
A robot can do two way matching of the payment process, or two way, three way matching of the purchase order, invoice and bill. But what happens when it doesn’t match? Well, let’s have a dispute investigation, like we showed on stage today. Once you investigate the reason for the dispute, why not have a dispute resolution agent that lets you interact with the vendor through email, but that resolution agent just has one task: to research a potential solution and look at it. But it’s not doing your entire payables process. And I think that’s where the overhype is going too broad, looking at these agents as taking a full business role versus a business task.
Here is my verbatim exchange with Greene on this:
Jon Reed: I can imagine situations, though, where it is higher stakes, and where the the agent may be doing a great job, but maybe they are just not aware that your salesperson promised a particular type of discount that wasn’t applied within the system or whatever. And I could see an agent like that really annoying the heck out of a VIP-type customer. So how do you intelligently design a system like that?
Mark Greene: In this case, let’s say you have the context that I’m going to give [the AI agent] from Salesforce: “Before you, the AI agent, reaches out to this customer, you’re going to look up what the revenue is with that customer. If it’s less than $100,000 a year, we’ll let you interact directly with it. If not, we have a customer success manager for any customers over $100,000 a year. And I want you to investigate this, and look into the potential reasons as the investigation agent, but you’re not going to be the resolution agent. We’re going to send that to our person on our team, but as the AI agent, you’re going to give them full context of a suggestion of how to resolve this based on the information you looked up…” You can be very specific in the governance from our platform on what you can give the agent… This would be in natural language, in the instruction. I don’t have to code it in a workflow.
Greene confirmed that this “instruction” could come in the form of the prompt-based “context window” used to invoke the agent, or, alternately, in an AI policy (“policy” in this case is an LLM output instruction on how to handle a certain category of situations; I’ll get back to this in my conclusion).
During our interview with Daniel Dines, he asserted that UIPath can play a big role in proper agent governance:
As I said in my keynote, I think the governance and security aspects of agents are tremendously important, because agents running autonomously without proper guardrails can be dangerous. So to me, one of our biggest propositions that we can bring is that we can limit, actually, what agents can do – by surrounding them with very predictable actions. Especially when it comes to action, I think it would be a lot more dangerous to give an agent access to a third party system, and let them type and click directly – without very strict governance.
With just about every software vendor pushing agentic workflows, customers will expect these agents to get along with each other, and complete tasks across vendors. Will that fly? Dines thinks UiPath has a key role to play here, acting as a “Switzerland” of AI orchestration:
SAP agents, Salesforce, agents, ServiceNow agents are just simple APIs in the end, right? But you will need this process orchestration piece on top of them, because if you want to create an agent that works multi-systems, what do you do? Where do you host it? It’s better to host it in a neutral platform.
My take – companies need a future of work narrative
One area where I’ve found no consensus? The impact of agentic automation on the future of work. Some believe the disruption will be speedy and profound. I happen to think the job disruption will be more gradual – though with pockets where the impact will be felt more quickly (e.g. customer support, coding teams, graphic design).
I do think AI will one day have massive impact on human job availability, but only when a further breakthrough comes that encompasses real problem solving on the one hand, or highly adaptive physical robotics on the other hand (see my piece on Active Inference AI for different approaches, beyond gen AI).
Right now, we hear the words “reasoning,” “planning,” and “complex decision making” used freely with today’s gen AI iterations, but I expect more formal studies that cast doubt on these evangelistic assertions, as in: Apple study exposes deep cracks in LLMs’ “reasoning” capabilities.
A more potent question is: when will we reach “good enough” to remove humans from workflows? One person’s “good enough” is different than another’s. My good enough is likely a higher bar than some large organizations under investor scrutiny for AI returns, so that will be a story to watch.
Why does this philosophical question matter now? Because I believe companies – and software vendors – need a future of work story that resonates with their employees. Will they manage out of AI fear, or possibility? Dines shared his view:
I think eventually the work of humans will be just to read some kind of summaries of the information flowing to them, and make decisions that will trigger actions. We don’t believe that human [work] will disappear, but I think we will reduce the input of humans.
But given the staffing issues in many industries, we’re still at a point where most organizations can use as much automation – and human talent – as they can get. Dines:
Today, I was talking with one of the largest health care providers in United States, and they said, ‘We are short thousands of people, to reduce the admissions date of our patients. So if we reduce the time someone spends on patient admissions, they can process a lot more patients, and they can deliver a much better outcome to the patients. So really, the key is to reduce the time people spend on processes.
That’s why UiPath’s ability to employ both gen AI and rules-based automations is worth watching. When I asked Dines for his take on the skills employees need now, he pointed to prompt engineering and query manipulation skills. Maximizing interactions with AI agents is, indeed, a skill. Those who have it will be on firmer ground.
And how will Dines judge UiPath’s progress toward agentic AI? By the only metric that really matters: customer adoption by industry. I’m glad Dines hit on prompt skills; I’ve heard from some vendors that don’t want to educate users on prompt engineering. Why not? I think they want to believe their AI agents are foolproof enough that any user can have success. I suspect the answer will land in the middle. Some agents will take workflows out of the hands of users – no skill required. But in other cases, prompting skills will make a difference – even for companies where prompt engineers hone part of the context window that the user doesn’t have to see.
In closing, I want to note that even the AI experts I trust are divided on whether providing RAG-based context and LLM policies are enough to limit/control output to auditable levels. Most believe we are not to the auditable level yet, which does not make agentic AI irrelevant. Rather, it points to the relevance of UiPath’s view of different types of automation for different needs.
There is no doubt that LLMs, without some type of internal policy or additional instruction, will sometimes disregard context windows in favor of their own internal data or predilections (Google Gemini just had an already-notorious example). Whether embedded policies and prompt instructions can make this foolproof, we’ll have to see. I’ve seen some impressive demos of this lately, but they pertain to smaller models, and agents focused on serving up insights, or smaller tasks. LLMs are a different beast. That gives UiPath plenty to work on with customers. Let’s see how Dines and team progress.
Also see: Alyx MacQueen on UiPath’s Second Act – RPA combined with agentic AI to enable real organizational change.