Master or Servant? The role of AI in arbitration and decision-making

At Accuracy, we are frequently assisting in high-stakes international arbitrations and litigations, using our multidisciplinary expertise to aid our clients and tribunals in their resolution of disputes. The emergence of AI tools is having a seismic effect on the industry, changing how disputes are organised, how data is managed and how reports are produced. It is incumbent on firms such as ours to consider the impacts and opportunities presented by AI if we are to maintain excellence and value for clients.

Calum Mackenzie is a Senior Manager and data specialist at Accuracy, working across our infrastructure, disputes and advisory practices. Alongside his work in traditional disputes settings, he holds a prominent role in data science and software development, with a focus on applying automation, Natural Language Processing (NLP), and AI in construction projects and arbitration.

Read our latest interview with Calum where he explores the role of AI in arbitration and decision making.

How is your role at Accuracy aligned to the world of AI?

My role allows me to work in two worlds at the same time: dispute resolution and technology. The former is traditionally conservative, risk-averse, and high-stakes, with a tendency to emphasise credentials, procedures and precedents; the latter is innovative, tolerant of risk in search of productivity and insight, but low-stakes. The former views “legacy” systems and solutions favourably, while the latter sees them as things to be superseded and improved on.
On any given day, I may find myself presenting findings to executives and counsels, explaining why the construction of their oil refinery suffered delays; on another day, I may be developing and testing AI-powered tools to assist our team in getting insights from messy and unstructured datasets, automating tasks which before had to be done manually, or accelerating tasks which were previously done slowly. Because I am familiar with both worlds, I think I am uniquely placed to offer some useful perspective on the topic of AI and decision-making.

What are your views on the changing landscape of AI to date?

AI is penetrating knowledge and service work, from individuals to businesses, to entire industries. Many saw this coming long before the advent of ChatGPT and were excited by its potential. Experienced professors and practitioners like Maxi Scherer in 2019 had predicted that AI would have valuable applications in research, data management, and decision prediction in the legal space. AI tools were and, we have assumed, continue to be free from certain biases that humans are vulnerable to. There has been a general consensus that humans must remain accountable for decision making and should not rely solely on AI tools.

The reality, so far at least, has been much messier than predicted. On the one hand, the set of problems which computers can now solve is much larger. On the other hand, how to use and regulate these tools is a huge and complicated problem with no clear solution. But there are still some big opportunities.

Arbitration exists, in theory, as a faster, cheaper and private alternative to litigation. In practice, its speed and cost advantages compared to litigation aren’t so clear cut. With AI, there is an opportunity to streamline and accelerate the dispute resolution process in a way that promotes quick and amicable settlement, from negotiation and mediation to adjudication and arbitration:

AI systems could review and evaluate claims, explore the strengths and weaknesses in a case, review precedents, prepare strategies, and predict outcomes at speeds impossible for humans to match.
Moreover, even the best and most productive firms can only handle a small number of cases in parallel; AI-powered alternatives would have no such bottlenecks.
AI could also be a means to reinforce the principle of “equality of arms” (that the outcome of the dispute ought not be influenced by the scale of one party’s resources); AI could address potential imbalances by offering cheaper, more efficient and more accessible legal representation.

As a minimum, the capability offered by AI could allow professionals to focus the bulk of their time and resource on the most complex and contentious cases.

Attractive as these opportunities are, they require that we address complex challenges. One huge challenge is that these systems and tools are highly complex; we cannot expect all users to fully comprehend their workings nor how such models behave in accordance with rules and directives. Whatever “guardrails” are in place, there is always the possibility that someone will figure out how to circumvent them or that some deeper, undiscovered, biases lie within.

With much AI development and innovation being “open-source”, i.e. public, it may not even be possible to regulate the use and spread of these systems even if we wanted to. Websites like HuggingFace provide free “pre-trained” models which can be downloaded like any other file and are already widely disseminated. Much research into the “architecture” of models is likewise done in public and published via platforms like ArXiv.

And even if you could restrict public access to new models, research, and computing power, it still wouldn’t be enough. Training models from scratch is hugely time-, cost-, and resource-intensive, but replicating a trained model is trivially easy and takes seconds.

Take your TV as an analogy. The image on your TV is determined by the values of its settings like brightness, contrast, and so on. It takes time and effort to get the image how you want it. But if someone else buys the same model of TV, they can copy your settings and have the exact same image – with no further effort. Viewed in this way, top-down regulation is not a viable strategy.

So, let’s assume that AI tools aren’t going away. How can we know the extent to which their output is truly independent and objective? Emerging “best practices” and principles offer a likely direction for how such tools will be applied and are likely to stand the test of time, namely:

Transparency: Declare up front the AI tools you have used, and when, where, how, and why you used them.
A Useful Servant, A Bad Master: Critical, high-consequence work and real-world decision-making should never be naively outsourced to computer systems.
Don’t Trust It, Test It: Don’t take what AI says as truth unless you have the competence to verify it. Thoroughly evaluate its output before you put your name to it.

We must also recognise the challenge of confidentiality. To use these systems, you must share your data with them. But in arbitration – where the information is, by definition, extremely sensitive – using a centralised and external model is simply not an option, no matter how smart and powerful the model is (and however much assurance the owners of these tools may give as to their trustworthiness).

What would you say the weaknesses of AI decision making tools are?

Three big ones come to mind, all in the realm of bias:

We know and understand human biases. Removing human bias from our processes, systems, and decision-making with AI-powered tools would be a good thing – but there is a trade-off; we don’t (yet) understand these systems in the same way. We risk exchanging our own well-known biases with unknown, subtler ones. This undermines their use in real-world decision-making today.
Secondly, these systems are not in fact free of human bias. You cannot train an AI on all data – even if you could, you wouldn’t want to. So, the “training sets” used to create these tools must be filtered, and it is still humans who would do the filtering. After training these models, their behaviour is conditioned through “reinforcement” which, again, is human intervention. So, we haven’t removed human bias; we have only made it more subtle. Again, this undermines any real-world decision-making by these systems.

Thirdly, there is the risk of as-yet undiscovered biases and behaviours. As experts we are paid for our competence, but we have value in large part because we pay a price when we are wrong. Let’s imagine that AI tools are used to inform arbitral awards in the future. Then, let’s imagine that some flaw in the “reasoning” of these systems is discovered. Whether or not it happens, it’s certainly a plausible scenario. What would happen to the awards that these systems influenced? Would cases be re-opened? Who would bear responsibility for the wrong-doing or malfunctioning of these systems? We don’t have good answers to these questions and, for now, it’s unclear if they can be answered.

How do you see the growth potential for these new technologies in arbitration?

There are causes for optimism about the potential of these technologies in arbitration. Used correctly, they could complement the humans involved rather than replace them or, at the very least, have more benefits than costs.

AI tools have the ability to “see all sides”, digest information, present differing interpretations, and advance lines of reasoning at much greater speed and efficiency than humans. Such processing capacity offers “Monte Carlo” style approaches in dense, complex, and primarily verbal legal disputes. What if you could enter your next dispute with 1,000s of simulated mock hearings behind you?
They could improve the efficiency and affordability of dispute resolution mechanisms. The facilitation of settlements, negotiations, arbitrations, and so on may decrease the price of dispute resolution while vastly increasing its accessibility. Imagine ten times as many arbitrations, each at one tenth the price!
Much evidence and data today remain trapped in PDF files of poorly scanned documents and tables. Unstructured data like this is cost-intensive to extract and process, and experts don’t have the time to rectify this alone. These tools could liberate it, helping lawyers and experts to leave no stone unturned and offer the maximum possible value.

What are your predictions for 2029?

Prediction is a dangerous game – but I think there are a few things we can expect.

Expect to find LLMs and AI tools at your disposal across most computers, laptops, and phones. Phones, laptops, and other devices will have built-in LLMs, optimised for specific purposes.
Working behaviours will change because of tools like ChatGPT and CoPilot. People will read less and will choose to outsource comprehension / translation / summarisation of complex or challenging text to their AI tools.
Important, high-consequence work will prove resistant to LLM encroachment, however low-consequence aspects of jobs will be increasingly automated or redundant.
Critical decision-making by humans will become harder, not easier, because of asymmetries in how these systems can be used, at least in early adoption. The extent to which we are guided by (or eventually trust) AI to support us in critical work is a question we are likely to wrestle with for years to come.

Calum Mackenzie – Senior manager and data specialist – Accuracy

Accuracy-Analysis-AI-in-arbitration-v1