Introduction
Between September and December 2024, Goaco partnered with the Ministry of Justice (MoJ) to build a proof of concept leveraging generative AI to demonstrate how the Criminal Injuries Compensation Authority (CICA) could improve the triaging of documents whilst maintaining human oversight.
The project objective was:
To evaluate and implement AI-driven solutions to enhance efficiency in the MoJ’s document triage processes while ensuring compliance with data privacy and retaining human oversight.
The key deliverables were:
- Recommendations for AI integration in workflows.
- Demonstrations of AI capabilities in document extraction and classification.
- A comprehensive report and presentations on findings and scalability options.
Key Stages
Kick-off
The kick-off phase began with the blended Goaco and MoJ team defining the project goals to ensure alignment and identifying key success metrics that included reductions in manual processing time, improvements in data validation accuracy, and scalability to support future workloads. To structure the project effectively, the team divided the work into three stages: Discovery, PoC Build, and Playback.
The initial scope of the project was carefully delineated to concentrate on tasks related to document extraction, classification, and validation. Particular attention was given to complex document types such as police reports and handwritten notes, which were identified as high-impact areas. The team prioritised potential use cases by considering both feasibility and the anticipated impact of automation.
To ensure compliance with GDPR and MoJ’s internal policies, the team established a comprehensive data handling framework and given the limitations in accessing real case files, we used life-like dummy data.
AWS had been chosen as the primary infrastructure platform for deploying the AI solution given its proven scalability, robust compliance features, and alignment with the MoJ’s operational requirements. Preliminary technical discussions also highlighted the necessity of API integrations to enable seamless data flow and emphasised the development of modular AI models. These modular models would allow for flexible adjustments and scalability, ensuring the system could evolve with changing requirements.
Discovery
The discovery phase provided the foundation for understanding the current workflows and identifying opportunities for AI integration. This began with an in-depth review of the end-to-end case triage processes, spanning initial intake to final resolution. The team meticulously mapped critical manual tasks such as document review, eligibility checks, and the complex process of linking duplicate cases. This highlighted pivotal decision points within the workflow where AI could be employed to streamline operations while retaining the critical human oversight necessary for accuracy and compliance.
The team also identified pain points that hindered efficiency in the existing processes. Key challenges included the considerable time manual handling of diverse document types, such as police reports, handwritten notes, and scanned PDFs. Additionally, linking duplicate or related cases required labour-intensive cross-referencing across systems. The process of verifying the authenticity and relevance of documents was frequently delayed, compounding operational inefficiencies.
The review of existing policies and infrastructure was a critical component of this phase. The team conducted an in-depth analysis of the MoJ’s data privacy requirements, ensuring that all proposed solutions adhered to GDPR and internal compliance standards. The existing IT infrastructure was evaluated to confirm compatibility with AI-based solutions, with AWS reaffirmed as the preferred platform due to its scalability and robust security.
Workshops and stakeholder engagements played a pivotal role in gathering insights and refining project scope. Multiple workshops were conducted with MoJ teams to gain a deeper understanding of current workflows and the specific challenges faced by users. Policy experts were engaged to ensure the proposed AI solutions were aligned with existing legal and procedural obligations. These sessions also allowed stakeholders to provide feedback on potential AI use cases, focusing on areas that promised the highest impact.
The discovery phase was very productive: Initial use cases for AI integration were identified, including the application of Optical Character Recognition (OCR) for digitising documents, automated classification of document types, and AI-powered summarisation to extract key insights from lengthy case files. Key performance metrics for the PoC’s success were defined, focusing on time savings, accuracy improvements, and user satisfaction. Foundational workflows were established for integrating AI into the triage process while maintaining human oversight, ensuring both efficiency and compliance. Additionally, Human-Computer Interaction (HCI) was considered in the AI tool’s design, ensuring that caseworkers could interact seamlessly with AI-generated outputs, validate automated decisions, and refine AI responses when necessary. Lastly, some operational challenges were identified, including:
- Document diversity and complexity: Handling various formats such as handwritten notes, scanned PDFs, structured forms complicates processing and increases the risk of error.
- Manual workflows: Resource-intensive tasks like sorting, categorising, and policy cross-referencing inflate operational costs and hinder scalability.
- Prolonged case timelines: Manual checks delay case progression.
- Critical document management issues: Some documents arrive unattached or unidentified, risking oversight of vital case information.
PoC Build
During the Proof of Concept (PoC) build, we focused on testing AI’s ability to automate document triage while ensuring compliance and human oversight. Given the sensitivity of MoJ data, templated dummy documents were used to simulate real-world scenarios. This allowed us to validate AI capabilities without handling personal information.
AI Components Implemented
- Optical Character Recognition (OCR) – Applied to digitise scanned PDFs and handwritten documents, enabling structured text extraction.
- Local AI Model Deployment – A fine-tuned llama model was hosted locally to enhance text refinement and summarisation.
- Document Categorisation – AI classified documents into predefined categories, demonstrating automated sorting for caseworkers.
- Vector Database Storage – Processed documents were stored in Chroma, enabling efficient querying and retrieval.
- Chatbot Querying via RAG – Integrated Retrieval-Augmented Generation (RAG) to allow policy-aligned chatbot responses for caseworkers.
Outcomes
The OCR demonstrated high accuracy by successfully extracting text from various document formats, including both typed and handwritten notes. Summarization performance was strong, generating concise and policy-relevant summaries that significantly reduced document review time. Automated document classification effectively structured case data, improving retrieval efficiency. Additionally, the AI-driven querying, powered by a RAG-enabled chatbot, delivered relevant and context-aware responses, highlighting its potential to assist caseworkers.
Key Learnings
The PoC successfully demonstrated that AI is feasible for document triage, proving its effectiveness in digitisation, classification, and retrieval. Additionally, the PoC ensured data compliance by using a locally deployed AI model and non-sensitive dummy data, confirming that AI integration can be done securely within MoJ’s regulatory framework. Overall, the PoC validated AI’s potential to enhance document processing workflows, laying the foundation for a scalable deployment in future phases.
Playback Phase
The playback phase focused on presenting the findings and showcasing the AI solution. The team highlighted key efficiency metrics, such as potential reductions in manual triage times and the scalability of the solutions to handle increasing workloads. Stakeholder questions were addressed, including concerns about system adaptability to policy changes and GDPR compliance implications.
Technical workshops were conducted to delve into the specifics of model performance. These sessions highlighted the accuracy of the OCR capabilities and the reliability of automated classification processes. Demonstrations of the end-to-end AI-enabled document processing system were provided, illustrating how digitisation, classification, and policy alignment workflows were seamlessly integrated. The workshops also served as a forum to discuss the technical and operational aspects of scaling the solutions.
The final report provided a comprehensive, end-to-end strategy for implementing AI-driven document processing within CICA. It outlined key findings from the PoC, infrastructure requirements, and a structured roadmap for integrating AI into MoJ workflows. The report detailed a strategic approach to AI adoption, identifying the most effective models for classification, summarisation, and retrieval, along with a step-by-step deployment plan for seamless integration.
A high-level technical architecture was presented, illustrating how AI components interact with existing systems through cloud-based infrastructure (AWS) and open-source solutions. It evaluated different AI models, comparing proprietary vs. open-source approaches, and outlined a cost-benefit analysis to assess long-term sustainability.
The report also addressed key risks and compliance considerations, ensuring AI adoption aligns with MoJ’s security, GDPR regulations, and human oversight frameworks. Challenges such as OCR accuracy, AI bias, and integration complexities were identified, with mitigation strategies included.
Ultimately, the report provided clear recommendations for transitioning from PoC to full-scale deployment, ensuring AI integration is scalable, cost-effective, and enhances operational efficiency while maintaining compliance and human oversight.
The playback phase culminated in live demonstrations that highlighted the capabilities of the AI PoC system. These demonstrations emphasised the synergy between AI and human decision-making, illustrating how the system could reduce workloads while maintaining accuracy and compliance. By the end of this phase, stakeholders were equipped with the insight and recommendations needed to make informed decisions about next steps.
The project identified several opportunities to leverage AI including:
- Document Processing:
- Automate intake, categorisation, and digitisation of incoming documents.
- Convert unstructured formats (e.g., handwritten notes) into searchable data.
- Completeness and Validation:
- Automatically identify missing or incomplete components, streamlining validation processes.
- Policy Verification:
- Cross-reference documents with guidelines for compliance and consistency.
- Decision Support:
- Provide summaries and insights to assist caseworkers, enabling focus on high-priority tasks.
Conclusion
The PoC successfully demonstrated the potential for Generative AI to enhance MoJ’s document handling efficiency while maintaining data security and human oversight and led to a recommendation that future efforts should focus on scaling AI solutions, acquiring larger datasets, and refining integration with existing workflows. The work demonstrated automation could reduce manual workload by up to 82%, significantly shortening processing times and enabling caseworkers to focus on complex decisions while aligning with CICA’s 2024-2025 priorities to scale operations sustainably, enhance claimant experiences, and maintain service fairness and efficiency. Importantly, it proved the benefits of integrating AI solutions into workflows to augment, not replace, human expertise. Caseworkers retain control over decisions, ensuring fairness, accuracy, and compliance.