The goal was to replace manual triage and inconsistent labels with a reliable, scalable pipeline for classifying and validating legal documents. I built production-oriented Python scripts that use LLMs plus SQL to classify documents, enforce clear criteria, and surface issues for quick remediation.
Focus: trustworthy automation—measurable accuracy, clear audit trails, and low-friction ops for the legal team.
Rasa Legal’s teams were spending significant time manually classifying documents and correcting mismatches in a large PostgreSQL datastore. I developed Python scripts leveraging LLMs to automate the classification step and standardized the criteria used to decide each label. The pipeline queries large tables, fetches candidate records, applies deterministic preprocessing, calls the model with prompt templates tuned for our taxonomy, and writes back results with confidence scores and reasons for later audit.
To ensure reliability, I incorporated validation rules (e.g., schema checks, allowed-value sets, field presence), plus logging for traceability. This replaced ad-hoc review with a repeatable, inspectable process legal ops can trust
I then added a data quality audit pass: large-scale SQL queries flag duplicates, drift from the taxonomy, and label conflicts; sampled batches get human spot-checks to monitor precision/recall over time. This workflow helped the team quickly identify and correct inconsistencies across 10,000+ records and reduced day-to-day manual review time for routine classifications.