Yimeng Li

← Projects

Fraud investigation in finance

The problem

Financial investigation teams spend a lot of time stitching together context: case notes, alerts, policies, customer histories, and prior decisions. The harder product question was not whether AI could help, but where an agent was worth the extra complexity compared with a rule, a classifier, or a simpler workflow improvement.

Who it was for

The product had two audiences. Investigators needed less repetitive collection and summarization work without losing control of the final judgment. Business and technology leaders needed a clear way to decide which AI bets were credible enough for a regulated environment.

What I did

I broke the investigation process into candidate jobs and scored each one by manual effort, decision risk, data availability, explainability needs, and how often the task required multi-step reasoning across systems. That created a shortlist of places where an agent could add real leverage instead of novelty.

For the selected flow, I designed an assistant-style investigation path that collects evidence, drafts a structured case narrative, and surfaces the key uncertainties an investigator should review. I also built a selection framework for when to use an agent, neural network, traditional ML model, deterministic rule, or no AI at all.

The hard tradeoffs

The most important calls were the places where I did not recommend agents. High-volume, stable decisions with clear thresholds were usually better served by deterministic logic or conventional models. Agentic flows made sense where the task was messy, context-heavy, and reviewable, and where the system could show its work rather than silently decide.

I kept the human decision point explicit. The design reduced manual gathering and drafting, but left judgment, escalation, and final disposition with the investigator.

Outcome

The recommended flow projected roughly $10.6M in annual savings and about 97% less manual effort for the targeted workstream. Just as importantly, the framework gave stakeholders a shared vocabulary for deciding when agentic AI was the right product shape and when a simpler system was the better answer.

What I’d do next

I would validate the workflow with real investigator review loops: measure time saved, correction rate, missed-evidence rate, and user trust over repeated cases. I would also keep a close eye on drift, policy changes, and any signs that reviewers were accepting generated narratives too quickly.

← Projects