Behind the Screen Door: False Positive Reduction Models

I have always looked at False Positive Reduction (FPR) in sanctions screening in a certain way – that there were systems that relied on matching algorithms and others that relied more on user-built tools like scorecards and FPR rules.

What did Gemini think of my way of looking at things?

First, my prompt:

I am familiar with 2 basic kinds of false positive reduction for sanctions screening systems. One model relies on sophisticated name matching algorithms rather than whitelisting or rules-based processing to minimize false positive matches. I call that a “one funnel” model. In contrast, a “two funnel system” uses a more basic set of matching algorithms, combined with user-built and maintained false positive reduction rules, multi-element score boarding, and/or white listing tools. IS that an appropriate way to categorize the false positive reduction capabilities of these systems? If not, are there other ways of categorizing these systems? What other considerations have I not considered here?

Here’s the response I got:

Your “One Funnel vs. Two Funnel” analogy is an astute and highly effective way to conceptualize the philosophical divide in modern sanctions screening. You are essentially distinguishing between upstream optimization (preventing the false positive from being generated) and downstream optimization (suppressing the false positive after generation).

Here is a breakdown of why your categorization works, alternative ways to view the market, and the critical considerations you may have overlooked.


1. Validating Your “Funnel” Model

Your categorization is appropriate because it accurately describes where the “intelligence” of the system resides.

  • The “One Funnel” (Precision-First Model):
    • Philosophy: “Get it right the first time.”
    • Mechanism: Relies on heavy computational linguistics, AI, and fuzzy logic during the matching process to ensure that only high-probability matches are generated.
    • Pros: Fewer alerts to manage; cleaner data flow.
    • Cons: “Black box” risk (hard to explain to regulators why a name didn’t match); computationally expensive; difficult to tune without data science expertise.
  • The “Two Funnel” (Filter-First Model):
    • Philosophy: “Cast a wide net, then filter the catch.”
    • Mechanism: Uses a “looser” matching engine (often deterministic or basic fuzzy) to grab many potential matches, then applies a secondary logic layer (whitelist, scoring, secondary field checks) to discard the noise.
    • Pros: Transparent and explainable (you can see exactly which rule suppressed the alert); easier for compliance officers to manually tune; often safer for risk-averse institutions.
    • Cons: High initial volume; relies heavily on the maintenance of complex rule sets which can become unmanageable “spaghetti logic” over time.

2. Alternative Categorizations

While your funnel model focuses on process flow, the industry often categorizes these systems based on technology generation or data usage.

A. Deterministic vs. Probabilistic vs. Semantic

  • Deterministic (Legacy): Relies on exact matches or simple wildcards. Zero intelligence. High false negatives (misses risks) or massive false positives.
  • Probabilistic (Fuzzy): Uses algorithms like Levenshtein Distance or Soundex to measure similarity scores (e.g., “Match if name is 85% similar”). This is the standard “Two Funnel” base.
  • Semantic/Vector (AI-Native): Uses Neural Networks to understand that “Bill” and “William” are the same entity without sharing letters. This typically aligns with your “One Funnel” model.

B. The “Three-Stage” Lifecycle

Rather than just funnels, many vendors categorize by where the reduction happens in the lifecycle:

  1. Pre-Computation (Data prep): Cleaning and normalizing data before it hits the engine.
  2. In-Memory (The Match): The algorithms you described.
  3. Post-Process (Decisioning): The “score boarding” and “rules” you described.

3. Critical Considerations You May Have Missed

Your model focuses heavily on Name Matching algorithms and Rules. However, modern false positive reduction relies on several other pillars that don’t fit neatly into either bucket.

A. The “Data Hygiene” Paradox

Neither funnel works if the input data is garbage.

  • Consideration: A system might have the world’s best “One Funnel” AI, but if your internal customer data has “N/A” listed as a last name, you will get false positives.
  • Correction: You should consider Pre-processing/Normalization as a distinct capability. Does the system strip special characters, standardize “Street” vs “St”, and parse concatenated names before matching?

B. Secondary Attribute Matching (The “Tie-Breaker”)

You focused on name matching algorithms. However, the biggest driver of False Positive reduction is not the name, but the secondary identifiers.

  • The Gap: A “One Funnel” system might correctly identify that “John Smith” looks like “John Smyth”. But does it automatically penalize the score if the Dates of Birth are 20 years apart?
  • Multidimensional Scoring: Sophisticated systems don’t just match names; they match Name + Country + DoB as a single weighted vector. This is often more effective than name matching alone.

C. Explainability (XAI) vs. Performance

  • The Regulatory Trap: The “One Funnel” AI model is dangerous if it cannot be explained. If a regulator asks, “Why did you not flag this variation of a terrorist’s name?”, answering “Our neural network didn’t feel it was close enough” is often unacceptable.
  • Consideration: Does the system provide a “white box” explanation for its AI decisions?

D. Predictive Adjudication (The “Third Funnel”)

There is a new category emerging that sits after your “Two Funnel” concept.

  • AI Analyst Agents: Instead of rules (IF/THEN), these systems observe human analysts. If humans dismiss “partial matches on common Spanish surnames” 1,000 times, the AI learns to auto-dismiss the 1,001st instance. This is Hibernation/Auto-closure based on behavioral learning, not just static rules.

Summary Table

Capability“One Funnel” (Your Model)“Two Funnel” (Your Model)The Missing Piece (Holistic)
Primary FilterSophisticated Matching AIBasic Matching + Post-RulesData Normalization + Contextual Vectoring
Logic“Smart” Pattern Recognition“Brute Force” Logic/WhitelistsBehavioral Learning (Predictive)
StrengthPrecisionAuditability/ControlEfficiency & Context
WeaknessExplainability (Black Box)Maintenance OverheadData Dependency

Leave a comment