Category: Behind the Screen Door

  • With apologies to George Carlin

    So, how might we design a workbasket system – regardless of whether or not our system actually looks like workbaskets, or just uses things like database queries to display logical subsets of the screening matches? Even if we have a entity-centric model (see that earlier post), there is usually significant utility in being able to segment the screened records.

    Of course, this list of possible match segmentation is very much firm-dependent… but let’s just say these are possible ways to look at our data:

    1. Probably the most generally useful segmentation (and the most frequently implemented) is by processing stage – like New Match, Pending Approval, False Positive Matches, True Matches
    2. It is not uncommon to segment by the general data source type – employees, vendors, customers, transactions – and/or the screening method – real-time, batch file processing
    3. If multiple business lines are screening and their data is screened separately, these results are not infrequently kept separately within the workflow
    4. If multiple geographic regions or records in different languages are screened in the same system, they may be kept separately

    From a design standpoint, workbaskets are not the only tool that can be used – and, if not well thought-out, they can lead to operational headaches. 

    When one’s screening needs cover a significant number of varied data (e.g. geographies, data types, business lines), keeping them separate can lead to complex naming and organizational challenges. In theory, this apparent complexity can be better managed by alternate means; one could run multiple instances of screening systems for segments of the business that do not or should not be commingled with other segments or staff. For example, even if a screening system can keep employee screening results out of the view of staff who are not supposed to view it by other means, a firm might choose to run that screening physically separate as a business decision and/or to reduce the complexity of workflow functionality.

    On the other hand, relying exclusively on capturing every variation in screening results with a large population of workbaskets brings different design challenges. The easier challenge is that many workbaskets can, at any point of time, have no results in them. That complicates the job of reviewing operational dashboards and reports. However, suppressing empty workbaskets – or even reporting them separately largely mitigates that problem 

    The harder issue is that, just like empty workbaskets, sparse data (e.g. lots of workbaskets with screening records, but many of them with very few results) complicates reporting and MIS – as well as managing operations. One of my former clients, for example, had 3 business units, each of which had its own instance of workflow – and each of those, despite my firm’s advice otherwise, had over 100 workbaskets each.

    Where’s the sweet spot here? My inclination is to minimize the number of workbaskets for the purpose of getting work done, but provide sufficient detail so that staff can prioritize their work  as day-to-day needs dictate, that there is proper separation of duties and visibility control, and that management conducting oversight can retrieve sufficient operational detail for reporting and control needs.

  • I’ve worked with a number of processing systems over the years, both when I was a payments guy and my longer stint in the sanctions world. And how the work is organized and processed, from the systems I’ve worked on, has fallen into 2 major designs – from what I’ve seen. And I wanted to see – because I’ve only really been exposed to a handful of systems (Accuity’s ComplianceLink, Dow Jones’ SAM and ASAM products, EastNets SIDE SafeWatch and LexisNexis Bridger Insight XG) – whether there was something else out there, and was my categorization even a fair way to look at things.

    Yes, let’s discount the fact that AI systems tend to validate our assumptions… but first I asked:

    I am familiar with two basic ways of visualizing and accessing work items in sanctions screening.

    In one model, all work items are in one collection. Some systems using this model provide filters or database queries to see subsets of the data. Each work item may have multiple attributes, like department or geography, person or role assigned to work the item, and the status of the item. While staff may be able to use a filter or query to see items assigned to them, there is no separate workbasket that one could view in a product’s list of workbaskets, in a vendor-provider MIS function or in vendor-provided reports – although one could use system capabilities to see the corresponding subset of work items.

    In the other model, there are separate named workbaskets, either vendor supplied or configured by system administrators. There are typically rules for populating workbaskets when new work items are created, methods for moving items between workbaskets (e.g. when an item is escalated for approval), and workbaskets for holding completed items.

    For sanctions screening systems, do their workflow functionality fit roughly into one of these two models? Are there other generalized workflow systems used by sanctions screening systems? If so, please describe in general terms.

    and here was the response:

    You are absolutely on the right track. The two models you described cover the vast majority of Case Management Systems (CMS) used in sanctions screening today. In industry terms, these are generally distinguished as Queue-Based(your second model) versus Attribute-Based (your first model) workflows.

    Image of case management workflow diagram

    Shutterstock

    Here is a breakdown of how the industry classifies these two models, followed by the “third ways” that are emerging as technology evolves.

    1. The Two Standard Models

    Model A: The Queue-Based Workflow (Your “Named Workbaskets”)

    This is the traditional “Inbox” model found in legacy banking systems and major vendor platforms (like early versions of Fircosoft or Oracle Watchlist Screening).

    • How it works: An alert is generated and physically or logically “routed” to a specific container (e.g., “EMEA Level 1 Review,” “High Net Worth Escalations”).
    • User Experience: An analyst logs in and sees a folder structure. They enter a folder and “pull” work from it.
    • Pros: Clear ownership; easy for managers to visualize backlogs (e.g., “The EMEA basket is overflowing”).
    • Cons: Inflexible. Moving items requires specific routing rules or manual transfers. If an analyst is out sick, their personal basket might sit stagnant unless an admin intervenes.

    Model B: The Attribute-Based / Pool Workflow (Your “Single Collection”)

    This is the modern “Data Lake” or “Flat View” approach, common in newer, agile platforms or custom-built internal tools.

    • How it works: All alerts sit in one massive table. “Queues” are virtual—they are simply saved search filters (e.g., WHERE status='Open' AND region='Asia').
    • User Experience: The analyst sees a filtered list.
    • Pros: Highly flexible. An admin can change a “queue” definition instantly by altering the filter query without moving data rows. It supports “swarming” (many users tackling one view) easily.
    • Cons: Danger of “cherry-picking” (analysts skipping difficult items). Risk of collision (two analysts opening the same item) if the system doesn’t handle record locking well.

    2. Alternative “Generalized” Workflow Systems

    While most systems are variations of the two above, there are distinct methodologies that fundamentally change how users interact with work items.

    A. The “Get Next” / Push Model (Dynamic Assignment)

    This is increasingly common in high-volume environments (like FinTechs or Tier 1 banks) to combat the “cherry-picking” problem inherent in both models above.

    • The Concept: The analyst does not see a list of items or a basket at all. They have a single button: “Get Next Case.”
    • The Logic: A backend algorithm (often called a “distribution engine”) decides in real-time what the most important item is based on SLA deadlines, risk scores, and the analyst’s specific skill set.
    • Difference: It removes the “browse and select” aspect entirely. The workflow is purely algorithmic and priority-driven.

    B. Entity-Centric (Profile) Workflow

    Traditional screening workflows are Alert-Centric—you review a “hit” (a potential match). This model changes the unit of work to the Customer.

    • The Concept: Instead of generating 5 separate work items for 5 different wire transfers made by “John Smith,” the system aggregates them into a single “John Smith Profile” case.
    • The Logic: The analyst reviews the relationship, not the transaction. If they clear “John Smith” once, the decision propagates to all associated alerts.
    • Difference: The workflow is hierarchical. You don’t work “items” (transactions); you work “entities.” This significantly reduces false positive volumes but requires a much more complex data architecture (often involving Entity Resolution engines).

    C. The “Four-Eyes” / State-Machine Model

    This is less about where the item sits and more about who has touched it. It is strictly process-enforced.

    • The Concept: The item is locked into a rigid state transition diagram (e.g., Draft -> L1 Review -> L2 Review -> Final).
    • The Logic: An item cannot physically be seen by an “Approver” until a “Maker” has digitally signed it. It focuses on the separation of duties.
    • Difference: In the Queue model, an admin might accidentally drag a file into the “Completed” folder. In a State-Machine model, the system forbids the status change unless the cryptographic logic (User A ≠ User B) is satisfied.

    Comparison Table

    FeatureQueue-BasedAttribute-Based“Get Next” (Push)
    VisibilityDistinct FoldersFiltered ListsSingle Item View
    Analyst AutonomyHigh (Select from list)High (Select from list)None (System assigns)
    Management FocusManaging Pile SizesManaging Query LogicManaging SLAs & Skills
    Best ForComplex, multi-team flowsAgile, flat teamsHigh-volume operational floors

    What do I think of the other models? It depends on what you need out of them. I have no problems with the Push model – if you get some MIS out of what hasn’t been pushed out to a staff member. I’d want to know how many high-value current day payments are waiting, and for how long, and for how much total money, is sitting as I get closer and closer to a currency cutoff.

    Again, I have no issues with the entity-centric model – if I can identify higher-priority items. And it would seem it would only bring value to “frequent flyers”. If you’ve got 5 “hits”, as the example goes, I’d assume that the client has actually performed 100s of transactions that day. For a wholesale BtoB environment, that would seem to work – but it seems like the theory of its utility might outweigh its value on a day-to-day basis. But, if someone wants to show me one of these in actual production… can’t really pass judgment quite yet.

    Finally, the “four-eyes model”? Does any solution not at least provide the tools to do this? Personally, I am used to being able to restrict either access to or visibility of, items by role or user and/or control the state changes / workflow processing commands by role or user. And, to be perfectly frank, having a workflow system where that sort of restriction isn’t either built-in or configurable is an audit finding waiting to happen.

    Full disclosure: In my payments days, I supervised Swiss Bank’s (now UBS) wholesale funds transfer operation. And, for Accuity, a good chunk of my role was consulting with clients on how to configure their workflow setups. Did clients always make the best decisions? uh…. they thought so.

    By the way, Gemini asked me if I wanted to show which vendors used which models…

    Based on the two models you described and the industry landscape, here is how the major sanctions screening vendors typically align with those workflow styles.

    It is important to note a key distinction in this industry: Screening Engines (the math that finds the match) vs. Case Management (the workflow tool where you click “False Positive”).

    • Some vendors (like Fircosoft) are famous for the Engine but often feed data into a separate, third-party Workflowtool.
    • Others (like Oracle and Nice Actimize) provide massive, all-in-one suites that include both.

    1. The Queue-Based Vendors (Your “Named Workbaskets” Model)

    These vendors follow the traditional banking compliance structure: hierarchies, rigid ownership, and folder-based views.

    • Oracle FCCM (Financial Services Crime and Compliance Management):
      • The Model: Classic Queue-Based.
      • How it works: Oracle is the archetype of the “Inbox” model. It uses distinct “levels” of investigation (e.g., Level 1, Level 2, QA). Administrators configure rigid routing rules (e.g., “If Country = Iran, route to Sanctions Specialist Queue“).
      • User Experience: Analysts log in and see a list of queues they have permission to access. They “checkout” or “assign” items to themselves from these queues.
    • Pega (Pegasystems):
      • The Model: Highly Configurable Queue/Process.
      • How it works: While Pega is a BPM (Business Process Management) tool rather than a dedicated screening vendor, it is the underlying “Workflow” engine for many of the world’s largest banks (often sitting on top of a Fircosoft engine).
      • User Experience: It is famous for “Get Next” (push routing) logic but is architected around “Workbaskets” where items sit until a user or robot acts on them.

    2. The Attribute-Based Vendors (Your “Single Collection” Model)

    These vendors often have a “lighter” or more modern architecture, viewing alerts as a single dataset that you filter dynamically.

    • Refinitiv (World-Check One):
      • The Model: Attribute-Based / Filter Driven.
      • How it works: In their native UI, you essentially view a list of all matches. You use facets and filters (like “Group,” “Risk Level,” or “Date”) to slice the data.
      • User Experience: You don’t usually “move” an item from a “Level 1 Folder” to a “Level 2 Folder.” Instead, you change the status attribute from “Open” to “Resolved,” and it disappears from your “Open” filter view.
    • Fircosoft (Firco Continuity / Firco Trust):
      • The Model: Hybrid (History of Queues, moving toward Attributes).
      • How it works: Legacy implementations use “decision queues.” However, because Fircosoft is so often integrated via API into a bank’s own custom interface, it is frequently the backend for “Attribute-Based” custom tools.
      • Note: If you use their out-of-the-box UI, it feels more like a list/queue system. If you use them as an engine only, you build the workflow yourself.

    3. The Entity-Centric Model (The “Third Way”)

    As mentioned, this is the shift from “working alerts” to “working customers.”

    • Nice Actimize (WL-X):
      • The Model: Entity-Centric.
      • How it works: Actimize markets this heavily. Instead of a queue of 50 alerts, you might see a queue of 10 Entities.
      • User Experience: When you open a “Case,” you see the customer profile and all their associated alerts (sanctions, negative news, PEPs) in one view. You decision the relationship, not just the transaction line item.
    • Quantexa:
      • The Model: Network/Entity-Centric.
      • How it works: They are a newer player (often overlaying legacy engines) that focuses entirely on “Context.” They don’t just show you the match; they build a network graph showing who that person is transacting with.
      • User Experience: The “Workflow” is an investigation of a network graph rather than processing a list of tickets.

    4. The “Get Next” / Push Routing Model

    This is rare as a default setting in “out of the box” software but is the standard configuration for FinTechs and Neobanksusing these tools.

    • Silent Eight / Rippleshot (AI Overlays):
      • The Model: API-driven “Push.”
      • How it works: These are often AI adjudicators that sit on top of Fircosoft or Oracle.
      • User Experience: The AI pre-closes the false positives. The human analyst only sees the complex “gray area” cases, often served up one by one because the volume of “easy” stuff has been removed.

    Summary Table

    VendorPrimary Workflow ModelBest For
    Oracle FCCMQueue-BasedLarge traditional banks with rigid “Level 1 / Level 2” staffing structures.
    Refinitiv (WC1)Attribute-BasedCompliance teams needing quick, flexible filtering without complex routing logic.
    Nice ActimizeEntity-CentricTeams that want to reduce volume by reviewing “Customers” rather than “Alerts.”
    PegaQueue/Push (Hybrid)Massive operations that want custom “Get Next” logic to maximize efficiency.
  • Years ago, I worked with a system that had a very extensive global synonym data. Every once in a while, we caught it making matches it really shouldn’t have, so we got the vendor to turn some off – since our customers wanted an explanation that made sense to them.

    From that, I wondered, if AI/LLM based systems (my “One Funnel” model) wouldn’t make things worse.

    So, my prompt:

    I am familiar with 2 basic kinds of false positive reduction for sanctions screening systems. One model relies on sophisticated name matching algorithms rather than whitelisting or rules-based processing to minimize false positive matches. I call that a “one funnel” model.

    In contrast, a “two funnel system” uses a more basic set of matching algorithms, combined with user-built and maintained false positive reduction rules, multi-element score boarding, and/or white listing tools.

    Assume, as a starting point, that this is a valid way to categorize the overwhelming bulk of systems offered by solutions providers.

    Now, assume that third party data is in the Latin character set, but is global in nature (e.g. could represent parties from any country on Earth).

    Assuming that tokens that appear on sanctions lists also have alternative, benign meanings in other languages (e.g. “PRIMO” can mean “best” or “first”) but, due to the low incidence of single-token sanctioned party names, the actual incidence where such collisions occur is very small.

    Assume, too, that the system must be tuned with a significant bias toward minimizing false negatives, although there needs to be the capability to also reduce false positives in a significant way.

    How do these considerations affect the choice of a false positive reduction model and how it is designed, configured, implemented and maintained?

    And the response I got:

    Based on your scenario—specifically the need to handle global data, the presence of rare “dictionary word” collisions, and a strict requirement to miss zero sanctions hits—the considerations heavily favor a “Two Funnel” system architecture.

    Here is the analysis of how your assumptions dictate the design, configuration, implementation, and maintenance, with key industry terms explained for clarity.


    1. Choice of Model: The “Two Funnel” Imperative

    While “One Funnel” systems (often AI-driven models that try to decide “match or no match” in a single step) are efficient, your requirement to minimize False Negatives makes them risky.

    • The Problem with One Funnel: To ensure you don’t miss any bad actors in a “One Funnel” model, you have to tell the system to be very paranoid. This lowers the confidence threshold (the minimum similarity score required to trigger an alert). Without a secondary filter, this “paranoid” setting floods you with False Positives (innocent parties wrongly flagged as sanctioned) that the algorithm can’t easily explain away.
    • The Strength of Two Funnel: This architecture separates Recall from Precision.
      • Recall (The ability of the system to catch every possible bad actor): You handle this in Funnel 1 by casting a very wide net.
      • Precision (The ability of the system to filter out the noise and only present true risks): You handle this in Funnel 2 using specific logic rules.

    Verdict: Because you cannot afford to miss a hit, you need the Two Funnel approach. It allows you to be “loose” in the initial scan to catch everything, and “strict” in the second phase to clean up the mess.


    2. Design Implications

    Designing the system under these assumptions requires a “Wide Net, Fine Mesh” architecture.

    • Funnel 1 Design (The Wide Net):
      • Algorithm Selection: Since your data is global but uses the Latin alphabet (A-Z), simple “sounds-like” algorithms (phonetics) aren’t enough. You need algorithms focused on Edit Distance (calculating how many typos or character changes it takes to turn one name into another). This catches typos and variations in transliteration (the process of converting names from scripts like Arabic or Cyrillic into English characters).
      • Segmentation: You should design the system to treat people and companies differently. A collision with the word “PRIMO” is fine for a company (e.g., “Primo Logistics”) but suspicious for a person. The design must allow different rules for different entity types.
    • Funnel 2 Design (The Fine Mesh):
      • The “Weak Alias” Logic: Because you noted that single-word collisions (like “PRIMO”) are rare but possible, you should design a “Weak Alias” or “Stop List” module. This prevents the system from generating alerts solely on common dictionary words unless there is more corroborating data (like a matching country or date of birth).

    3. Configuration & Tuning

    The global nature of your data and the “Zero False Negative” goal drive specific configuration choices.

    • Fuzzy Matching Thresholds:
      • To minimize False Negatives (missed hits), you must configure your matching thresholds loosely (e.g., flagging names that are only 80% similar). This accounts for the messy reality of global data.
      • Effect: This guarantees the system catches variations like “Mohammed,” “Mohamad,” and “Muhamad.”
    • Handling the “PRIMO” Assumption:
      • Since actual collisions with dictionary words are statistically rare, you do not need complex AI to figure out context.
      • Configuration: Instead, implement a Token Whitelist Rule.
        • Rule Logic: “If the matched token (the specific word or part of the name) is ‘PRIMO’, AND it is the onlyword that matches, THEN suppress the alert.”
        • This effectively neutralizes the issue without over-engineering the system.
    • Multi-Element Score-boarding:
      • Score-boarding is a method where the system assigns “points” to a match based on various factors.
      • Since the data is global, you cannot rely on name matching alone. You must configure geographic weighting. If the customer is in “Canada” and the sanctioned party is in “Iran,” the system should deduct points from the risk score, helping automate the dismissal of the alert.

    4. Implementation Strategy

    Implementing this system requires a focus on cleaning data before the matching even begins.

    • Pre-Processing (The Hidden Funnel):
      • Because the data is global, implementation must include a normalization layer. This means stripping out accents, umlauts, and special characters (converting ñ to nç to c) so that the basic matching algorithms in Funnel 1 can compare “apples to apples.”
    • Testing for “Zero Miss” Bias:
      • You cannot just test random names. You must implement a “Sanctions Testing Deck”—a dataset of known sanctioned entities with intentionally corrupted spellings (e.g., testing “Usama bin Ladin” to see if it matches “Osama bin Laden”). The system is only ready for launch if Funnel 1 catches 100% of these variations.

    5. Maintenance & Governance

    The choice of a Two Funnel system shifts the operational burden from monitoring a “Black Box” AI to managing a set of transparent rules.

    • Rule Decay:
      • The “PRIMO” rule works today. But if the government sanctions a vessel named “PRIMO” tomorrow, your rule essentially creates a False Negative (a blind spot).
      • Maintenance Protocol: You must implement a “List Management” process. Every time a sanctions list is updated, an automated script must check if any new sanctioned names conflict with your existing “Good Guy” rules.
    • False Negative Tuning:
      • To ensure you aren’t missing things, you must perform “Below the Line” testing.
      • Process: Periodically look at customers who did not generate an alert. Run them against a secondary, more aggressive screening tool. If you find a missed match, you know you need to loosen the settings in Funnel 1.

    Summary Table

    FeatureEffect of your Assumptions
    Model ChoiceTwo Funnel. It allows you to keep the “Screening” loose (to catch everything) while using “Rules” to manage the resulting noise.
    AlgorithmsMust focus on Edit Distance (fuzzy matching) rather than phonetics to handle global spelling differences.
    Linguistic NoiseAddressed via exclusion lists (lists of words to ignore) due to the low frequency of these collisions.
    Risk BiasFalse Negative intolerance dictates loose matching thresholds in Funnel 1 to ensure high Recall.
    MaintenanceFocus shifts to Rule Hygiene (ensuring your rules don’t accidentally hide new sanctions).

    Was my prompt exactly what I wanted to say? Nah, but it was pretty close – yes, in our experience, we had very few collisions between tokens that had meanings in multiple languages. But I think it was more about that matching is, largely, a multiple token matching business – with the exception of cargo vessels, of course. So, even if you hit one of those colliding tokens, usually you’re going to have to match something else in addition – and that reduces the likelihood you’ve got a match to review.

    It also highlights a design consideration for matching systems. Do you need synonyms, semantic equivalents, etc for an extensive set of common terms – or only for tokens in sanctions listings? I suspect it’s the latter, which might call into question risk/reward of LLMs in cases where the source data is more heterogeneous.

  • These evaluation questions are designed to cut through the marketing fluff. Vendors often use terms like “AI-powered” loosely; these questions will force them to reveal whether they are truly building a “One Funnel” (precision) architecture or simply patching a “Two Funnel” (filtering) system.

    The evaluation framework has been restructured to distinguish between universal “Funnel” mechanics (applicable to both worlds) and those features that primarily drive efficiency in Static Data (KYC/Customer) screening.

    Part 1: Universal Design Considerations

    These factors are critical regardless of whether you are screening a static client database or a live wire transfer. They address the fundamental “One Funnel vs. Two Funnel” architecture of the engine itself.

    1. The Matching Engine Mathematics (The “Funnel” Type)

    Whether screening a customer name or a payment string, the core algorithm dictates the baseline efficiency.

    • Question: “Is the matching engine based on deterministic/probabilistic algorithms (e.g., Levenshtein, Jaro-Winkler) or Semantic AI (Vector/Neural Networks)?”
    • Why it’s Universal: Both transaction and static screening suffer from basic fuzzy matching errors (e.g., over-flagging “Main Street”). A “One Funnel” (Semantic) engine reduces noise in both environments by understanding context and meaning rather than just character overlap.

    2. Synonym & Variation Handling

    • Question: “Does the system identify synonyms (e.g., Bill = William, Ltd = Limited) via a hardcoded dictionary or learned semantic associations?”
    • Why it’s Universal: A terrorist or sanctioned entity can disguise their name in a payment instruction just as easily as they can in account opening documents. If the system relies on a manual dictionary (Two Funnel), you have to maintain that dictionary for both workflows.

    3. Data Normalization & Cleaning (Pre-Computation)

    • Question: “How does the system handle dirty or concatenated strings (e.g., ‘IBM_CORP_NY’ or ‘PaymentRef:Inv#1234’)?”
    • Why it’s Universal:
      • In Static: Bad data entry exists in legacy systems.
      • In Transactions: SWIFT/ISO messages often cram names, addresses, and references into single free-text fields.
      • Note: The system must be able to parse and clean these strings before matching, or the “One Funnel” engine will fail.

    4. Model Governance & Explainability

    • Question: “Can you generate a report explaining why a specific variation was not flagged?”
    • Why it’s Universal: Regulators (OFAC/OFSI) require validation for both systems. If you cannot explain why a wire transfer wasn’t stopped (False Negative), or why a customer wasn’t flagged, the regulatory penalty is the same.

    Part 2: Static Data-Specific Considerations

    These considerations rely on rich, structured data (Dates of Birth, Citizenship, full addresses). They are highly effective for “Two Funnel” reduction in Static Data screening but are less applicable or difficult to implement in Transaction Screening because payment messages are often ephemeral, unstructured, and lack these specific data fields.

    1. Multi-Dimensional Scoring (The “Tie-Breaker”)

    • The Capability: Weighting the match score based on non-name attributes. (e.g., “If Name matches 100% but Year of Birth is >10 years apart, reduce score by 50%”).
    • Why it’s Static-Dominant:
      • Static Data: You almost always have the KYC data (DOB, Country of Citizenship) to compare against the sanctions list.
      • Transaction Screening: A standard SWIFT MT103 or ISO 20022 message often does not contain the Date of Birth or Citizenship of the beneficiary. Therefore, a scoring model capable of weighing these factors is useless for the vast majority of payments.

    2. Entity-Based Whitelisting (“Golden Record” Suppression)

    • The Capability: Permanently suppressing a match for a specific customer entity ID after a human reviews it (e.g., “Client ID 12345 is NOT the terrorist John Smith. Never flag him again.”).
    • Why it’s Static-Dominant:
      • Static Data: You screen the same unique “Customer ID” periodically. Once cleared, the “Two Funnel” whitelist prevents re-alerting on that ID.
      • Transaction Screening: You are screening a string of text in a message, not a “Customer ID.” The text might change slightly (“J. Smith” vs “Mr. John Smith”). You cannot reliably “whitelist” a text string without risking that a bad actor might use that same string later.

    3. Delta Screening (Trigger-Based Scanning)

    • The Capability: Only screening records that have changed or when the sanctions list updates, rather than re-screening the whole database every day.
    • Why it’s Static-Dominant:
      • Static Data: This is the primary efficiency driver for customer databases.
      • Transaction Screening: Every transaction is a “new” event. You cannot “Delta screen” a wire transfer; you must screen the whole message every time it occurs.

    Summary of Differences

    FeatureImportance in Static ScreeningImportance in Transaction Screening
    Semantic AI EngineHigh (Reduces review volume)High (Reduces stopped payments)
    DOB/Nationality WeightingCritical (Major false positive reducer)Low (Data rarely exists in message)
    Entity WhitelistingCritical (prevents “Groundhog Day” reviews)Low (Too risky to whitelist text strings)
    Address Fuzzy MatchingHigh (Matches client address to sanctioned city)Medium (Payment addresses are often unstructured
  • So, you’ve got your matching algorithm and you may have applied some False Positive Reduction (FPR) tools and techniques. But, you still have some matches left that you need to research and make conclusions about. How do you organize and process them? By having some sort of workflow functionality in your screening system, of course.

    Workflow involves, at a bare minimum, the following:

    • A way of organizing new matched records
    • A way of moving records along a path from initial research, to one or more approvers
    • A way of, at the very end of the process, organizing items that were false positives (looked like a match, but was not) and true matches

    That is a bare-bones list. A more robust screening system should also support:

    • Ways to document the research performed and the justification for the decision made
    • Ways to notify external parties of actions taken (e.g. for Compliance Officers who only log into the screening system in specific instances)
    • Robust real-time MIS dashboards and reporting

    Making the system conform to one’s risk management standards also requires a design that can be configured to segment the work (e.g. along geographic or business unit lines), and to control access and visibility of the work.

    I will cover all these in subsequent, bite-sized posts. As you may have noticed, this is all me.. no Gemini involved – I’ve had exposure to a lot of workflow systems in my day, both when I was a sanctions nerd and, before then, a wholesale payments nerd.

  • I have always looked at False Positive Reduction (FPR) in sanctions screening in a certain way – that there were systems that relied on matching algorithms and others that relied more on user-built tools like scorecards and FPR rules.

    What did Gemini think of my way of looking at things?

    First, my prompt:

    I am familiar with 2 basic kinds of false positive reduction for sanctions screening systems. One model relies on sophisticated name matching algorithms rather than whitelisting or rules-based processing to minimize false positive matches. I call that a “one funnel” model. In contrast, a “two funnel system” uses a more basic set of matching algorithms, combined with user-built and maintained false positive reduction rules, multi-element score boarding, and/or white listing tools. IS that an appropriate way to categorize the false positive reduction capabilities of these systems? If not, are there other ways of categorizing these systems? What other considerations have I not considered here?

    Here’s the response I got:

    Your “One Funnel vs. Two Funnel” analogy is an astute and highly effective way to conceptualize the philosophical divide in modern sanctions screening. You are essentially distinguishing between upstream optimization (preventing the false positive from being generated) and downstream optimization (suppressing the false positive after generation).

    Here is a breakdown of why your categorization works, alternative ways to view the market, and the critical considerations you may have overlooked.


    1. Validating Your “Funnel” Model

    Your categorization is appropriate because it accurately describes where the “intelligence” of the system resides.

    • The “One Funnel” (Precision-First Model):
      • Philosophy: “Get it right the first time.”
      • Mechanism: Relies on heavy computational linguistics, AI, and fuzzy logic during the matching process to ensure that only high-probability matches are generated.
      • Pros: Fewer alerts to manage; cleaner data flow.
      • Cons: “Black box” risk (hard to explain to regulators why a name didn’t match); computationally expensive; difficult to tune without data science expertise.
    • The “Two Funnel” (Filter-First Model):
      • Philosophy: “Cast a wide net, then filter the catch.”
      • Mechanism: Uses a “looser” matching engine (often deterministic or basic fuzzy) to grab many potential matches, then applies a secondary logic layer (whitelist, scoring, secondary field checks) to discard the noise.
      • Pros: Transparent and explainable (you can see exactly which rule suppressed the alert); easier for compliance officers to manually tune; often safer for risk-averse institutions.
      • Cons: High initial volume; relies heavily on the maintenance of complex rule sets which can become unmanageable “spaghetti logic” over time.

    2. Alternative Categorizations

    While your funnel model focuses on process flow, the industry often categorizes these systems based on technology generation or data usage.

    A. Deterministic vs. Probabilistic vs. Semantic

    • Deterministic (Legacy): Relies on exact matches or simple wildcards. Zero intelligence. High false negatives (misses risks) or massive false positives.
    • Probabilistic (Fuzzy): Uses algorithms like Levenshtein Distance or Soundex to measure similarity scores (e.g., “Match if name is 85% similar”). This is the standard “Two Funnel” base.
    • Semantic/Vector (AI-Native): Uses Neural Networks to understand that “Bill” and “William” are the same entity without sharing letters. This typically aligns with your “One Funnel” model.

    B. The “Three-Stage” Lifecycle

    Rather than just funnels, many vendors categorize by where the reduction happens in the lifecycle:

    1. Pre-Computation (Data prep): Cleaning and normalizing data before it hits the engine.
    2. In-Memory (The Match): The algorithms you described.
    3. Post-Process (Decisioning): The “score boarding” and “rules” you described.

    3. Critical Considerations You May Have Missed

    Your model focuses heavily on Name Matching algorithms and Rules. However, modern false positive reduction relies on several other pillars that don’t fit neatly into either bucket.

    A. The “Data Hygiene” Paradox

    Neither funnel works if the input data is garbage.

    • Consideration: A system might have the world’s best “One Funnel” AI, but if your internal customer data has “N/A” listed as a last name, you will get false positives.
    • Correction: You should consider Pre-processing/Normalization as a distinct capability. Does the system strip special characters, standardize “Street” vs “St”, and parse concatenated names before matching?

    B. Secondary Attribute Matching (The “Tie-Breaker”)

    You focused on name matching algorithms. However, the biggest driver of False Positive reduction is not the name, but the secondary identifiers.

    • The Gap: A “One Funnel” system might correctly identify that “John Smith” looks like “John Smyth”. But does it automatically penalize the score if the Dates of Birth are 20 years apart?
    • Multidimensional Scoring: Sophisticated systems don’t just match names; they match Name + Country + DoB as a single weighted vector. This is often more effective than name matching alone.

    C. Explainability (XAI) vs. Performance

    • The Regulatory Trap: The “One Funnel” AI model is dangerous if it cannot be explained. If a regulator asks, “Why did you not flag this variation of a terrorist’s name?”, answering “Our neural network didn’t feel it was close enough” is often unacceptable.
    • Consideration: Does the system provide a “white box” explanation for its AI decisions?

    D. Predictive Adjudication (The “Third Funnel”)

    There is a new category emerging that sits after your “Two Funnel” concept.

    • AI Analyst Agents: Instead of rules (IF/THEN), these systems observe human analysts. If humans dismiss “partial matches on common Spanish surnames” 1,000 times, the AI learns to auto-dismiss the 1,001st instance. This is Hibernation/Auto-closure based on behavioral learning, not just static rules.

    Summary Table

    Capability“One Funnel” (Your Model)“Two Funnel” (Your Model)The Missing Piece (Holistic)
    Primary FilterSophisticated Matching AIBasic Matching + Post-RulesData Normalization + Contextual Vectoring
    Logic“Smart” Pattern Recognition“Brute Force” Logic/WhitelistsBehavioral Learning (Predictive)
    StrengthPrecisionAuditability/ControlEfficiency & Context
    WeaknessExplainability (Black Box)Maintenance OverheadData Dependency

  • The prompt:

    What are the most prominent sanctions screening challenges for Islamic personal names, and how are these addressed by current software and data solutions in the market? If there are notable challenges that are more prominent for male names, or female names, list and explain them. If there are notable challenges specific to transaction screening, list and explain them. Provide a source list (with URLs in parentheses) but do not produce citation markers within the response.

    The response:

    Prominent Sanctions Screening Challenges: Islamic Personal Names

    Screening Islamic personal names for sanctions compliance presents unique linguistic and cultural challenges. These difficulties often lead to high rates of “false positives” (incorrectly flagging innocent individuals) or “false negatives” (failing to catch sanctioned entities).

    1. General Screening Challenges

    These challenges apply broadly across Islamic naming conventions, regardless of gender.

    • Transliteration Variability (Romanization): The most significant challenge is converting names from Arabic script (or Persian/Urdu) into Latin characters for Western watchlists.
      • Vowel Ambiguity: Arabic script often omits short vowels. A name like “Mohammed” can be validly transliterated as Mohamed, Muhammad, Muhamed, Mahomet, etc.
      • Letter Substitution: Certain Arabic sounds have no direct English equivalent. The letter Qaf might be transliterated as ‘Q’, ‘K’, or ‘G’. For example, Qasem vs. Kasem vs. Gasem.
    • The “Mohammad Bias” (Frequency Noise): The name Mohammad (and its variants) is the most common given name in the world. In sanctions lists, it appears frequently as a first name or part of a patronymic chain.
      • Challenge: Simple exact-match or loose fuzzy-match logic often triggers thousands of false positives because the name is so ubiquitous.
    • Complex Naming Structures (The Chain of Names): Unlike the Western “First-Middle-Last” format, traditional Islamic names often follow a chain:
      • Ism: Proper name (e.g., Ali).
      • Nasab: Patronymic chain indicating lineage, often using bin or ibn (son of). (e.g., Ali bin Ahmed bin Saeed).
      • Nisba: Adjective indicating origin, tribe, or affiliation (e.g., Al-Masri = “The Egyptian”).
      • Challenge: Western systems frequently mistake bin Ahmed for a middle name or Al-Masri for a surname, failing to match when the components are reordered or when bin is dropped.
    • Honorifics and Titles: Titles are often embedded into the name string and can be mistaken for part of the name.
      • Examples: Sheikh, Hajji, Sayyid, Imam.
      • Challenge: If a sanctions list includes “Sheikh Yassin” and a customer is onboarded as “Yassin” (without the title), a strict system might miss the match. Conversely, treating “Sheikh” as a first name causes false alerts against anyone with that title.

    2. Gender-Specific Challenges

    While many challenges are shared, distinct issues affect male and female names differently.

    Challenges More Prominent for Male Names:

    • Use of Kunya (Technonyms): Men are frequently referred to by a Kunya—a name meaning “Father of [Eldest Son’s Name]”—rather than their legal given name.
      • Example: Abu Mazen (Father of Mazen).
      • Challenge: Sanctions lists often designate terrorists or political figures by their Kunya (nom de guerre). If a bank screens only the legal name (Mahmoud Abbas), they may miss the alias Abu Mazen.
    • Nasab Chains: The patronymic chain (bin X bin Y) is prevalent in male names and often leads to truncation errors where the grandfather’s name is mistaken for the surname.

    Challenges More Prominent for Female Names:

    • Teh Marbuta (Feminine Suffix) Transliteration: Many female names end with the Teh Marbuta character (ة), which denotes femininity.
      • Challenge: When transliterated, this can be written as an ‘a’, ‘h’, or ‘ah’. For example, Fatima vs. Fatimah. This inconsistency frequently breaks exact-match algorithms.
    • “Umm” (Mother of) Titles: Similar to the male Abu, women may be known as Umm [Son’s Name].
      • Challenge: Screening systems that do not account for “Umm” as a title rather than a name component may fail to match against lists using the legal name.
    • Maiden vs. Married Names: In many Islamic cultures, women traditionally retain their father’s family name after marriage rather than adopting their husband’s.
      • Challenge: Western compliance systems often assume a “maiden name” field is necessary or that a name change has occurred post-marriage. Incorrectly forcing a husband’s surname onto a female customer can lead to a data mismatch against sanctions lists that use her legal (father’s) surname.

    3. Transaction Screening Challenges

    Transaction screening involves monitoring payment messages (like SWIFT/ISO 20022) in real-time. Specific challenges for Islamic finance include:

    • Religious Terminology False Positives: Transaction description fields often contain religious terms that fuzzy matching algorithms mistake for names of sanctioned entities or groups.
      • Zakat / Sadaqah: Terms for charity. Can trigger false positives against entities with similar names (e.g., Al-Zakat Foundation).
      • Hajj / Umrah: Pilgrimage terms. “Hajj” is also a title (Hajji) often found in sanctioned names, leading to stops on payments simply labeled “Savings for Hajj.”
      • Eid: Festival names (e.g., Eid Gift) can trigger matches against individuals named Eid.
    • Compound Tokens in Institution Names: Islamic financial institutions often use compound names that get broken apart incorrectly.
      • Example: Islamic Bank of [Country].
      • Challenge: Generic words like “Islamic” or “Bank” are high-frequency. If the screening software does not treat the institution name as a single entity, it may flag the transaction based on partial matches with sanctioned banks containing those generic terms.

    4. Current Software and Data Solutions

    The market has shifted from simple “fuzzy matching” to more sophisticated AI and culture-aware approaches.

    • Culture-Sensitive Algorithms: Advanced software now includes specific logic for Arabic naming conventions.
      • Solution: These tools can identify and ignore “noise” words (like bin, al, abu) during matching or treat them as connectors rather than core name parts. They can also recognize that Mohammed and Mohd are equivalent.
    • Multi-Script Screening: Instead of relying solely on Latin transliteration, newer solutions screen names in their original Arabic script against the original script versions of sanctions lists.
      • Solution: This bypasses the transliteration ambiguity entirely (e.g., matching محمد directly to محمد).
    • Secondary Identifier Matching: To combat the “Mohammad Bias,” systems now weigh secondary data points (Date of Birth, Nationality, Location) much more heavily.
      • Solution: A match on the name “Mohammad Ali” is automatically discarded by the software unless the Year of Birth or Country also matches, reducing manual review volume.
    • AI & Entity Resolution: Machine learning models are trained on vast datasets of name variants to predict valid matches.
      • Solution: These models “learn” that Fatima and Fatimah are the same entity without needing hard-coded rules, and they can distinguish between a person named Hajji and a payment reference to a pilgrimage.

    Source List

  • This uses a similar prompt to the one I used for Islamic names, but it worked better when I removed the part about particles, which is specific to Islamic names. And I had to remind it to give me a source list instead of citation markers.

    Some of this I knew – like the patronymic name and the gendered endings – but not in this detail or nuance.

    Oh, and even though it’s pointed out in the text, let me introduce this by saying this is most relevant for Russian, Belarusian, and Ukrainian names.

    Here is the explanation of Eastern Slavic names and identity documents, written for a layperson.

    1. The “Three-Name” Structure

    In English-speaking countries, people typically have a First Name, Middle Name, and Last Name. In Eastern Slavic cultures (Russia, Ukraine, Belarus), the structure is strictly defined and always consists of three parts in a specific order:

    1. Given Name (First Name): The personal name chosen at birth (e.g., IvanNatalia).
    2. Patronymic (Father’s Name): This is not a “middle name” in the Western sense because it is not chosen freely. It is derived directly from the father’s first name and literally means “son of [Father]” or “daughter of [Father].”
    3. Surname (Family Name): The name inherited from the parents.

    How the Patronymic Works:

    If a man named Ivan has a son named Boris and a daughter named Elena, their full names would be:

    • Son: Boris Ivanovich (Boris, son of Ivan)
    • Daughter: Elena Ivanovna (Elena, daughter of Ivan)

    If the father’s name was Sergey, the names would be Boris Sergeevich and Elena Sergeevna.

    (Note: In daily life, adults often address each other by their Given Name + Patronymic as a sign of respect, rather than using Mr. or Mrs.)

    2. Male vs. Female Differences

    In English, a surname like “Smith” looks the same whether it belongs to a man or a woman. In Eastern Slavic languages, names act like adjectives—they change their shape to match the gender of the person holding the name.

    The table below shows how the endings (suffixes) typically change.

    Name PartMale FormFemale Form
    Given NameUsually ends in a consonant (e.g., Ivan, Vladimir, Anton).Usually ends in “a” or “ya” (e.g., Maria, Svetlana, Tatiana).
    PatronymicEnds in -vich or -ich.Ends in -vna or -na.
    SurnameEnds in a consonant (e.g., Ivanov).Adds an “a” to the end (e.g., Ivanova).

    Example of a Married Couple:

    • Husband: Alexei Petrovich Volkov
    • Wife: Tatiana Sergeevna Volkova

    3. Differences Across Countries and Regions

    While Russia, Ukraine, and Belarus share the same basic system, there are distinct cultural differences.

    Russia

    Russian names follow the standard model most strictly. Surnames often end in -ov, -ev, or -in. The gender difference (adding an “a” for women) is almost universal here.

    Ukraine

    Ukrainian names have distinct endings.

    • The “-enko” Ending: Many Ukrainian surnames end in -enko (e.g., ShevchenkoTkachenko). These are gender-neutral; they do not change for women. A brother and sister will both be named Shevchenko.
    • Spelling: When names are written in English (transliteration), Ukrainian spellings differ from Russian ones. For example, the Russian “Vladimir” becomes “Volodymyr” in Ukraine; “Alexander” becomes “Oleksandr.”

    Belarus

    Belarusian names share features with both Russian and Polish traditions.

    • The “-ich” Ending: Surnames ending in -ich or -vich are very common (e.g., Bogdanovich). In Russia, “-vich” usually signals a middle name (patronymic), but in Belarus, it is frequently a last name.
    • Phonetics: Belarusian spelling often reflects a strong “A” sound where Russian uses an “O.” For example, the name “Olga” may appear as “Volha.”

    4. Identity Documents and Lifecycle Updates

    In these countries, people generally have two different passports:

    1. Internal Passport (Domestic ID): Used for everything inside the country—banking, buying train tickets, voting, or getting married.
    2. International Passport: Used only for traveling abroad.

    Russia

    • Birth: A child receives a Birth Certificate. This is their only ID until age 14.
    • First Internal Passport: Issued at age 14.
    • Mandatory Updates: The internal passport expires and must be replaced at age 20 and again at age 45. If you do not replace it, the ID becomes invalid, and you cannot use a bank account or sign legal documents.
    • Marriage: If a woman takes her husband’s surname, her internal passport becomes invalid immediately, and she must apply for a new one within 30 days.

    Ukraine

    Ukraine has transitioned from old blue booklets to modern biometric ID cards.

    • First ID Card: Issued at age 14.
    • Validity: The first card is valid for 4 years (until age 18). After age 18, cards are issued for 10 years.
    • Old Passports: Citizens who still have the old booklet style passports (issued before 2016) were required to paste new photos into the booklet at ages 25 and 45 to keep them valid.

    Belarus

    Belarus traditionally used a single passport for both domestic and international use, though they recently introduced biometric ID cards.

    • First Passport: Issued at age 14.
    • Validity: Passports are valid for 10 years.
    • Updates: Like Russia and Ukraine, a name change due to marriage requires a new passport.

    Sources

    • Wikipedia (wikipedia.org) – East Slavic naming customs; Eastern European name suffixes.
    • Gosuslugi – Russian Federation Portal of State Services (gosuslugi.ru) – Passport issuance and replacement ages.
    • State Migration Service of Ukraine (dmsu.gov.ua) – ID card validity and regulations.
    • Ministry of Foreign Affairs of the Republic of Belarus (mfa.gov.by) – Passport regulations.
    • MasterRussian (masterrussian.com) – Grammar rules for names.
  • This sort of summarization task, especially when you can rein it in by making AI share its work, and by not asking it make conclusions about the data it finds, is a fine use for Gemini.

    This was my prompt:

    In layperson terms, explain the structure of personal Islamic names. Include a section on the particles prevalent in such names. Explain the differences between male and female names. Include a section on the differences across countries, regions and cultures. If using linguistic terms like demonym or patronym, define them in parentheses after their first use. Also explain, by country, identify documents and how or when they are updated (e.g. when children are born). Include sources, but Instead of inserting citation markers throughout the response , provide a source list, including URLs in parentheses (as appropriate), at the end of the response. If multiple citations refer to the same base domain, refer to the base domain instead of the specific page.

    And this was its response – I knew the real basics (e.g. the 5 name parts), but having it lay out some of the details, like the national identity documents) was beyond my available time and abilities:

    The structure of Islamic names can seem complex to the outsider because, historically, they functioned more like a “chain of identity” than a simple “First Name + Last Name” format used in the West.

    In many parts of the Arab world, a person’s name tells a story about who they are, who their father is, and where they come from. However, this structure varies significantly depending on the country and local culture.

    The Classical Structure of an Islamic Name

    A traditional name is often built from five distinct building blocks. While modern official documents often shorten these, understanding the full chain is key to understanding the name.

    1. Ism (The Given Name):This is the personal name given at birth, equivalent to a “First Name” in English (e.g., Muhammad, Fatima, Omar).
    2. Nasab (The Patronymic):This indicates lineage. It is a chain of names linking the person to their ancestors, usually defined by the particle bin (son of) or binti (daughter of).
      • Example: Ahmad bin Omar (Ahmad, son of Omar).
    3. Kunya (The Paedonymic):This is an honorific name derived from the person’s eldest child. It is a polite way to address someone socially. It usually begins with Abu (Father of) or Umm (Mother of).
      • Example: If Ahmad has a son named Hamza, Ahmad might be called Abu Hamza (Father of Hamza).
    4. Laqab (The Honorific/Nickname):This is a descriptive name or title, often religious or related to a physical trait or virtue.
      • Example: Al-Rashid (The Rightly Guided) or Al-Siddiq (The Truthful).
    5. Nisba (The Demonym/Origin):This functions like a surname and indicates where the person is from, their tribe, or their profession. It almost always ends in an “i” sound.
      • Example: Al-Masri (The Egyptian) or Al-Baghdadi (The one from Baghdad).

    Common Particles and Their Meanings

    Particles are the small connector words that link the names together.

    ParticleMeaningContext
    Al / El / UlTheThe definite article. Used before tribe names, regions, or titles (e.g., Al-Saud = The Saud family).
    Bin / IbnSon ofConnects a man’s name to his father’s name (e.g., Saeed bin Ahmed).
    Binti / BintDaughter ofConnects a woman’s name to her father’s name (e.g., Aisha binti Ahmed).
    AbuFather ofUsed in a Kunya (e.g., Abu Bakr).
    UmmMother ofUsed in a Kunya (e.g., Umm Kulthum).
    Abd / AbdulServant ofAlways followed by one of the 99 names of God. Abdul is actually “Abd” + “Al” (The). (e.g., Abdullah = Servant of God).

    Differences Between Male and Female Names

    1. Linguistic Markers

    • The “ah” ending: Many female names end with a “soft h” or “a” sound, represented in Arabic by the letter Ta Marbuta.
      • Male: Jamil (Beautiful/Handsome) → Female: Jamila (Beautiful).
      • Male: Samir (Companion) → Female: Samira.

    2. Thematic Differences

    • Male names often convey servitude to God (AbdullahAbdulrahman), strength (Hamza – Lion), or historical religious figures.
    • Female names often refer to nature (Yasmin – Jasmine, Warda – Rose), virtues (Amal – Hope, Iman – Faith), or the wives and daughters of the Prophet.

    3. Marriage and Surnames

    • In strict Islamic tradition, women do not change their names upon marriage. A woman named Fatima binti Aliremains Fatima binti Ali for life, as she remains the daughter of Ali regardless of whom she marries. However, in some modern Westernized contexts or specific countries (like Turkey or parts of South Asia), women may adopt the husband’s surname legally.

    Regional and Cultural Differences

    While the religious roots are the same, naming conventions change drastically across borders.

    1. The Arab World (e.g., Saudi Arabia, UAE, Egypt)

    • Structure: Heavily relies on the Ism + Bin + Father + Bin + Grandfather + Family Name structure.
    • Legal IDs: Often drop the “Bin” in English fields but keep the four-name chain (Own Name + Father + Grandfather + Family).
    • Egypt specific: Often omits the “Bin” entirely. A person is simply named Ahmed Mohamed Ali (Ahmed [son of] Mohamed [son of] Ali).

    2. South Asia (Pakistan, India, Bangladesh)

    • Surnames: The “Bin” structure is rare. Instead, they use family surnames or caste/tribal names (Khan, Chaudhry, Patel).
    • Honorifics as Names: Titles often become part of the official name. Syed (lineage of the Prophet) or Sheikh may act as a first name or surname.
    • Naming: Names are heavily influenced by Persian and local languages (Urdu/Pashto) alongside Arabic.

    3. Southeast Asia (Malaysia, Indonesia)

    • Malaysia: Strictly follows the patronymic (name derived from father) system. Surnames are rare.
      • Format: [Own Name] BIN [Father’s Name].
    • Indonesia: Very fluid. Many people have only one name (Mononym), like Sukarno or Suharto. They do not strictly require surnames. A child’s name might be totally different from the father’s name.

    4. Turkey

    • Surnames: In 1934, Turkey mandated surnames for all citizens.
    • Turkification: Arabic names were adapted to Turkish spelling and pronunciation.
      • Muhammad became Mehmet.
      • Uthman became Osman.
    • Patronymics: The suffix -oğlu (son of) is common in surnames (e.g., Davutoğlu), but the “Bin” system is not used legally.

    Identity Documents and Updates by Country

    How governments capture these names varies, as does the timeline for updating them.

    Saudi Arabia

    • Document: National ID Card (Bataka Al-Ahwal / Hawiya).
    • Issuance/Update:
      • Birth: Parents must register the birth via the Absher digital platform after the hospital issues a notification. This generates a Family Register listing the child.
      • Adulthood: A separate photo ID card is issued to citizens usually starting at age 15 (mandatory by 15, optional from 10-14).

    Pakistan

    • Document: CNIC (Computerized National Identity Card).
    • Issuance/Update:
      • Birth: Parents register the child at the local Union Council to get a CRC (Child Registration Certificate), colloquially called the “B-Form.” This form assigns the unique 13-digit ID number immediately.
      • Adulthood: At age 18, the individual converts the B-Form into a full photo CNIC.

    Malaysia

    • Document: MyKad (Government Multipurpose Card).
    • Issuance/Update:
      • Birth: The National Registration Department (JPN) issues a Birth Certificate upon registration.
      • Childhood: From birth to age 12, children carry a MyKid (chip-based card without a photo).
      • Adulthood: At age 12, it is mandatory to apply for the MyKad (the primary photo ID), which must be updated again at age 18 to ensure the photo is current.

    Egypt

    • Document: National ID Card (Raqam Qawmi).
    • Issuance/Update:
      • Birth: Births must be registered within 15 days at the local Health Office to obtain a computerized Birth Certificate.
      • Adulthood: It is mandatory to obtain the National ID card at age 15 (lowered from 16 recently). This card is the primary proof of citizenship.

    Source List

    • Kingdom of Saudi Arabia National Portal (my.gov.sa): Information on “Register Newborn Service” and Civil Affairs updates.
    • Ministry of Interior – Egypt (moi.gov.eg): Regulations regarding Civil Registry and National ID issuance.
    • National Database and Registration Authority – Pakistan (nadra.gov.pk): Procedures for CRC (B-Form) and CNIC issuance.
    • National Registration Department – Malaysia (jpn.gov.my): Details on MyKid and MyKad application procedures.
    • Encyclopedia of Islam (Brill): Academic definitions of IsmNasab, and Nisba.
  • So, I was playing with Gemini to ask about Arabic name structure, and it came up with this term I had never heard of, so I decided to ask it what it meant:

    Nunation (called Tanween in Arabic) is a linguistic term that refers to adding an “n” sound to the end of a word.

    It is not written with the letter “N” (Nun). Instead, it is indicated by doubling the accent mark at the end of the word.

    1. What does it do?

    In standard grammar, it acts like the English word “A” or “An.” It marks a word as indefinite.

    • Kitab = Book
    • Al-Kitab = The Book (Specific)
    • Kitabun = A Book (General / Indefinite)

    2. Why is this in the video about names at the end of this response?

    This is the key difference between traditional male and female names in Classical Arabic grammar.

    • Male names typically take Nunation. Even though they are specific people, male Arabic names (like MuhammadZayd, or Khalid) are grammatically treated as “triptotes” (words that accept all case endings).
      • You say: “Muhammadun” (Nominative case).
    • Female names are “Diptotes” (Forbidden from Nunation). Female names (like FatimaZainab, or Maryam) fall into a grammatical category called Mamnu’ min al-sarf (forbidden from exchanging). They generally cannot take the “n” sound at the end.
      • You say: “Fatima-u” (NOT Fatima-un).

    3. Summary

    In the context of the video: Nunation is the “Un” sound you hear at the end of male names (Muhammad-un) that is grammatically stripped away from female names (Fatima-u).

    Sources

    • Council on American-Islamic Relations (cair.com)
    • Learn Arabic Online (learnarabiconline.com)
    • Transparent Language (blogs.transparent.com)
    • Wikipedia (en.wikipedia.org)

    Relevance of Video

    The video below provides a linguistic breakdown of the grammatical differences between male and female names in Arabic, specifically focusing on “Tanween” (nunation) and how endings change based on gender, which is useful for understanding the spoken nuances of these names.

    Male vs. Female Names (Tanween)

    [Ep. 11] Madinah Arabic – “Male vs. Female Names (Tanween)” – YouTube

    Saajid Lipham · 1.4K views