Customisation of Possible Duplicate criteria, or bulk update options using export/import

It would be extremely useful to be able to customise the criteria used to identify possible duplicates. The current hard coded matching criteria is far too broad for our use. With such breadth of criteria, the number of possible duplicates returned is unmanageable, and, therefore, the duplicates do not get managed at all.

Alternatively, if we were able to export all possible duplicates identified for analysis outside of BlackBaud, then import the results back in for bulk update, even if just for those that should be set to "Not a Duplicate", this would help.

Currently we have almost 20k possible duplicates in NXT Health Check.

We had approximately the same in DB View Duplicate Constituent Management Tool. Of the 8,000 I checked through, only 5 were definite duplicates able to be merged.

So, I now have just over 10k possible duplicates showing through DB View, but still 20k possible duplicates in NXT view. I'm told marking as Not a Duplicate in DB View does not update the same in NXT view.

I'd like to get the data into a good enough position that NXT view can be used by our Users on a regular basis to keep on top of them, but with 20k possibles, there's no way my users will ever have time to tackle that many so they just won't do it at all.

  • Paula Bramble
  • Jan 27 2021
  • Attach files
  • Jennifer Watson commented
    November 28, 2022 19:55

    The duplicate algorithm should include bdays - or at least allow us to select a birthday as an option to sort on. When children have the same name as a parent someone more than 20 years older with same name and address is not going to be a duplicate.

  • Gwen Capers-Singleton commented
    July 05, 2022 14:43

    one more thing on the duplicate algorithm.. really should consider matching on addresses and not rely so much on name. Often a search using even just the first part of an address can identify an existing record that might be recorded using a nickname. thank you all your help with this, much appreciated!

  • Gwen Capers-Singleton commented
    July 05, 2022 14:27

    The duplicate algorithm is indeed too broad and relies upon name matching too much. End of life care is our business so tribute gifts represent 75-80% of our giving. Accordingly, our database contains a large number of deceased records. Deceased records are marked as deceased, with a deceased date, and appropriate constituent codes as “Deceased”. However, the NXT Duplicate Tool is picking quite a large number of these records as duplicates. As I am going through each record, you can clearly see the differing deceased dates so while the names of the deceased records may be the same, the deceased (date of death) dates and birth dates (when available) are clearly different. Our list is showing over 33,000 duplicates and wading through is quite the task. Also, an NXT merge only copies the data from the source to the target record. Despite the marking on record as inactive, you still have two records with same data presented in database view...not sure how an NXT merge is helpful. 

  • Meghan Ogren commented
    May 04, 2022 21:00

    The way the duplicate algorithm works right now, it matches people as a high likelihood with a name-only match as long as they don't have other bio data (address/phone/email) that conflict. That means that someone with a name and no other info on their record will show up as a high likelihood match. When we are trying to go through the duplicate tool to quickly merge everyone with high match likelihood, it is pretty frustrating to run into records that have no info to qualify it to merge as a duplicate. Likewise, we don't want to mark them as not a duplicate, because they may be duplicates. We have no way of knowing. And we can't sort/filter on the match types, so we can't even suppress the high likelihood matches that only match on name. We need more customization around this so we can identify the actual merge-able duplicates.

  • Paula Bramble commented
    January 27, 2021 11:39

    We have 55,853 total Constituents records and 19,407 possible duplicates identified! Just saying to show the impact of the breadth of duplicate identification criteria.