Mismo
Mismo is a python framework for Record Linkage, built on top of Ibis. It allows you to deduplicate and link records from tables that don't have a unique identifier.
- Take a database of campaign contributions and determine which were made by the same person
- Determining which product listings actually refer to the same item.
- Linking businesses across different datasets.
Example
See the example notebook.