Phones API
This contains utilities, blockers, and comparers relevant to phone numbers
mismo.lib.phone.clean_phone_number(numbers, *, default_area_code=None)
Extracts any 10-digit number from a string.
Drops leading 1 country code if present.
Parsing failures are returned as NULL.
Empty strings are returned as NULL.
If you supply a default_area_code, it will be prepended to 7-digit numbers.
If a number looks bogus, ie it contains "0000", "9999", or "12345", it is set to NULL.
mismo.lib.phone.match_level(p1: ir.StringValue, p2: ir.StringValue, *, native_representation: Literal['integer', 'string'] = 'integer') -> PhoneMatchLevel
Match level of two phone numbers.
Assumes the phone numbers have already been cleaned and normalized.
PARAMETER | DESCRIPTION |
---|---|
p1 |
The first phone number.
TYPE:
|
p2 |
The second phone number.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
level
|
The match level.
TYPE:
|
mismo.lib.phone.PhoneMatchLevel
Bases: MatchLevel
How closely two phone numbers match.
mismo.lib.phone.PhoneMatchLevel.ELSE = 2
class-attribute
instance-attribute
None of the above.
mismo.lib.phone.PhoneMatchLevel.EXACT = 0
class-attribute
instance-attribute
The numbers are exactly the same.
mismo.lib.phone.PhoneMatchLevel.NEAR = 1
class-attribute
instance-attribute
The numbers have a small edit distance.
mismo.lib.phone.PhonesDimension
Prepares, blocks, and compares sets of phone numbers.
This is useful if each record contains a collection of phone numbers. Two records are probably the same if they have a lot of phone numbers in common.
mismo.lib.phone.PhonesDimension.__init__(column: str, *, column_parsed: str = '{column}_parsed', column_compared: str = '{column}_compared')
Initialize the dimension.
PARAMETER | DESCRIPTION |
---|---|
column |
The name of the column that holds a array
TYPE:
|
column_parsed |
The name of the column that will be filled with the parsed phone numbers.
TYPE:
|
column_compared |
The name of the column that will be filled with the comparison results.
TYPE:
|
mismo.lib.phone.PhonesDimension.block(left: ir.Table, right: ir.Table, **kwargs) -> ir.Table
Block records wherever they share a phone number.
mismo.lib.phone.PhonesDimension.compare(t: ir.Table) -> ir.Table
Add a column with the best match between all pairs of phone numbers.
mismo.lib.phone.PhonesDimension.prepare(t: ir.Table) -> ir.Table
Add a column with the parsed and normalized phone numbers.