Phones API
This contains utilities, blockers, and comparers relevant to phone numbers
mismo.lib.phone.clean_phone_number
clean_phone_number(
phones: Table, *, default_area_code: str | None = None
) -> Table
clean_phone_number(
phones: ArrayValue,
*,
default_area_code: str | None = None,
) -> ArrayValue
clean_phone_number(
phones: StringValue,
*,
default_area_code: str | None = None,
) -> StringValue
clean_phone_number(numbers, *, default_area_code=None)
Extracts any 10-digit number from a string.
Drops leading 1 country code if present.
Parsing failures are returned as NULL.
Empty strings are returned as NULL.
If you supply a default_area_code, it will be prepended to 7-digit numbers.
If a number looks bogus, ie it contains "0000", "9999", or "12345", it is set to NULL.
mismo.lib.phone.match_level
match_level(
p1: StringValue,
p2: StringValue,
*,
native_representation: Literal[
"integer", "string"
] = "integer",
) -> PhoneMatchLevel
Match level of two phone numbers.
Assumes the phone numbers have already been cleaned and normalized.
PARAMETER | DESCRIPTION |
---|---|
p1
|
The first phone number.
TYPE:
|
p2
|
The second phone number.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
level
|
The match level.
TYPE:
|
mismo.lib.phone.PhoneMatchLevel
Bases: MatchLevel
How closely two phone numbers match.
mismo.lib.phone.PhoneMatchLevel.ELSE
class-attribute
instance-attribute
ELSE = 2
None of the above.
mismo.lib.phone.PhoneMatchLevel.EXACT
class-attribute
instance-attribute
EXACT = 0
The numbers are exactly the same.
mismo.lib.phone.PhoneMatchLevel.NEAR
class-attribute
instance-attribute
NEAR = 1
The numbers have a small edit distance.
mismo.lib.phone.PhonesDimension
Prepares, blocks, and compares sets of phone numbers.
This is useful if each record contains a collection of phone numbers. Two records are probably the same if they have a lot of phone numbers in common.
mismo.lib.phone.PhonesDimension.__init__
__init__(
column: str,
*,
column_cleaned: str = "{column}_cleaned",
column_compared: str = "{column}_compared",
)
Initialize the dimension.
PARAMETER | DESCRIPTION |
---|---|
column
|
The name of the column that holds a array
TYPE:
|
column_cleaned
|
The name of the column that will be filled with the parsed phone numbers.
TYPE:
|
column_compared
|
The name of the column that will be filled with the comparison results.
TYPE:
|
mismo.lib.phone.PhonesDimension.compare
compare(t: Table) -> Table
Add a column with the best match between all pairs of phone numbers.
mismo.lib.phone.PhonesDimension.prepare_for_blocking
prepare_for_blocking(t: Table) -> Table
noop
mismo.lib.phone.PhonesDimension.prepare_for_fast_linking
prepare_for_fast_linking(t: Table) -> Table
Add a column with the parsed and normalized phone numbers.