Skip to content

Phones API

This contains utilities, blockers, and comparers relevant to phone numbers

mismo.lib.phone.clean_phone_number

clean_phone_number(
    phones: Table, *, default_area_code: str | None = None
) -> Table
clean_phone_number(
    phones: ArrayValue,
    *,
    default_area_code: str | None = None,
) -> ArrayValue
clean_phone_number(
    phones: StringValue,
    *,
    default_area_code: str | None = None,
) -> StringValue
clean_phone_number(numbers, *, default_area_code=None)

Extracts any 10-digit number from a string.

Drops leading 1 country code if present.

Parsing failures are returned as NULL.

Empty strings are returned as NULL.

If you supply a default_area_code, it will be prepended to 7-digit numbers.

If a number looks bogus, ie it contains "0000", "9999", or "12345", it is set to NULL.

mismo.lib.phone.match_level

match_level(
    p1: StringValue,
    p2: StringValue,
    *,
    native_representation: Literal[
        "integer", "string"
    ] = "integer",
) -> PhoneMatchLevel

Match level of two phone numbers.

Assumes the phone numbers have already been cleaned and normalized.

PARAMETER DESCRIPTION
p1

The first phone number.

TYPE: StringValue

p2

The second phone number.

TYPE: StringValue

RETURNS DESCRIPTION
level

The match level.

TYPE: PhoneMatchLevel

mismo.lib.phone.PhoneMatchLevel

Bases: MatchLevel

How closely two phone numbers match.

mismo.lib.phone.PhoneMatchLevel.ELSE class-attribute instance-attribute

ELSE = 2

None of the above.

mismo.lib.phone.PhoneMatchLevel.EXACT class-attribute instance-attribute

EXACT = 0

The numbers are exactly the same.

mismo.lib.phone.PhoneMatchLevel.NEAR class-attribute instance-attribute

NEAR = 1

The numbers have a small edit distance.

mismo.lib.phone.PhonesDimension

Prepares, blocks, and compares sets of phone numbers.

This is useful if each record contains a collection of phone numbers. Two records are probably the same if they have a lot of phone numbers in common.

mismo.lib.phone.PhonesDimension.__init__

__init__(
    column: str,
    *,
    column_cleaned: str = "{column}_cleaned",
    column_compared: str = "{column}_compared",
)

Initialize the dimension.

PARAMETER DESCRIPTION
column

The name of the column that holds a array of phone numbers.

TYPE: str

column_cleaned

The name of the column that will be filled with the parsed phone numbers.

TYPE: str DEFAULT: '{column}_cleaned'

column_compared

The name of the column that will be filled with the comparison results.

TYPE: str DEFAULT: '{column}_compared'

mismo.lib.phone.PhonesDimension.compare

compare(t: Table) -> Table

Add a column with the best match between all pairs of phone numbers.

mismo.lib.phone.PhonesDimension.prepare_for_blocking

prepare_for_blocking(t: Table) -> Table

noop

mismo.lib.phone.PhonesDimension.prepare_for_fast_linking

prepare_for_fast_linking(t: Table) -> Table

Add a column with the parsed and normalized phone numbers.