Skip to content

Human Names API

This contains utilities, blockers, and comparers relevant to human names

mismo.lib.name.normalize_name(name: ir.StructValue) -> ir.StructValue

Convert to uppercase, normalize whitespace, and remove non-alphanumeric.

PARAMETER DESCRIPTION
name

The name to normalize.

TYPE: StructValue

RETURNS DESCRIPTION
name_normed

The normalized name.

TYPE: StructValue

mismo.lib.name.are_aliases(name1, name2)

Determine if two names are nickname interchangeable.

This is case-insensitive, and whitespace is stripped from both ends. The same name returns True.

mismo.lib.name.is_nickname_for(nickname, canonical)

Determine if a name is a nickname for another name.

This is case-insensitive, and whitespace is stripped from both ends. The same name returns True.

mismo.lib.name.NameMatchLevel

Bases: MatchLevel

How closely two names match.

mismo.lib.name.NameMatchLevel.ELSE = 6 class-attribute instance-attribute

None of the above.

mismo.lib.name.NameMatchLevel.EXACT = 1 class-attribute instance-attribute

The names are exactly the same.

mismo.lib.name.NameMatchLevel.GIVEN_SURNAME = 2 class-attribute instance-attribute

The given and surnames both match.

mismo.lib.name.NameMatchLevel.INITIALS = 4 class-attribute instance-attribute

The first letter of the given name matches, and the surnames match.

mismo.lib.name.NameMatchLevel.NICKNAMES = 3 class-attribute instance-attribute

The given names match with nicknames, and the surnames match.

mismo.lib.name.NameMatchLevel.NULL = 0 class-attribute instance-attribute

At least one given or surname is NULL from either side.

mismo.lib.name.NameMatchLevel.TYPO = 5 class-attribute instance-attribute

The given names are the same (forgiving typos), and the surnames match.

mismo.lib.name.NameComparer

Compare names. Assumes the names have already been normalized/featurized.

mismo.lib.name.NameComparer.__call__(pairs: ir.Table) -> ir.Table

Compare pairs of names.

PARAMETER DESCRIPTION
pairs

A table with columns self.left_column and self.right_column. Each of these columns should be a struct that has been normalized/featurized with _clean.normalize_name(raw_name_struct).

TYPE: Table

RETURNS DESCRIPTION
t

The table with the comparison results in the column self.name.

TYPE: Table

mismo.lib.name.NameDimension

Prepares, blocks, and compares based on a human name.

A name is a Struct of the type `struct< prefix: string, given: string, middle: string, surname: string, suffix: string, nickname: string,

`.

mismo.lib.name.NameDimension.block(left: ir.Table, right: ir.Table, **kwargs) -> ir.Table

Block records based on the name tokens.

PARAMETER DESCRIPTION
left

The left table.

TYPE: Table

right

The right table.

TYPE: Table

RETURNS DESCRIPTION
t

The blocked table.

mismo.lib.name.NameDimension.compare(t: ir.Table) -> ir.Table

Compare the left and right names.

PARAMETER DESCRIPTION
t

The table to compare.

TYPE: Table

RETURNS DESCRIPTION
t

The compared table.

TYPE: Table

mismo.lib.name.NameDimension.prepare(t: ir.Table) -> ir.Table

Add columns with the normalized name and name tokens.

PARAMETER DESCRIPTION
t

The table to prep.

TYPE: Table

RETURNS DESCRIPTION
t

The prepped table.