Comparing API
Once records are blocked together into pairs, we actually can do pairwise comparisons on them.
All of the APIs revolve around the  protocol.
This is simply a function which takes a table of record pairs,
(eg with columns suffixed with _l and _r), and returns a modified
version of this table. For example, it could add a column with match scores,
add rows that were missed during the initial blocking, or remove rows
that we no longer want to consider as matched.
            mismo.compare.PComparer
    
              Bases: Protocol
A Callable that adds column(s) of features to a table of record pairs.
            mismo.compare.PComparer.__call__
__call__(pairs: Table, **kwargs) -> Table
Add column(s) of features to a table of record pairs.
For example, add a match score to each record pair, modify a score from a previous PComparer, or similar.
Implementers must expect to be called with a table of record pairs. Columns suffixed with "_l" come from the left table, columns suffixed with "_r" come from the right table, and columns with neither suffix are features of the pair itself (eg from a different PComparer).
Level-Based Comparers
Bin record pairs into discrete levels, based on levels of agreement.
Each LevelComparer represents a dimension, such as name, location, price, date, etc. Each one contains many MatchLevels, each of which is a level of aggreement, such as exact, misspelling, within_1_km, etc.
            mismo.compare.MatchLevel
    An enum-like class for match levels.
This class is used to define the levels of agreement between two records.
Examples:
>>> from mismo.compare import MatchLevel
>>> class NameMatchLevel(MatchLevel):
...     EXACT = 0
...     NEAR = 1
...     ELSE = 2
The class acts as a container:
>>> len(NameMatchLevel)
3
>>> 2 in NameMatchLevel
True
>>> list(NameMatchLevel)
['EXACT', 'NEAR', 'ELSE']
You can access the hardcoded values:
>>> str(NameMatchLevel.EXACT)
'EXACT'
>>> int(NameMatchLevel.EXACT)
0
You can use indexing semantics to translate between strings and ints:
>>> NameMatchLevel[1]
'NEAR'
>>> NameMatchLevel["NEAR"]
1
>>> NameMatchLevel[ibis.literal(1)].execute()
'NEAR'
>>> NameMatchLevel[ibis.literal("NEAR")].execute()
1
You can construct your own values, both from python literals...
>>> NameMatchLevel("NEAR").as_integer()
1
>>> NameMatchLevel(2).as_string()
'ELSE'
>>> NameMatchLevel(3)
Traceback (most recent call last):
...
ValueError: Invalid value: 3. Must be one of {0, 1, 2}`
...And Ibis expressions
>>> import ibis
>>> levels_raw = ibis.array([0, 2, 1, 99]).unnest()
>>> levels = NameMatchLevel(levels_raw)
>>> levels.as_string().execute()
0    EXACT
1     ELSE
2     NEAR
3     None
Name: NameMatchLevel, dtype: object
>>> levels.as_integer().name("levels").execute()
0     0
1     2
2     1
3    99
Name: levels, dtype: int8
Comparisons work as you expect:
>>> NameMatchLevel.NEAR == 1
True
>>> NameMatchLevel(1) == "NEAR"
True
>>> (levels_raw == NameMatchLevel.NEAR).name("eq").execute()
0    False
1    False
2     True
3    False
Name: eq, dtype: bool
However, implicit ordering is not supported (file an issue if you think it should be):
>>> NameMatchLevel.NEAR > 0
Traceback (most recent call last):
...
TypeError: '>' not supported between instances of 'NameMatchLevel' and 'int'
            mismo.compare.MatchLevel.__eq__
__eq__(
    other: int
    | str
    | NumericValue
    | StringValue
    | MatchLevel,
) -> bool | BooleanValue
            mismo.compare.MatchLevel.__init__
__init__(
    value: MatchLevel
    | int
    | str
    | StringValue
    | IntegerValue,
)
Create a new match level value.
If the given value is a python int or str, it is checked against the valid values for this class. If it is an ibis expression, we do no such check.
| PARAMETER | DESCRIPTION | 
|---|---|
                value
             | 
            
               The value of the match level. 
                  
                    TYPE:
                        | 
          
            mismo.compare.MatchLevel.as_integer
as_integer() -> int | IntegerValue
Convert to a python int or ibis integer, depending on the original type.
            mismo.compare.MatchLevel.as_string
as_string() -> str | StringValue
Convert to a python str or ibis string, depending on the original type.
            mismo.compare.LevelComparer
    Assigns a MatchLevel to record pairs based on one dimension, e.g. name
            mismo.compare.LevelComparer.cases
  
      instance-attribute
  
    The cases to check for each level.
            mismo.compare.LevelComparer.levels
  
      instance-attribute
  
levels: Type[MatchLevelT]
The levels of agreement.
            mismo.compare.LevelComparer.name
  
      instance-attribute
  
name: str
The name of the comparer, eg "date", "address", "latlon", "price".
            mismo.compare.LevelComparer.representation
  
      class-attribute
      instance-attribute
  
representation: Literal['string', 'integer'] = 'integer'
The native representation of the levels in ibis expressions.
Integers are more performant, but strings are more human-readable.
            mismo.compare.LevelComparer.__call__
__call__(
    pairs: Table,
    *,
    representation: Literal["string", "integer"]
    | None = None,
) -> StringColumn | IntegerColumn
Label each record pair with the level that it matches.
Go through the levels in order. If a record pair matches a level, label ir. If none of the levels match a pair, it labeled as "else".
| PARAMETER | DESCRIPTION | 
|---|---|
                pairs
             | 
            
               A table of record pairs. 
                  
                    TYPE:
                        | 
          
| RETURNS | DESCRIPTION | 
|---|---|
                labels
             | 
            
               The labels for each record pair. 
                  
                    TYPE:
                        | 
          
Plotting
            mismo.compare.compared_dashboard
compared_dashboard(
    compared: Table,
    comparers: Iterable[LevelComparer],
    weights: Weights | None = None,
    *,
    width: int = 500,
) -> VBox
A dashboard for debugging compared record pairs.
Used to see which match levels are common, which are rare, and which Comparers are related to each other. For example, exact matches should appear together across all Comparers, this probably represents true matches.
| PARAMETER | DESCRIPTION | 
|---|---|
                compared
             | 
            
               The result of running the blocked table through the supplied  
                  
                    TYPE:
                        | 
          
                comparers
             | 
            
               The LevelCompareres that were used to compare  
                  
                    TYPE:
                        | 
          
                weights
             | 
            
               The Weights used to score the comparers. If provided, the chart will be colored by the odds found from the Weights. 
                  
                    TYPE:
                        | 
          
                width
             | 
            
               The width of the chart. 
                  
                    TYPE:
                        | 
          
| RETURNS | DESCRIPTION | 
|---|---|
                
                    VBox
                
             | 
            
               The dashboard.  |