Skip to content

EDA API

This module contains utilities for cleaning and preparing data for linkage.

mismo.eda.distribution_chart

distribution_chart(
    vals: Column, *, limit: int | None = None
) -> Chart

Make a Altair histogram of values.

Useful as an exploratory tool to look at what values are present in a column.

PARAMETER DESCRIPTION
vals

The values to plot.

TYPE: ColumnExpr

limit

The maximum number of bars to plot, by default 1000

TYPE: int DEFAULT: None

RETURNS DESCRIPTION
Chart

The histogram.

mismo.eda.distribution_dashboard

distribution_dashboard(
    records: Table,
    *,
    column: str | None = None,
    limit: int | None = None,
) -> VBox

Make an ipywidget dashboard for exploring the distribution of values in a table.

PARAMETER DESCRIPTION
records

The table to plot.

TYPE: Table

column

The initial column to plot. If None, the first column is used. You can change this interactively in the returned dashboard.

TYPE: str DEFAULT: None

limit

The initial maximum number of values to plot, by default 100. You can change this interactively in the returned dashboard.

TYPE: int DEFAULT: None

RETURNS DESCRIPTION
VBox

The dashboard.