Skip to content

EDA API

This module contains utilities for cleaning and preparing data for linkage.

mismo.eda.distribution_chart(vals: ir.Column, *, limit: int | None = None) -> alt.Chart

Make a Altair histogram of values.

Useful as an exploratory tool to look at what values are present in a column.

PARAMETER DESCRIPTION
vals

The values to plot.

TYPE: ColumnExpr

limit

The maximum number of bars to plot, by default 1000

TYPE: int DEFAULT: None

RETURNS DESCRIPTION
Chart

The histogram.

mismo.eda.distribution_dashboard(records: ir.Table, *, column: str | None = None, limit: int | None = None) -> ipywidgets.VBox

Make an ipywidget dashboard for exploring the distribution of values in a table.

PARAMETER DESCRIPTION
records

The table to plot.

TYPE: Table

column

The initial column to plot. If None, the first column is used. You can change this interactively in the returned dashboard.

TYPE: str DEFAULT: None

limit

The initial maximum number of values to plot, by default 100. You can change this interactively in the returned dashboard.

TYPE: int DEFAULT: None

RETURNS DESCRIPTION
VBox

The dashboard.