MVSR Python API Reference

High-level Function Interface

mvsr(x, y, k, *, kernel=Kernel.Poly(1), algorithm=None, score=None, normalize=None, weighting=None, dtype=np.float64, keepdims=False, sortkey=None)

Run multi-variant segmented regression on input data, reducing it to k piecewise segments.

Parameters:
  • x (numpy.typing.ArrayLike) – Array-like containing the x input values. This gets transformed into the internal X matrix by the selected kernel. Values may be of any type.

  • y (numpy.typing.ArrayLike) – Array-like containing the y input values. Shape (n_samples,) or (n_variants, n_samples).

  • k (int) – Target number of segments for the Regression.

  • kernel (Kernel.Raw) – Kernel used to transform x values into the internal X matrix, as well as normalize and interpolate y values. Defaults to Kernel.Poly().

  • algorithm (Algorithm | None) – Algorithm used to reduce the number of segments. If None, the algorithm will be selected automatically based on the number of samples, number of x dimensions and set k. Defaults to None.

  • score (Score | None) – Placeholder for k scoring method (not implemented yet).

  • normalize (bool | None) – Normalize y input values. If None, auto-enabled for multi-variant input data. Defaults to None.

  • weighting (numpy.typing.ArrayLike) – Optional per-variant weights. Defaults to None.

  • dtype (numpy.float32 | numpy.float64) – Internally used numpy data type. Defaults to numpy.float64.

  • keepdims (bool) – If set to False, return scalar values when evaluating single-variant segments. Defaults to False.

  • sortkey (Callable[[Any], Any] | None) – If the x values are not comparable, this function is used to extract a comparison key for each of them. Defaults to None.

Returns:

Regression object containing k segments.

Raises:
  • ValueError – If input dimensions of x, y, weighting are incompatible.

  • RuntimeError – If normalization is enabled but the selected kernel does not support it.

Classes

class Regression(xs, ys, kernel, starts, models, errors, keepdims, sortkey=None)

Regression consisting of multiple segments.

Parameters:
__call__(x)

Evaluate the regression for a given x value.

Parameters:

x (Any) – Input x value.

Returns:

Predicted y value.

Return type:

numpy.ndarray

__len__()

Get the number of segments.

Returns:

Number of segments.

Return type:

int

property segments

List of Segment objects.

Type:

list[Segment]

property starts

Input sample indices of segment starts.

Type:

numpy.ndarray

property variants

List of Regression objects for each variant.

Type:

list[Regression]

get_segment(x)

Get Segment object for a given x value.

Returns an interpolated Segment if x is in between segments.

Parameters:

x (Any) – Input x value.

Returns:

Segment corresponding to x.

Return type:

Segment

get_segment_index(x)

Get segment indices for a given x value.

Returns multiple indices if x is in between segments.

Parameters:

x (Any) – Input x value.

Returns:

Tuple of segment indices.

Return type:

tuple[int, …]

get_segment_by_index(index)

Get Segment object for the given indices.

Returns an interpolated Segment if multiple indices are provided.

Parameters:

index (tuple[int, ...]) – Tuple of segment indices.

Returns:

Segment at the given index or interpolated segment.

Return type:

Segment

plot(ax, xs=1000, style={}, istyle=None)

Plot regression segments using matplotlib.

Parameters:
  • ax (Axes | Iterable[Axes]) – Single matplotlib Axes or iterable of Axes for each variant.

  • xs (int | numpy.typing.ArrayLike | Iterable[Any]) – Number of points to sample or array-like of explicit x values. Defaults to 1000.

  • style (dict[str, Any] | Iterable[dict[str, Any] | None]) – Matplotlib styling applied to segments. Can be provided as iterable for each variant. Defaults to {}.

  • istyle (dict[str, Any] | Iterable[dict[str, Any] | None] | None) – Matplotlib styling used for interpolated regions between segments. If None, uses default styling. Defaults to None.

Returns:

List of handles for plotted lines, per variant, per segment.

Return type:

list[list[list[matplotlib.lines.Line2D]]]

class Segment(xs, ys, kernel, model, errors, keepdims)

Regression segment.

Parameters:
  • xs (numpy.ndarray) – X input values.

  • ys (numpy.ndarray) – Y input values.

  • kernel (Kernel.Raw) – Kernel used to transform x values.

  • model (numpy.ndarray) – Model matrix describing the segment.

  • errors (numpy.ndarray) – Residual sum of squares for each segment sample.

  • keepdims (bool) – If set to False, return scalar values when evaluating single-variant segments.

__call__(x, keepdims=None)

Evaluate the segment for a given x value.

Parameters:
  • x (Any) – Input x value.

  • keepdims (bool | None) – If set to False, return scalar values when the segment only has one variant. If None, use value provided from segment initialization. Defaults to None.

Returns:

Predicted y value.

Return type:

numpy.ndarray

predict(xs, keepdims=None)

Evaluate the regression for the given x values.

Parameters:
  • xs (numpy.typing.ArrayLike) – Input x values.

  • keepdims (bool | None) – If set to False, return scalar values when the segment only has one variant. If None, use value provided from segment initialization. Defaults to None.

Returns:

Predicted y values.

Return type:

numpy.ndarray

property model

Model matrix describing the segment.

Type:

numpy.ndarray

get_model(keepdims=None)

Get the model matrix describing the segment.

Parameters:

keepdims (bool | None) – If set to False, return scalar values when the segment only has one variant. If None, use value provided from segment initialization. Defaults to None.

Returns:

Model matrix.

Return type:

numpy.ndarray

property range

Input x value range.

Type:

tuple[Any, Any]

property samplecount

Number of samples.

Type:

int

property xs

Input x values.

Type:

numpy.ndarray

property ys

Input y values.

Type:

numpy.ndarray

property rss

Residual sum of squares, per sample.

Type:

numpy.ndarray

property mse

Mean squared error, per sample.

Type:

numpy.ndarray

plot(ax, xs=1000, style={})

Plot segment using matplotlib.

Parameters:
  • ax (Axes | Iterable[Axes]) – Single matplotlib Axes or iterable of Axes for each variant.

  • xs (int | numpy.typing.ArrayLike | Iterable[Any]) – Number of points to sample or array-like of explicit x values. Defaults to 1000.

  • style (dict[str, Any] | Iterable[dict[str, Any] | None]) – Matplotlib styling applied to segments. Can be provided as iterable for each variant. Defaults to {}.

Returns:

List of handles for plotted lines, per variant.

Return type:

list[list[matplotlib.lines.Line2D]]

Kernels

class Kernel.Raw(translation_dimension=None, model_interpolation=Interpolate.closest)

Raw Kernel to be used as a base class for other Kernel types.

Implements pass-through transformation of x values, normalization of y values and interpolation between segments.

Parameters:
  • translation_dimension (int | None) – Index of the model dimension that translates the regression along the y axis (required for normalization). Defaults to None.

  • model_interpolation (Callable[[numpy.typing.ArrayLike, list[Segment]], list[float]] | None) – Function to interpolate between neighbouring segments. Defaults to Interpolate.closest().

Raw.__call__(x)

Convert input array of x values to numpy array of dimensions.

Parameters:

x (numpy.typing.ArrayLike) – Input x values.

Returns:

Internal X matrix to use with libmvsr.Mvsr.

Return type:

numpy.ndarray

Raw.normalize(y)

Normalize each y variant to a range of [0,1].

Parameters:

y (numpy.ndarray) – Input y values. Shape (n_variants, n_samples)

Raises:

RuntimeError – If translation_dimension has not been specified.

Returns:

Normalized y values.

Return type:

numpy.ndarray

Raw.denormalize(models)

Denormalize models derived from values previously normalized with normalize().

Parameters:

models (numpy.ndarray) – Models for regression segments.

Raises:

RuntimeError – If normalize() has not been called on this kernel before.

Returns:

Denormalized segment models.

Return type:

numpy.ndarray

Raw.interpolate(segments)

Create interpolated Segment using the provided model_interpolation.

Parameters:

segments (list[Segment]) – List of segments to be interpolated between.

Raises:

RuntimeError – If model_interpolation has not been specified.

Returns:

Interpolated segment.

Return type:

Segment

class Kernel.Poly(degree=1, model_interpolation=None)

Kernel for polynomial regression segments.

Bases: Kernel.Raw

Inherited Methods: normalize(), denormalize()

Parameters:
  • degree (int) – Polynomial degree.

  • model_interpolation (Callable[[numpy.typing.ArrayLike, list[Segment]], list[float]] | None) – Function to interpolate between neighbouring segments. If None interpolate linearly between segment endpoints. Defaults to None.

Poly.__call__(x)

Convert input array of x values to numpy array of dimensions.

Parameters:

x (numpy.typing.ArrayLike) – Input x values.

Returns:

Internal X matrix to use with libmvsr.Mvsr.

Return type:

numpy.ndarray

Poly.interpolate(segments)

Create interpolated Segment.

Uses model_interpolation if provided else linearly interpolates between segment endpoints.

Parameters:

segments (list[Segment]) – List of segments to be interpolated between.

Raises:

RuntimeError – If model_interpolation is set to None and more than 2 segments were provided or segments were constructed from multidimensional x values.

Returns:

Interpolated segment.

Return type:

Segment

Enums

class Algorithm(*values)

Algorithm used to reduce the number of segments.

GREEDY = 0

Fast Greedy Algorithm ( \(O(n \log n)\) ).

DP = 1

Dynamic Program ( \(O(n^2)\) ).

class Score(*values)

Scoring method used to determine the number of segments.

EXACT = 0

Output the exact number of segments provided.

Interpolation

Interpolate.left(_x, segments)

Always use the leftmost (first) Segment for interpolating.

Interpolate.right(_x, segments)

Always use the rightmost (last) Segment for interpolating.

Interpolate.closest(x, segments)

Use the Segment that is closest to x for interpolating.

Interpolate.linear(x, segments)

Interpolate linearly between Segments based on x.

Interpolate.smooth(x, segments)

Interpolate smoothly (using a cubic function) between Segments based on x.