apunim package

Module contents

class apunim.ApunimResult(apunim, pvalue, support)

Bases: tuple

Container for the result of the Aposteriori Unimodality (apunim) test for a single factor level.

Attributes:
apunim (float): The apunim statistic for the factor.
  • apunim > 0: Increased polarization due to group differences.

  • apunim < 0: Decreased polarization due to group differences.

  • apunim ≈ 0: Polarization explained by chance.

  • NaN indicates that the statistic could not be computed.

pvalue (float): The p-value associated with the AP-unimodality statistic.

Reflects the statistical significance of the observed polarization relative to randomized partitions. NaN indicates p-value could not be computed.

support (int): The number of observations (annotations) for the factor.

See also

apunim

Alias for field number 0

pvalue

Alias for field number 1

support

Alias for field number 2

apunim.aposteriori_unimodality(annotations: Collection[float], factor_group: Collection[FactorType], comment_group: Collection[FactorType], num_bins: int | None = None, iterations: int = 100, alpha: float | None = 0.05, two_sided: bool = True, seed: int | None = None) dict[FactorType, ApunimResult]

Perform the Aposteriori Unimodality (apunim) test for group-wise polarization.

This test evaluates whether differences between annotator groups (e.g., gender, age) contribute significantly to the polarization observed in a dataset, as measured by Distance From Unimodality (DFU).

The test compares the observed DFU of each factor level to the distribution of DFU values obtained by randomly partitioning annotations according to group sizes (apriori randomization). The apunim statistic quantifies the relative increase or decrease in polarization attributable to group differences.

Generally: - apunim > 0: increased polarization due to group differences. - apunim < 0: decreased polarization due to group differences. - apunim ≈ 0: polarization explained by chance.

Parameters:
  • annotations (Collection[float]) – A list of annotation scores, where each element corresponds to an annotation (e.g., a toxicity score) made by an annotator. Needs not be discrete.

  • factor_group (Collection[FactorType]) – A list indicating the group assignment (e.g., ‘male’, ‘female’) of the annotator who produced each annotation. For example, if two annotations were made by a male and female annotator respectively, the provided factor_group would be [“male”, “female”].

  • comment_group (Collection[FactorType]) – A list of comment identifiers, where each element associates an annotation with a specific comment in the discussion.

  • num_bins (int) – The number of bins to use when computing the DFU polarization metric. If data is discrete, it is advisable to use the number of modes. Example: An annotation task in the 1-5 LIKERT scale should use 5 bins. None to create as many bins as the distinct values in the annotations. WARNING: If set to None, check whether all possible values are represented at least once in the provided annotation.

  • iterations (int) – The number of randomized groups compared against the original groups. A larger number makes the method more accurate, but also more computationally expensive.

  • alpha (float | None) – The target statistical significance. Used to apply pvalue correction for multiple comparisons. None to disable pvalue corrections.

  • two_sided (bool) – Whether the statistical tests run for both less and greater polarization, or just greater. Defaults to True.

  • seed (int | None) – The random seed used, None for non-deterministic outputs.

Returns:

Dictionary mapping factor levels to ApunimResult namedtuples containing: the apunim value and its pvalue

Return type:

dict[FactorType, ApunimResult]

Raises:

ValueError

  • If input lists differ in length.

  • If annotations is empty.

  • If factor_group has fewer than 2 unique groups.

  • If comment_group has fewer than 2 unique comments.

  • If iterations < 1.

  • If num_bins < 2.

  • If alpha is not in the range [0,1].

  • If no valid polarized comments are found

    (all DFU ≤ 0.01 or fewer than 2 annotator groups per comment).

  • If _apriori_polarization_stat finds inconsistent

    group sizes vs. annotations.

See also

Note

The test is relatively robust even with a small number of annotations per comment. The pvalue estimation is parametric (Student-t test).

apunim.dfu(x: Collection[float], bins: int, normalized: bool = True) float

Compute the Distance From Unimodality (DFU) for a sequence of annotations.

DFU measures how much a distribution deviates from being unimodal. The normalized DFU (nDFU) rescales the value to the range [0, 1].

  • DFU/nDFU = 0 indicates a unimodal or flat distribution.

  • Higher DFU/nDFU values indicate stronger multimodality or polarization.

  • nDFU = 1 indicates the maximum possible polarization.

Parameters:
  • x (Collection[float]) – Sequence of annotation values (e.g., ratings, scores). Values need not be discrete, but discrete annotations should use a number of bins equal to the number of distinct values.

  • bins (int) – Number of bins to use for histogramming. For discrete data, it is recommended to use the number of distinct annotation levels.

  • normalized (bool) – If True, returns the normalized DFU (nDFU). If False, returns the raw DFU.

Raises:

ValueError – If x is empty or`bins` < 2.

Returns:

DFU or normalized DFU (nDFU) statistic for the sequence.

Return type:

float

Note

DFU is computed based on the maximum difference between the histogram peak and its neighbors. For details on the methodology and usage, see the original paper: Pavlopoulos and Likas 2024.

See also

Credits Original code and concept adapted from John Pavlopoulos: https://github.com/ipavlopoulos/ndfu