Expectation Reference#
This page gives an overview of all available datapact expectations.
- class datapact.Asserter(parent: SeriesTest, critical: bool)#
- be_between(minimum: float, maximum: float)#
checks the value range.
- Parameters:
minimum – if there’s a value lower than this, it will fail.
maximum – if there’s a value higher than this, it will fail.
Examples
>>> dp.age.should.be_between(0, 150)
- be_positive()#
checks if all values are 0 or higher.
Examples
>>> dp.age.should.be_positive()
- be_negative()#
checks if all values are 0 or smaller.
Examples
>>> dp.debt.should.be_negative()
- not_be_na()#
checks if all values are non-na.
Examples
>>> dp.user_id.must.not_be_na()
- be_one_of(*allowed_values)#
checks if values are in the allowed values.
Examples
>>> dp.state.must.be_one_of("active", "sleeping", "inactive")
- be_date()#
checks if all values are ISO8601-compliant dates. datetimes will be rejected.
Examples
>>> dp.day.must.be_date()
- be_datetime()#
checks if all values are ISO8601-compliant datetimes.
Examples
>>> dp.timestamp.must.be_datetime()
- be_unix_epoch()#
checks if all values are unix epoch-compliant timestamps.
Examples
>>> dp.timestamp.must.be_unix_epoch()
- be_normal_distributed(alpha: float = 0.05)#
performs a normaltest.
uses scipy.stats.normaltest under the hood.
- Parameters:
alpha – sensitivity of the test. low value = more sensitive.
Examples
>>> dp.salary.should.be_normal(alpha=0.1)
- match_sample(sample, alpha=0.05)#
checks if series is from same distribution as sample using a kolmogorov-smirnoff-test.
- Parameters:
sample – list-like sample to compare to
Examples
>>> dp.age.should.match_sample(reference_sample)
- match_cdf(cdf: Callable, args, N=20, alpha=0.05)#
checks if series is from distribution as given by cdf using a kolmogorov-smirnoff-test.
- Parameters:
cdf – Callable Used to calculate the cdf.
args – Distribution parameters, given to cdf.
Examples
>>> dp.wins.should.match_sample(scipy.stats.binom)
- be_binomial_distributed(n, p=0.5, N=20, alpha=0.05)#
checks if series is binomial distributed using a kolmogorov-smirnoff-test.
- Parameters:
n – number number of draws
p – probability of success
Examples
>>> dp.heads.should.be_binomial_distributed()
- be_poisson_distributed(l, N=20, alpha=0.05)#
checks if series is poisson distributed using a kolmogorov-smirnoff-test.
- Parameters:
l – number lambda for poission distribution
Examples
>>> dp.new_covid_cases.should.be_poisson_distributed(10)
- have_average_between(minimum: float, maximum: float)#
checks if average is between min and max.
Examples
>>> dp.size.should.have_average_between(4, 5)
- have_variance_between(minimum: float, maximum: float)#
checks if average is in given range.
Examples
>>> dp.size.should.have_variance_between(150, 170)
- have_median_between(minimum: float, maximum: float)#
checks if median is in given range.
Examples
>>> dp.size.should.have_median_between(150, 170)
- have_percentile_between(p: float, minimum: float, maximum: float)#
checks if percentile / quantile is in given range.
Examples
>>> dp.size.should.have_percentile_between(.95, 10, 20)
- have_no_outliers(alpha=0.05)#
verifies that series has no outliers using Grubbs test.
Examples
>>> dp.size.should.have_no_outliers()
- fulfill(custom_assertion: Callable[[Series], Optional[str]])#
checks if series passes your custom validator
Examples
>>> def custom_assertion(series: pandas.Series): ... if series.max() > 100: ... return "too high" >>> dp.user_id.must.fulfill(custom_assertion)