SigmaClip

class astropy.stats.SigmaClip(sigma=3.0, sigma_lower=None, sigma_upper=None, maxiters=5, cenfunc='median', stdfunc='std')[source] [edit on github]

Bases: object

Class to perform sigma clipping.

The data will be iterated over, each time rejecting values that are less or more than a specified number of standard deviations from a center value.

Clipped (rejected) pixels are those where:

data < cenfunc(data [,axis=int]) - (sigma_lower * stdfunc(data [,axis=int]))
data > cenfunc(data [,axis=int]) + (sigma_upper * stdfunc(data [,axis=int]))

Invalid data values (i.e. NaN or inf) are automatically clipped.

For a functional interface to sigma clipping, see sigma_clip().

Note

scipy.stats.sigmaclip provides a subset of the functionality in this class. Also, its input data cannot be a masked array and it does not handle data that contains invalid values (i.e. NaN or inf). Also note that it uses the mean as the centering function.

If your data is a ndarray with no invalid values and you want to use the mean as the centering function with axis=None and iterate to convergence, then scipy.stats.sigmaclip is ~25-30% faster than the equivalent settings here (s = SigmaClip(cenfunc='mean', maxiters=None); s(data, axis=None)).

Parameters:
sigma : float, optional

The number of standard deviations to use for both the lower and upper clipping limit. These limits are overridden by sigma_lower and sigma_upper, if input. The default is 3.

sigma_lower : float or None, optional

The number of standard deviations to use as the lower bound for the clipping limit. If None then the value of sigma is used. The default is None.

sigma_upper : float or None, optional

The number of standard deviations to use as the upper bound for the clipping limit. If None then the value of sigma is used. The default is None.

maxiters : int or None, optional

The maximum number of sigma-clipping iterations to perform or None to clip until convergence is achieved (i.e., iterate until the last iteration clips nothing). If convergence is achieved prior to maxiters iterations, the clipping iterations will stop. The default is 5.

cenfunc : {‘median’, ‘mean’} or callable, optional

The statistic or callable function/object used to compute the center value for the clipping. If set to 'median' or 'mean' then having the optional bottleneck package installed will result in the best performance. If using a callable function/object and the axis keyword is used, then it must be callable that can ignore NaNs (e.g. numpy.nanmean) and has an axis keyword to return an array with axis dimension(s) removed. The default is 'median'.

stdfunc : {‘std’} or callable, optional

The statistic or callable function/object used to compute the standard deviation about the center value. If set to 'std' then having the optional bottleneck package installed will result in the best performance. If using a callable function/object and the axis keyword is used, then it must be callable that can ignore NaNs (e.g. numpy.nanstd) and has an axis keyword to return an array with axis dimension(s) removed. The default is 'std'.

Examples

This example uses a data array of random variates from a Gaussian distribution. We clip all points that are more than 2 sample standard deviations from the median. The result is a masked array, where the mask is True for clipped data:

>>> from astropy.stats import SigmaClip
>>> from numpy.random import randn
>>> randvar = randn(10000)
>>> sigclip = SigmaClip(sigma=2, maxiters=5)
>>> filtered_data = sigclip(randvar)

This example clips all points that are more than 3 sigma relative to the sample mean, clips until convergence, returns an unmasked ndarray, and modifies the data in-place:

>>> from astropy.stats import SigmaClip
>>> from numpy.random import randn
>>> from numpy import mean
>>> randvar = randn(10000)
>>> sigclip = SigmaClip(sigma=3, maxiters=None, cenfunc='mean')
>>> filtered_data = sigclip(randvar, masked=False, copy=False)

This example sigma clips along one axis:

>>> from astropy.stats import SigmaClip
>>> from numpy.random import normal
>>> from numpy import arange, diag, ones
>>> data = arange(5) + normal(0., 0.05, (5, 5)) + diag(ones(5))
>>> sigclip = SigmaClip(sigma=2.3)
>>> filtered_data = sigclip(data, axis=0)

Note that along the other axis, no points would be clipped, as the standard deviation is higher.

Methods Summary

__call__(data[, axis, masked, …]) Perform sigma clipping on the provided data.

Methods Documentation

__call__(data, axis=None, masked=True, return_bounds=False, copy=True)[source] [edit on github]

Perform sigma clipping on the provided data.

Parameters:
data : array-like or MaskedArray

The data to be sigma clipped.

axis : None or int or tuple of int, optional

The axis or axes along which to sigma clip the data. If None, then the flattened data will be used. axis is passed to the cenfunc and stdfunc. The default is None.

masked : bool, optional

If True, then a MaskedArray is returned, where the mask is True for clipped values. If False, then a ndarray and the minimum and maximum clipping thresholds are returned. The default is True.

return_bounds : bool, optional

If True, then the minimum and maximum clipping bounds are also returned.

copy : bool, optional

If True, then the data array will be copied. If False and masked=True, then the returned masked array data will contain the same array as the input data (if data is a ndarray or MaskedArray). The default is True.

Returns:
result : flexible

If masked=True, then a MaskedArray is returned, where the mask is True for clipped values. If masked=False, then a ndarray is returned.

If return_bounds=True, then in addition to the (masked) array above, the minimum and maximum clipping bounds are returned.

If masked=False and axis=None, then the output array is a flattened 1D ndarray where the clipped values have been removed. If return_bounds=True then the returned minimum and maximum thresholds are scalars.

If masked=False and axis is specified, then the output ndarray will have the same shape as the input data and contain np.nan where values were clipped. If return_bounds=True then the returned minimum and maximum clipping thresholds will be be ndarrays.