肖维勒准则 Chauvenet’s Criterion – Python实现

阅读 2398

Python 3实现,肖维勒准则 Chauvenet’s Criterion。

使用肖维勒准则过滤正态分布数据中的异常值。
Chauvenet’s criterion is a means of assessing whether one piece of experimental data — an outlier — from a set of observations, is likely to be spurious.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import numpy as np
from scipy.stats import norm

def Chauvenet(v):
    l = len(v)
    std = np.std(v)
    avg = np.average(v)
    z = np.abs(norm.ppf(0.25 / l))
   
    if l < 5:
        return []

    Xmin = avg - (z * std)
    Xmax = avg + (z * std)

    nv = [i for i in v if Xmin <= i <= Xmax]
    bv = [i for i in v if Xmin > i or i > Xmax]
   
    #print('nv:', nv)
    #print('bv:', bv)
   
    if bv == []:
        return v

    return Chauvenet(nv)

示例 Example

1
2
3
4
5
6
7
8
9
10
11
if __name__ == '__main__':
    l = [43, 1, 1, 40, 40, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 5, 1, 1, 1, 1, 1, 2, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
    print("sorce:", l)
    print("result:", Chauvenet(l))

'''
Output:

sorce: [43, 1, 1, 40, 40, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 1, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 5, 1, 1, 1, 1, 1, 2, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
result: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
'''