#均值方差合并
均值
$$
\mathrm{mean}=\sum_i^n{\frac{x_i}{n}}=\frac{x_1+x_2+…+x_n}{n}=\frac{\mathrm{sum_1}}{n}
$$
其中$\mathrm{sum_1}=(x_1+x_2+…+x_n)$
方差
$$
\begin{alignat}{2} \mathrm{Var} & = \frac{\sum_i^n{(x_i-\mathrm{mean})^2}}{n} \& = \frac{\sum_i^n{(x_i^2-2\mathrm{mean}x_i+\mathrm{mean}^2)}}{n} \& = \frac{\mathrm{sum_2}}{n}-2\mathrm{mean}\sum_i^n{\frac{x_i}{n}}+ \mathrm{mean}^2 \& = \frac{\mathrm{sum_2}}{n}-\mathrm{mean}^2\\end{alignat}
$$
其中$\mathrm{sum_2}=(x_1^2+x_2^2+…+x_n^2)$
合并均值方差
计两个数组$A=(x_1,x_2,…x_m)$, $B=(y_1,y_2,…y_n)$。A数组包含m个元素,均值为mean1,方差为Var1,B数组包含n个元素,均值为mean2,方差为Var2
则合并A,B数组后的均值为
$$
\begin{alignat}{2} \mathrm{mean_{merge}} & = \frac{\sum_i^m{x_i} +\sum_j^n{y_j}}{m+n} \& = \frac{m\mathrm{mean1} +n\mathrm{mean2}}{m+n}\\end{alignat}
$$
方差为
$$
\begin{alignat}{2} \mathrm{Var_{merge}} & = \frac{\sum_i^m{(x_i-\mathrm{mean_{merge}})^2+\sum_j^n{(y_j-\mathrm{mean_{merge}})^2}}}{m + n} \& = \frac{\sum_i^m{(x_i^2-2\mathrm{mean_{merge}}x_i+\mathrm{mean_{merge}}^2)+\sum_j^n{(y_j^2-2\mathrm{mean_{merge}}y_j+\mathrm{mean_{merge}}^2)}}}{m+n} \& = \frac{\mathrm{sum_2}}{m+n}-\mathrm{mean_{merge}}^2 \& = \frac{\mathrm{(Var_A+\mathrm{mean_A}^2})m +\mathrm{(Var_B+\mathrm{mean_B}^2})n}{m+n}-\mathrm{mean_{merge}}^2\\end{alignat}
$$
其中$\mathrm{sum_2}=(x_1^2+x_2^2+…+x_m^2 + y_1^2+y_2^2+…+y_n^2)$, 记$\mathrm{sum_A}=(x_1^2+x_2^2+…+x_m^2)=\mathrm{(Var_A+\mathrm{mean_A}^2})m$,$\mathrm{sum_B}=( y_1^2+y_2^2+…+y_n^2) = \mathrm{(Var_B+\mathrm{mean_B}^2})n$
Python代码
code
1 | import math |
output
5.0 6.666666666666667
5.0 8.0
5.0 5.0
(9, 5.0, 6.666666666666668)