T检验

先决条件 -假设检验，p 值

t 检验是一种推断统计量，用于确定两组均值之间是否存在显着差异，这可能与某些特征相关。

$t=\frac{\text { variance between groups }}{\text { variance within groups }}$

如果 t 值很大 => 两组属于不同的组。
如果 t 值很小 => 两组属于同一组。

涉及的术语

自由度 (df) –它告诉我们用于计算 2 个样本组之间估计值的自变量的数量。 [公式-2]

$d f=\sum n_{s}-1$

where, 
df = degree of freedom
nS = size of the sample S

Suppose, we have 2 samples A and B. The df would be calculated as

df = (n_A-1) + (n_B-1)

编程需要懂一点英语

显着性水平（α）——当原假设为真时拒绝原假设的概率。简而言之，它告诉我们说两组之间存在差异所涉及的风险百分比，而实际上它并不存在。

存在三种类型的 t 检验，它们分为依赖和独立 t 检验。

独立样本 t 检验：比较两组的平均值。
配对样本 t 检验：比较同一组在不同时间（例如，相隔一年）的平均值。
一个样本 t 检验：单个组的平均值与已知平均值的对比。

1.独立样本t检验

独立样本 t 检验，通常称为非配对样本 t 检验，用于确定两组之间发现的差异是否真正显着或只是随机发生。

我们可以在以下情况下使用它：

总体均值或标准差未知。（人口信息未知）
这两个样本是分开的/独立的。例如。男孩和女孩（两者相互独立）

使用的公式：

$t=\frac{\mu_{A}-\mu_{B}}{\sqrt{\left[\frac{1}{n_{A}}+\frac{1}{n_{B}}\right] *\left[\left(\sum A^{2}-\frac{\left(\sum A\right)^{2}}{n_{A}}\right)+\left(\sum B^{2}-\frac{\left(\sum B\right)^{2}}{n_{B}}\right)\right] *\left[\frac{1}{d f}\right]}}$

where,
t = t-value 
A = Sample of A
B = Sample of B
μA = Mean of sample A
μB = Mean of sample B
nA = samele size of A  
nB = sample size of B 
df = degree of freedom

涉及的步骤

Step 1 - Find the sum of all values in each sample. 
Step 2 - Square the sum values found in step 1.
Step 3 - Find the sum of square of individual values in each sample.
Step 4 - Calculate the mean of each sample.
Step 5 - Find the degree of freedom (df) using Eq-2.
Step 6 - Insert all the values found in Steps 1-4 into Eq-3 and find the calculated t-value.
Step 7 - Use the values of df and α (take α = 0.05 if not given) in the two-tails t-table (Click here) to 
      find the table value of t.
Step 8 - Compare values of t found in Step-6 and Step-7.

解释结果

If tcal > ttable => p < (α=0.05) => significant difference between two groups found.
If tcal < ttable => p > (α=0.05) => no significant difference between two groups.

示例问题（逐步）

假设给定两个独立的样本数据 A 和 B，其值如下。我们必须对此数据执行独立样本 t 检验。

Sample A	Sample B
1	1
2	2
4	2
4	3
5	3
5	4
6	5
7	6
8	7
8	7

Step 1 - 
∑A = 1 + 2 + 4 + 4 + 5 + 5 + 6 + 7 + 8 + 8 = 50
∑B = 1 + 2 + 2 + 3 + 3 + 4 + 5 + 6 + 7 + 7 = 40

Step 2 -
(∑A)2 = (50)2 = 2500
(∑B)2 =    (40)2 = 1600

Step 3 -
∑A2 = 12 + 22 + 42 + 42 + 52 + 52 + 62 + 72 + 82 + 82 = 300
∑B2 = 12 + 22 + 22 + 32 + 32 + 42 + 52 + 62 + 72 + 72 = 202

Step 4 -
n = 10
μA = (∑A / n) = 50/10 = 5
μB = (∑B / n) = 40/10 = 4

Step 5 - 
df = (nA - 1) + (nB - 1) = (10-1) + (10-1) = 18 [using Eq-2]

Step 6 - Putting values found in Eq-3 to find the calculated value of t.
     we get, tcal = 0.99

Step 7 - Let value of α = 0.05 and df = 18. Looking up the two-tailed t-table. 
     (See table below or refer link above)
     we get, ttable = 2.10

(df)/(α)	0.2	0.10	0.05	. .
∞	1.282	1.645	1.960	. .
1	3.078	6.314	12.706	. .
2	1.886	2.920	4.303	. .
:	:	:	:	. .
8	1.397	1.860	2.306	. .
9	1.383	1.833	2.262	. .
:	:	:	:	. .
18	1.330	1.734	2.101	. .
19	1.328	1.729	2.093	. .
20	1.325	1.725	2.086	. .
:	:	:	:	. .

Step 8 - 
0.99 < 2.10 (tcal < ttable by 1.11)
=> no significant difference found between two groups.

2.配对样本t检验

配对样本t检验，俗称依赖样本t检验，用于判断两个样本的均值之差是否为0。检验是在依赖样本上进行的，通常针对特定的一组人或事物。在这种情况下，每个实体被测量两次，从而产生一对观察结果。

我们可以在以下情况下使用它：

给出了两个相似的（双胞胎样）样本。 [例如，在英语和数学中获得的分数（两个科目）]
因变量（数据）是连续的。
观察是相互独立的。
因变量大致呈正态分布。

使用的公式

$t=\frac{\left(\sum D\right) / N}{\sqrt{\frac{\sum D^{2}-\left(\frac{(\Sigma D)^{2}}{N}\right)}{(N)(N-1)}}}$

where, 
t = t-value
D = difference between the two samples (A-B)
N = sample size (same as n)

涉及的步骤

Step 1 - Find the sum of difference of each two samples in data. [∑D = ∑(A-B)]
Step 2 - Find the sum of square of each D found in Step 1. [(∑D2)]
Step 3 - Find the square of summation of D. [(∑D)2]
Step 4 - Put the values found from Steps 1-3 in Eq-4 and find the t-value.
Step 5 - Find the degree of freedom (df) using Eq-2.

NOTE : Here, df is calculated as a whole for the data, not for each individual sample set. This is because the two samples A and B are twin like. (similar)

So, df = ∑(n_S – 1) = N-1

编程需要懂一点英语

Step 6 - Use the values of df and α (take α = 0.05 if not given) in the two-tails t-table (Click here) to 
      find the table value of t. 
Step 7 - Compare values of t found in Step-4 and Step-6.

结果解释

与独立样本 t 检验相同。

示例问题（逐步）

考虑以下示例。数学和 SST 科目的分数（满分 25 分）取自 10 名学生的样本。我们必须对此数据执行配对样本 t 检验。

Student no.	Math	SST	Step 1 (D)	Step 2 (∑D²)
1	4	15	-11	121
2	4	16	-12	144
3	7	14	-7	49
4	16	14	2	4
5	20	22	-2	4
6	11	22	-11	121
7	13	23	-10	100
8	9	18	-9	81
9	11	18	-7	49
10	15	19	-4	16
Sum –			(∑D) = -71	∑D² = 689

Step 1 and Step 2 - as shown in table above.

Step 3 - (∑D)2 = (71)2 = 5041

Step 4 - Putting values in Eq-4, we get
     tcal = -4.96

Step 5 - df = n -1 = 10 - 1 = 9

Step 6 - Using df = 9 and α = 0.05 in table. We get,
     ttable = 2.26

Step 7 - -4.96 < 2.26 (tcal < ttable by 7.22)
=> no significant difference found between two groups.

3.一个样本t检验

一个样本 t 检验是广泛使用的 t 检验之一，用于将数据的样本均值与特定给定值进行比较。用于将样本均值与真实/总体均值进行比较。

我们可以在以下情况下使用它：

样本量很小。（30 岁以下）数据是随机收集的。数据近似正态分布。

使用的公式：

$t=\frac{\bar{x}-\mu}{\frac{\sigma}{\sqrt{n}}}$

where,
t = t-value
x_bar = sample mean
μ = true/population mean
σ = standard deviation
n = sample size

涉及的步骤

Step 1 - Define the null (h0) and alternative (h1) hypothesis.
Step 2 - Calculate sample mean. (if not given) 
     [population mean, standard deviation, n is given]
Step 3 - Put the values found in Step 1 into Eq-5 and calculate t-value. (tcal)
Step 4 - Calculate degree of freedom (df). (same as done in paired sample t-test)
Step 5 - Take α = 0.05 if not given. Use the value of df and α and find ttable from one tailed t-table. (Click here)
Step 6 - Compare values of t found in Step-3 and Step-5.

结果解释

与独立样本 t 检验相同。

示例问题（逐步）

考虑以下示例。在将 25 名肥胖者纳入营养营之前，他们测量了他们的体重。在开始营地之前，发现人口平均体重为 45 公斤。训练营结束后，同样的 25 人，样本均值为 75，标准差为 25。健身训练营有效吗？

Step 1 - h0 -> μ = 45 (sample mean is true mean)
      h1 -> μ ≠ 45 (sample mean is not true mean)

Step 2 - Given,
      x_bar = 75
      μ = 45
      σ = 25
      n = 25

Step 3 - Putting the values from Step 2 in Eq-5. we get,
     tcal = 6

Step 4 - df = n - 1 = 24

Step 5 - Using df = 24 and α = 0.05 in table. We get,
     ttable = 1.711

Step 6 - 6 > 1.711 (tcal > ttable)
=> significant difference found between two groups.
=> the nutrition camp significantly impacted the weights and it was a success.

上述讨论的 t 检验类型被专家广泛用于医院研究领域，以获取有关提供给他们的有关各种药物和药物对人群影响的医学数据的重要信息，并帮助他们得出有关相同的。但是，个人有责任确保哪个 t 检验会带来最佳结果，并且该 t 检验的所有假设都得到遵守。如有任何疑问/疑问，请在下方评论。