Experimental Design & Causal Inference

Question 1 Correlation vs causation — what’s the difference?

Question 2 What are potential outcomes and the fundamental problem of causal inference?

Question 3 What is the Average Treatment Effect (ATE)?

Question 4 What is confounding?

Question 5 What is selection bias?

Question 6 One-way ANOVA — what is it used for?

Question 7 One-way ANOVA — what is the test statistic and distribution?

Question 8 Two-way ANOVA — what is the model?

Question 9 Two-way ANOVA as linear regression (dummy coding) — what does that mean?

Question 10 In two-way ANOVA, how many F-tests are there and what do they test?

Question 11 What are the F-test distributions in two-way ANOVA?

Question 12 Are there t-tests in ANOVA via regression? What is the relationship between t and F?

Question 13 What are ANOVA assumptions?

Question 14 What is a full factorial design in DOE notation?

Question 15 Example: what does a \(2^3\) factorial design mean?

Question 16 What is a full factorial design? Explain with a \(2^3\) example including interactions.

A full factorial design tests all combinations of factor levels so that you can estimate main effects and interaction effects independently.

For 3 factors (A, B, C) at 2 levels:

\[ 2^3 = 8 \]

So you run 8 experiments:

Run	A	B	C
1	−	−	−
2	+	−	−
3	−	+	−
4	+	+	−
5	−	−	+
6	+	−	+
7	−	+	+
8	+	+	+

From the same runs you compute interaction columns:

\[ AB = A\times B,\quad AC = A\times C,\quad BC = B\times C,\quad ABC = A\times B\times C \]

These allow estimation of:

3 main effects (A, B, C)
3 two-way interactions (AB, AC, BC)
1 three-way interaction (ABC)
1 intercept

Because all columns are independent, no aliasing exists in the full factorial design.

Question 17 What is a fractional factorial design and how is it created? Use a \(2^{3-1}\) example.

A fractional factorial design uses only a fraction of the full factorial runs to save time/cost. The reduction is achieved with a generator.

For 3 factors (A, B, C), a half fraction is:

\[ 2^{3-1} = 4 \text{ runs} \]

A commonly used generator is:

\[ C = AB \]

This means factor C is defined by the interaction of A and B, so C is not independently set — its level is derived.

Using this generator, the 4 runs become:

Run	A	B	C (defined)
1	−	−	+
2	+	−	−
3	−	+	−
4	+	+	+

Only 4 combinations are tested instead of 8.

Question 18 What is aliasing in factorial designs? Show the alias structure in a \(2^{3-1}\) design with C = AB.

Aliasing occurs when two effects are indistinguishable — they share the same column in the design matrix.

In the \(2^{3-1}\) fractional design with generator:

\[ C = AB \]

the columns for C and AB are identical. This means:

\[ C \equiv AB \]

They are confounded — the experiment cannot tell them apart.

From the generator, you derive the alias structure:

Multiply both sides of the generator by each factor:

\[ C = AB \]

Multiply by A \(\rightarrow AC = B\)
Multiply by B \(\rightarrow BC = A\)
Multiply by C \(\rightarrow ABC = I\)

So the key aliases are:

Effect	Aliased with
A	BC
B	AC
C	AB
ABC	I

In fractional factorials, effects are aliased due to fewer runs than parameters. This is why generators are chosen carefully — to control which effects are confounded.

Question 19 Example: what does a \(3^3\) factorial design mean?

Question 20 What is the main power tradeoff in factorial designs?

Question 21 What are factorial design assumptions?

Question 22 What is a randomized block design (with a pricing example) and how does it look as regression?

Question 23 A/B testing — what is it and what is the key estimator?

Question 24 What is the standard error for the difference in means?

Question 25 A/B testing — t-test vs z-test (high level)

Question 26 What is a t-test, and what assumptions does it require?

Experimental Design & Causal Inference

What it is

Assumptions