Probability for ML — Interview Grill
50 questions on probability fundamentals, distributions, Bayes, limit theorems. Drill until you can answer 35+ cold.
A. Probability basics
1. State the three probability axioms. . . Countable additivity for disjoint events.
2. What's the inclusion-exclusion principle for two sets? .
3. Define conditional probability. for .
4. State Bayes' theorem. .
5. Define independence vs uncorrelated. Independent: . Uncorrelated: . Independence ⟹ uncorrelated, but not vice versa (except for jointly Gaussian).
6. What's the law of total probability? For partition : .
7. Conditional independence — define. iff . NOT the same as unconditional independence.
B. Random variables
8. Define expectation. or .
9. Linearity of expectation — when does it hold? Always — even for dependent variables. .
10. Variance formula — two equivalent forms? .
11. Variance of a sum? .
12. Covariance formula? .
13. Variance of for iid samples? .
14. State the law of total expectation. (tower property).
15. State the law of total variance. .
C. Common distributions
16. Bernoulli mean and variance? Mean , variance .
17. Binomial mean and variance? , . Sum of iid Bernoullis.
18. Poisson mean and variance? Both . Variance equals mean — Poisson signature.
19. When does Binomial → Poisson? , , fixed. Used for rare events.
20. Geometric mean and variance? Mean , variance . Number of trials until first success.
21. Exponential mean and variance? , .
22. Gaussian — fully specified by what? Mean and variance . (Multivariate: mean vector and covariance matrix.)
23. What's the memoryless property? . Only geometric (discrete) and exponential (continuous) have it.
24. Sum of independent Gaussians? Gaussian. Means add, variances add.
25. Sum of independent Poissons? Poisson. Rates add.
26. Beta distribution — what does it model? A probability (range ). Conjugate prior for Bernoulli/Binomial.
27. Gamma — what does it model? Positive continuous quantity. Sum of exponentials. Conjugate for Poisson rate.
D. Multivariate Gaussian
28. Density of multivariate Gaussian? .
29. Affine transform of Gaussian? .
30. Marginal of multivariate Gaussian? Gaussian. Just take the corresponding subvector of and submatrix of .
31. Conditional of multivariate Gaussian? Gaussian. .
32. Uncorrelated jointly Gaussian = independent. True? Yes. This is special to Gaussians.
33. If both Gaussian individually, is jointly Gaussian? Not necessarily. Marginal Gaussianity doesn't imply joint. (Counterexample: , where randomly.)
E. Limit theorems
34. State the weak law of large numbers. For iid with finite mean : .
35. State the central limit theorem. For iid with mean , finite variance : .
36. When does CLT fail? Infinite variance (heavy tails like Cauchy). Strongly dependent data without mixing conditions.
37. CLT convergence rate? Berry-Esseen: , with constant depending on third moment. Skewed distributions need larger .
38. Why is Gaussian everywhere in stats? CLT — sums of many small effects approach Gaussian. So sample means, regression residuals, etc. tend to be approximately Gaussian.
F. Bayes applications
39. Disease prevalence 1%, test sensitivity 99%, specificity 99%. P(disease | positive)? . Even 99% accurate tests give only 50% probability for 1% prevalence.
40. What's the base rate fallacy? Ignoring prior probability when interpreting test results. The classic Bayesian error.
41. What's naive Bayes' assumption? Features conditionally independent given class: .
42. Why does naive Bayes work despite the assumption being wrong? Need only correct relative ordering of class probabilities; absolute values can be miscalibrated.
43. Sequential Bayes update — what happens to posterior after multiple iid observations? Posterior after observations = prior × likelihood = repeatedly applying Bayes one observation at a time.
G. Calculations to do fast
44. . ? .
45. . ? .
46. for ? . (Half-normal mean.)
47. Variance of sum of iid Bernoulli()? .
48. Roll a fair die until you get a 6. Expected number of rolls? . (Geometric distribution.)
49. Two iid uniform . ? . Or .
50. iid . Distribution of ? = Exp(1/2). .
Quick fire
51. Bernoulli variance? . 52. Poisson variance equals? Mean. 53. Memoryless distributions? Geometric, Exponential. 54. Conjugate of Bernoulli? Beta. 55. CLT requires what about variance? Finite. 56. Linearity of expectation requires? Nothing — always holds. 57. Independence implies? Uncorrelated. 58. Cov = 0 implies independent? Only for jointly Gaussian. 59. 95% CI z-value? 1.96. 60. Variance of sample mean of iid? .
Self-grading
If you can't answer 1-15, you don't know basic probability. If you can't answer 16-35, you'll get tripped up on Bayes/distribution questions. If you can't answer 36-50, frontier-lab interview probability problems will go past you.
Aim for 40+/60 cold.