Advanced Statistics for Physics - Course program


Preliminary version of the course program. As the course progresses, the program becomes final and the font color changes from gray to black. 

lesson
date
lesson topics
total time

Part 1: Probability theory and probability models


1
5/10/2022
Introduction to the course. Review of the basic concepts of probability. Basic combinatorics.
Set theory representation of probability space. Addition law for probabilities of incompatible events.
2
2
7/10/2022
Addition law for probabilities of non-mutually exclusive events: the inclusion-exclusion principle. Application of the inclusion-exclusion principle. Conditional probabilities. Independent events. Statistical independence and dimensional reduction.
Bayes' theorem and basic introduction to Bayesian inference.
4
3
12/10/2022
Elementary example of Bayesian inference.
Conditional probabilities in stochastic modeling: gambler's ruin and probabilistic diffusion models. Discrete and continuous forms of gambler's ruin. Proof of the first Borel-Cantelli lemma.
6
4
14/10/2022
Proof of the second Borel-Cantelli lemma.
The transition from sample space to random variables. Review of basic concepts and definitions on discrete and continuous random variables. Uniform distribution.
Buffon's needle. Expectation, dispersion and their properties. Higher moments. Chebyshev's inequality. From Chebyshev's inequality to the weak law of large numbers.
8
5
19/10/2022
The generalized Chebyshev's inequality. Strong law of large numbers. Brief review of common probability models. 1. The uniform distribution; 2. the Bernoulli distribution; 3 the binomial distribution. 4. the geometric distribution. 5. the negative binomial distribution 10
6
21/10/2022
Distributions (ctd): 5. the negative binomial distribution (ctd.); application of the negative binomial distribution to carcinogenesis. 5. the hypergeometric distribution; application of the hypergeometric distribution to opinion polling. 6. the multinomial distribution. 7. Poisson distribution. The Poisson process as a memoryless random point process. Example: cell plating statistics in biology; 12
7
26/10/2022
Distributions (ctd): 7. Poisson distribution (ctd.);  Poisson survival probability in radiobiology. 8. exponential distribution. Example: paralyzable and non-paralyzable detectors. Example: coincidence counting. 9. The De Moivre - Laplace theorem and the normal distribution. Properties of the normal distribution. 14
8
28/10/2022
Distributions (ctd): Error function and the cumulative distribution function of the normal distribution. High-rate limit of the Poisson distribution. Transformations of random variables. Sum of random variables (convolution). Generic transformations of multivariate distributions.  16
9
28/10/2022 Approximate transformation of random variables: General formulas for error propagation. Linear orthogonal transformations of random variables. The multivariate normal distribution. The log-normal distribution. The gamma distribution. The Cauchy-Lorentz distribution. The Landau distribution. The Rayleigh distribution.  18
10
11/11/2022
Other important probability distributions. Example of a complex model used to setup a null hypothesis: the distribution of nearest-neighbor distance. Overview of Jaynes' solution of Bertrand's paradox.
Introduction to generating functions.
20
11
16/11/2022
Generating and characteristic functions of some common distributions. Poisson distribution as limiting case of a binomial distribution. PGF of the Galton-Watson branching process. Photomultiplier noise.  22
12
18/11/2022
Photomultiplier noise (ctd.). More on the moments of a distribution. Skewness and kurtosis. Mode and median. Properties of characteristic functions. Characteristic functions of some common distributions. Characteristic function of the Gaussian distribution. Series expansion of a generic characteristic function. 24
13 18/11/2022 The Central Limit Theorem (CLT). Further comments on on the CLT. The Berry-Esseen theorem. Introduction to descriptive and exploratory statistics. Visualization in statistics (see this beautiful example due to Hans Rosling). 26
14
23/11/2022 The Delbrück-Luria experiment: an example of approximate statistical analysis of a branching process (slides). Multiplicative processes. Power Laws.
Quick review of sample mean, sample variance, estimate of covariance and correlation coefficient.
28

Part 2: Statistical inference


15
25/11/2022
Linear fits and the correlation coefficient. Schwartz's inequality and Pearson's correlation coefficient. Broad discussion on Exploratory Data Analysis (EDA, link to the dedicated NIST website).
PDF of sample mean for exponentially distributed data.
30
16 30/11/2022 Box plots. Outliers. Violin plots. Rug plots. Kernel density plots. Order statistics.
Introduction  to the Monte Carlo method. Early history of the Monte Carlo method.  Pseudorandom numbers. Uniformly distributed pseudorandom numbers. 
32
17
2/12/2022 Transformation method. Example: generation of exponentially-distributed pseudorandom numbers.
The Box-Müller method for the generation of pairs of Gaussian variates. Transformation method and the transformation of differential cross sections. Acceptance-rejection method. Monte Carlo method examples: Examples: 1. generation of angles in the e+e- -> mu+mu- scattering; 2. generation of angles in the Bhabha scattering. 3. the structure of a complete MC program to simulate low-energy electron transport.
34
18
2/12/2022 Statistical bootstrap.
Review of the Bayesian approach to statistical inference.
36
19
7/12/2022 Maximum likelihood method 1. Connection with Bayes' theorem. The Maximum Likelihood principle. Example with exponentially distributed data. Point estimators. Properties of estimators. Transformations of estimators. Consistency of the maximum likelihood estimators. Asymptotic optimality of ML estimators. Bartlett's Identities. Cramer-Rao-Fisher bound. Variance of ML estimators.  Introduction to Shannon's entropy. Information measures based on the Shannon's entropy: Kullback-Leibler divergence. 38
20
14/12/2022 Maximum likelihood method 2. Information measures based on the Shannon's entropy: Kullback-Leibler divergence, Jeffreys distance,Jensen-Shannon divergence. Fisher information. Non-uniqueness of the likelihood function. Efficiency and Gaussianity of ML estimators. Graphical method for the variance of ML estimators. The Expectation-Maximization (EM) algorithm.
40
21
16/12/2022 Maximum likelihood method 3. The Expectation-Maximization (EM) algorithm (ctd.) (slides). More properties of the likelihood function. Extended maximum likelihood. Examples. Introduction to ML with binned data.
Example with two counting channels.
42
22
21/12/2022 Introduction to confidence intervals. Credible intervals in the Bayesian perspective.  Confidence intervals for the sample mean of exponentially distributed samples. Confidence intervals and confidence level. Detailed analysis of the Neyman construction of confidence intervals (link to the Neyman paper).
Example of ML with binned data: power laws. Chi-square and its relation to ML. Brief overview of chi-square and least squares fits. The chi-square distribution in the frame of multidimensional geometry.
44
23
23/12/2022 More on Neyman's construction of confidence intervals. Confidence intervals for the correlation coefficient of a bivariate Gaussian distribution from MC simulation. Hypothesis tests, significance level. Examples. Critical region and acceptance region. Errors of the first and of the second kind. Chi-square as a test statistic.
46
24
23/12/2022 The Neyman-Pearson lemma.  Fisher's p-value and rejection of the null hypothesis.  An anecdote about Fermi and multiparametric fits (check Dyson's paper and this paper on "fitting an elephant"). 48

 Edoardo Milotti - Dec. 2022