RESEARCH STARTER
Distribution (mathematics)
In mathematics, a distribution refers to a function that describes the probability of different outcomes in a random variable. For continuous random variables, this is represented by a probability density function, where the area under the curve corresponds to the likelihood of the variable falling within a specific interval. Common examples of continuous distributions include the normal distribution, exponential distribution, and t-distribution. For discrete random variables, probability distributions provide the probabilities associated with each possible value, with the Poisson and binomial distributions being notable examples.
Mathematicians often rely on assumptions about these distributions to conduct analyses, utilizing parametric methods that test hypotheses based on parameters—numerical characteristics that define the population or model. The normal (or Gaussian) distribution is particularly significant due to its properties, such as symmetry, where the mean, median, and mode are the same, and the fact that certain ranges (like ±2 standard deviations from the mean) encompass a large percentage of the data. Overall, understanding distributions is crucial for statistical analysis and data interpretation in various fields.
Authored By: Holt, Martin P., MSc 1 of 4
Published In: 2022 2 of 4
- Related Topics:
3 of 4
- Related Articles:Contribution to an open problem of Harkness and Shantaram.;Decision making using similarity to a reference distribution.;Discrete search stability for a hidden target on M-intervals with a known probabilistic distributed effort.;Extending the Dixon and Coles model: an application to women's football data.;Simple closed-form estimation of a binary latent variable model.
4 of 4
Full Article
A probability distribution is a mathematical formula that gives a curve for a continuous random variable. The area under the curve gives the probability that the variable is found within a particular interval. Some distributions are the normal distribution, exponential distribution, and the t-distribution. Alternatively, in the case of discrete random variables, the formula gives the probability of each value of the variable. Examples of these include the Poisson distribution and the pinomial distribution.
Overview
First, the mathematician must assume that the empirical distribution of the data approximates accurately to the theoretical distribution derived from the theoretical, mathematical distribution defined by one or more parameters. Such methods are called parametric methods. A parameter is a numerical characteristic of a population or model, as in a "death" in a binomial distribution. Parametric methods test hypotheses about parameters in a population described by a particular distribution, for example, students’ t-test.
The Normal Distribution
When analysing data there is a choice between methods that make distributional assumptions, as above, and those that make no assumptions (called distribution-free or non-parametric methods). For example, say that the random continuous variable of interest is height. Say that the mean and standard deviation of the height of adult men are known. Assuming that the distribution of height in the population is the same as a specific probability distribution (here, the normal distribution), then the probability of adult males being more than six feet tall can be calculated.
Moreover, if it is known from observation that the proportion of babies being female is 0.58, then it is possible to work out (using the normal distribution) the probability of a woman with three children having three sons. The mean and standard deviation in the first example, and the value of 0.58 in the second example, are all examples of parameters. All probability distributions are described by one or more parameters. Regarding continuous variables, the normal distribution stands out as the fundamental parametric method of choice.
The Gaussian Distribution
A Gaussian distribution has the following properties:
Property 1: The shape is symmetric like a bell.
Property 2: The mean, median, and mode coincide.
Property 3: The limits from (mean – 2SD) to (mean + 2SD) cover the measurements of nearly 95% of subjects. These are referred to as ±2SD limits or sometimes as 2-sigma limits.
Another often-cited property of a Gaussian distribution is that the limits from (mean – 3SD) to (mean + 3SD) cover almost all (99.7%) of the subjects. These 3-sigma limits are rarely used in health and medicine. An exceptional use of these limits is in Z-scores. A Z-score can be calculated as (height – mean)/SD where mean and SD are calculated for reference healthy individuals of a given age or weight. So the Normal distribution can be seen to be of fundamental importance.
Bibliography
Blitzstein, Joseph K., and Jessica Hwang. Introduction to Probability. Boca Raton, FL: Chapman, 2015.
Forbes, Catherine, et al. Statistical Distributions. Hoboken, NJ: Wiley, 2011.
Indrayan, Abhaya. Medical Biostatistics. Boca Raton, FL: Chapman, 2013.
Full Article
A probability distribution is a mathematical formula that gives a curve for a continuous random variable. The area under the curve gives the probability that the variable is found within a particular interval. Some distributions are the normal distribution, exponential distribution, and the t-distribution. Alternatively, in the case of discrete random variables, the formula gives the probability of each value of the variable. Examples of these include the Poisson distribution and the pinomial distribution.
Overview
First, the mathematician must assume that the empirical distribution of the data approximates accurately to the theoretical distribution derived from the theoretical, mathematical distribution defined by one or more parameters. Such methods are called parametric methods. A parameter is a numerical characteristic of a population or model, as in a "death" in a binomial distribution. Parametric methods test hypotheses about parameters in a population described by a particular distribution, for example, students’ t-test.
The Normal Distribution
When analysing data there is a choice between methods that make distributional assumptions, as above, and those that make no assumptions (called distribution-free or non-parametric methods). For example, say that the random continuous variable of interest is height. Say that the mean and standard deviation of the height of adult men are known. Assuming that the distribution of height in the population is the same as a specific probability distribution (here, the normal distribution), then the probability of adult males being more than six feet tall can be calculated.
Moreover, if it is known from observation that the proportion of babies being female is 0.58, then it is possible to work out (using the normal distribution) the probability of a woman with three children having three sons. The mean and standard deviation in the first example, and the value of 0.58 in the second example, are all examples of parameters. All probability distributions are described by one or more parameters. Regarding continuous variables, the normal distribution stands out as the fundamental parametric method of choice.
The Gaussian Distribution
A Gaussian distribution has the following properties:
Property 1: The shape is symmetric like a bell.
Property 2: The mean, median, and mode coincide.
Property 3: The limits from (mean – 2SD) to (mean + 2SD) cover the measurements of nearly 95% of subjects. These are referred to as ±2SD limits or sometimes as 2-sigma limits.
Another often-cited property of a Gaussian distribution is that the limits from (mean – 3SD) to (mean + 3SD) cover almost all (99.7%) of the subjects. These 3-sigma limits are rarely used in health and medicine. An exceptional use of these limits is in Z-scores. A Z-score can be calculated as (height – mean)/SD where mean and SD are calculated for reference healthy individuals of a given age or weight. So the Normal distribution can be seen to be of fundamental importance.
Bibliography
Blitzstein, Joseph K., and Jessica Hwang. Introduction to Probability. Boca Raton, FL: Chapman, 2015.
Forbes, Catherine, et al. Statistical Distributions. Hoboken, NJ: Wiley, 2011.
Indrayan, Abhaya. Medical Biostatistics. Boca Raton, FL: Chapman, 2013.
More Like ThisRelated Articles
Related Articles (5)
Related Articles (5)
- Contribution to an open problem of Harkness and Shantaram.Published In: Mathematical Methods in the Applied Sciences, 2024, v. 47, n. 13. P. 11086Authored By: Jedidi, Wissem; Bouzeffour, Fethi; Harthi, NoufPublication Type: Academic Journal
- Decision making using similarity to a reference distribution.Published In: IMA Journal of Management Mathematics, 2025, v. 36, n. 1. P. 67Authored By: Baker, Rose D; McHale, Ian GPublication Type: Academic Journal
- Discrete search stability for a hidden target on M-intervals with a known probabilistic distributed effort.Published In: Discrete Mathematics, Algorithms & Applications, 2025, v. 17, n. 2. P. 1Authored By: El-Hadidy, Mohamed Abd AllahPublication Type: Academic Journal
- Extending the Dixon and Coles model: an application to women's football data.Published In: Journal of the Royal Statistical Society: Series C (Applied Statistics), 2025, v. 74, n. 1. P. 167Authored By: Michels, Rouven; Ötting, Marius; Karlis, DimitrisPublication Type: Academic Journal
- Simple closed-form estimation of a binary latent variable model.Published In: Econometrics Journal, 2025, v. 28, n. 2. P. 198Authored By: Hu, Yingyao; Li, Jingrong; Shiu, Ji-Liang; Shum, MatthewPublication Type: Academic Journal