Selective distribution

A sample distribution or a distribution of finite samples in statistics is a probability distribution of a given statistic based on a random sample. Sample distributions are important in statistics because they provide significant simplification on the way to statistical inference. More specifically, they allow analytical reasoning to be based on a sample distribution of statistics rather than on a common probability distribution of all individual sample values.
Table of Contents
1 Introduction
2 Standard Error
3 Examples
4 Statistical Inference
5 Notes
6 Sources
7 References
Introduction
Sample distribution of statistics is a distribution of these statistics, which is considered as a random variable derived from a random sample of size n. It can be considered as a distribution of statistics for all possible samples from the same population, having a given size. The sample distribution depends on the distribution underlying the population, the statistics under consideration, the sampling procedure involved, and the sample size used. There is often considerable interest as to whether a sample distribution can be approximated by an asymptotic distribution corresponding to a limit case or when an infinite number of random samples of finite size are taken from an infinite population and used to obtain a distribution, or when from the same general population. Only one "sample" with a size equal to infinity is taken from the population.
For example, consider a normal general population with the mean value μ and the variance σ². Suppose we repeatedly take samples of a given size from this population and calculate the arithmetic mean



x
& # x00AF;

for each of them - this statistic is called the sample average. Each sample has its own average, and the distribution of these averages is called the "sample distribution of the sample average." This distribution is normal







N


(
& # x03BC;
,


& # x03C3;

2














{ displaystyle scriptstyle { mathcal {N}} ( mu, , sigma ^ {2} / n)}

(n is the sample size) because the underlying population at its core, is normal, although the sample distributions can also often be close to normal even when the distribution of the general population is not so (see the central limit theorem). An alternative to the sample mean is the sample median. When calculated from the same population, it has a different sample distribution than the sample mean, and is usually not normal (but may be close to this for large sample sizes).
The average value of the sample from the population having a normal distribution is an example of a simple statistic taken from one of the simplest statistical general populations. Formulas for other statistics and other general populations are more complex, and often do not exist in a closed form. In such cases, the sample distributions can be approximated using Monte Carlo simulations, [1] statistical bootstrap, or asymptotic distribution theory.
Standard error
The standard deviation of the sample distribution of statistics is called the standard error of this quantity. For the case where the statistics are the average of the sample and the samples are uncorrelated, the standard error is x


x & # x00AF;





=


& # x03C3;

n





{ displaystyle sigma _ { bar {x}} = { frac { sigma} { sqrt {n}}}}






{ displaystyle sigma}

is the standard error of distribution of this value of the general population, and n is the sample size (number items in the sample).
An important consequence of this formula is that to achieve half (1/2) of the measurement error, the sample size must be quadrupled (multiplied by 4). When designing statistical surveys in which costs are a factor, this may play a role in understanding the trade-off between costs and benefits.

















N


(
& # x03BC;
,

& # x03C3;

2


)


{ displaystyle { mathcal {N}} ( mu, sigma ^ {2})}

Selective average






X
& # x00AF; { displaystyle { bar {X}}}












X
& # x223C ;


N



(


& # x03BC;
,




& # x03C3;

2


n




)) <



{ displaystyle { bar {X}} sim { mathcal {N}} { Big (} mu, , { frac { sigma ^ {2} } {n}} { Big)}}

Bernoulli:



Bernoulli
& # x2061;
(
p
)


{ displaystyle operatorname {Bernoulli} (p)}

Ïğîñòà ïğîï rtsiya "successful trials"





X
& # x00AF; br> { displaystyle { bar {X}}}




n



X
& # x00AF;



& # x223C;
Binomial
& # x2061;
(
n
,
p
)


{ displaystyle n { bar {X}} sim operatorname {Binomial} (n, p)}

Two independent normal sets:




N












1


,

& # x03C3;

1


2


)


{ displaystyle { mathcal {N}} mu _ {1}, sigma _ {1} ^ {2 })}

and




N








# br>
2


,

& # x03C3;

2


2


)


{ displaystyle { mathcal {N}} ( mu _ {2}, sigma _ {2} ^ {2})}

sample averages,







X
& # x00AF; 1


& # x2212;




X
& # x00AF;




2




{ displaystyle { bar {X }} _ {1} - { bar {X}} _ {2}}







X
& # x00AF;




1


& # x2212;




X
& # x00AF;




2


& # x223C;


N




(

& # x03BC;

1


& # x2212;

& # x03BC ;

2












1


2



n - Page 1



+



x03C3;

2


2



n

2




)



{ displaystyle { bar {X}} _ {1} - { bar {X}} _ {2} sim { mathcal {N }} ! left ( mu _ {1} - mu _ {2}, , { frac { sigm a _ {1} ^ {2}} {n_ {1}}} + { frac { sigma _ {2} ^ {2}} {n_ {2}}} right)}

Absolutely continuous distribution F with density ƒ
Median



X

(
k
)




{ displaystyle X _ {{(k)}}

from a sample of size n = 2k - 1, where the sample is ordered from




X

(
1
)




{ displaystyle X _ {((1)}}

to




X

(
n
)




{ displaystyle X_ { (n)}}






f


X

(
k
)




(
x
)
=























;
1
)




2















)






F
(
x
)
(
1
& # x2212 ;
F
(
x
)
)



)




k
& # x2212;
1



{ displaystyle f_ {X _ {(k)}} (x) = { frac {(2k-1)!} {(k-1)! ^ {2}}} f (x) { Big (} F (x) (1-F (x)) { Big)} ^ {k-1}}

Arbitrary distribution with distribution function F
Maximum



M
=
max
& # xA0;

X

k




{ displaystyle M = max X_ {k}}
Random sample size n




F

M



(
x
)
=
P
(
M
& # x2264;
x
)
=
& # x220F;
P


X

k


& # x2264;
x
)
=











>)
)



n




{ displaystyle F_ {M} (x) = P (M leq x) = prod P (X_ {k} leq x) = left (F (x) right) ^ {n}}

Statistical inference
In the theory of statistical inference, the idea of sufficient statistics provides a basis for such selection of statistics (as a function of sampling data points), that no information is lost when replacing the full probabilistic description of the sample with a sample distribution of the selected statistics.
In frequency output, n For example, in creating a test of statistical hypotheses or confidence intervals, the availability of a sample distribution of statistics (or its approximation in the form of an asymptotic distribution) may provide a ready-made formulation of such procedures, while creating procedures starting with a joint sample distribution would not be so obvious. > In Bayesian inference, when a sample distribution of statistics is available, it is possible to consider the replacement of the final output of such procedures, in particular conditional distributions of any unknown values for a given sample data, mind vnymy distributions of any unknown quantities for a given sample statistics. Such procedures will involve a selective distribution of these statistics. The results will be identical provided that the selected statistics are sufficiently common.
This article requires additional references to sources to improve its reliability.
Help improve this article by adding links to trusted sources!
Material without sources may be questioned and removed. (October 2015)
Notes
↑ Mooney, 1999, p. 2
Sources
Mooney, Christopher Z. (1999). Monte Carlo simulation. Thousand Oaks, Calif .: Sage. ISBN 9780803959439.
Merberg, A. and S.J. Miller (2008). "The Sample Distribution of the Median." Course Notes for Math 162: Mathematical Statistics, on the web at http://web.williams.edu/Mathematics/sjmiller/public_html/BrownClasses/162/Handouts/MedianThm04.pdf, pgs 1–9.




Generating sample distributions in Excel
Mathematica demonstration showing a sample distribution of various statistics (for example, Σx²) for a normal general sample

br> î
ğ
Statistics

Descriptive Statistics
Continuous Data
Offset
Average
Arithmetic
Geometric and Harmonic
Median
Fashion - Dispersion - Scope - Standard deviation - Coefficient of variation - Percentile - Interquartile range
Form [en] - Dispersion - Asymmetry - Excess
Moment
L-moment [en]
Numerical data
Dispersion index
Summary tables
Grouped data
Frequency distribution
Conjugation table
Dependence
Pearson correlation coefficient
Correlation of rank Spearman
tau kendala
Partial correlation






























> Correlogram
Fan chart
Forest chart
Histogram
Pie chart
QQ chart
Running chart
Spot chart
Diagram "trunk - leaves"
Petal diagram

Data collection
Study planning
Effect size
Standard error
Statistical power
Sample sizing
Research methodology > Sampling


































]
Random assignment
Replication
Grouping
Full factorial experiment
Passive research
Natural experiment
Quasi-experiment ]
Observational study [en]

Statistical inference
Te statistics series
Selective distribution
Sequential statistics
Scanning statistics
Record value
Sufficiency
Completeness
Exponential family
Permutation criterion
Randomization criterion
Empirical distribution
Bootstrep
U-statistics
Efficiency
Asymptotics > Robustness - Frequency inference
Confidence interval
Hypothesis test
Power
Unbiased estimates
Mean unbiased minimum-variance
Median unbiased
Biased estimates
Maximum probability
Method of moments
Minimum distance
Density estimation
Parametric checks
Probability ratio
Wald
Lagrange multipliers
Special criteria
Z (normal)

Student's t-test
F
Shapiro-Vilka
Kolmogorov-Smirnov
Degree of agreement
Chi-square
G
Sources of sampling (Anderson – Darling) [en]
Normalities of sampling (Shapiro – Wilk) [en]
Normalities of asymmetry / excess (Harke-Bera)


Model quality (criterion Akaike)
Sign ranking
1-sample (Wilcoxon)
en 2nd irkovy (U Manna – Whitney)
1-way analysis of variance (Kraskela – Wallis) [en] Bayesian inference
Bayesian probability
a priori
a posteriori
Probable interval
Coefficient Bayes
Bayes estimation
A posteriori maximum estimation

Correlation and regression analysis
Correlation
Pearson correlation coefficient
Partial correlation
Mixing variable [en]
Coefficient of determination
Regression analysis
Errors and residues
Validation of regression model
Mixed effects models
System of simultaneous equations
Splines are rich of dimensional adaptive regression (MARS)
Linear regression
Simple linear regression
Conventional least squares method
General linear model
Bayesian linear regression
Non-standard predictors
Nonlinear regression
Nonparametric
Semiparametric
Isotonic
Robotic
Robust
Heteroskedastic
Homoskedastic
Generalized linear model
Exponential families
Logistic (Bernoulli) / Binomial regression / Poisson regression
Dispersion division
Analysis of variance (ANOVA)
Covariance anal

Multidimensional analysis of variance (MANOVA)
Degrees of freedom

Category / multidimensional analysis / analysis of time series / survival

Category
Kappa Cohen [ en]
Conjugation table [en]
Graphic model
Logarithmic model [en]
McNimar criterion [en]
Multidimensional
Multidimensional regression
Main components
Factor analysis
Cluster analysis
Classification
Dome
Time series
General
Decomposition
Trends
Stationarity
Seasonal adaptation > Exponential smoothing
Cointegration [en]
Structural gap

Causality by Granger
Special criteria
Dicky-Fuller
Johansen
Q-statistics (Leung-Box)
Darbin-Watson
Broysch – Godfrey
Time domain
Autocorrelation (ACF)
Partial autocorrelation (PACF)
Mutual correlation (XCF)
Autoregression moving average (ARMA)
Boxing – Jenkins method (ARIMA)
Autoregressive conditional heteroskedasticity (ARCH)
Vector autoregression (VAR)
Frequency domain
Estimation of spectral density
Fourier analysis
Wavelet
Survival
Survival function and Kaplan – Meyer (product limits) [en]
Logarithmic rank criterion [en] Failure intensity
Models of proportional failure intensities [en]
Accelerated time to failure model [en]

Applied fields
Biological statistics
Bioinformatics
Clinical trials / research
Epidemiology
Medical statistics
Engineering statistics
Chemometrics
Methods Engineering
Probabilistic Design
Process / Quality Management
Reliability Theory
Systems Identification
Social Statistics
Actuarial Mat òèêà
Population Census
Legal Statistics
Demographics
Econometrics
National Accounts
Official Statistics
Psychometry
Spatial Statistics ]
Cartography
Environmental statistics
Geographic information systems
Geostatistics
General set of measurements
Krieging
Category
Portal [en]
Introduction [en] ]
Table of Contents
Index [en]


Вибірковий розподіл

Випадкові Статті

Ophidion scrippsae

Ophidion scrippsae

Ophidion scrippsae — вид риб родини Ошибневих Ophidiidae Поширений у східній Пацифіці від Пойнт...
Комар Володимир Степанович

Комар Володимир Степанович

Медіафайли у Вікісховищі У Вікіпедії є статті про інших людей з прізвищем Комар Володимир Ст...
1 липня

1 липня

1 липня — 182-ий день року (183-ий в високосні роки) в григоріанському календарі. До кінця року...
Хачеріді Євген Григорович

Хачеріді Євген Григорович

* Ігри та голи за професіональні клуби враховуються лише в національному чемпіонаті. Інформацію поно...