2022-09-06
Sampling is an important part in probability and statistics. It is part of any probability scenario.
Sampling WITH replacement: A method of sampling that an item is sampled more than once. Sampling with replacement generally produces independent events. Usually scenarios - but not always - like this does not distinguish identical items.
Sampling WITHOUT replacement: A method of sampling where an item may not be sampled more than once. Sampling without replacement generally produce dependent events. Usually - but not always - scenarios like this does distinguish identical items.
Rolling a dice multiple times - sampling with replacement.
Drawing a card from a standard deck, putting it back, reshuffle, and drawing again - sampling with replacement.
Drawing three balls from a jar - sampling without replacement.
Drawing five cards from a standard deck - sampling without replacement.
A fair coin has the following idealized scenario:
\(P(H) = \frac{1}{2}\)
\(P(T) = \frac{1}{2}\)
\(P(H \cup T) = P(H) + P(T) = \frac{1}{2} + \frac{1}{2} = 1\)
\(P(H \cap T) = 0\)
\(P(\emptyset) = 0\)
Let’s say we flip a fair coin three times.
Question: How many possible outcomes are there?
The sample space for three coin flips is written as
\[S = \{HHH,HHT,HTT,TTT,TTH,THH,HTH,THT\}\]
Question: What is the probability of getting \(HHH\)?
The probability of getting any outcome from the sample space is \(\frac{1}{8}\) if we consider that order matters.
Consider that order does not matter and we are only interested in the number of heads.
Question: If we flip the coin three times, what is the probability of getting exactly two heads in any order?
Question: If we flip the coin three times, what is the probability of getting at least two heads in any order?
Question: If we flip the coin three times, what is the probability of getting at most two heads in any order?
The sample space of tossing a fair coin three times is \(2^3 = 8\).
It gets complicated if we increase the number of tosses.
For example:
\(2^2 = 2
\longrightarrow\) two tosses
\(2^3
= 8 \longrightarrow\) three tosses
\(2^4 = 16 \longrightarrow\) four tosses
\(2^5 = 32 \longrightarrow\) five
tosses
\(\vdots\)
\(2^{20} = 1048576 \longrightarrow\) twenty
tosses
\(\vdots\)
\(2^{200} = 1.606938 \times 10^{60}
\longrightarrow\) two hundred tosses
The number of total possible outcomes in the sample space increases by powers of two as \(n\) increases.
We are uncertain on predicting the outcome of one fair coin toss but we have an expected value that with - for example - 20 tosses, we will get about 10 heads, which is half of 20.
Suppose that we simulate tossing coins with increasing number of tosses (trials) and record the cumulative number heads.
For smaller number of tosses, we will see a lot of variation but as the number of tosses increases, the proportion of heads gets closer to \(\frac{1}{2}\) or the number of heads gets closer to the expected value.
Source Code: R - Law of Large Numbers Simulations
If a fair coin is tossed a large number of times, the number of heads and the number of tails should be approximately equal. This is called the law of large numbers.
The law of large numbers is a theorem that describes the result of performing the same trials multiple times. As more trials are performed, the mean of the results obtained tends to become closer to the expected value. We will go back to this idea again in a few weeks.