Sunday, December 15, 2013

Negative binomial and geometric distributions pt. 3 variance.

Alright now we have to take the derivative we got the last time and differentiate again. Last time we had:

$$\frac{\partial M_{x}(t)}{\partial t}=r\left( \frac{Pe^{t}}{1-(1-P)e^{t}}\right)^{r-1}\left(\frac{Pe^{t}}{1-(1-P)e^{t}}+\frac{P(1-P)e^{2t}}{(1-(1-P)e^{t})^{2}}\right)$$

The second derivative of this would become:


(yeah I'm not typing that up at all, of course I used Wolfram)

Setting $t=0$ reduces this to:

$$\frac{-r(P-1)+r^{2}}{P^2}$$

Now to use the fact that $\sigma^2=E[x^2]-\mu^2$, we have:

$$\sigma^2=\frac{-r(P-1)+r^{2}}{P^2}-\frac{r^2}{P^2}=\frac{-r(1-P)}{P^2}=\frac{r(P-1)}{P^2}$$

Now we have the variance of the negative binomial, we can use this to find the variance of the geometric, which is just the special case of $r=1$. That is simply:

$$\frac{(P-1)}{P^2}$$

Work is finallyyyyy done. Now moving onto the next discrete distribution. The hypergeometric. That leaves only two of the most useful discrete distributions. The Poisson, and the uniform (trivial but I'll make a post about it anyways).

Tuesday, December 10, 2013

Binomial distribution in R.

I haven't put much emphasis on using R on my blog yet, but since we have a pretty good grasp of some discrete distributions it's probably time we start using it. I'll start with the binomial distribution. Now all distributions are fairly similar in the sense that in R we can use four different commands for each distribution. The "d" command gives you the density, the "p" command tells you the cumulative probability, "q" is the "inverse cumulative probability" (I'll talk about this later), and "r" uses randomly generated numbers. I'll use d to start off with, since it's the easiest to grasp.

In R placing a "d" before the binomial function binom() gives the density of your chosen point. (for continuous distributions, which we'll cover later, gives the height at the point). For instance, let's say we want to know the probability of getting exactly 4 successes in 10 trials with a probability of .4 of success. Plugging this in would be:

>dbinom(4,10,.4)

Which is approximately equal to .25. What about the probability of getting 4 or fewer? Well we could add them:

>dbinom(0,10,.4)+dbinom(1,10,.4)+dbinom(2,10,.4)+dbinom(3,10,.4)+dbinom(4,10,.4)

Or we can take the much simpler route which is using the cumulative distribution of the binomial command, which is:

>pbinom(4,10,.4)

For this particular case. Either way gives you approximately .633 probability. Now we also have the inverse cumulative function, also known as the quantile function. I might talk about this in a different post, but as a quick primer I'll give the definition for a continuous, monotonically increasing distribution. If $F(x)=p$ is the cumulative probability at $x$, then we know we can take the inverse and get $F^{-1}(p)=x$, which in words means we take the reverse order in that we take the probability and try to find an $x$ that satisfies this. For discrete distributions it's slightly more difficult, since they're discontinuous, but for our purposes understanding the process intuitively even in the continuous case will be enough to try it out in R. Since the probability of getting 4 or fewer is approximately .633, the putting in the inverse cumulative function:

>qbinom(.633,10,.4)

 Should give us 4. Check this for yourself.

Now for our last command, the randomly generated binomial command. Let's say one experiment has 100 trials, the probability of success is .4, and we run 5 of these experiments. Well, in R we'd put:


>x=rbinom(5,100,.4)

Where x is an arbitrary variable we're assigning this all to, rbinom() is the randomly generated binomial distribution command, 5 is the total number of experiments we run, 100 the number of trials in each experiment, and .4 is the probability of success (notice the first number plays a different role than earlier). If we check the value of x, it would give us the results of these randomly run experiments. It would look something like this:



This certainly makes sense. If we have 100 trials, and we have .4 probability of success, we should get round 40 successes. Notice the values actually oscillate around 40. Graphically this would look like:



This is simply the histogram of the randomly generated values we had. To make this, you simply put:

>hist(x)

Where hist() is the histogram command. What we can see from the graph is that it doesn't look like much. Obviously they cluster together around 38-40. There's a strange gap between 32-36, with another value appearing in the 30-32 bin. Why does this look so strange? Well, we only have 5 experiments total. What we need to do is run more before we get a distribution that looks more reasonable. Let's do the exact same process, but now instead of 5 experiments of 100 trials, we'll run 100. That would give:



With a histogram of:



Now that looks much more reasonable. We could also test how close it is to the mean by using the mean command, mean(). The mean of the 5 experiments we ran earlier was 38.2. The mean of the 100 experiments equals 39.68. Closer to the theoretical average of 40.

You should definitely play around with this and get a better feel for it, you'll be using this for your other distributions.

Addendum:

Keep in mind the Bernoulli distribution that only takes one trial. It's easy to use our tools here to make this. if you want randomly created data for a Bernoulli distribution just set the number of trials equal to one. It would look like this:

>x=rbinom(100,1,.5)

So a hundred experiments of a single trial. That would look like:

Now here's another good command we can use to play around with this. Let's say we want to sum up all those outcomes of 1. Put:

>sum(x==1)

This sums up all the individual data that equals one, and it gets 51 in this case which is exactly what you would expect from a distribution like this with a probability of .5.  A collection of all the outcomes can be found by putting >table(x) which finds you the total numbers of 0s and 1s.



Sunday, December 1, 2013

Negative binomial and geometric distribution pt. 2 MGF, variance, and mean.

As I said in the last post, the geometric distribution is a special case of the negative binomial when $r=1$. Therefore, we'll begin by finding the MGF, variance, and mean of the negative binomial, and this will automatically give us the cases for the geometric. Since we can use the MGF to find the variance and the mean, we'll start with that. This gives:

$$E[e^{tn}]=\sum_{x=r}^{\infty}e^{tn}\left( {\begin{array}{*{20}c} x-1  \\ r-1  \\
\end{array}} \right) P^{r}(1-P)^{x-r}$$


Now some explaining. Because the number of successes, $r$, is a set number, it's not our random variable. Instead, our random variable would be $x$, the number of trials it would take before we get $r$ successes. That's why $e$ is raised to $tx$. Why does it sum over $x=r$ to $\infty$? Well those are the possible values of $x$. Either there are $0$ failures ($r-r=0$), or the number of failures goes on to $\infty$. (In which case the probabilities get very small). Well what we have so far is a good start, but we'll need to make a slight change. We'll turn $e^tx$ to $e^tx+tr-tr$, which will become useful soon. Split into parts that becomes $e^{t(x-r)}e^{tr}$. Now arranging this into our equation changes it to:


$$\sum_{x=r}^{\infty}\left( {\begin{array}{*{20}c}x-1  \\ r-1  \\
\end{array}} \right) (Pe^{t})^{r}((1-P)e^{t})^{x-r}=(Pe^{t})^{r}\sum_{x=r}^{\infty}\left( {\begin{array}{*{20}c}x-1 \\ r-1  \\
\end{array}} \right)((1-P)e^{t})^{x-r}$$

Where I've just pulled out the term that isn't being summed over. What we want to do next is get rid of that ugly binomial coefficient to simplify this. What can we do? Well we know from the binomial theorem that:

$$\sum_{k=o}^{n}\left( {\begin{array}{*{20}c}n \\ k  \\
\end{array}} \right)x^{n-k}y^k=(x+y)^n$$

If we can form part of the equation into that then we can simplify it greatly. Even better, if we can make it so that $x+y=1$, then that whole part of the equation would become $1$. Let's compare what we have to what we want to make it. Now, in the exponents they must add up to the total number of trials. So, $n-k+k=n$, so in ours we must have $x-r+r=x$, or the other exponent to be $r$. Therefore, the numbers on the right side of the binomial coefficient in our original equation must be in the form $a^{x-r}b^{k}$. Furthermore, $a+b$ must equal $1$. We know that our first term, $((1-P)e^{t})$ is raised to $x-r$, which means it fits the position of the $a$ term. Now what we need is a "$b$" term that is raised to $r$. Since we know the $a$ term, we want it so that $a+b=1$, or:

$$(1-P)e^{t}+b=1\Leftrightarrow b=1-(1-P)e^{t}$$
So we need this term on the right side of the binomial coefficient:
$$(1-(1-P)e^{t})^{r}$$
Well, on the right side we can have:
 $$\left( {\begin{array}{*{20}c}x-1 \\ r-1  \\ \end{array}} \right)((1-P)e^{t})^{x-r}\frac{(1-(1-P)e^{t})^{r}}{(1-(1-P)e^{t})^{r}}$$
Since that term is just $1$ multiplying it. Well, the full equation would be:

$$(Pe^{t})^{r}\sum_{n=r}^{\infty}\left( {\begin{array}{*{20}c}x-1 \\ r-1  \\
\end{array}} \right)((1-P)e^{t})^{x-r}\frac{(1-(1-P)e^{t})^{r}}{(1-(1-P)e^{t})^{r}}$$

The summation only affects $x$, so we can pull the denominator out:

$$\frac{(Pe^{t})^{r}}{(1-(1-P)e^{t})^{r}}\sum_{n=r}^{\infty}\left( {\begin{array}{*{20}c}x-1 \\ r-1  \\
\end{array}} \right)((1-P)e^{t})^{x-r}(1-(1-P)e^{t})^{r}$$

Well, we know the binomial coefficient and everything to the right of it will become $1$, so all we are left with is:
$$\frac{(Pe^{t})^{r}}{(1-(1-P)e^{t})^{r}}=\left( \frac{Pe^{t}}{1-(1-P)e^{t}}\right) ^r$$
Which is the MGF of the negative binomial distribution. Setting $r=1$ would then give us the MGF of the geometric distribution, which is:
$$\frac{Pe^{t}}{1-(1-P)e^{t}}$$
(Fairly easy to check that yourself). Now that we have the MGF, we can focus on the mean and variance. Starting with the mean, we want the first moment. That's equivalent to the first derivative of the MGF with respect to $t$, and then setting $t$ equal to $0$. The derivative is:


$$\frac{\partial M_{x}(t)}{\partial t}=r\left( \frac{Pe^{t}}{1-(1-P)e^{t}}\right)^{r-1}\left(\frac{Pe^{t}}{1-(1-P)e^{t}}+\frac{P(1-P)e^{2t}}{(1-(1-P)e^{t})^{2}}\right)$$

(You have no idea how long that took)

Now setting $t=0$, we get $\frac{r}{P}$. Setting $r=1$ for the geometric case, we get $\frac{1}{P}$. Well, we have the means of both distributions. Now it's time to differentiate twice....I'll do that in another post. Too tired. As for now, there's something I should mention. If you'll notice the denominator, if we set $t=ln(\frac{1}{1-P})$ then we get a zero. Obviously, it's important we avoid something like that, but I didn't deem it necessary to mention earlier. Then a stroke of conscience reminded me I should.