Econometric Nonsense: Negative binomial and geometric distribution pt. 2 MGF, variance, and mean.

As I said in the last post, the geometric distribution is a special case of the negative binomial when $r=1$. Therefore, we'll begin by finding the MGF, variance, and mean of the negative binomial, and this will automatically give us the cases for the geometric. Since we can use the MGF to find the variance and the mean, we'll start with that. This gives:

$$E[e^{tn}]=\sum_{x=r}^{\infty}e^{tn}\left( {\begin{array}{*{20}c} x-1 \\ r-1 \\
\end{array}} \right) P^{r}(1-P)^{x-r}$$

Now some explaining. Because the number of successes, $r$, is a set number, it's not our random variable. Instead, our random variable would be $x$, the number of trials it would take before we get $r$ successes. That's why $e$ is raised to $tx$. Why does it sum over $x=r$ to $\infty$? Well those are the possible values of $x$. Either there are $0$ failures ($r-r=0$), or the number of failures goes on to $\infty$. (In which case the probabilities get very small). Well what we have so far is a good start, but we'll need to make a slight change. We'll turn $e^tx$ to $e^tx+tr-tr$, which will become useful soon. Split into parts that becomes $e^{t(x-r)}e^{tr}$. Now arranging this into our equation changes it to:

$$\sum_{x=r}^{\infty}\left( {\begin{array}{*{20}c}x-1 \\ r-1 \\
\end{array}} \right) (Pe^{t})^{r}((1-P)e^{t})^{x-r}=(Pe^{t})^{r}\sum_{x=r}^{\infty}\left( {\begin{array}{*{20}c}x-1 \\ r-1 \\
\end{array}} \right)((1-P)e^{t})^{x-r}$$

Where I've just pulled out the term that isn't being summed over. What we want to do next is get rid of that ugly binomial coefficient to simplify this. What can we do? Well we know from the binomial theorem that:

$$\sum_{k=o}^{n}\left( {\begin{array}{*{20}c}n \\ k \\
\end{array}} \right)x^{n-k}y^k=(x+y)^n$$

If we can form part of the equation into that then we can simplify it greatly. Even better, if we can make it so that $x+y=1$, then that whole part of the equation would become $1$. Let's compare what we have to what we want to make it. Now, in the exponents they must add up to the total number of trials. So, $n-k+k=n$, so in ours we must have $x-r+r=x$, or the other exponent to be $r$. Therefore, the numbers on the right side of the binomial coefficient in our original equation must be in the form $a^{x-r}b^{k}$. Furthermore, $a+b$ must equal $1$. We know that our first term, $((1-P)e^{t})$ is raised to $x-r$, which means it fits the position of the $a$ term. Now what we need is a "$b$" term that is raised to $r$. Since we know the $a$ term, we want it so that $a+b=1$, or:

$$(1-P)e^{t}+b=1\Leftrightarrow b=1-(1-P)e^{t}$$
So we need this term on the right side of the binomial coefficient:
$$(1-(1-P)e^{t})^{r}$$
Well, on the right side we can have:
$$\left( {\begin{array}{*{20}c}x-1 \\ r-1 \\ \end{array}} \right)((1-P)e^{t})^{x-r}\frac{(1-(1-P)e^{t})^{r}}{(1-(1-P)e^{t})^{r}}$$
Since that term is just $1$ multiplying it. Well, the full equation would be:

$$(Pe^{t})^{r}\sum_{n=r}^{\infty}\left( {\begin{array}{*{20}c}x-1 \\ r-1 \\
\end{array}} \right)((1-P)e^{t})^{x-r}\frac{(1-(1-P)e^{t})^{r}}{(1-(1-P)e^{t})^{r}}$$

The summation only affects $x$, so we can pull the denominator out:

$$\frac{(Pe^{t})^{r}}{(1-(1-P)e^{t})^{r}}\sum_{n=r}^{\infty}\left( {\begin{array}{*{20}c}x-1 \\ r-1 \\
\end{array}} \right)((1-P)e^{t})^{x-r}(1-(1-P)e^{t})^{r}$$

Well, we know the binomial coefficient and everything to the right of it will become $1$, so all we are left with is:
$$\frac{(Pe^{t})^{r}}{(1-(1-P)e^{t})^{r}}=\left( \frac{Pe^{t}}{1-(1-P)e^{t}}\right) ^r$$
Which is the MGF of the negative binomial distribution. Setting $r=1$ would then give us the MGF of the geometric distribution, which is:
$$\frac{Pe^{t}}{1-(1-P)e^{t}}$$
(Fairly easy to check that yourself). Now that we have the MGF, we can focus on the mean and variance. Starting with the mean, we want the first moment. That's equivalent to the first derivative of the MGF with respect to $t$, and then setting $t$ equal to $0$. The derivative is:

$$\frac{\partial M_{x}(t)}{\partial t}=r\left( \frac{Pe^{t}}{1-(1-P)e^{t}}\right)^{r-1}\left(\frac{Pe^{t}}{1-(1-P)e^{t}}+\frac{P(1-P)e^{2t}}{(1-(1-P)e^{t})^{2}}\right)$$

(You have no idea how long that took)

Now setting $t=0$, we get $\frac{r}{P}$. Setting $r=1$ for the geometric case, we get $\frac{1}{P}$. Well, we have the means of both distributions. Now it's time to differentiate twice....I'll do that in another post. Too tired. As for now, there's something I should mention. If you'll notice the denominator, if we set $t=ln(\frac{1}{1-P})$ then we get a zero. Obviously, it's important we avoid something like that, but I didn't deem it necessary to mention earlier. Then a stroke of conscience reminded me I should.

Econometric Nonsense

Sunday, December 1, 2013

Negative binomial and geometric distribution pt. 2 MGF, variance, and mean.

No comments:

Post a Comment