# Random Variables



Understanding the concept of a random variable is important for a deeper understanding of statistics. Next some key terminology will be covered related to random variables.

\defs{Definitions}

• {\bf Random variable:
Is an outcome or observation whose value is determined by a process that is not predetermined and thus can't be predicted. Random variables are often denoted using capital letters, and possible values that a random variable can take by a lower case letter.}
1. {\bf Categorical random variable:} Is a random variable that results in categorical response (non-numeric), such as gender (male or female), and opinion (strongly disagree, disagree, ..., or strongly agree).
• {\bf Dummy coding:} Dummy coding is turning a variable with two or more outcomes into a variable(s) with possible values of 0 and 1. Often categorical variables are dummy coded for
analysis purposes. For example, the gender male might be assigned the value of 0 and females the value of 1. If there are several categories, several dummy variables are needed to capture all the information. The dummy coded data can now be treated as a numerical random variable.
2. {\bf Numerical random variable:} Is a random variable that results in a numerical response. Examples include height, weight, age, income, etc. of a randomly selected individual.
1. {\bf Discrete random variable:} Resulting integer values, like the number of heads observed when flipping a coin four times, x=0,1,2,3 or 4. For an example, see Table~contdisc1.
2. {\bf Continuous random variable:} Resulting in continuous values,
like income. For an example see Table~contdisc1.
• {\bf Cumulative distribution function (c.d.f.):}
Basically $P(X \leq x)$ where $X$ is a random variable and $x$ is a real number.
The cdf is often denoted with a capital $F$ as $F(x)$, i.e. $F(x)=P(X \leq x)$.
• {\bf Probability distribution function (p.d.f.):}
1. For a discrete random variable it is merely the probability of a certain value occurring, $P(X=x)$.
• The probability distribution function has the following properties:
1. $f(x_i) \geq 0, \quad \forall i.$
2. $\sum_{\forall i} f(x_i)=1$
2. For a continuous random variable the $P(X=x)=0$ and thus the definition is not the same. The p.d.f. for a continuous random variable is a curve described by the function, $f(x)$. The area under the curve within a given interval yields the probability of the continuous random variable falling within that given interval.
• The probability distribution function has the following properties:
1. $f(x) \geq 0$
2. $\int_{-\infty}^{\infty}{f(x)dx}=1$
3. $F(b)-F(a)=P(a\leq X\leq b) = \int_{a}^{b}{f(x)dx}$, which is the area under the curve $f(x)$ from $a$ to $b$, $a\leq b$.
• Note: $P(X=b)=F(b)-F(b)=\int_{b}^{b}{f(x)dx}=0$, that is the probability of a continuous random variable equaling a specific constant, say $b$, is zero.
• {\bf Expectation} of a random variable is the mean value (a weighted mean) of the variable $X$ in the sample space, or population, of possible outcomes. {\em Expected value} can also be interpreted as the mean value that would be obtained from an infinite number of observations of the random variable.

\begin{table}
\centering
\begin{tabular}{|c|c|}\hline
Discrete & Continuous\\\hline
0& 736.1918273\\
1& 759.5668806\\
2& 812.7593044\\
3& 562.2359305\\
4& 798.2952718\\\hline
\end{tabular}
\caption{Example of Discrete and Continuous Data}
\label{contdisc1}
\end{table}

\defl{Examples of Categorical, Continuous and Discrete Data.}

• Categorical:
1. Gender
2. Blood Type
3. Marital Status
4. Eye Color
5. Political Party
• Discrete:
1. Number of people using the ATM at a certain location within the past hour.
2. Number of brothers or sisters a person has.
3. Number of times a person won at roulette within the past 20 spins.
• Continuous:
1. Income
2. Age
3. Height
4. Weight
[\latex]

## Binomial

\defl{Binomial Distribution has the following properties:} There are a fixed number of trials or observations, $n$, determined in advance.Each trial can take on one of two possible outcomes, labeled ”success” and ”failure”.Each trial’s outcome is determined independently of all the other trials.The probability of a success and that of a failure remains the same from …