Statistics+EAGSR7101

**[[image:Huck_Book.jpg width="63" height="68"]][|Reading Statistics and Research- Schuyler W. Huck]**
Chapter 2 Notes- Descriptive Statistics, The Univariate Case (pgs. 19-48)


 * [|Univariate-]only one variable is involved
 * [|Bivariate-]two dependent variables
 * Picture techniques- include frequency distributions, stem-and-leaf displays, histograms and bar graphs
 * Distributional shape- what is means when it is said data is normal, skewed, bimodal or rectangular
 * Central tendency-
 * Frequency distribution- shows how many people (or animals or objects) were similar. (Measurement is based on the same variable and they ended up being in the same category or having the same score.)


 * Three Kinds of Frequency Distributions**
 * Simple (ungrouped)
 * Grouped
 * Cumulative- how many measured objects ended up with any given score and all other lower scores ( or how many scores ended up in a given score interval and all other lower intervals).
 * __Symbols__**
 * //f = frequency//
 * //N- number//


 * Stem-and- leaf display- a grouped frequency distribution that contains no loss of information.
 * Histograms- vertical columns (or thin lines) used to indicate how many times any given score appears in the data set.


 * Horizontal Axis of Histogram vs. Bar Graph:**
 * Histogram- the horizontal axis is labeled with numerical values that represent a QUANTITATIVE variable.
 * Bar Graph- the horizontal axis is labeled with categories representing a QUALITATIVE variables.


 * Distributional Shape of Data**
 * **Normal distribution-** most of the scores will be clustered near the middle of the nontinuum of observed scores. There will be a gradual and symmetrical decrease in frequency in both directions away from the middle area of scores
 * **Skewed distributions-** most of the scores end up being high or low, with a small percentage of sores towards one direction away from most of the other scores. They are not symmetrical.
 * Positively Skewed- Skewed distribution where most of the scores point towards the upper end of scores
 * Negatively skewed- Skewed distribution where most of the scores point towards the lower end of scores
 * Multimodal- the scores congregate around more that one point along the continuum
 * Bimodal- two places where scores are grouped along the continuum
 * Trimodal- three places where scores are grouped along the continuum
 * Unimodal- Distributions having just one"hump"
 * Rectangular- Scores are fairly evenly distributed without any clustering
 * **Kurtosis-** the possibility that a set of scores can be nonnormal even though there is only one mode and even though there is no skewness in the data.
 * Leptokurti and Platykurtic- terms denoting distributional shapes that are more peaked and less peaked
 * Mesokurtic- a distributional shape that is neither overly peaked nor flat


 * To properly interpret coefficients of skewness and Kurtosis:**
 * 1) Normal- Both indicators will turn out equal to zero
 * 2) A skewness value <0 shows that a distribution is negatively skewed.
 * 3) A skewness value >0 shows that a distribution is positively skewed.

Measures of Central Tendency

 * 1) Mode- the most frequently occurring score
 * 2) Median- the midpoint of a distribution (divides the distribution into two equal parts)
 * 3) Mean- the summation of the scores divided by the number of scores in the distribution; also known as the average

Measures of Variability

 * Measure of Variability**- the measure of how the scores differ from one another
 * If the scores are very similar they are **homogeneous**- there is little dispersion and little variability
 * If the scores are very different they are **heterogeneous**- there is a high degree of dispersion and variability
 * 1. Range**- the simplest measure of variability- the difference between the highest and lowest score
 * 2. Interquartile Range**- the measure of how much spread exists between the middle 50% of the scores the range between the upper and lower quartile
 * Upper Quartile- the numerical value that separates the top 25 percent of scores from the bottom 75%
 * **Lower Quartile**- the top 75% of the scores
 * 3. Semi-interquartile range**- one half the size of the interquartile range


 * Box and Whisker Plot**- a visual representation of the degree of variability; the upper and lower quartiles are represented by rectangles and the whiskers are lines no longer than 1.5 times the height of the rectangles of the upper and lower quartiles.


 * 4. Standard deviation**- indicator of the amount of dispersion (represented by the single letter s)
 * 5. Variance-** indicator of the amount of dispersion

**Standard Scores**
When researchers want to focus on individual scores instead of all the scores as a set, they usually convert the raw scores into a standard score. Two of these are **t-scores** and **z-scores**. Each one of these indicate many standard deviations in a particular raw score showing whether that particular score is above or below the group mean.


 * Outliers**- scores that lie far away from the rest of the scores.
 * The data should be examined for outliers and either discarded or perform analysis of the data set with the outliers included and excluded and compare the two results.