03-Scores-Scales.Rmd

# Scores and Scales {#scoresScales}

Assessments yield information.
The information is encoded in scores or in other [types of data](#dataTypes).\index{data!type}
It is important to consider the different [types of data](#dataTypes) because the [types of data](#dataTypes) restrict what options are available to analyze the data.\index{data!type}

## Getting Started {#gettingStarted-scoresScales}

### Load Libraries {#loadLibraries-scoresScales}

First, we load the `R` packages used for this chapter:

```{r}
library("petersenlab") #to install: install.packages("remotes"); remotes::install_github("DevPsyLab/petersenlab")
library("dplyr")
library("tidyverse")
library("tinytex")
library("knitr")
library("rmarkdown")
library("bookdown")
```

### Prepare Data {#prepareData-scoresScales}

#### Simulate Data {#simulateData-scoresScales}

For this example, we simulate data below.\index{simulate data}
For reproducibility, we set the seed below.
Using the same seed will yield the same answer every time.
There is nothing special about this particular seed.

```{r}
set.seed(52242)

rawData <- rnorm(
  n = 1000,
  mean = 200,
  sd = 30)

scores <- data.frame(
  rawData = rawData,
  rawDataNoMissing = rawData)
```

#### Add Missing Data {#addMissingData-scoresScales}

Adding missing data to dataframes helps make examples more realistic to real-life data and helps you get in the habit of programming to account for missing data.

```{r}
scores$rawData[c(5,10)] <- NA
```

## Data Types {#dataTypes}

There are four general data types: [nominal](#nominalData), [ordinal](#ordinalData), [interval](#intervalData), and [ratio](#ratioData).\index{data!type}
Depending on the use of the variable, the data could fall into more than one category.\index{data!type}
The type of data influences what kinds of data analysis you can do.\index{data!type}
For instance, parametric statistical analysis (e.g., *t* test, analysis of variance [ANOVA], and linear regression) assumes that data are [interval](#intervalData) or [ratio](#ratioData).\index{data!type}

### Nominal {#nominalData}

Nominal data are distinct categories.\index{data!nominal}
They are categorical and unordered.\index{data!nominal}
Nominal data make no quantitative claims.\index{data!nominal}
Nominal data represent things that we can name (e.g., cat and dog).\index{data!nominal}
Nominal data can be represented with numbers.\index{data!nominal}
For example, zip codes are nominal.\index{data!nominal}
Numbers that represent a participant's sex, race, or ethnicity are also nominal.\index{data!nominal}
Categorical data are often dummy-coded into binary (i.e., dichotomous) variables, which represent nominal data.\index{data!nominal}\index{data!dichotomous}
Higher numbers of nominal data do not reflect higher (or lower) levels of the construct because the numbers represent categories that do not have an order.\index{data!nominal}

### Ordinal {#ordinalData}

Ordinal data are ordered categories: they have a name and an order.\index{data!ordinal}
They make no claim about the conceptual distance between the ranks, only that higher values represent higher (or lower) levels of the construct.\index{data!ordinal}
For example, ranks following a race are ordinal—that is, the person with rank 1 finished before the person with rank 2, who finished before the person with rank 3 (1 > 2 > 3 > 4).\index{data!ordinal}
Ordinal data make a limited claim because the conceptual distance between adjacent numbers is not the same.\index{data!ordinal}
For instance, the person who finished the race first might have finished 10 minutes before the second-place finisher; whereas the third-place finisher might have finished 1 second after the second-place finisher.\index{data!ordinal}

That is, just because the numbers have the same *mathematical* distance does not mean that they represent the same *conceptual* distance on the construct.\index{data!ordinal}
For example, if the respondent is asked how many drinks they had in the past day, and the options are 0 = 0 drinks; 1 = 1–2 drinks; 2 = 3 or more drinks, the scale is ordinal.\index{data!ordinal}
Even though the numbers have the same mathematical distance (1, 2, 3), they do not represent the same conceptual distance.\index{data!ordinal}
Most data in psychology are ordinal data even though they are often treated as if they were [interval](#intervalData) data.\index{data!ordinal}\index{data!interval}

### Interval {#intervalData}

Interval data are ordered and have meaningful distances (i.e., equal spacing between intervals).\index{data!interval}
You can sum interval data (e.g., 2 is 2 away from 4), but you cannot multiply interval data ($2 \times 2 \ne 4$).\index{data!interval}
Examples of interval data are temperatures in Fahrenheit and Celsius—100 degrees Fahrenheit is not twice as hot as 50 degrees Fahrenheit.\index{data!interval}
A person's number of years of education is interval, whereas educational attainment (e.g., high school degree, college degree, graduate degree) is only ordinal.\index{data!interval}
Although much data in psychology involve numbers that have the same mathematical distance between intervals, the intervals likely do not represent the same conceptual distance.\index{data!interval}
For example, the difference in severity of two people who have two symptoms and four symptoms of depression, respectively, may not be the same difference in depression severity as two people who have four symptoms and six symptoms, respectively.\index{data!interval}

### Ratio {#ratioData}

Ratio data are ordered, have meaningful distances, and have a true (absolute) zero that represents absence of the construct.\index{data!ratio}
With ratio data, multiplicative relationships are true.\index{data!ratio}
An example of ratio data is temperature in Kelvin—100 degrees Kelvin is twice as hot as 50 degrees Kelvin.\index{data!ratio}
There is a dream of having ratio scales in psychology, but we still do not have a true zero with psychological constructs—what does total absence of depression mean (apart from a dead person)?\index{data!ratio}

## Score Transformation {#scoreTransformation}

There are a number of score transformations, depending on the goal.\index{data!transformation}
Some score transformations (e.g., log transform) seek to make data more normally distributed to meet assumptions of particular analysis approaches.\index{data!transformation}
Score transformations alter the original (raw) data.\index{data!transformation}\index{data!raw}
If you change the data, it can change the results.\index{data!transformation}
Score transformations are not neutral.\index{data!transformation}

### Raw Scores {#rawScores}

Raw scores are the original data, or they may be aggregations (e.g., sums or means) of multiple items.\index{data!raw}
Raw scores are the purest because they are closest to the original operation (e.g., behavior).\index{data!raw}
A disadvantage of raw scores is that they are scale dependent, and therefore they may not be comparable across different measures with different scales.\index{data!raw}
An example histogram of raw scores is in Figure \@ref(fig:rawScores).\index{data!raw}\index{histogram}

```{r rawScores, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Histogram of Raw Scores."}
hist(scores$rawData,
     xlab = "Raw Scores",
     main = "Histogram of Raw Scores")
```

### Norm-Referenced Scores {#norm}

Norm-referenced scores are scores that are referenced to some norm.\index{norm-referenced}\index{norm}
A norm is a standard of comparison.\index{norm}
For instance, you may be interested in how well a participant performed relative to other children of the same sex, age, grade, or ethnicity.\index{norm-referenced}\index{norm}
However, interpretation of norm-referenced scores depends on the measure and on the normative sample.\index{norm-referenced}\index{norm}
A person's norm-referenced score can vary widely depending on which norms are used.\index{norm-referenced}\index{norm}
Which reference group should you use?\index{norm-referenced}\index{norm}
Age?\index{norm-referenced}\index{norm}
Sex?\index{norm-referenced}\index{norm}
Age and sex?\index{norm-referenced}\index{norm}
Grade?\index{norm-referenced}\index{norm}
Ethnicity?\index{norm-referenced}\index{norm}
The optimal reference group depends on the purpose of the assessment.\index{norm-referenced}\index{norm}
The quality of the norms also depends on the representativeness of the reference group compared to the population of interest.\index{norm-referenced}\index{norm}
Pros and cons of group-based norms are discussed in Section \@ref(withinGroupNorming).\index{norm-referenced}\index{norm}

A standard normal distribution on various norm-referenced scales is depicted in Figure \@ref(fig:normedDistribution), as adapted from @Bandalos2018.\index{norm-referenced}

```{r normedDistribution, echo = FALSE, results = "hide", out.width = "100%", fig.align = "center", fig.cap = "Various Norm-Referenced Scales."}
standardNormal <- data.frame(zScore = -4:4)
standardNormal$Tscore <- seq(from = 10, to = 90, by = 10)
standardNormal$standardScore <- seq(from = 40, to = 160, by = 15)
standardNormal$cumulativePercent <- substr(as.character(pnorm(standardNormal$zScore) * 100), 1, 4)
standardNormal$cumulativePercentFigure <- c("0","0.1","2.3","15.9","50","84.1","97.7","99.9","100")

standardNormal$nonCumulativePercent[1] <- round((pnorm(-3) - pnorm(-4)) * 100, 2)
standardNormal$nonCumulativePercent[2] <- round((pnorm(-2) - pnorm(-3)) * 100, 2)
standardNormal$nonCumulativePercent[3] <- round((pnorm(-1) - pnorm(-2)) * 100, 2)
standardNormal$nonCumulativePercent[4] <- round((pnorm(0) - pnorm(-1)) * 100, 2)
standardNormal$nonCumulativePercent[5] <- round((pnorm(1) - pnorm(0)) * 100, 2)
standardNormal$nonCumulativePercent[6] <- round((pnorm(2) - pnorm(1)) * 100, 2)
standardNormal$nonCumulativePercent[7] <- round((pnorm(3) - pnorm(2)) * 100, 2)
standardNormal$nonCumulativePercent[8] <- round((pnorm(4) - pnorm(3)) * 100, 2)

standardNormal$nonCumulativePercentFigure <- paste(standardNormal$nonCumulativePercent, "%", sep = "")
standardNormal$midpointsZ <- c(-3.5,-2.5,-1.5,-0.5,0.5,1.5,2.5,3.5,NA)

standardNormal$nonCumulativePercent[9] <- NA
standardNormal$nonCumulativePercentFigure[9] <- NA

standardNormalPercentiles <- data.frame(percentileRank = c(NA,1,5,10,20,30,40,50,60,70,80,90,95,99,NA))
standardNormalPercentiles$percentileRankZ <- qnorm(standardNormalPercentiles$percentileRank / 100)
standardNormalPercentiles$percentileRankZ[1] <- -4
standardNormalPercentiles$percentileRankZ[nrow(standardNormalPercentiles)] <- 4

standardNormalStanines <- data.frame(stanine = 1:9)
standardNormalStanines$staninePercent <- c("4%","7%","12%","17%","20%","17%","12%","7%","4%")
standardNormalStanines$stanineCumulativeStartingPercent <- c(0,4,11,23,40,60,77,89,96)
standardNormalStanines$stanineCumulativeEndingPercent <- c(4,11,23,40,60,77,89,96,100)
standardNormalStanines$stanineCumulativeStartingPercentZ <- qnorm(standardNormalStanines$stanineCumulativeStartingPercent/100)
standardNormalStanines$stanineCumulativeEndingPercentZ <- qnorm(standardNormalStanines$stanineCumulativeEndingPercent/100)
standardNormalStanines[standardNormalStanines == -Inf] <- -4
standardNormalStanines[standardNormalStanines == Inf] <- 4
standardNormalStanines$midpointsZ <- rowMeans(standardNormalStanines[,c("stanineCumulativeStartingPercentZ","stanineCumulativeEndingPercentZ")])

par(xpd = NA,
    mgp = c(3, 0.5 , 0), # gap between axis label and axis
    mar = c(12, 8, 0, 0) + 0.1) # plot margins: default of the form c(bottom, left, top, right): c(5, 4, 4, 2) + 0.1
x <- seq(-4, 4, length = 10000)
y <- dnorm(x, mean = 0, sd = 1)

plot(x, y, type = "l", lwd = 8, axes = FALSE, xlab = NA, ylab = NA, col = "gray")

segments(x0 = -4, y0 = 0, y1 = 1)
segments(x0 = -3, y0 = 0, y1 = 1)
segments(x0 = -2, y0 = 0, y1 = 1)
segments(x0 = -1, y0 = 0, y1 = 1)
segments(x0 = 0, y0 = 0, y1 = 1)
segments(x0 = 1, y0 = 0, y1 = 1)
segments(x0 = 2, y0 = 0, y1 = 1)
segments(x0 = 3, y0 = 0, y1 = 1)
segments(x0 = 4, y0 = 0, y1 = 1)

axis(side = 1, at = standardNormal$zScore, pos = 0)
axis(side = 1, at = standardNormal$zScore, pos = -.06, labels = standardNormal$Tscore)
axis(side = 1, at = standardNormal$zScore, pos = -.12, labels = standardNormal$standardScore)
axis(side = 1, at = standardNormal$zScore, pos = -.18, labels = standardNormal$cumulativePercentFigure)
axis(side = 1, at = standardNormalPercentiles$percentileRankZ, pos = -.24, labels = standardNormalPercentiles$percentileRank, cex.axis = 0.5)

rect(xleft = standardNormalStanines$stanineCumulativeStartingPercentZ, ybottom = -.36, xright = standardNormalStanines$stanineCumulativeEndingPercentZ, ytop = -.31)
mtext(standardNormalStanines$stanine, side = 1, line = 9, at = standardNormalStanines$midpointsZ)

text(0.05, .02, labels = "Standard Deviations")
text(standardNormal$midpointsZ, .06, labels = standardNormal$nonCumulativePercentFigure)

mtext("Percent of cases \n under segment of \n the normal curve", side = 2, line = 0, at = .05, las = 1)
mtext("z scores", side = 2, line = 0, at = -.033, las = 1)
mtext("T scores", side = 2, line = 0, at = -.0934, las = 1)
mtext("Standard scores", side = 2, line = 0, at = -.1538, las = 1)
mtext("Cumulative percent", side = 2, line = 0, at = -.2142, las = 1)
mtext("Percentile ranks", side = 2, line = 0, at = -.2746, las = 1)
mtext("Stanines", side = 2, line = 0, at = -.335, las = 1)
```

#### Percentile Ranks {#percentileRanks}

A percentile rank reflects what percent of people a person scored higher than, in a given group (i.e., norm).\index{data!percentile rank}\index{norm}
Percentile ranks are frequently used for tests of intellectual/cognitive ability, academic achievement, academic aptitude, and grant funding.\index{data!percentile rank}\index{intellectual assessment}\index{achievement!testing}\index{aptitude!testing}
They seem like interval data, but they are not intervals because the conceptual spacing between the numbers is not equal.\index{data!percentile rank}\index{data!interval}
The difference in ability for two people who scored at the 99th and 98th percentile, respectively, is not the same as the difference in ability for two people who scored at the 49th and 50th percentile, respectively.\index{data!percentile rank}
Percentile ranks are only judged against a baseline; there is no subtraction.\index{data!percentile rank}

Percentile ranks have unusual effects.\index{data!percentile rank}
There are lots of people in the middle of a distribution, so a very small difference in raw scores gets expanded out in percentiles.\index{data!percentile rank}
For instance, a raw score of 20 may have a percentile rank of 50, but a raw score of 24 may have a percentile rank of 68.\index{data!percentile rank}\index{data!raw}
However, a larger raw score change at the ends of the distribution may have a smaller percentile change.\index{data!percentile rank}\index{data!raw}
For example, a raw score of 120 may have a percentile rank of 97, whereas a raw score of 140 may have a percentile rank of 99.\index{data!percentile rank}\index{data!raw}
Thus, percentile ranks stretch out differences for some people but constrict differences for others.\index{data!percentile rank}

Here is an example of how to calculate percentile ranks using the `dplyr` package from the `tidyverse` [@R-tidyverse; @tidyverse2019]:\index{data!percentile rank}

```{r}
scores$percentileRank <- 
  percent_rank(scores$rawDataNoMissing) * 100
```

A histogram of percentile rank scores is in Figure \@ref(fig:percentileRanks).\index{data!percentile rank}\index{histogram}

```{r percentileRanks, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Histogram of Percentile Ranks."}
hist(scores$percentileRank,
     xlab = "Percentile Ranks",
     main = "Histogram of Percentile Ranks")
```

#### Deviation (Standardized) Scores {#standardizedScores}

Deviation or standardized scores are the transformation of raw scores to a normal distribution using some norm.\index{data!standardized}\index{norm}
The norm could be a comparison group, or it could be the sample itself.\index{data!standardized}\index{norm}
With deviation scores, you have similar challenges as with [percentile ranks](#percentileRanks), including which reference group to use, but there are additional challenges.\index{data!standardized}\index{data!percentile rank}
Deviation scores are more informative when the scores are normally distributed compared to when the scores are skewed.\index{data!standardized}\index{data!not normally distributed}
If scores are skewed, it can lead to two *z* scores on the opposite side of the mean having different probabilities.\index{data!standardized}\index{data!\textit{z} score}

Many constructs we study in psychology are not normally distributed.\index{data!not normally distributed}
For example, the frequency of hallucinations among people would show a positively skewed distribution with a truncation at zero, representing a floor effect—i.e., most people do not show hallucinations.\index{data!not normally distributed}
For instance, consider a hypothetical distribution of hallucinations.
It might follow a folded distribution, as depicted in Figure \@ref(fig:foldedDistribution):\index{data!not normally distributed}\index{histogram}

```{r foldedDistribution, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Histogram of Hallucinations (Raw Score)."}
hist(rbinom(100000, 300, .01),
     breaks = 8,
     xlab = "Hallucinations (Raw Score)",
     main = "Histogram of Hallucinations (Raw Score)")
```

Now consider the same distribution converted to a standardized (*z*) score, as depicted in Figure \@ref(fig:foldedDistributionZscore):\index{data!standardized}\index{data!\textit{z} score}\index{histogram}

```{r foldedDistributionZscore, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Histogram of Hallucinations (z Score)."}
hist(scale(rbinom(100000, 300, .01)),
     breaks = 8,
     xlab = "Hallucinations (z Score)",
     main = "Histogram of Hallucinations (z Score)")
```

Thus, you can compute a deviation score, but it may not be meaningful if the data and underlying construct are not normally distributed.\index{data!standardized}\index{data!not normally distributed}

##### *z* scores {#zScores}

The *z* score is the most common standardized score, and it can help put scores from different measures with different scales onto the same playing field.\index{data!standardized}\index{data!\textit{z} score}
*z* scores have a mean of zero and a standard deviation of one.\index{data!standardized}\index{data!\textit{z} score}
To get a *z* score that uses the sample as its own norm, subtract the mean from all scores and divide by the standard deviation.\index{data!standardized}\index{data!\textit{z} score}
Every *z* score represents how far that person's score is from the (normed) average, represented in standard deviation units.\index{data!standardized}\index{data!\textit{z} score}
With a standard normal curve, 68% of scores fall within one standard deviation of the mean.\index{data!standardized}\index{data!\textit{z} score}
95% of scores fall within two standard deviations of the mean.\index{data!standardized}\index{data!\textit{z} score}
99.7% of scores fall within three standard deviations of the mean.\index{data!standardized}\index{data!\textit{z} score}

The area under a normal curve within one standard deviation of the mean is calculated below using the `pnorm()` function, which calculates the cumulative density function for a normal curve.\index{data!standardized}

```{r}
stdDeviations <- 1

pnorm(stdDeviations) - pnorm(stdDeviations * -1)
```

The area under a normal curve within one standard deviation of the mean is depicted in Figure \@ref(fig:zScoreDensity1SD).\index{data!standardized}

```{r zScoreDensity1SD, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Density of Standard Normal Distribution. The blue region represents the area within one standard deviation of the mean.", fig.scap = "Density of Standard Normal Distribution: One Standard Deviation of the Mean."}
x <- seq(-4, 4, length = 200)
y <- dnorm(x, mean = 0, sd = 1)
plot(x, y, type = "l",
     xlab = "z Score",
     ylab = "Normal Density")

x <- seq(stdDeviations * -1, stdDeviations, length = 100)
y <- dnorm(x, mean = 0, sd = 1)
polygon(c(stdDeviations * -1, x, stdDeviations),
        c(0, y, 0),
        col = "blue")
```

The area under a normal curve within two standard deviations of the mean is calculated below:\index{data!standardized}

```{r}
stdDeviations <- 2

pnorm(stdDeviations) - pnorm(stdDeviations * -1)
```

The area under a normal curve within two standard deviations of the mean is depicted in Figure \@ref(fig:zScoreDensity2SD).\index{data!standardized}

```{r zScoreDensity2SD, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Density of Standard Normal Distribution. The blue region represents the area within two standard deviations of the mean.", fig.scap = "Density of Standard Normal Distribution: Two Standard Deviations of the Mean."}
x <- seq(-4, 4, length = 200)
y <- dnorm(x, mean = 0, sd = 1)
plot(x, y, type = "l",
     xlab = "z Score",
     ylab = "Normal Density")

x <- seq(stdDeviations * -1, stdDeviations, length = 100)
y <- dnorm(x, mean = 0, sd = 1)
polygon(c(stdDeviations * -1, x, stdDeviations),
        c(0, y, 0),
        col = "blue")
```

The area under a normal curve within three standard deviations of the mean is calculated below:\index{data!standardized}

```{r}
stdDeviations <- 3

pnorm(stdDeviations) - pnorm(stdDeviations * -1)
```

The area under a normal curve within three standard deviations of the mean is depicted in Figure \@ref(fig:zScoreDensity3SD).\index{data!standardized}

```{r zScoreDensity3SD, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Density of Standard Normal Distribution. The blue region represents the area within three standard deviations of the mean.", fig.scap = "Density of Standard Normal Distribution: Three Standard Deviations of the Mean."}
x <- seq(-4, 4, length = 200)
y <- dnorm(x, mean = 0, sd = 1)
plot(x, y, type = "l",
     xlab = "z Score",
     ylab = "Normal Density")

x <- seq(stdDeviations * -1, stdDeviations, length = 100)
y <- dnorm(x, mean = 0, sd = 1)
polygon(c(stdDeviations * -1, x, stdDeviations),
        c(0, y, 0),
        col = "blue")
```

Alternatively, if you want to determine the *z* score associated with a particular percentile in a normal distribution, you can use the `qnorm()` function.\index{data!standardized}\index{data!\textit{z} score}\index{data!percentile rank}

For instance, the *z* score associated with the 37th percentile is:\index{data!standardized}\index{data!percentile rank}

```{r}
qnorm(.37)
```

*z* scores are calculated using Equation \@ref(eq:zScore):\index{data!standardized}\index{data!\textit{z} score}

\begin{equation}
z = \frac{x - \mu}{\sigma}
(\#eq:zScore)
\end{equation}

where $x$ is the observed score, $\mu$ is the mean observed score, and $\sigma$ is the standard deviation of observed scores.\index{data!\textit{z} score}

```{r}
scores$zScore <- scale(scores$rawData)

all.equal(
  as.vector(scores$zScore),
  (scores$rawData - mean(scores$rawData, na.rm = TRUE)) / 
    sd(scores$rawData, na.rm = TRUE))
```

An example histogram of *z* scores is in Figure \@ref(fig:zScores).\index{data!standardized}\index{data!\textit{z} score}\index{histogram}

(ref:zScores) Histogram of *z* Scores.

```{r zScores, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "(ref:zScores)"}
hist(scores$zScore,
     xlab = "z Scores",
     main = "Histogram of z Scores")
```

##### *T* scores {#tScores}

*T* scores have a mean of 50 and a standard deviation of 10.\index{data!standardized}\index{data!\textit{T} score}
*T* scores are frequently used with personality and symptom measures, where clinical cutoffs are often set at 70 (i.e., two standard deviations above the mean).\index{data!standardized}\index{data!\textit{T} score}\index{personality assessment}
For the Minnesota Multiphasic Personality Inventory (MMPI), you would examine peaks (elevations $\ge$ 70) and absences ($\le$ 30).\index{data!\textit{T} score}\index{Minnesota Multiphasic Personality Inventory}

*T* scores are calculated using Equation \@ref(eq:TScore):\index{data!standardized}\index{data!\textit{T} score}

\begin{equation}
T = 50 + 10z
(\#eq:TScore)
\end{equation}

where $z$ are *z* scores.\index{data!\textit{z} score}

```{r}
scores$tScore <- 50 + 10*scale(scores$rawData)
```

An example histogram of *T* scores is in Figure \@ref(fig:tScores).\index{data!\textit{T} score}\index{histogram}

(ref:tScores) Histogram of *T* Scores.

```{r tScores, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "(ref:tScores)"}
hist(scores$tScore,
     xlab = "T Scores",
     main = "Histogram of T Scores")
```

##### Standard scores {#standardScores}

Standard scores have a mean of 100 and a standard deviation of 15.\index{data!standardized}\index{data!standard score}
Standard scores are frequently used for tests of intellectual ability, academic achievement, and cognitive ability.\index{data!standardized}\index{data!standard score}\index{intellectual assessment}\index{achievement!testing}
Intellectual disability is generally considered an I.Q. standard score less than 70 (two standard deviations below the mean), whereas giftedness is at 130 (two standard deviations above the mean).\index{data!standardized}\index{data!standard score}\index{intellectual assessment}

Standard scores with a mean of 100 and standard deviation of 15 are calculated using Equation \@ref(eq:standardScore):\index{data!standardized}\index{data!standard score}

\begin{equation}
\text{standard score} = 100 + 15z
(\#eq:standardScore)
\end{equation}

where $z$ are *z* scores.\index{data!\textit{z} score}

```{r}
scores$standardScore <- 100 + 15*scale(scores$rawData)
```

An example histogram of standard scores is in Figure \@ref(fig:standardScores).\index{data!standardized}\index{data!standard score}\index{histogram}

```{r standardScores, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Histogram of Standard Scores."}
hist(scores$standardScore,
     xlab = "Standard Scores",
     main = "Histogram of Standard Scores")
```

##### Scaled scores {#scaledScores}

Scaled scores are raw scores that have been converted to a standardized metric.\index{data!standardized}\index{data!scaled score}
The particular metric of the scaled score depends on the measure.\index{data!scaled score}
On tests of intellectual or cognitive ability, scaled scores commonly have a mean of 10 and a standard deviation of 3.\index{data!standardized}\index{intellectual assessment}\index{data!scaled score}

Using this metric, scaled scores can be calculated using Equation \@ref(eq:scaleScore):\index{data!standardized}\index{data!scaled score}

\begin{equation}
\text{scale score} = 10 + 3z
(\#eq:scaleScore)
\end{equation}

where $z$ are *z* scores.\index{data!\textit{z} score}

```{r}
scores$scaledScore <- 10 + 3*scale(scores$rawData)
```

An example histogram of scaled scores is in Figure \@ref(fig:scaledScores).\index{data!standardized}\index{data!scaled score}\index{histogram}

```{r scaledScores, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Histogram of Scaled Scores."}
hist(scores$scaledScore,
     xlab = "Scale Scores",
     main = "Histogram of Scaled Scores")
```

##### Stanine scores {#stanineScores}

Stanine scores (short for STANdard Nine) have a mean of 5 and a standard deviation of 2.\index{data!standardized}\index{data!stanine score}
The scores range from 1–9.\index{data!standardized}\index{data!stanine score}
Stanine scores are calculated using the bracketed proportions in Table \@ref(tab:stanineCalculation).\index{data!standardized}\index{data!stanine score}
The lowest 4% receive a stanine score of 1, the next 7% receive a stanine score of 2, etc.\index{data!standardized}\index{data!stanine score}

```{r, include = FALSE}
stanineConverstionTable <- data.frame(stanine = 1:9)
stanineConverstionTable$bracketedPercent <- c(4,7,12,17,20,17,12,7,4)
stanineConverstionTable$bracketedPercentTable <- paste(stanineConverstionTable$bracketedPercent, "%", sep = "")
stanineConverstionTable$bracketedPercent <- NULL
stanineConverstionTable$cumulativePercent <- c(4,11,23,40,60,77,89,96,100)
stanineConverstionTable$cumulativePercentTable <- paste(stanineConverstionTable$cumulativePercent, "%", sep = "")
stanineConverstionTable$cumulativePercent <- NULL
stanineConverstionTable$zScore <- c("below −1.75","−1.75 to −1.25","−1.25 to −0.75","−0.75 to −0.25","−0.25 to +0.25","+0.25 to +0.75","+0.75 to +1.25","+1.25 to +1.75","above +1.75")
stanineConverstionTable$standardScore <- c("below 74","74 to 81","81 to 89","89 to 96","96 to 104","104 to 111","111 to 119","119 to 126","above 126")

names(stanineConverstionTable) <- c("Stanine","Bracketed Percent","Cumulative Percent","z Score","Standard Score")
```

```{r stanineCalculation, echo = FALSE}
kable(stanineConverstionTable,
      caption = "Calculating Stanine Scores",
      booktabs = TRUE)
```

```{r}
scores$stanineScore <- NA
scores$stanineScore[which(scores$percentileRank <= 4)] <- 1
scores$stanineScore[
  which(scores$percentileRank > 4 & scores$percentileRank <= 11)] <- 2
scores$stanineScore[
  which(scores$percentileRank > 11 & scores$percentileRank <= 23)] <- 3
scores$stanineScore[
  which(scores$percentileRank > 23 & scores$percentileRank <= 40)] <- 4
scores$stanineScore[
  which(scores$percentileRank > 40 & scores$percentileRank <= 60)] <- 5
scores$stanineScore[
  which(scores$percentileRank > 60 & scores$percentileRank <= 77)] <- 6
scores$stanineScore[
  which(scores$percentileRank > 77 & scores$percentileRank <= 89)] <- 7
scores$stanineScore[
  which(scores$percentileRank > 89 & scores$percentileRank <= 96)] <- 8
scores$stanineScore[which(scores$percentileRank > 96)] <- 9
```

A histogram of stanine scores is in Figure \@ref(fig:stanineScores).\index{data!standardized}\index{data!stanine score}\index{data!histogram}

```{r stanineScores, out.width = "100%", fig.align = "center", class.source = "fold-hide", fig.cap = "Histogram of Stanine Scores."}
hist(scores$stanineScore,
     xlab = "Stanine Scores",
     main = "Histogram of Stanine Scores",
     breaks = seq(from = 0.5, to = 9.5, by = 1),
     xlim = c(0,10),
     xaxp = c(1,9,8))
```

## Conclusion {#conclusion-scoresScales}

It is important to consider the [types of data](#dataTypes) your data are because the [types of data](#dataTypes) restrict what options are available to analyze the data.\index{data!type}
It is also important to consider whether the data were [transformed](#scoreTransformation) because score transformations are not neutral—they can impact the results.\index{data!transformation}

## Suggested Readings {#readings-scoresScales}

@Childs2021: https://dzchilds.github.io/stats-for-bio/data-transformations.html (archived at https://perma.cc/3YEB-QW2V)

## Exercises {#exercises-scoresScales}