Home

Standard Scores in R

|
Updated:  
2017-07-05 16:02:41
|
Statistical Analysis with R Essentials For Dummies
Explore Book
Buy On Amazon
The R function for calculating standard scores is called scale(). Supply a vector of scores, and scale() returns a vector of z-scores along with, helpfully, the mean and the standard deviation.

To show scale() in action, isolate a subset of the Cars93 data frame. (It's in the MASS package. On the Packages tab, check the box next to MASS if it's unchecked.)

Specifically, create a vector of the horsepowers of 8-cylinder cars from the USA:

> Horsepower.USA.Eight <- Cars93$Horsepower[Origin == "USA" & Cylinders == 8]

> Horsepower.USA.Eight [1] 200 295 170 300 190 210

And now for the z-scores:

> scale(Horsepower.USA.Eight) [,1] [1,] -0.4925263 [2,] 1.2089283 [3,] -1.0298278 [4,] 1.2984785 [5,] -0.6716268 [6,] -0.3134259 attr(,"scaled:center") [1] 227.5 attr(,"scaled:scale") [1] 55.83458 That last value is s, not Σ. If you have to base your z-scores on Σ, divide each element in the vector by the square root of (N-1)/N:

> N <- length(Horsepower.USA.Eight) > scale(Horsepower.USA.Eight)/sqrt((N-1)/N) [,1] [1,] -0.5395356

[2,] 1.3243146 [3,] -1.1281198 [4,] 1.4224120 [5,] -0.7357303 [6,] -0.3433408 attr(,"scaled:center") [1] 227.5 attr(,"scaled:scale") [1] 55.83458

Notice that scale() still returns s.

About This Article

This article is from the book: 

About the book author:

Joseph Schmuller, PhD, is a cognitive scientist and statistical analyst. He creates online learning tools and writes books on the technology of data science. His books include R All-in-One For Dummies and R Projects For Dummies.