Home

What Are the Key Properties of a Dataset?

|
Updated:  
2016-03-26 07:28:11
|
Statistics for Big Data For Dummies
Explore Book
Buy On Amazon

Prior to performing any type of statistical analysis, understanding the nature of the data being analyzed is essential. You can use EDA to identify the properties of a dataset to determine the most appropriate statistical methods to apply to the data. You can investigate several types of properties with EDA techniques, including the following:

  • The center of the data

  • The spread among the members of the data

  • The skewness of the data

  • The probability distribution the data follows

  • The correlation among the elements in the dataset

  • Whether or not the parameters of the data are constant over time

  • The presence of outliers in the data

Another key question EDA answers is "Does the data conform to our assumptions?" Identifying the properties of a dataset is very important, because many statistical procedures are sensitive to the assumptions you make about the data.

About This Article

This article is from the book: 

About the book author:

Alan Anderson, PhD is a teacher of finance, economics, statistics, and math at Fordham and Fairfield universities as well as at Manhattanville and Purchase colleges. Outside of the academic environment he has many years of experience working as an economist, risk manager, and fixed income analyst. Alan received his PhD in economics from Fordham University, and an M.S. in financial engineering from Polytechnic University.

David Semmelroth has two decades of experience translating customer data into actionable insights across the financial services, travel, and entertainment industries. David has consulted for Cedar Fair, Wachovia, National City, and TD Bank.