Home

Algorithmic Data Structure

|
|  Updated:  
2017-07-17 16:51:56
Data Science Essentials For Dummies
Explore Book
Buy On Amazon
Structure is an essential element in making algorithms work. An essential need to meet as part of working with data is to understand the data content. A search algorithm works only when you understand the dataset so that you know what to search for using the algorithm.

Looking for words when the dataset contains numbers is an impossible task that always results in errors. Yet, search errors due to a lack of understanding of dataset content are a common occurrence even with the best search engines.

Humans make assumptions about dataset content that cause algorithms to fail. Consequently, the better you can see and understand the content through structured formatting, the easier it becomes to perform algorithm-based tasks successfully.

However, even looking at the content is often error prone when dealing with humans and computers. For example, if you attempt to search for a number formatted as a string when the dataset contains the numbers formatted as integers, the search will fail.

Computers don't automatically translate between strings and integers as humans do. In fact, computers see everything as numbers, and strings are only an interpretation imposed on the numbers by a programmer. Therefore, when searching for "1" (the string), the computer sees it as a request for the number 49 when using ASCII characters. To find the numeric value 1, you must search for a 1 as an integer value.

Structure also enables you to discover nuanced data details. For example, a telephone number can appear in the form (555)555-1212. If you perform a search or other algorithm task using the form 1(555)555-1212, the search might fail because of the addition of a 1 at the beginning of the search term. These sorts of issues cause significant problems because most people see the two forms as equal, but the computer doesn't. The computer sees two completely different forms and even sees them as being two different lengths. Trying to impose form on humans rarely works and generally results in frustration that makes using the algorithm even harder, so structure imposed through data manipulation becomes even more important.

About This Article

This article is from the book: 

About the book author:

John Paul Mueller is a freelance author and technical editor. He has writing in his blood, having produced 100 books and more than 600 articles to date. The topics range from networking to home security and from database management to heads-down programming. John has provided technical services to both Data Based Advisor and Coast Compute magazines.

Luca Massaron is a data scientist specialized in organizing and interpreting big data and transforming it into smart data by means of the simplest and most effective data mining and machine learning techniques. Because of his job as a quantitative marketing consultant and marketing researcher, he has been involved in quantitative data since 2000 with different clients and in various industries, and is one of the top 10 Kaggle data scientists.