AI algorithms have tried different approaches along the way, passing from simple algorithms to symbolic reasoning based on logic and then to expert systems. In recent years, they became neural networks and, in their most mature form, deep learning. As this methodological passage happened, data turned from being the information processed by predetermined algorithms to becoming what molded the algorithm into something useful for the task. Data turned from being just the raw material that fueled the solution to the artisan of the solution itself, as shown here.
Thus, a photo of some of your kittens has become increasingly useful not simply just because of its affective value — depicting your cute little cats — but because it could become part of the learning process of an AI discovering more general concepts, such as what characteristics denote a cat, or understanding what defines cute.
On a larger scale, a company like Google feeds its algorithms from freely available data, such as the content of websites or the text found in publicly available texts and books. Google spider software crawls the web, jumping from website to website, retrieving web pages with their content of text and images. Even if Google gives back part of the data to users as search results, it extracts other kinds of information from the data using its AI algorithms, which learn from it how to achieve other objectives.
Algorithms that process words can help Google AI systems understand and anticipate your needs even when you are not expressing them in a set of keywords but in plain, unclear natural language, the language we speak every day (and yes, everyday language is often unclear). If you currently try to pose questions, not just chains of keywords, to the Google search engine, you’ll notice that it tends to answer correctly. Since 2012, with the introduction of the Hummingbird update, Google became better able to understand synonyms and concepts, something that goes beyond the initial data that it acquired, and this is the result of an AI process. An even more advanced algorithm exists in Google, named RankBrain, which learns directly from millions of queries every day and can answer ambiguous or unclear search queries, even expressed in slang or colloquial terms or simply ridden with errors. RankBrain doesn’t service all the queries, but it learns from data how to better answer queries. It already handles 15 percent of the engine’s queries, and in the future, this percentage could become 100 percent.