Is Data Science Dead? Long Live Business Science
90 days ago
What Are The Bad Things About Machine Learning?
- Believing we are at the center of the universe: Copernicus and Galileo dashed these hopes, at great personal risk to their lives, since earlier such "blasphemers" were burned alive at the stake. As the most recent estimate puts it, we are one tiny planet revolving around a star, there are a trillion stars in our Milky Way galaxy, and a trillion such galaxies. Much as we might care about the next US Presidential Election, or making the maximum number of hits on Instagram, the universe does not care. We are absolutely inconsequential in the grand scheme of things.
- Assuming that humans are "divine" creatures, distinct from other animals on the planet: Darwin dashed these hopes, and while the Kansas School Board and other similar states in the US try to throw Darwin out of the school curriculum each year, 150 years after Darwin, in favor of "intelligent creation" (whatever that means), it is an established fact beyond any doubt that we are descended from the apes, and share a huge genetic similarity with many millions of other species on the planet. That simple but astonishing fact, strangely enough, may be the single biggest hope for extending human life on this planet, since breakthrough medicines for treating our most basic ailments can be developed from recognizing and exploiting the genetic similarities among species.
- Freud claimed the mantle for this last advance: the vain hope that humans are rational beings. Freud thought his life's work decisively disproved the illusion that humans act rationally. Much of the work in AI, economics, psychology, and other areas still cling to this illusion, despite the fact that there is huge evidence against it. Behavioral economics is finally realizing that perhaps it might be better to actually study how humans make decisions, rather than assuming they act rationally. AI has yet to make this intellectual leap, and work in reinforcement learning, to take one area, still assumes humans act to maximize expected utility and therefore, behave rationally.
- Believing that every problem in AI, science, business, and entertainment is best approached through an ML solution: this is a "paradigm" best exemplified by deep learning. Whether it is computer vision, speech recognition, NLP, digital marketing, robotics, social interactions, financial investing, you name it, the DL paradigm assumes that a network can be designed to fit some unknown function, given enough data. With every iteration, it seems that the "promised land" is just around the corner, and with a little bit more math, and a lot of GPU power, the magical solution will appear.
- Evaluating a desired machine learning model based on its fit to some training or test data: another common and hard-to-avoid misapprehension of ML researchers is that the method that produces the lowest error on training data (or test data derived from the same distribution) must be the best approach. So, hypothetically, if a deep learning (or whatever) approach performs, say 3% better on MNIST, compared to some other approach, e.g. random forest, then the lower error approach is intrinsically better. In other words, fit to training or test data trumps all other criteria.
- Assuming that for every given problem, there has to be an ML solution, regardless of what computational learning theory has shown: so, more than 5 decades ago, Gold established the futility of trying to induce grammars from data, even given arbitrary computation and data. While linguists took Gold's work to heart and sought to develop models that don't assume grammars can be induced from data, ML researchers largely ignored the results from their own community, and have continued to search for the "magic solution", e.g., LSTMs and other sequence models.
- Wanting a completely domain-independent solution to the problem of how the brain works, according to which any approach that is developed in one domain (e.g., convolutional networks for vision) must automatically also be the best approach in every other domain (e.g., deep reinforcement learning, speech recognition, robotics, etc.). The fact that decades of neuroscience research has produced significant evidence for the modularity of the brain has done little to dampen the enthusiasm for the hope that there is one magical architecture, and one magical solution, that will produce the best result in each and every domain.
- But a theory is not like an airline or bus timetable. We are not interested simply in the accuracy of its predictions. A theory also serves as a base for thinking. It helps us to understand what is going on by enabling us to organize our thoughts. Faced with a choice between a theory which predicts well but gives us little insight into how the system works and one which gives us this insight but predicts badly, I would choose the latter, and I am inclined to think that most economists would do the same.
- If you torture the data enough, nature will always confess.