An Interesting & Faster Method To Drastically Improve Data Science And Machine Learning Efficiency

By Kimberly Cook |Email | Feb 18, 2019 | 5811 Views

For every business, data science is the foundation of enabling a successful transformation into an AI-powered enterprise. Data analytics and machine learning drive smarter business decisions, greater operational efficiency, and higher levels of customer satisfaction.

Streamlining the data science workflow is essential to ensuring that organizations can mine oceans of data for valuable insights and predictions that can power business. Unfortunately, today's enterprise machine learning is built on a legacy architecture that was never designed for the unique demands of ingesting, preparing and ultimately training ML algorithms with speed and efficiency - attributes which are native to the world of GPUs.
GPU-accelerated data science enables a new, faster method to drastically improve the speed and efficiency of data science initiatives:

  • Oak Ridge National Labs - saw a 215X speedup using RAPIDS with XGBoost
  • a large retailer expects they can reduce the error rate on their inventory forecast models by 4% - which could net hundreds of millions to almost a billion dollars in savings
  • a streaming media company experienced a 24X speedup in ML training, which could net a $1.5M savings in legacy infrastructure - now replaced with a single GPU node

Data science talent is hard to find, and maybe harder to retain if they're not adequately equipped to do their best work, with the best tools available. If your valued innovators are spending an appreciable amount of their day waiting on ingesting a CSV file, data analysis, data preparation or training a model, they're likely bored or taking too many coffee breaks while they wait. Neither of which is great from a talent retention perspective.

With the advent of GPU-accelerated data science,  talent no longer needs to feel restricted by their platform. Previously, data science workflow could involve wrangling gigabyte-plus size CSVs and training models over the course of hours or days. This process made machine learning development slow and cumbersome. GPU-computing is now helping the enterprise get easier insights and faster predictions from machine learning models.
At GTC Europe, NVIDIA launched a new data science platform for the enterprise, built on RAPIDS - our solution for GPU-accelerating the end to end machine learning workflow. This platform enables an accelerated data science pipeline that delivers dramatically reduced model development times resulting in more accurate models, effortless productivity for data scientists, and lowers the TCO of machine learning by eliminating the infrastructure sprawl associated with enterprise-scale.

Built on NVIDIA CUDA, the platform combines our GPU-optimized suite of libraries focused on supercharging data preparation and ML training, along with a growing ecosystem of GPU-accelerated visualization solutions. Now data scientists can benefit from an end-to-end workflow that's faster, more productive and yields models with the best accuracy possible, at the lowest cost.

Source: HOB