Nand Kishor Contributor

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc... ...

Full Bio 
Follow on

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc...

3 Best Programming Languages For Internet of Things Development In 2018
406 days ago

Data science is the big draw in business schools
579 days ago

7 Effective Methods for Fitting a Liner
589 days ago

3 Thoughts on Why Deep Learning Works So Well
589 days ago

3 million at risk from the rise of robots
589 days ago

Top 10 Hot Artificial Intelligence (AI) Technologies
315441 views

Here's why so many data scientists are leaving their jobs
81822 views

2018 Data Science Interview Questions for Top Tech Companies
79275 views

Want to be a millionaire before you turn 25? Study artificial intelligence or machine learning
77661 views

Google announces scholarship program to train 1.3 lakh Indian developers in emerging technologies
62397 views

How to Create a Data Strategy for Machine Learning?

By Nand Kishor |Email | Jun 3, 2017 | 19779 Views

Summary

MLpAI can help deliver systems with more automation and less human intervention, but success requires a data strategy to deal with the complexity of real-world data. This research guides technical professionals involved in MLpAI on developing a data strategy to support successful deployments.

Table of Contents

Problem Statement

Introducing MLpAI and Its Limitations

The Gartner Approach

The Guidance Framework

Data Strategy for ML Process Framework

Prework: Building a Rationalization Framework for MLpAI

Defining the End Objective

Defining the Means Objectives

Providing Assessment and Governance to Support the Data Strategy

Defining Influencers Critical to the Success of the Data Strategy

Step 1: Build Problem or Task Taxonomy

Step 2: Design Data Science Pipeline

2.1 Constructing Batch Data Science Pipelines

2.2 Constructing Online Data Science Pipelines

Step 3: Enable Data Science Workflows

3.1 Enabling Supervised Learning Workflows

3.2 Enabling Unsupervised Learning Workflows

Step 4: Create Data Science Stages

4.1 Critical Stages of Preprocessing

4.2 Supporting Computationally Intensive Training Stages

Step 5: Integration

Step 6: Refine With Storage

6.1 Using Memory

6.2 Using Distributed File Systems

6.3 Using Distributed Data Stores (Persistent Data Store)

6.4 Using Relational Databases

Step 7: Operationalization and Maintenance

7.1 Compute-Intensive vs. Data-Intensive Components in Workflows

7.2 Securing Data Science Pipelines

Follow-Up

Introducing DevOps to MLpAI and Vice Versa

Risks and Pitfalls

Risk No. 1: Building DS Pipelines Can Be Especially Challenging When Dealing With Big Data Without the Right Tools

Risk No. 2: Poor Data Quality Will Significantly Impact Performance and Accuracy

Risk No. 3: Techniques for Securing DS pipelines Are Still in Their Infancy

Pitfall: Bounded Rationality Exists Even Within MLpAI Applications

Source: Gartner