30 Data Science Punchlines, All Data Scientist Should Use

By Kimberly Cook | Dec 25, 2018 | 24639 Views

A holiday reading list condensed into 30 quotes

For those who like brainfood on your vacation, here's a handy index of all my articles from 2018 boiled down to 30 (occasionally cheeky) punchlines to help you avoid/cause awkward silences at family events and holiday parties.

Sections: Data Science and Analytics, ML/AI Concepts, How Not To Fail At ML/AI, Data Science Leadership, Technology, Statistics.

Bonus: Videos, podcasts, foreign language translations for your non-English-speaking friends and family to enjoy, and an end-to-end deep learning tutorial for the Pythonistas among you.

Data Science and Analytics
What on earth is data science? A quick tour of data science, data engineering, statistics, analytics, ML, and AI.
Data science is the discipline of making data useful.

What Great Data Analysts Do‚??-‚??and Why Every Organization Needs Them. Good analysts are a prerequisite for effectiveness in your data endeavors. It's dangerous to have them quit on you, but that's exactly what they'll do if you under-appreciate them.

Each of the three data science disciplines has its own excellence. Statisticians bring rigor, ML engineers bring performance, and analysts bring speed.
Secret Paragraphs from HBR's Analytics A collection of musings omitted from the article above. Let's talk about hybrid roles, the nature of research, Bat Signals, data charlatans, and awesome analysts!

Buyer beware: there are many data charlatans out there posing as data scientists. There's no magic that makes certainty out of uncertainty.
Top 10 roles in AI and data science. A guide to the job titles, in hiring order.

If a researcher is your first hire, you probably won't have the right environment to make good use of them.

ML/AI Concepts
The simplest explanation of machine learning you'll ever read. Machine learning is a thing-labeler where you explain your task with examples instead of instructions.

Machine learning is a new programming paradigm, a new way of communicating your wishes to a computer. It's exciting because it allows you to automate the ineffable.
Are you using the term ‚??AI' incorrectly? With poorly defined terms, there's not really such a thing as using them correctly. We can all be winners, but here's a quick guide to the alphabet soup of AI, ML, DL, RL, and HLI.

If you're worried that there's a human-like intelligence lurking in every cupboard, breathe easy. All those industry AI applications are too busy solving real business problems.

Explaining supervised learning to a kid (or your boss). My goal here is to get humans of all stripes comfy with some basic terminology: instance, label, feature, model, algorithm, and supervised learning.

Don't be intimidated by jargon. For example, a model is just a fancy word for "recipe."


Machine learning‚??-‚??Is the emperor wearing clothes? A beginner-friendly look at the core concepts‚??-‚??including algorithms and loss functions‚??-‚??via pictures and cat memes.

Don't hate machine learning for being simple. Levers are simple too, but they can move the world.

Unsupervised learning demystified. Unsupervised learning helps you find inspiration in data by grouping similar things together for you. The results are a Rorschach card to help you dream.

Think of unsupervised learning as a mathematical version of making "birds of a feather flock together."

  • Explainable AI won't deliver. Here's why. Many people are drawn to XAI because they think it's a good basis for trust. It isn't, and getting caught up in the trust hype might mean you'll miss out on something XAI is great for: inspiration.

If you refuse to trust decision-making to something whose process you don't understand, then you should fire all your human workers, because no one knows how the brain (with its hundred billion neurons!) makes decisions.

How Not To Fail At ML/AI
Why businesses fail at machine learning. Many businesses don't realize that applied ML is a very different discipline from ML algorithms research.

Imagine trying to start a restaurant by hiring folks who've been building microwave parts their whole lives but have never cooked a thing‚?¶ what could possibly go wrong?

Advice for finding AI use cases. My brainstorming trick for finding opportunities to apply starts with imagining that AI is a hoax‚?¶

A common mistake businesses make is to assume machine learning is magic, so it's okay to skip thinking about what it means to do the task well.
The first step in AI might surprise you. What's the right way to start an AI project? Get an AI degree? No. Hire an AI wizard? Nope. Pick an awesome algorithm? Not that either. Dive into the data? Wrong again! Here's how to do it better.

Never ask a team of PhDs to "Go sprinkle machine learning over the top of the business so‚?¶ good things happen."

Is your AI project a nonstarter? A (reality) checklist you should go through before you hire any engineers or get any data for an applied ML/AI project.

Don't waste your time on AI for AI's sake. Be motivated by what it will do for you, not by how sci-fi it sounds.
Getting started with AI? Start here! A detailed guide to the decision-maker's role and responsibilities in an applied ML/AI project.

Just because you can do something, doesn't mean it's a good use of anyone's time. We humans fall in love with what we have poured effort into‚?¶ even if it is a pile of poisonous rubbish.

Whose fault is it when AI makes mistakes? The point of ML/AI is that you're expressing your wishes using examples instead of instructions. For it to work, the examples have to be relevant.

If you use a tool where it hasn't been verified safe, any mess you make is your fault. AI is a tool like any other.

Data Science Leadership
Data-Driven? Think again. For a decision to be data-driven, it has to be the data‚??-‚??as opposed to something else entirely‚??-‚??that drive it. Seems so straightforward, and yet it's so rare in practice because decision-makers lack a key psychological habit.
The more ways there are to slice the data, the more your analysis is a breeding ground for confirmation bias. The antidote is setting your decision criteria in advance.

Is data science a bubble? Learn more about the people calling themselves "data scientists" and why the industry is playing a dangerous game.

"I think you might be hiring data scientists the way a drug lord buys a tiger for his backyard," I told him. "You don't know what you want with the tiger, but all the other drug lords have one."

Data Science Leaders: There are too many of you. What's the plan for training decision-makers with the skills to make data science teams successful? Hope is not a strategy!

‚?¶a pro-math subculture where it's fashionable to display disdain for anything that smells like "soft" skills. It's all chest-thumping about how hardcore you are for staying up all night proving some theorem or coding in your sixth language.
Rethinking Fast and Slow in Data Science. Is it possible for product development teams to reconcile rapid iteration with the slow-moving behemoth of the deep research process, or must they pick one?

Inspiration is cheap, but rigor is expensive.
Interview: Advice for data scientists. Candid answers to a fellow data scientist's questions. Topics include: favorite resources, careers, statistics education, and data science leadership.

Useful is worth more than complicated. Data quality is worth more than method quality. Communication skills are worth more than yet another programming language.

Technology
9 Things You Should Know About TensorFlow. TensorFlow might be your new best friend if you have a lot of data and/or you're after the state-of-the-art in AI. It's not a data science Swiss Army Knife, it's the industrial lathe. Here's what's new with it.

With TensorFlow Hub, you can engage in a more efficient version of the time-honored tradition of helping yourself to someone else's code and calling it your own (otherwise known as professional software engineering).
What do you call AI without the boring bits? Kubeflow is about giving data scientists the experience they'd have if they got rid of all the fiddly bits they don't like. It's a ski lift for your mountain of chores.

Congratulations on waiting it out long enough to have the infrastructure taken care of for you, kind of like you don't need to build your own computer anymore.
5 Bite-Sized Data Science Summaries. 5 favorite talks from Google Cloud Next SF 2018. 5 video summaries. 5 minutes or less.

AI spent over half a century being more hype than happening. So, why now? Many people don't realize that the story of today's applied AI is actually a story about The Cloud.

Statistics
Don't waste your time on statistics. How to determine whether you need statistics and what to do if you don't.
Statistics is the science of changing your mind.
Never start with a hypothesis. Starting with hypotheses instead of actions is a common mistake among those who learn the math without absorbing any of the philosophy. Let's look at how to do use statistics for decision-making.

Hypotheses are like cockroaches. When you see one, it's never just the one. There's always more hiding somewhere nearby.
Statistics for people in a hurry. Ever wished someone would just tell you what the point of statistics is and what the jargon means in plain English? Let me try to grant that wish for you in 8 minutes!

The math is all about building a toy model of the null hypothesis universe. That's how you get the p-value.
Populations‚??-‚??You're doing it wrong. A statistical approach only makes sense when there's a mismatch between the information you want (population) and the information you have (sample). What happens if the project's leader doesn't know what information they want?

In the Icarus-like leap from sample to population, expect a big splat if you don't know where you're aiming.

Statistics Savvy Self-Test. Will you pass this small quiz that checks your statistical expertise? You might not if you believed what they told you in STAT101‚?¶

If you had facts, you wouldn't need statistics.
Incompetence, delegation, and population. If the decision-maker doesn't have the right skills, your whole statistical project is doomed. When is it appropriate for the statistician to make a fuss and when should they meekly follow orders?

If your goal is to persuade people using data, you may as well throw rigor out the window (since that's where it belongs) and make pretty graphs instead.
Podcasts
30min GCP Podcast about Decision Intelligence
65min DataCamp podcast about making data science useful

Videos



Hands on Deep Learning Tutorial
My end-to-end deep learning tutorial screenshot walkthrough and (mostly Python) code from last year's Supercomputing conference.


Source: HOB