...

Full Bio

Is It Possible To Learn Programming Language And Have Fun Together?

today

Artificial Intelligence Bias And The People Factor In AI Development

yesterday

Awesome Programming Language Practices Every Programmer Should Follow

yesterday

How Does Data Science Influence Everyday Life For The Better

2 days ago

How Does Programming Language Influence Daily Life - Life Of A Programmer

2 days ago

Which Programming Languages in Demand & Earn The Highest Salaries?

382188 views

Top 10 Best Countries for Software Engineers to Work & High in-Demand Programming Languages

311337 views

50+ Data Structure, Algorithms & Programming Languages Interview Questions for Programmers

228711 views

100+ Data Structure, Algorithms & Programming Language Interview Questions Answers for Programmers - Part 1

202818 views

Why I Studied Full-time 8 Months For A Google Programming Language Interview

153870 views

### Ultimate Python Quickstart Guide For Data Science

- Install Anaconda
- Open Jupyter Notebook
- Start New Notebook
- Try Math Calculations
- Import Data Science Libraries
- Import Your Dataset
- Explore Your Data
- Clean Your Dataset
- Engineer Features
- Train a Simple Model
- Next Steps

- First, we imported Python's math module, which provides convenient functions (e.g. math.sqrt()) and math constants (e.g. math.pi).
- Second, 2*2*2*2... or "two to the fourth"... is written as 2**4. If you write 2^4, you'll get a very different output!
- Finally, the text following the "hashtags" (#) is called comments. Just as their name implies, these text snippets are not run as code.

- First, we imported the Pandas library. We also gave it the alias of pd. This means we can evoke the library with pd. You'll see this in action shortly.
- Next, we imported the pyplot module from the matplotlib library. Matplotlib is the main plotting library for Python. There's no need to bring in the entire library, so we just imported a single module. Again, we gave it an alias of plt.
- Oh yea, and the %matplotlib inline command? That's Jupyter Notebook specific. It simply tells the notebook to display our plots inside the notebook, instead of in a separate screen.
- Finally, we imported a basic linear regression algorithm from scikit-learn. Scikit-learn has a buffet of algorithms to choose from. At the end of this guide, we'll point you to a few resources for learning more about these algorithms.

- df is where we stored the data. It's called a "dataframe," and it's also a Python object, like the variables from Step 4.
- .isnull() is called a method, which is just a fancy term for a function attached to an object. This method looks through our entire dataframe and labels any cell with a missing value as True. (Tip: Try running df.head().isnull() and see what you get!)
- Finally, .sum() is a method that sums all of the True values across each column. Well... technically, it sums any number, while treating True as 1 and False as 0.

- Numerical ones are pretty self-explanatory... For example, "number of years of education" would be a numerical feature.
- Categorical features are those that have classes instead of numeric values.... For example, "highest education level" would be a categorical feature, and the classes could be: ['high school', 'some college', 'college', 'some graduate', 'graduate'].

- The variables to drop... (e.g. ['Y1', 'Y2'])
- Whether to drop from the index ( axis=0) or the columns ( axis=1)