...

Full Bio

Top 10 Big Data Trending, Everyone Should Know

today

How to Get Job in Machine Learning, Even If You Aren't a Data Scientist

today

Top 10 Machine Learning Algorighms Everyone Should know to Become Data Scientist

today

7 Things Every Manager Should Know About Machine Learning

today

Five books every data scientist should read that are not about data science

115656 views

88 Data Science Resources & Tools to Become a Data Scientist Expert

71019 views

60+ Free Books on Data Science, Big Data, Data Mining, Machine Learning that Everyone Should Read!

66678 views

Top 9 Data Science Skills Must Have to Become a Data Scientist

55719 views

Are Data Scientists the Highest Paid Job?

45258 views

### 100 Plus Commonly Asked Data Science Interview Questions to Get Job as Data Scientist

1. General2. Big Data3. Python4. R5. SQL

- What are the different types of sorting algorithms available in R language?
- There are insertion, bubble, and selection sorting algorithms.
- What are the different data objects in R?
- What packages are you most familiar with? What do you like or dislike about them?
- How do you access the element in the 2nd column and 4th row of a matrix named M?
- What is the command used to store R objects in a file?
- What is the best way to use Hadoop and R together for analysis?
- How do you split a continuous variable into different groups/ranks in R?
- Write a function in R language to replace the missing value in a vector with the mean of that vector.

- What is the purpose of the group functions in SQL? Give some examples of group functions.
- Group functions are necessary to get summary statistics of a dataset. COUNT, MAX, MIN, AVG, SUM, and DISTINCT are all group functions
- Tell me the difference between an inner join, left join/right join, and union.

- Tell me about how you designed the model you created for a past employer or client.
- What are your favorite data visualization techniques?
- How would you effectively represent data with 5 dimensions?
- How is kNN different from k-means clustering?

kNN, or k-nearest neighbors is a classification algorithm, where the k is an integer describing the the number of neighboring data points that influence the classification of a given observation. K-means is a clustering algorithm, where the k is an integer describing the number of clusters to be created from the given data. Both accomplish different tasks.

Answer. Recall describes what percentage of true positives are described as positive by the model. Precision describes what percent of positive predictions were correct. The ROC curve shows the relationship between model recall and specificity - specificity being a measure of the percent of true negatives being described as negative by the model. Recall, precision, and the ROC are measures used to identify how useful a given classification model is.

- Teamwork
- Leadership
- Conflict Management
- Problem-solving
- Failure

- Tell me about a time when you took initiative.
- Tell me about a time where you had to overcome a dilemma.
- Tell me about a time where you resolved a conflict.
- Tell me about a time you failed, and what you have learned from it.
- Tell me about (a job on your resume). Why did you choose to do it and what do you like most about it?
- Tell me about a challenge you have overcome while working on a group project.
- When you encounter a tedious, boring task, how would you deal with it and motivate yourself to complete it?
- What have you done in the past to make a client satisfied/happy?
- What have you done in your previous job that you are really proud of?
- What do you do when your personal life is running over into your work life?

- How would you come up with a solution to identify plagiarism?
- How many â??usefulâ?? votes will a Yelp review receive?
- How do you detect individual paid accounts shared by multiple users?
- You are about to send one million emails. How do you optimize delivery? How do you optimize response?
- You have a dataset containing 100K rows and 100 columns, with one of those columns being our dependent variable for a problem we'd like to solve. How can we quickly identify which columns will be helpful in predicting the dependent variable? Identify two techniques and explain them to me as though I were 5 years old.
- How would you detect bogus reviews or bogus Facebook accounts used for bad purposes?