Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc... ...

Full BioNand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc...

3 Best Programming Languages For Internet of Things Development In 2018

715 days ago

Data science is the big draw in business schools

888 days ago

7 Effective Methods for Fitting a Liner

898 days ago

3 Thoughts on Why Deep Learning Works So Well

898 days ago

3 million at risk from the rise of robots

898 days ago

Top 10 Hot Artificial Intelligence (AI) Technologies

338646 views

2018 Data Science Interview Questions for Top Tech Companies

92340 views

Here's why so many data scientists are leaving their jobs

87219 views

Want to be a millionaire before you turn 25? Study artificial intelligence or machine learning

86637 views

Google announces scholarship program to train 1.3 lakh Indian developers in emerging technologies

67776 views

### 100 Asked Data Science Interview Questions and Answers for 2018

- Technical Data Scientist Interview Questions based on data science programming languages like Python , R, etc.
- Technical Data Scientist Interview Questions based on statistics, probability , math , machine learning, etc.
- Practical experience or Role based data scientist interview questions based on the projects you have worked on , and how they turned out.

- Understand the problem statement, understand the data and then give the answer.Assigning a default value which can be mean, minimum or maximum value. Getting into the data is important.
- If it is a categorical variable, the default value is assigned. The missing value is assigned a default value.
- If you have a distribution of data coming, for normal distribution give the mean value.
- Should we even treat missing values is another important point to consider? If 80% of the values for a variable are missing then you can answer that you would be dropping the variable instead of treating the missing values.

- Training Set is to fit the parameters i.e. weights.
- Test Set is to assess the performance of the model i.e. evaluating the predictive power and generalization.
- Validation set is to tune the parameters.

- There are two companies manufacturing electronic chip. Company A is manufactures defective chips with a probability of 20% and good quality chips with a probability of 80%. Company B manufactures defective chips with a probability of 80% and good chips with a probability of 20%.If you get just one electronic chip, what is the probability that it is a good chip?
- Suppose that you now get a pack of 2 electronic chips coming from the same company either A or B. When you test the first electronic chip it appears to be good. What is the probability that the second electronic chip you received is also good?
- A dating site allows users to select 6 out of 25 adjectives to describe their likes and preferences. A match is said to be found between two users on the website if the match on atleast 5 adjectives. If Steve and On a dating site, users can select 5 out of 24 adjectives to describe themselves. A match is declared between two users if they match on at least 4 adjectives. If Brad and Angelina randomly pick adjectives, what is the probability that they will form a match?
- A coin is tossed 10 times and the results are 2 tails and 8 heads. How will you analyse whether the coin is fair or not? What is the p-value for the same?
- Continuation to the above question, if each coin is tossed 10 times (100 tosses are made in total). Will you modify your approach to the test the fairness of the coin or continue with the same?
- An ant is placed on an infinitely long twig. The ant can move one step backward or one step forward with same probability during discrete time steps. Find out the probability with which the ant will return to the starting point.

- Which is your favourite machine learning algorithm and why?
- In which libraries for Data Science in Python and R, does your strength lie?
- What kind of data is important for specific business requirements and how, as a data scientist will you go about collecting that data?
- Tell us about the biggest data set you have processed till date and for what kind of analysis.
- Which data scientists you admire the most and why?
- Suppose you are given a data set, what will you do with it to find out if it suits the business needs of your project or not.
- What were the business outcomes or decisions for the projects you worked on?
- What unique skills you think can you add on to our data science team?
- Which are your favorite data science startups?
- Why do you want to pursue a career in data science?
- What have you done to upgrade your skills in analytics?
- What has been the most useful business insight or development you have found?
- How will you explain an A/B test to an engineer who does not know statistics?
- When does parallelism helps your algorithms run faster and when does it make them run slower?
- How can you ensure that you don't analyse something that ends up producing meaningless results?
- How would you explain to the senior management in your organization as to why a particular data set is important?
- Is more data always better?
- What are your favourite imputation techniques to handle missing data?
- What are your favorite data visualization tools?
- Explain the life cycle of a data science project.

- Understanding whether the model chosen is correct or not.Start understanding from the point where you did Univariate or Bivariate analysis, analysed the distribution of data and correlation of variables and built the linear model.Linear regression has an inherent requirement that the data and the errors in the data should be normally distributed. If they are not then we cannot use linear regression. This is an inductive approach to find out if the analysis using linear regression will yield meaningless results or not.
- Another way is to train and test data sets by sampling them multiple times. Predict on all those datasets to find out whether or not the resultant models are similar and are performing well.
- By looking at the p-value, by looking at r square values, by looking at the fit of the function and analysing as to how the treatment of missing value could have affected- data scientists can analyse if something will produce meaningless results or not.