Data Scientist Senior Interview Questions

3,383 data scientist senior interview questions shared by candidates

1) Scan through the below steps and import needed Python libraries (don’t worry, you can import them later if you forget one) 2) Load the data from the csv file 3) Perform basic commands to understand the data 4) Bin the following features: a) 'currentterm' into [0 to 11], [11 and more] b) 'mrr_entry' into [0 to 14.99], [14.99 to 500], [500 to 5K], [5K and more] c) 'account_age’ into [0 to 90], [90 to 180], [180 to 360], [360 and more] d) 'days_left_in_term’ into [0 to 30], [30 to 360], [360 and more] 5) Set 'churn_next_90' as your target column 6) Set 'zoom_account_no' as an ID column, this should not be a feature 7) Set 'ahs_date' as a date column, this should not be a feature 8) Treat the binned features from step (4) and the following features as categorical features: a) 'sales_group', b) 'employee_count', c) 'coreproduct' 9) Perform feature selection using your preferred method and ML algorithm. Choose 10 features and continue to step (10). 10) Divide the new data frame (with 10 features) into test and train subset 11) Use a different algorithm from part (9) and perform cross-validation method for parameter tuning. Print out the results. 12) Based on results from (11), fit your model on the train subset 13) Test your fitted model using the test subset 14) Print feature importance, accuracy score (roc_auc_score), and confusion matrix (crosstab) from step (13) 15) Save your trained model using pickle
avatar

Senior Data Scientist

Interviewed at Zoom Communications

3.6
Aug 10, 2017

1) Scan through the below steps and import needed Python libraries (don’t worry, you can import them later if you forget one) 2) Load the data from the csv file 3) Perform basic commands to understand the data 4) Bin the following features: a) 'currentterm' into [0 to 11], [11 and more] b) 'mrr_entry' into [0 to 14.99], [14.99 to 500], [500 to 5K], [5K and more] c) 'account_age’ into [0 to 90], [90 to 180], [180 to 360], [360 and more] d) 'days_left_in_term’ into [0 to 30], [30 to 360], [360 and more] 5) Set 'churn_next_90' as your target column 6) Set 'zoom_account_no' as an ID column, this should not be a feature 7) Set 'ahs_date' as a date column, this should not be a feature 8) Treat the binned features from step (4) and the following features as categorical features: a) 'sales_group', b) 'employee_count', c) 'coreproduct' 9) Perform feature selection using your preferred method and ML algorithm. Choose 10 features and continue to step (10). 10) Divide the new data frame (with 10 features) into test and train subset 11) Use a different algorithm from part (9) and perform cross-validation method for parameter tuning. Print out the results. 12) Based on results from (11), fit your model on the train subset 13) Test your fitted model using the test subset 14) Print feature importance, accuracy score (roc_auc_score), and confusion matrix (crosstab) from step (13) 15) Save your trained model using pickle

The technical rounds and the case study were focused on traditional ML. How would you deal with columns containing hundreds of categories? How would you deal with class imbalance? How does xgboost deal with nan values? What is the difference between oversampling and class weights? What hyper-parameters did you use in your models? How did you decide between one hot encoding and target encoding? What was the loss function used and what the score function used and why they were chosen? Then there were questions regarding one of the projects you did and questions on that? Behavioural round - How would you deal with a low performer in your team? What challenges you have faced? What do you consider as failure? Hypothetical scenarios on linking your models to business KPIs ? How would you manage a project?
avatar

Senior Lead Data Scientist

Interviewed at CARS24.com

4.3
Jul 11, 2024

The technical rounds and the case study were focused on traditional ML. How would you deal with columns containing hundreds of categories? How would you deal with class imbalance? How does xgboost deal with nan values? What is the difference between oversampling and class weights? What hyper-parameters did you use in your models? How did you decide between one hot encoding and target encoding? What was the loss function used and what the score function used and why they were chosen? Then there were questions regarding one of the projects you did and questions on that? Behavioural round - How would you deal with a low performer in your team? What challenges you have faced? What do you consider as failure? Hypothetical scenarios on linking your models to business KPIs ? How would you manage a project?

The conversation with the HR responsible was very nice and professional. She asked about data science in general and why I am looking for another job, etc. However, the conversation with the IT person was not that good. Most of the question were not related (it seems they were from his field). Before the interview, I checked to whom I am going to discuss and I found that he is an IT Engineer at the company (basically lower in the hierarchy than the offered position). This did not cause any problem to me but at the interview, the interviewer was surprisingly very offensive.
avatar

Senior Data Scientist

Interviewed at Amazon

3.5
Nov 18, 2019

The conversation with the HR responsible was very nice and professional. She asked about data science in general and why I am looking for another job, etc. However, the conversation with the IT person was not that good. Most of the question were not related (it seems they were from his field). Before the interview, I checked to whom I am going to discuss and I found that he is an IT Engineer at the company (basically lower in the hierarchy than the offered position). This did not cause any problem to me but at the interview, the interviewer was surprisingly very offensive.

Viewing 2621 - 2630 interview questions

Glassdoor has 3,383 interview questions and reports from Data scientist senior interviews. Prepare for your interview. Get hired. Love your job.