Data Consultant Interview Questions

1,913 data consultant interview questions shared by candidates

From the very beginning, I was asked to share my screen and open excel. 1) Write 1 to 5 and calculate the std. deviation without using default functions. What is std. deviation. 2) Explain Type 1 and Type 2 Errors - If one has cancer but predicted as not cancer.- Which type of error is this? 3) Using first three columns of the given data, derive the last column(Total Salary) using SQL queries as well as pandas. data = """EID Month Salary Total Salary 1 1 10000 10000 1 2 12000 22000 1 3 14000 36000 2 1 16000 16000 2 2 18000 34000 2 3 20000 54000""" 4) Central Limit Theorem 5) 4 basic assumption of Linear Regression 6) If there are 10k obs. out of which 500 observations is for 1: defaulter for loans and 9500 observation for non- defaulters for loans for target variable, so whenever we are doing 70:30 train test split, it is observed that the data is training on non-defaulters so how to mitigate it? How to handle before train-test split? (Hint: Strata(Stratification), dont say SMOTE/RandomOverSampler/class_weight) 7) What should be the correct evaluation metric for the above? 8) Feature Scaling - Which model work with feature scaling and which model cant work on feature scaling? Linear Regression/Decision Tree ? 9) What is Stationarity in Time Series Forecasting.
avatar

Consultant Data Scientist

Interviewed at Tredence

3.9
Jan 4, 2024

From the very beginning, I was asked to share my screen and open excel. 1) Write 1 to 5 and calculate the std. deviation without using default functions. What is std. deviation. 2) Explain Type 1 and Type 2 Errors - If one has cancer but predicted as not cancer.- Which type of error is this? 3) Using first three columns of the given data, derive the last column(Total Salary) using SQL queries as well as pandas. data = """EID Month Salary Total Salary 1 1 10000 10000 1 2 12000 22000 1 3 14000 36000 2 1 16000 16000 2 2 18000 34000 2 3 20000 54000""" 4) Central Limit Theorem 5) 4 basic assumption of Linear Regression 6) If there are 10k obs. out of which 500 observations is for 1: defaulter for loans and 9500 observation for non- defaulters for loans for target variable, so whenever we are doing 70:30 train test split, it is observed that the data is training on non-defaulters so how to mitigate it? How to handle before train-test split? (Hint: Strata(Stratification), dont say SMOTE/RandomOverSampler/class_weight) 7) What should be the correct evaluation metric for the above? 8) Feature Scaling - Which model work with feature scaling and which model cant work on feature scaling? Linear Regression/Decision Tree ? 9) What is Stationarity in Time Series Forecasting.

Viewing 211 - 220 interview questions

See Interview Questions for Similar Jobs

Glassdoor has 1,913 interview questions and reports from Data consultant interviews. Prepare for your interview. Get hired. Love your job.