Briefly describe why the models you used in production is relevant?
Sr Data Scientist Interview Questions
3,368 sr data scientist interview questions shared by candidates
What is an autoregressive model?
- Technical screen is SQL/Python - "Data Challenge" is a take-home assignment using a music playlist dataset - Final round can vary by team but definitely includes a presentation for your data challenge
First round: * Experience and why you left jobs? * Why General Mills? * Explain PCA to a jr.DS * Tell me about a time that you had conflict with your manager? * Tell me about a time when you worked with other teams to get your job done? * How do you explain coefficients of a linear regression to a business person? * Tell me about a time you had to convince your manager to listen to you? * What are your long term career plans? Codility: * Implement Mutual Info probability distribution using Python Final round: Hiring Manager: * How do you deal with bad data? How did you handle them ? * How do deal with missing values ? * Is more data always good ? * What is cross-validation, explain ? * Which tool do you prefer ? R or Python ? * What do you do when distribution of train & test data is different ? * What is the size of data that you worked with? No of observations and No of features ? * Your experience writing production level code ? * experience with GCP ? * Around Betty crocker's website. Recipe recommendations? How do you ask them questions ? How do you see if your model is performing well or not ? How do you deploy a model ? * How to motivate team members ? * How did you help under performers 2 Sr. DS: provides a python notebook, which is almost same as https://github.com/andreagrandi/ml-pima-notebook/blob/master/PimaIndiansDiabetes.ipynb data team manager: 1 question per General Mills values, around helping team members, a situation where you did the right thing, etc
No question asked from HR.
Codility questions were around the usage of some popular Python libraries, unlike other companies who test DS/Algo. This was unexpected since every one may not be using these.
What is voting? What are the classification models that can be used?
How to work with the imbalanced dataset, what is a sparse matrix is, how does a recommendation system work, why NumPy is faster
questions about how to optimize the code/calculation, very CS based.
What is the difference between traditional NLP vs. LLMs and Transformers?
Viewing 531 - 540 interview questions