Basics on spark coding and its architecture
Senior Data Engineer Interview Questions
2,560 senior data engineer interview questions shared by candidates
Asking Sql and python script questions.
Very broad questions of system design around tracks and reports website traffic.
Redis load balancing algorithm SQL leetcode Hard X 3
How would you encourage business teams to change their mindset from relying on a central data team to developing/managing data projects in a data-mesh context.
Python Coding Question: 1. Reverse a string in python. 2. Take integer input from user and check if the given number is prime or not.
why are you switching the job? why you left your first job?
Consider a employee dataframe with columns – empID, department, salary. Get minimum, maximum, average salary and employee count for each department. Write it in a single statement.
tell me something about yourself
A. Core Data Engineering Concepts SQL (joins, window functions, performance tuning) Data Modeling (star vs snowflake, normalization) ETL/ELT pipelines (batch vs streaming, orchestration tools like Airflow) B. Apache Spark / PySpark Catalyst Optimizer & Tungsten Narrow vs Wide transformations Joins (broadcast, sort-merge), Skew handling AQE (Adaptive Query Execution) Partitioning, Predicate Pushdown Execution Plan (DAG → Stage → Tasks) Spark UI and Job Debugging SCD Type 2 Implementation in PySpark C. AWS S3, Glue, Athena, Lambda, EMR, Redshift Event-driven design (S3 → EventBridge → Lambda) Security: IAM roles, bucket policies, encryption CI/CD in AWS (CodePipeline, CloudFormation) D. Python Writing modular, reusable code Working with Pandas, Boto3 (for AWS interaction) Exception handling, logging Lambda functions and decorators E. Kafka / Streaming Kafka topic partitioning, consumer groups Offset management Integration with Spark Structured Streaming
Viewing 1651 - 1660 interview questions