Big Data Engineer Interview Questions

1,784 big data engineer interview questions shared by candidates

1.delta lake vs data lake difference 2.How CDC process implemented in your project 3.key difference between parquet and orc, which one to choose over the other. 4.have you worked with any reporting tools 5.have you ever consumed files 6.facts and dimensions 7.have you worked on scd, explain about how many types of SCD’s you have worked on. 8.what is normalization,why is it required. 9.what is structured and unstructured data, have you worked with them? 10.issues you have faced, how did you resolve them.
avatar

Big Data Engineer

Interviewed at Indium Software

3.9
Mar 21, 2024

1.delta lake vs data lake difference 2.How CDC process implemented in your project 3.key difference between parquet and orc, which one to choose over the other. 4.have you worked with any reporting tools 5.have you ever consumed files 6.facts and dimensions 7.have you worked on scd, explain about how many types of SCD’s you have worked on. 8.what is normalization,why is it required. 9.what is structured and unstructured data, have you worked with them? 10.issues you have faced, how did you resolve them.

Question 1: we have stores table that contain columns (Store_id,store_nm,Product) write query to Find the stores which either sell both tea and coffee or coffee and jam. Question 2: we have orders table with the columns (Orderid,Orderdt,custid,Endloc) write a sql query to return the customers who place the order within 12 days PySpark Question 3: we have 2 csv files,one contains department data with dept_name,dept_id columns and second csv file contains students data with studentname,stud_id , deptid , total_marks_secured , year we need to "return the top 5 stds for each dept for each year" in the output format deptname, studid , stud_name,year,total marks 4.What are facts and dimensions in a data warehouse ?
avatar

Big Data Engineer

Interviewed at Indium Software

3.9
Mar 21, 2024

Question 1: we have stores table that contain columns (Store_id,store_nm,Product) write query to Find the stores which either sell both tea and coffee or coffee and jam. Question 2: we have orders table with the columns (Orderid,Orderdt,custid,Endloc) write a sql query to return the customers who place the order within 12 days PySpark Question 3: we have 2 csv files,one contains department data with dept_name,dept_id columns and second csv file contains students data with studentname,stud_id , deptid , total_marks_secured , year we need to "return the top 5 stds for each dept for each year" in the output format deptname, studid , stud_name,year,total marks 4.What are facts and dimensions in a data warehouse ?

Viewing 431 - 440 interview questions

See Interview Questions for Similar Jobs

Glassdoor has 1,784 interview questions and reports from Big data engineer interviews. Prepare for your interview. Get hired. Love your job.