PySpark Questions and Answers

  1. What is PySpark?

  2. Key Features of PySpark

  3. How does PySpark handle data?

  4. What is an RDD?

  5. How do you create an RDD in PySpark?

  6. What is the difference between map() and flatMap() in PySpark?

  7. What are actions and transformations in PySpark?

  8. Explain the concept of lazy evaluation in PySpark.

  9. How does Spark optimize execution plans?

  10. What is a DataFrame in PySpark?

  11. How do you perform joins in PySpark?