The Spark Column class defines a variety of column methods that are vital for manipulating DataFrames. This blog post demonstrates how to instantiate Column objects and covers the commonly used […]

Spark Datasets / DataFrames are filled with null values and you should write code that gracefully handles these null values. You don’t want to write code that thows NullPointerExceptions – […]

Spark broadcast joins are perfect for joining a large DataFrame with a small DataFrame. Broadcast joins cannot be used when joining two large DataFrames. This post explains how to do […]