Aggregations with Spark (groupBy, cube, rollup)
Spark has a variety of aggregate functions to group, cube, and rollup DataFrames. This post will explain how to use aggregate functions with Spark. Check out Beautiful Spark Code for […]
Spark has a variety of aggregate functions to group, cube, and rollup DataFrames. This post will explain how to use aggregate functions with Spark. Check out Beautiful Spark Code for […]
The Spark Column class defines a variety of column methods that are vital for manipulating DataFrames. This blog post demonstrates how to instantiate Column objects and covers the commonly used […]
This book will teach you how to be a proficient Apache Spark programmer with minimal effort. Other books focus on the theoretical underpinnings of Spark. This book skips the theory […]
Spark Datasets / DataFrames are filled with null values and you should write code that gracefully handles these null values. You don’t want to write code that thows NullPointerExceptions – […]
Spark supports DateType and TimestampType columns and defines a rich API of functions to make working with dates and times easy. This blog post will demonstrates how to make DataFrames […]
Spark code will run faster with certain data lakes than others. For example, Spark will run slowly if the data lake uses gzip compression and has unequally sized files (especially […]
Spark runs slowly when it reads data from a lot of small files in S3. You can make your Spark code run faster by creating a job that compacts small […]
Spark broadcast joins are perfect for joining a large DataFrame with a small DataFrame. Broadcast joins cannot be used when joining two large DataFrames. This post explains how to do […]
This blog post explains how to filter duplicate records from Spark DataFrames with the dropDuplicates() and killDuplicates() methods. It also demonstrates how to collapse duplicate records into a single row […]
Spark programmers only need to know a small subset of the Scala API to be productive. Scala has a reputation for being a difficult language to learn and that scares […]