Introduction to SBT for Spark Programmers
SBT is an interactive build tool that is used to run tests and package your projects as JAR files. SBT lets you create a project in a text editor and […]
SBT is an interactive build tool that is used to run tests and package your projects as JAR files. SBT lets you create a project in a text editor and […]
Spark DataFrames schemas are defined as a collection of typed columns. The entire schema is stored as a StructType and individual columns are stored as StructFields. This blog post explains […]
Spark has a variety of aggregate functions to group, cube, and rollup DataFrames. This post will explain how to use aggregate functions with Spark. Check out Beautiful Spark Code for […]
The Spark Column class defines a variety of column methods that are vital for manipulating DataFrames. This blog post demonstrates how to instantiate Column objects and covers the commonly used […]
This book will teach you how to be a proficient Apache Spark programmer with minimal effort. Other books focus on the theoretical underpinnings of Spark. This book skips the theory […]
Spark Datasets / DataFrames are filled with null values and you should write code that gracefully handles these null values. You don’t want to write code that thows NullPointerExceptions – […]