Just Enough Scala for Spark Programmers
Spark programmers only need to know a small subset of the Scala API to be productive. Scala has a reputation for being a difficult language to learn and that scares […]
Spark programmers only need to know a small subset of the Scala API to be productive. Scala has a reputation for being a difficult language to learn and that scares […]
sbt-assembly makes it easy to shade dependencies in your Spark projects when you create fat JAR files. This blog post will explain why it’s useful to shade dependencies and will […]
Spark SQL functions make it easy to perform DataFrame analyses. This post will show you how to use the built-in Spark SQL functions and how to build your own SQL […]
Spark DataFrames are similar to tables in relational databases – they store data in columns and rows and support a variety of operations to manipulate the data. Here’s an example […]
Spark codebases can easily become a collection of order dependent custom transformations (see this blog post for background on custom transformations). Your library will be difficult to use if many […]