Type 2 Slowly Changing Dimension Upserts with Delta Lake
This post explains how to perform type 2 upserts for slowly changing dimension tables with Delta Lake. We’ll start out by covering the basics of type 2 SCDs and when […]
This post explains how to perform type 2 upserts for slowly changing dimension tables with Delta Lake. We’ll start out by covering the basics of type 2 SCDs and when […]
Delta lakes prevent data with incompatible schema from being written, unlike Parquet lakes which allow for any data to get written. Let’s demonstrate how Parquet allows for files with incompatible […]
Delta makes it easy to update certain disk partitions with the replaceWhere option. Selectively applying updates to certain partitions isn’t always possible (sometimes the entire lake needs the update), but […]
This blog posts explains how to update a table column and perform upserts with the merge command. We explain how to use the merge command and what the command does […]
Delta lakes are versioned so you can easily revert to old versions of the data. In some instances, Delta lake needs to store multiple versions of the data to enable […]
This post explains how to compact small files in Delta lakes with Spark. Data lakes can accumulate a lot of small files, especially when they’re incrementally updated. Small files cause […]
Delta Lake is a wonderful technology that adds powerful features to Parquet data lakes. This blog post demonstrates how to create and incrementally update Delta lakes. We will learn how […]