Using Delta lake merge to update columns and perform upserts
This blog posts explains how to update a table column and perform upserts with the merge command. We explain how to use the merge command and what the command does […]
This blog posts explains how to update a table column and perform upserts with the merge command. We explain how to use the merge command and what the command does […]
Delta lakes are versioned so you can easily revert to old versions of the data. In some instances, Delta lake needs to store multiple versions of the data to enable […]
This post explains how to compact small files in Delta lakes with Spark. Data lakes can accumulate a lot of small files, especially when they’re incrementally updated. Small files cause […]
This post describes how to programatically compact Parquet files in a folder. Incremental updates frequently result in lots of small files that can be slow to read. It’s best to […]