MungingData - Piles of precious data

The Virtuous Content Cycle for Developer Advocates

mrpowers September 2, 2022 0

This post explains how to scale developer advocacy by creating content in a way that answers current user questions and makes it easier to generate additional content in the future. […]

DevRel Driven Development

mrpowers June 14, 2022 0

DevRel Driven Development is driving software development from developer advocacy activities like creating documentation, writing blog posts, and producing videos. Developers advocates frequently identify public interface warts when creating content. […]

Convert streaming CSV data to Delta Lake with different latency requirements

mrpowers June 4, 2022 0

This blog post explains how to incrementally convert streaming CSV data into Delta Lake with different latency requirements. A streaming CSV data source is used because it’s easy to demo, […]

Install PySpark, Delta Lake, and Jupyter Notebooks on Mac with conda

mrpowers June 1, 2022 0

This blog post explains how to install PySpark, Delta Lake, and Jupyter Notebooks on a Mac. This setup will let you easily run Delta Lake computations on your local machine […]

Ultra-cheap international real estate markets in 2022

mrpowers January 1, 2022 0

This post explains how to identify ultra-cheap international real estate markets and when you can capitalize on deeply discounted prices. Let’s borrow Andrew Henderson’s definition of an ultra-cheap real estate […]

Read multiple CSVs into pandas DataFrame

mrpowers December 28, 2021 0

This post explains how to read multiple CSVs into a pandas DataFrame. pandas filesystem APIs make it easy to load multiple files stored in a single directory or in nested […]

Scale big data pandas workflows with Dask

mrpowers December 27, 2021 0

pandas is a great DataFrame library for datasets that fit comfortably in memory, but throws out of memory exceptions for datasets that are too large. This post shows how pandas […]

Writing NumPy Array to Text Files

mrpowers December 24, 2021 0

This post explains the different ways to save a NumPy array to text files. After showing the different syntax options the post will teach you some better ways to write […]

Content creators making more than $50,000 a month

mrpowers December 19, 2021 0

This post demonstrates how much money you can make as a content creator and contrasts the content creation and restaurant business models. Content creators can make a lot of money […]

Reading Delta Lakes into Dask DataFrames

mrpowers December 13, 2021 0

This post explains how to read Delta Lakes into Dask DataFrames. It shows how you can leverage powerful data lake management features like time travel, versioned data, and schema evolution […]