The repo is to supplement the youtube video on PySpark for Glue. It includes a cloudformation template which creates the s3 bucket, glue tables, IAM roles, and csv data files. Below are the schemas ...
pre-commit is a nice development tool to automatize the binding of pre-commit hooks. After installation and configuration pre-commit will run your hooks before you commit any change. CREATE OR REPLACE ...
Read this SQL tutorial to learn when to use SELECT, JOIN, subselects and UNION to access multiple tables with a single statement. It’s sometimes difficult to know which SQL syntax to use when ...
SQL is neither the fastest nor the most elegant way to talk to databases, but it is the best way we have. Here’s why Today, Structured Query Language is the standard means of manipulating and querying ...