A GitHub project now offers an Azure Databricks medallion architecture pipeline built with PySpark, Python, and SQL. It processes e-commerce data through Bronze, Silver, and Gold layers, adding ...
Implemented pandas-based cleaning rules in data_preprocessing.py, transformations for salesorder.csv → clean_salesorder.csv, pipeline testing via multiple DAG runs.
Este projeto implementa um pipeline ETL que coleta dados meteorológicos de São Paulo a cada hora, processa as informações e armazena em um banco de dados PostgreSQL para análise posterior.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results