10+ years designing reliable, scalable data platforms across enterprise data warehouses and modern cloud-native pipelines on AWS, Snowflake, and Apache Iceberg.
Get in touch →I'm Nguyen Le. I design data pipelines that are reliable, scalable, and portable across Snowflake and Redshift from a single dbt codebase.
My background spans the full data stack, from raw ingestion and orchestration to dimensional modeling, dbt transformation layers, and stakeholder-facing dashboards. I have delivered at scale inside United and Continental Airlines through a major merger, and independently through ZenClarity Consulting.
I specialize in modernizing legacy pipelines, replacing brittle ETL/ELT with cost-aware, idempotent, observable systems built on AWS, Snowflake, Airflow, and Apache Iceberg.
End-to-end data engineering platform built on AWS, featuring a production-grade Iceberg Migration Framework as the core ingestion layer. V2 introduces cost-aware engine routing between Glue and EMR based on data volume, idempotent orchestration via Airflow with DynamoDB audit trail, and a full dbt transformation stack on Snowflake and Redshift with 35.6M clean records across staging, intermediate, and mart layers.
Full medallion architecture implemented as staging, intermediate, and mart layers in dbt, with a dedicated DQ quarantine layer for multi-reason failure tracking and a clean separation between infrastructure and modeling concerns. dbt transformation layer architected for multi-engine portability, a single codebase deployable to both Snowflake and Redshift using target-aware Jinja conditionals. Architecture includes a delta ingestion layer for ongoing monthly loads, completing the full ingest-to-mart pipeline cycle.
View on GitHub →Delivered data engineering solutions across mission-critical airline operations including Cargo Operations and Revenue Management, through the United-Continental merger. Led migration of 170+ data landing zones to a secure SFTP platform consolidating internal file shares, FTP connections, and database links from internal teams, external partners, and global vendors into a compliant, standardized ingestion layer.
Maintained SOX-aligned audit reporting infrastructure and operated the Enterprise Data Warehouse at 99.9% availability across mission-critical airline operations. Drove ETL performance improvements of 30-65% across critical operational systems including progressive offload of Teradata workloads to Hadoop and EMR/Spark on S3.
Available for Senior Data Engineering and Analytics Engineering roles in Southern California and remote. Open to full-time and contract opportunities.
Actively exploring Senior Data Engineer and Analytics Engineer roles where I can bring production-grade pipeline architecture, modern data stack expertise, and a track record of delivering at enterprise scale. Based in Orange County, CA. Available for hybrid Southern California and remote positions.