Skip to content
January 15, 2025 active

Security Vulnerability Data Pipelines

Enterprise data pipelines capturing vulnerability signals across 70,000+ applications at Macquarie Group

70,000+ Applications monitored
Endpoints, containers, servers Signal sources
EC2, Lambda, S3 Infrastructure
dbtAirflowApache IcebergAWSTrino/StarburstGrafana

The Problem

Macquarie Group needed org-wide visibility into security vulnerabilities across endpoints, containers, and servers. Existing monitoring was fragmented — no single pipeline captured the full picture across 70,000+ applications.

What I Built

Designed and deployed security-critical data pipelines using dbt, Airflow, and Apache Iceberg on AWS. The system captures vulnerability signals from multiple sources and makes them queryable for security and risk stakeholders.

Also redesigned AWS infrastructure (EC2, Lambda) to reduce cost and operational risk while maintaining security monitoring guarantees.

What I Learned

Working at the intersection of data engineering and security taught me that pipeline reliability isn’t optional when your data feeds regulatory reporting. Schema changes need to be backward-compatible, and data freshness SLAs are non-negotiable when security teams depend on your output.