Results-driven software engineer with over 15 years of professional experience in designing and implementing high-performance distributed systems and data platforms. Proven leadership in building and guiding technical teams to deliver scalable, robust solutions for complex systems.
Specialties: Backend Engineering, Distributed Processing, Technical Leadership, Database Systems & Query Optimization, System/Data Platform Design, Search Engines, Messaging Protocols, Cloud Computing (Kubernetes/Docker), Application Profiling & Performance Tuning, Site Reliability Engineering
Technical Leadership: Contributed to the Query Execution team (involved in Query Planning and Connectors) to deliver a unified query experience across dozens of use cases at Datadog.
Core Contributor: Enhanced the product in a wide range of areas: grammar/parsers, cost-based optimization and execution - significantly improving the platform’s robustness and correctness guarantees.
Reliability Focus: Managed on-call and operational efforts, standardizing deployments and establishing quality standards with end-to-end tests and golden queries, enhancing operational confidence and agility, greatly improving team’s confidence to “move fast” while keeping the “break things” part limited.
Key Technologies: Golang, Java, DataFusion, Trino, Apache Calcite, Substrait, Kubernetes, gRPC
Data Analytics: Contributed to Google Cloud’s Dataflow platform, a data processing system focused on streaming, and the open-source project Apache Beam (unified programming model to define and execute data processing pipelines), maintaining hundreds of pipelines to move huge amounts of data between a large variety of systems.
Technical Leadership: Led Dataflow Platform team, a team of 5-6 engineers, providing mentorship and guidance while maintaining productivity as an individual contributor. Expanded Dataflow capabilities to handle dozens of new use cases and flexibility through dynamic templates.
End-to-end Improvements: Implemented significant enhancements to increase security, service reliability and performance. Moved workloads to minimal Docker base images (Distroless), automated processes to reduce operational cost and participated in 24/7 on-call rotations, providing support in several critical incidents.
Automation and Tech Debt Reduction: Created automated playbooks to minimize support toil, considerably reduced technical debt, and increased the contribution velocity for the open-source Dataflow Templates project by over 3x.
(Key Technologies: Java, Python, C++, Apache Beam, Kafka, Docker, Dataflow, Pub/Sub, Bazel)
Big Data Platform Development: Developed a data platform and components for high-volume, scalable streaming pipelines. Utilized NoSQL databases, search engines, and streaming systems to power machine learning-driven consumption engines.
Architectural Leadership: Headed architectural decisions and execution, providing mentorship to the Backend Engineering team, increasing the output considerably.
Performance Optimization: Enhanced application performance through advanced JVM and garbage collection tuning techniques and application profiling. Pushed JVM limits by creating an engine to compile dynamic user configurations into bytecode using ASM and ByteBuddy.
(Key Technologies: Java, Python, Kafka, Elasticsearch, Couchbase, Dropwizard, Docker, Data Lakes)
Enterprise Software Development: Contributed to large-scale enterprise software by developing core components and frameworks, including a data integration layer supporting major global manufacturing companies.
Project Design: Designed application integration pipelines utilizing various integration patterns, ensuring real-time performance and high availability.
Tool Creation: Developed internal tools that enhanced productivity, streamlined the development process, and improved software quality. These tools included search engines for source code/database schemas and white box code validators.
(Key technologies: Java, Elasticsearch, Lucene, Spring, EJB, SQL, Oracle, Progress, Flex)
Java, Golang, Python, JavaScript/TypeScript, C++, Rust
SQL, Postgres, Trino, Data Lakehouse (Iceberg), Apache Calcite, Substrait, BigQuery, Spanner, Elasticsearch, Redis, Couchbase, Hive, Prometheus, Grafana
Apache Beam/Dataflow, DataFusion, Kafka, Spark, Flink, Pub/Sub, Camel, ActiveMQ
Kubernetes, Docker, Packer, Terraform, Jenkins, Kibana, Prometheus, Datadog, Grafana, Google Cloud Platform (Storage, Dataflow, BigQuery, Compute, Stackdriver)
Spring, Jersey, Dropwizard, JPA, JAX-RS, AspectJ, Nashorn, Jackson, Fastjson, JVM Tuning, GC Profiling, Velocity, Maven
deeplearning.ai (Coursera) • Jul, 2018
deeplearning.ai (Coursera) • Jun, 2018
Princeton University (Coursera) • Apr, 2017
Oracle Corporation • May, 2015
Oracle Corporation • Mar, 2014
Oracle Corporation • Apr, 2011