Bruno Volpato's photo

Bruno Volpato

Staff Software Engineer, Distributed Systems & AI Infrastructure

Staff Software Engineer with 15+ years building high-scale distributed systems, query engines, data platforms, and AI infrastructure. Strongest at turning fragmented data access into durable platform primitives: reliable query layers, heterogeneous connectors, performance work, incident-ready operations, and systems that let product and research teams move faster without weakening production guarantees.

Current focus: query planning, connector/federation layers, distributed execution, AI-assisted service investigation data paths, evaluation loops, reliability engineering, and platform architecture for high-scale data products.

AI Data

Investigation Agents Evaluation Loops Retrieval Systems RCA Workflows

Systems

Distributed Systems Query Engines Federation Layers SRE

Languages

Golang Java Python C++ SQL Rust-aware

Platforms

Kubernetes Docker GCP CI/CD On-call

Data & Query

Trino DataFusion Substrait Apache Calcite Kafka BigQuery

Experience

100M+ daily queries served through production observability query infrastructure.
AI Data service-investigation retrieval paths, diagnostic eval loops, and RCA quality infrastructure.
200+ production pipelines led on a petabyte-scale cloud data platform.
80% CVE surface area reduction from production container hardening.

Datadog

Jan 2024 - Present

Staff Software Engineer

Query Platform Leadership: Lead Query Planning and Connectors work serving hundreds of millions of queries daily across observability use cases, spanning parsers, cost-based optimization, distributed execution engines, and connector reliability.

Federated Data Access: Designed and shipped a production connector platform that reduced engine coupling, expanded from 5 to 15 connectors and 5 to 38 catalog instances, and helped deliver 20+ new data sources in one quarter.

Open Query Systems: Became a Substrait committer (Cross-Language Serialization for Relational Algebra) and contributed across DataFusion, Trino, Iceberg, and Calcite to make plans and data access portable across engines.

Performance Engineering: Delivered 50x production query speedups, 14x large-timeframe query speedups, parser p95 latency reduction from roughly 3.2 ms to 720 us, and roughly 40x lower parser memory.

Production Reliability: Serve as Core Incident Commander for severe incidents; commanded 15+ incidents, responded to 120+ incidents in the last year, and coordinated follow-ups across monitoring, deployment gates, SLOs, and regression suites.

AI Data Infrastructure: Partnered with Applied AI researchers and scientists on service-investigation data paths, retrieval agents, simulation/evaluation environments, and root cause analysis quality loops.

Stack: Go, Java, Python, SQL, Trino, DataFusion, Apache Calcite, Kubernetes, LLMs, RL

Google

Aug 2022 - Jan 2024

Software Engineer, Tech Lead

Data Platform Leadership: Led Dataflow Platform team of 5 engineers responsible for 200+ production pipelines processing petabytes of data daily across Google Cloud.

Platform Transformation: Moved the team from bespoke customer pipeline delivery toward reusable infrastructure for teams and partners; reduced new pipeline development time by 60% and enabled 3x more use cases with the same team capacity.

Operational Reliability: Participated in 24/7 on-call handling P0/P1 incidents, customer escalations, and turnarounds while improving tests, probers, rollouts, support automation, and release safety.

Security & Delivery Hygiene: Led migration to minimal Distroless container images, eliminating over 80% of CVE surface area, and drove multi-quarter migrations and partner launches across data platform teams.

Execution & Influence: Recognized by peers for customer focus, incident support, cross-team technical help, and ability to turn ambiguous platform work into shipped infrastructure.

Stack: Java, Python, C++, Apache Beam, Kafka, Docker, Google Cloud Dataflow, GCP

TOTVS Labs

Sept 2015 - Jul 2022

Senior Backend Engineer

Big Data Platform: Architected scalable streaming pipelines processing billions of events daily using Kafka, Elasticsearch, and Couchbase, powering ML-driven recommendation engines for enterprise customers.

Platform Architecture: Set technical direction for backend data processing systems and helped turn high-volume event streams into reusable platform capabilities.

Performance Engineering: Built dynamic configuration-to-bytecode compilation engine, reducing rule evaluation latency by 80% and enabling real-time personalization at scale.

Stack: Java, Python, Kafka, Elasticsearch, Couchbase, Docker, JVM tuning

TOTVS

Mar 2009 - Aug 2015

Senior Software / R&D Engineer

Enterprise Integration: Designed and implemented data integration layer serving 5,000+ global manufacturing companies, processing millions of transactions daily with sub-second latency requirements.

Developer Productivity: Built source search and white-box validation tooling, improving team productivity by 30% and reducing bug escape rate.

System Architecture: Architected real-time event-driven integration pipelines for critical enterprise workflows across distributed deployments.

Stack: Java, Elasticsearch, Lucene, Spring, EJB, SQL, Oracle, Progress

Education

North Carolina State University

M.S. in Computer Science

North Carolina State University

B.S. in Computer Science