A peer-to-peer approach to managing data stacks live virtually on 4.19.2023

Sangeeta Krishnan

Sr. Analytics Lead

Vikas Ranjan

Senior Leader, Data Intelligence & Innovation

Jennifer Romero-Higgins

Principal Data Architect

Carlos Costa

Data & Analytics Hub Lead & Engineering Director

Sandeep Mehta

Engineering Lead, Data Platforms

Joseph Machado

Senior Data Engineer

Mark Mullins

Chief Data Officer

Dr. Rajkumar J. Bhojan

AI Researcher

Bill Inmon

Founder, Chairman, CEO and Author

Raj Joseph

Founder & CEO

Jess Ramos

Senior Data Analyst

Joel Hernandez

CTO

Olga Maydanchik

AVP, Metadata Architecture, Enterprise Data Management

Mike Fuller

CTO

Nicole Radziwill

SVP & Chief Data Scientist

Manimuthu Aayyannan

Senior Manager II, Feature Engineering

Sunny Zhu

ESG Data Analytics & Operations

Subramanya Mulgund

Sr. Software Engineer, Feature Engineering

Mike Mooney

Co-Founder

Mark Kidwell

Chief Data Architect

Carlos Rodríguez

Data and Analytics Manager

Monica Kay Royal

Founder & Chief Data Enthusiast

Matthew Norton

Product Owner

Dr. Alexander Mikhalev

Director

Ted Sfikas

Senior Director of Digital Strategy & Value Engineering, Americas

Christopher Chin

Public Speaking Coach

Andrew Gelinas

Co-Founder

Watch all of the DSS 2023 sessions on-demand now!

Opening notes

Andrew Gelinas, Co-Founder @ Solution Monday

Opening notes as we kick off Data Stack Summit 2023.

Thank you to all speakers, attendees, and sponsors who made Data Stack Summit 2023 possible today.

To keep up with announcements about future peer-to-peer data community events, visit solutionmonday.com.

To explore the peer-built StackWizard project, visit stackwizard.com.

Peer-to-Peer Panel: Managing cloud costs right now

Joseph Machado, Senior Data Engineer @ LinkedIn

Carlos Costa, Data & Analytics Hub Lead & Engineering Director @ Adidas

Vikas Ranjan, Senior Leader, Data Intelligence & Innovation @ T-Mobile

Mike Fuller, CTO @ FinOps Foundation

Mike Mooney, Co-Founder @ Solution Monday

Mike, Carlos, Vikas, and Joseph join Mike Mooney to address cloud cost management. Together they’ll walk through:

Challenges when it comes to controlling costs associated with managing data in the cloud
Who on the team should be responsible for cost management
Tools and processes they’ve implemented to ensure cost management coordination among those responsible for managing data

Out of all the ways to control costs, this panel will discuss the most valuable ways to identify cost-saving opportunities and see successful outcomes.

Modernizing the data stack - keeping it real!

Mark Mullins, Chief Data Officer @ United Community Bank

Raj Joseph, Founder & CEO @ DQ Labs

In a world of active metadata, semantic layer, data contracts, and modernization of data quality, sometimes it’s easy to overlook the challenges of delivering business value and jump upstream towards a vision of a modern stack.

Hear from a true data leader currently transforming his entire banking data stack and team with careful planning and making progress. This session is about keeping it real and for other leaders who want to learn how to swim in a world of hypes and buzzwords.

From Complex to Simplicity: Our DataOps Journey

Jennifer Romero-Higgins, Principal Data Architect @ American Airlines

In this presentation, Jennifer Romero-Higgins, Principal Data Architect will take a deep dive into American Airlines' DataOps journey, from the challenges we faced to the solutions we implemented.

She’ll cover how they created the “Easy” button for data and how it has helped reduce the time and effort required to onboard into the cloud and ingest data.

Jennifer will also share insight into how they have created a culture of continuous improvement and collaboration.

An ever-increasing need for data quality

Dr. Rajkumar J. Bhojan, AI Researcher @ Fidelity

As the big data explosion matures even further, we're seeing how data quality is so closely linked to information quality, decision quality, and outcome quality. So, good data is more important than big data but how fast can we make good data?

Rajkumar will walk through the new, different challenges teams are facing and explore a real-time use case for data quality.

Maintaining price and performance SLAs across engineering teams

Mark Kidwell, Chief Data Architect @ Autodesk

Getting a data stack up and running is just one step—making sure it’s optimized is the true challenge. This session will cover how Mark and his team have developed their strategy and built a self-service analytics platform.

Ranging from the common pitfalls and approaches to how they worked with cross-functional teams, Mark will take you through the journey of truly optimizing the modern data stack for price and performance.

Turning your data lake into an asset

Bill Inmon, Founder, Chairman, CEO and Author @ ForestRim Technology

Data architecture is constantly evolving. First, there were applications. Then data warehouses. Today we have the data lake. People are discovering that the data lake quickly turns into a data swamp or data sewer. What do you need to do to turn your data lake into a productive, vibrant data lakehouse?

Self service metadata driven data loader framework

Manimuthu Aayyannan, Senior Manager II, Feature Engineering @ Walmart

Subramanya Mulgund, Sr. Software Engineer, Feature Engineering @ Walmart

Join Manimuthu and Subramanya as they share insights around personalization at Walmart via thousands of data apps that generate personalized recommendations to customers.

They'll walk through relevant challenges and approaches for solutions, high-level system architecture, metadata design connectors, orchestration, schedule optimization, and telemetry.

YARN to Kubernetes: Modernizing big data workloads on a massive scale

Vikas Ranjan, Senior Leader, Data Intelligence & Innovation @ T-Mobile

Join Vikas as he shares his perspectives on how to optimize costs and meet cross-department SLAs by transforming a high-scale, high-volume distributed system, from YARN to Kubernetes.

Peer-to-Peer Panel: Enabling the analytics end user

Nicole Radziwill, SVP & Chief Data Scientist @ Ultranauts

Sangeeta Krishnan, Sr. Analytics Lead @ Bayer

Jess Ramos, Senior Data Analyst @ Crunchbase

Sunny Zhu, ESG Data Analytics & Operations @ Indeed

An intuitive discussion about practical ways to enable the end user and insights on the evolution of data collaboration.

Building a business-critical data platform to process over £34bn in card transactions

Sandeep Mehta, Engineering Lead, Data Platforms @ Dojo

The UK payments infrastructure has remained unchanged for 20 years, resulting in a fragile and unpredictable system for processing card transactions. Sandeep will discuss their journey in building a PCI DSS-compliant data platform on Kubernetes, using cloud-native technologies to address the challenges of scaling one of Europe's largest fintechs in a highly regulated industry.

His talk will cover security, auto-scaling, data observability, data transformation, schema evolution, and data governance considerations. He aims to inspire the community to build a data stack that can handle millions of transactions per day with four-nines availability, referring to it as "building a nuclear power station - it cannot fail."

NLP & ML data-driven decision making: Taming the curriculum beast

Joel Hernandez, CTO @ eLumen

Carlos Rodríguez, Data and Analytics Manager @ MentorMate

Recent advances in Natural Language Processing (NLP) and Machine Learning (ML) created unprecedented opportunities for organizations to leverage heterogeneous data scenarios. We can now draw insights from not only structured data but unstructured ones (text) as well.

eLumen partnered with MentorMate to deliver Data Engineering, NLP, and ML tools that harvest and curate course data and learning outcomes in higher education curriculum improvement. Formerly unstructured data tracked by the eLumen Insights platform now has key metadata and graph relations and can be stored in institutional data lakes that drive insights.

Joel and Carlos will share how eLumen Insights leverages graph DBs, data lake, and ML/NLP technologies to drive data curation and insights in heterogeneous unstructured data scenarios.

Is synthetic data useful for data engineers?

Dr. Alexander Mikhalev, Director @ Applied Knowledge Systems

Matthew Norton, Product Owner @ Nationwide

There is a recent buzz around synthetic data but is it useful for data engineers?

In this talk, we will cover synthetic data and how it's different from anonymized (masked) data and fuzzing and give a short overview of current synthetic data vendors.

The benefits vendor's tooling brings into generating synthetic data. We will conclude the session with a demo of using open-source synthetic data generation to validate a real-time streaming pipeline.

The great debate of data quality vs. data observability

Olga Maydanchik, AVP, Metadata Architecture, Enterprise Data Management @ Voya Financial

Raj Joseph, Founder & CEO @ DQ Labs

You are often left with whether to proactively observe the health of data stacks for anomalies before they can affect the downstream applications or drive business decisions by identifying fit for use. Is it one or the other, or is both needed? Join us to listen in the debate between two experts in this field to see how organizations can use these capabilities for better business outcomes from the eyes of the practitioner.

DataOps teams: Stop Sprinting!

Monica Kay Royal, Founder & Chief Data Enthusiast @ Nerd Nourishment

DevOps and DataOps have a few similarities in the processes and tooling required to achieve the goals of each. So why are data teams struggling with the implementation of DataOps?

Attendees will learn what it takes for data professionals to get things done, what done really means, and why it’s not a good idea to sprint through the data lifecycle.

Designing a Modern Customer Data Center of Excellence

Ted Sfikas, Senior Director of Digital Strategy & Value Engineering, Americas @ Tealium

Organizations globally have amassed an enormous amount of customer data that now requires careful oversight and timely activation in order to meet modern expectations. Doing so will bring brands as close as possible to the customer and automate some of the best revenue-generating processes available. You’re going to need some technology automation to help. You will increase revenue streams and optimize operational processes. So how can companies unlock this customer data? Who holds the key? Creating a Data Center of Excellence charts the path to successful data management in every business – large or small.

Takeaways: In this session, you'll learn the major stakeholders that benefit from increased revenue and operational excellence, what a Data Center of Excellence (DCoE) is, and business practices and topics in scope for optimization by a DCoE.

How to Solve the Communication Gap between Data Teams and the Business: The 5 Step Framework

Christopher Chin @ The Hidden Speaker

Stakeholders and data teams are experts in their respective fields. But strained or ineffective communication often leads to failed initiatives, poor decision-making, and deliverables that don’t meet requirements. In this talk I overview a 5 step framework that solves these issues by building relationships of trust, credibility, and transparency: Listen > Identify the Problem > Propose a Solution > Communicate > Deliver

Where & when?

Data Stack Summit 2023 was held virtually on April 19, 2023.

What is the cost to attend the virtual sessions?

Data Stack Summit is always free and open for all to attend

What is Data Stack Summit?

Finding ways to efficiently conquer the modern data stack can become infinitely more possible when we’re able to gather together collaboratively as a community and discuss the tools and capabilities desired by future-forward organizations.

Hear real-world perspectives from long-time data visionaries, data engineers, data and cloud architects, DataOps and DevOps practitioners as they talk through topics like the building blocks of the modern data platform, open source considerations, best practices for impactful data operations, migrations, data observability, and tuning data pipelines for performance at scale.

Who comes to Data Stack Summit?

Data and cloud architects, data engineers, DevOps practitioners and managers, data and ITOps leaders

Join us for talks around things like:

Building blocks of the modern data platform
Implementing the modern data platform using open source
Deploying the modern data platform using K8s
Best practices for data team operations
Migrations to modern data platforms
Optimizing high-performance big data for future-forward organizations

Interested in speaking or sponsoring the next Data Stack Summit?

Please reach out to astronaut@solutionmonday.com.