A peer-to-peer approach to managing data stacks live virtually on 4.19.2023
Sangeeta Krishnan
Sr. Analytics Lead
Vikas Ranjan
Senior Leader, Data Intelligence & Innovation
Jennifer Romero-Higgins
Principal Data Architect
Carlos Costa
Data & Analytics Hub Lead & Engineering Director
Sandeep Mehta
Engineering Lead, Data Platforms
Joseph Machado
Senior Data Engineer
Mark Mullins
Chief Data Officer
Dr. Rajkumar J. Bhojan
AI Researcher
Bill Inmon
Founder, Chairman, CEO and Author
Raj Joseph
Founder & CEO
Jess Ramos
Senior Data Analyst
Joel Hernandez
CTO
Olga Maydanchik
AVP, Metadata Architecture, Enterprise Data Management
Mike Fuller
CTO
Nicole Radziwill
SVP & Chief Data Scientist
Manimuthu Aayyannan
Senior Manager II, Feature Engineering
Sunny Zhu
ESG Data Analytics & Operations
Subramanya Mulgund
Sr. Software Engineer, Feature Engineering
Mike Mooney
Co-Founder
Mark Kidwell
Chief Data Architect
Carlos Rodríguez
Data and Analytics Manager
Monica Kay Royal
Founder & Chief Data Enthusiast
Matthew Norton
Product Owner
Dr. Alexander Mikhalev
Director
Ted Sfikas
Senior Director of Digital Strategy & Value Engineering, Americas
Christopher Chin
Public Speaking Coach
Andrew Gelinas
Co-Founder
Watch all of the DSS 2023 sessions on-demand now!
Opening notes
Andrew Gelinas, Co-Founder @ Solution Monday
Opening notes as we kick off Data Stack Summit 2023.
Thank you to all speakers, attendees, and sponsors who made Data Stack Summit 2023 possible today.
To keep up with announcements about future peer-to-peer data community events, visit solutionmonday.com.
To explore the peer-built StackWizard project, visit stackwizard.com.
Peer-to-Peer Panel: Managing cloud costs right now
Joseph Machado, Senior Data Engineer @ LinkedIn
Carlos Costa, Data & Analytics Hub Lead & Engineering Director @ Adidas
Vikas Ranjan, Senior Leader, Data Intelligence & Innovation @ T-Mobile
Mike Fuller, CTO @ FinOps Foundation
Mike Mooney, Co-Founder @ Solution Monday
Mike, Carlos, Vikas, and Joseph join Mike Mooney to address cloud cost management. Together they’ll walk through:
- Challenges when it comes to controlling costs associated with managing data in the cloud
- Who on the team should be responsible for cost management
- Tools and processes they’ve implemented to ensure cost management coordination among those responsible for managing data
Out of all the ways to control costs, this panel will discuss the most valuable ways to identify cost-saving opportunities and see successful outcomes.
Modernizing the data stack - keeping it real!
Mark Mullins, Chief Data Officer @ United Community Bank
Raj Joseph, Founder & CEO @ DQ Labs
In a world of active metadata, semantic layer, data contracts, and modernization of data quality, sometimes it’s easy to overlook the challenges of delivering business value and jump upstream towards a vision of a modern stack.
Hear from a true data leader currently transforming his entire banking data stack and team with careful planning and making progress. This session is about keeping it real and for other leaders who want to learn how to swim in a world of hypes and buzzwords.
From Complex to Simplicity: Our DataOps Journey
Jennifer Romero-Higgins, Principal Data Architect @ American Airlines
In this presentation, Jennifer Romero-Higgins, Principal Data Architect will take a deep dive into American Airlines' DataOps journey, from the challenges we faced to the solutions we implemented.
She’ll cover how they created the “Easy” button for data and how it has helped reduce the time and effort required to onboard into the cloud and ingest data.
Jennifer will also share insight into how they have created a culture of continuous improvement and collaboration.
An ever-increasing need for data quality
Dr. Rajkumar J. Bhojan, AI Researcher @ Fidelity
As the big data explosion matures even further, we're seeing how data quality is so closely linked to information quality, decision quality, and outcome quality. So, good data is more important than big data but how fast can we make good data?
Rajkumar will walk through the new, different challenges teams are facing and explore a real-time use case for data quality.
Maintaining price and performance SLAs across engineering teams
Mark Kidwell, Chief Data Architect @ Autodesk
Getting a data stack up and running is just one step—making sure it’s optimized is the true challenge. This session will cover how Mark and his team have developed their strategy and built a self-service analytics platform.
Ranging from the common pitfalls and approaches to how they worked with cross-functional teams, Mark will take you through the journey of truly optimizing the modern data stack for price and performance.
Turning your data lake into an asset
Bill Inmon, Founder, Chairman, CEO and Author @ ForestRim Technology
Data architecture is constantly evolving. First, there were applications. Then data warehouses. Today we have the data lake. People are discovering that the data lake quickly turns into a data swamp or data sewer. What do you need to do to turn your data lake into a productive, vibrant data lakehouse?
Self service metadata driven data loader framework
Manimuthu Aayyannan, Senior Manager II, Feature Engineering @ Walmart
Subramanya Mulgund, Sr. Software Engineer, Feature Engineering @ Walmart
Join Manimuthu and Subramanya as they share insights around personalization at Walmart via thousands of data apps that generate personalized recommendations to customers.
They'll walk through relevant challenges and approaches for solutions, high-level system architecture, metadata design connectors, orchestration, schedule optimization, and telemetry.
YARN to Kubernetes: Modernizing big data workloads on a massive scale
Vikas Ranjan, Senior Leader, Data Intelligence & Innovation @ T-Mobile
Join Vikas as he shares his perspectives on how to optimize costs and meet cross-department SLAs by transforming a high-scale, high-volume distributed system, from YARN to Kubernetes.
Peer-to-Peer Panel: Enabling the analytics end user
Nicole Radziwill, SVP & Chief Data Scientist @ Ultranauts
Sangeeta Krishnan, Sr. Analytics Lead @ Bayer
Jess Ramos, Senior Data Analyst @ Crunchbase
Sunny Zhu, ESG Data Analytics & Operations @ Indeed
An intuitive discussion about practical ways to enable the end user and insights on the evolution of data collaboration.
Building a business-critical data platform to process over £34bn in card transactions
Sandeep Mehta, Engineering Lead, Data Platforms @ Dojo
The UK payments infrastructure has remained unchanged for 20 years, resulting in a fragile and unpredictable system for processing card transactions. Sandeep will discuss their journey in building a PCI DSS-compliant data platform on Kubernetes, using cloud-native technologies to address the challenges of scaling one of Europe's largest fintechs in a highly regulated industry.
His talk will cover security, auto-scaling, data observability, data transformation, schema evolution, and data governance considerations. He aims to inspire the community to build a data stack that can handle millions of transactions per day with four-nines availability, referring to it as "building a nuclear power station - it cannot fail."
NLP & ML data-driven decision making: Taming the curriculum beast
Joel Hernandez, CTO @ eLumen
Carlos Rodríguez, Data and Analytics Manager @ MentorMate
Recent advances in Natural Language Processing (NLP) and Machine Learning (ML) created unprecedented opportunities for organizations to leverage heterogeneous data scenarios. We can now draw insights from not only structured data but unstructured ones (text) as well.
eLumen partnered with MentorMate to deliver Data Engineering, NLP, and ML tools that harvest and curate course data and learning outcomes in higher education curriculum improvement. Formerly unstructured data tracked by the eLumen Insights platform now has key metadata and graph relations and can be stored in institutional data lakes that drive insights.
Joel and Carlos will share how eLumen Insights leverages graph DBs, data lake, and ML/NLP technologies to drive data curation and insights in heterogeneous unstructured data scenarios.
Is synthetic data useful for data engineers?
Dr. Alexander Mikhalev, Director @ Applied Knowledge Systems
Matthew Norton, Product Owner @ Nationwide
There is a recent buzz around synthetic data but is it useful for data engineers?
In this talk, we will cover synthetic data and how it's different from anonymized (masked) data and fuzzing and give a short overview of current synthetic data vendors.
The benefits vendor's tooling brings into generating synthetic data. We will conclude the session with a demo of using open-source synthetic data generation to validate a real-time streaming pipeline.
The great debate of data quality vs. data observability
Olga Maydanchik, AVP, Metadata Architecture, Enterprise Data Management @ Voya Financial
Raj Joseph, Founder & CEO @ DQ Labs
You are often left with whether to proactively observe the health of data stacks for anomalies before they can affect the downstream applications or drive business decisions by identifying fit for use. Is it one or the other, or is both needed? Join us to listen in the debate between two experts in this field to see how organizations can use these capabilities for better business outcomes from the eyes of the practitioner.
DataOps teams: Stop Sprinting!
Monica Kay Royal, Founder & Chief Data Enthusiast @ Nerd Nourishment
DevOps and DataOps have a few similarities in the processes and tooling required to achieve the goals of each. So why are data teams struggling with the implementation of DataOps?
Attendees will learn what it takes for data professionals to get things done, what done really means, and why it’s not a good idea to sprint through the data lifecycle.
Designing a Modern Customer Data Center of Excellence
Ted Sfikas, Senior Director of Digital Strategy & Value Engineering, Americas @ Tealium
Organizations globally have amassed an enormous amount of customer data that now requires careful oversight and timely activation in order to meet modern expectations. Doing so will bring brands as close as possible to the customer and automate some of the best revenue-generating processes available. You’re going to need some technology automation to help. You will increase revenue streams and optimize operational processes. So how can companies unlock this customer data? Who holds the key? Creating a Data Center of Excellence charts the path to successful data management in every business – large or small.
Takeaways: In this session, you'll learn the major stakeholders that benefit from increased revenue and operational excellence, what a Data Center of Excellence (DCoE) is, and business practices and topics in scope for optimization by a DCoE.
How to Solve the Communication Gap between Data Teams and the Business: The 5 Step Framework
Christopher Chin @ The Hidden Speaker
Stakeholders and data teams are experts in their respective fields. But strained or ineffective communication often leads to failed initiatives, poor decision-making, and deliverables that don’t meet requirements. In this talk I overview a 5 step framework that solves these issues by building relationships of trust, credibility, and transparency: Listen > Identify the Problem > Propose a Solution > Communicate > Deliver
Where & when?
Data Stack Summit 2023 was held virtually on April 19, 2023.
What is the cost to attend the virtual sessions?
Data Stack Summit is always free and open for all to attend
What is Data Stack Summit?
Finding ways to efficiently conquer the modern data stack can become infinitely more possible when we’re able to gather together collaboratively as a community and discuss the tools and capabilities desired by future-forward organizations.
Hear real-world perspectives from long-time data visionaries, data engineers, data and cloud architects, DataOps and DevOps practitioners as they talk through topics like the building blocks of the modern data platform, open source considerations, best practices for impactful data operations, migrations, data observability, and tuning data pipelines for performance at scale.
Who comes to Data Stack Summit?
Data and cloud architects, data engineers, DevOps practitioners and managers, data and ITOps leaders
Join us for talks around things like:
- Building blocks of the modern data platform
- Implementing the modern data platform using open source
- Deploying the modern data platform using K8s
- Best practices for data team operations
- Migrations to modern data platforms
- Optimizing high-performance big data for future-forward organizations
Interested in speaking or sponsoring the next Data Stack Summit?
Please reach out to astronaut@solutionmonday.com.