I'm unsure what’s in the air, but I’ve had many conversations about data teams over the last two weeks. Typically, the themes are along the lines of:
Obviously, WTF is a data team?
How can my data team stay relevant? (Shoutout to
)Do I need a data team?
What if my team uses data to make a product? Is that a data team?
WTF is a data team? Good question. The answers are all over the place. It could be a team officially called “the data team.” Or, people might also use data to enable things within an organization, whether that’s process improvement, operations, or improving a customer-facing mobile app. In my career, I’ve seen the term “data team” used almost every day, but I don’t feel like there’s a single definition. Much of this is because organizations differ in their data uses, which I’ll get into. So, what is a data team? As the old saying goes, you know it when you see it.
From my experience, there are two main types of data teams. These often exist in separate universes.
In Universe 1, we’ll call it Enterpriseland, where data teams usually serve a back-office, internally-facing IT function. These teams provide dashboards, “insights” (whatever those are), and reports to the business. Maybe a data scientist sprinkles some “AI” glitter now and then. These teams are also owned by IT (usually controlled by a CIO or sometimes a CDO), who mandate their work and maintain the data team’s budget. The core thing about Enterpriseland is data isn’t what the company sells to the end customer. Data is a byproduct (sometimes called “business exhaust”) of the enterprise's day-to-day operations. Data teams are often tasked with making sense of this data, which is usually tricky because data is frequently not the first consideration of the enterprise operators. As enterprises modernize, this might change. However, for the majority of enterprises in the world, data is not a first-class consideration.
In Enterpriseland, it’s often hard to judge the effectiveness and ROI of a data team. Does the team deliver “business value” (whatever the hell that is)? This uncertainty is widespread. My question of data team effectiveness boils down to asking, “What happens if the data team disappears tomorrow?” If you answer along the lines of “all hell breaks loose”, then keep your data team. If the answer is nothing, then why have a data team? If the answer is “I don’t know,” this is a grey area. Figure out who the domain experts on the team are and if the data is in good shape. Could specific tasks of the team be automated, with the domain expert (treat them well) around to answer ad-hoc questions? Are there reports that don’t need to be sent out so the data team can focus on doing value-added work? Could you embed the data team into business units to improve areas of your business? Again, this is the murky area I see many data teams in. Even the team might not be sure if their enterprise needs them around. Calculating the ROI of a data team is tricky, and I don’t have a universal answer except to make sure the data team helps drive KPIs for the enterprise (these will vary by company and department) in a positive direction. By the way, this is the broader problem in which IT is viewed as a cost center. Enterprises think they need IT but are often unsure if they’re maximizing their investment. This has been the case for several decades, and IT still struggles with this question today.
Universe 2 is Productland. Things are different in Productland, which moves much faster than in Enterpriseland. Productland “data teams” focus on delivering products. For example, a ride-sharing company does something very data intensive - getting a ride to a customer promptly and safely. In Productland, data IS the product. There’s a flywheel where more data is used in better ways, which means the product performs better. Better products mean users and paying customers are happy. The data team - this could be the software engineers themselves, a team of data and ML engineers, data scientists, etc - inherently knows precisely what it contributes to the business. If introducing a new feature improves a KPI, attribution to the data team should be very clear. Contrast this with Enterpriseland, where the value of sending out a dashboard is often unclear. In Productland, data enables better products whose ROI can be calculated.
Data teams will continue struggling to justify their value as long as data in Enterpriseland is a back-office function. But there’s hope. Enterprises are starting to recognize the importance of data. Some are modernizing. Today, this is mainly being driven by the AI hype cycle. Another driver is the popularity of data-product thinking, where data is used to deliver data-enabled products and functionality to end customers. The significant distinction is the focus on end-customers. As long as data is internally focused, it will suffer from the same politics and shortsighted behavior that’s plagued data teams for decades. When data is used to improve the experience and lives of end customers, the benefits of data (and data teams) are far more evident.
Hope you have a fun weekend!
Thanks,
Joe
P.S. If you haven’t done so, please sign up for Practical Data Modeling. There are lots of great discussions on data modeling, and I’ll also be releasing early drafts of chapters for my new data modeling book here. Thanks!
Cool Weekend Reads
Roadmap: AI Infrastructure (Janelle Teng)
Microsoft AI CEO Mustafa Suleyman audits OpenAI’s code (Semafor)
Will Cloud Software Be Ready for Its AI Moment? (WSJ)
Apple WWDC: AI Announcements Will Enable Home Robot, AR Glasses, Camera AirPods (Bloomberg)
Introducing Apple’s On-Device and Server Foundation Models (Apple Machine Learning Research)
Python at the Speed of Julia (Glass Notebook)
Study finds 268% higher failure rates for Agile software projects (The Register)
New Content, Events, and Upcoming Stuff
Monday Morning Data Chat
Note: Matt Housley and I are taking the Summer off from the Monday Morning Data Chat. We will be back in the Fall with an incredible new lineup.
In case you missed it…
Zhamak Dehghani + Summer Break Special (Spotify, YouTube)
Chris Tabb - Platform Gravity (YouTube)
Ghalib Suleiman - The Zero-Interest Hangover in Data and AI (Spotify, YouTube)
Bart Vandekerckhove - Data Security Deep Dive (Spotify, YouTube)
Yali Sassoon - Using LLMs to Support the Analytics Workflow (Spotify, YouTube)
David Yaffe & John Kutay - The State of Streaming and Change Data Capture (Spotify, YouTube)
The Joe Reis Show
This week…
Doug Needham - Architecture Deep Dive, The Hard Work of Generative AI in the Enterprise, and more (Spotify)
5 Minute Friday - WTF is a “Data Team”? (Spotify)
In case you missed it…
5 Minute Friday - Don’t Be A D*ck (Spotify)
Juha Korpela - Conceptual Data Modeling Deep Dive (Spotify)
5 Minute Friday - History Matters (Spotify)
5 Minute Friday - Career Progression Advice (Spotify)
Yulia Pavlova - AI and Disinformation/Misinformation in the Media (Spotify)
5 Minute Friday - Is Data Modeling a Waste of Time? (Spotify)
Bill Inmon - History Lessons of the Data Industry. This is a real treat and a very rare conversation with the godfather himself (Spotify) - PINNED HERE.
Events I’m At
AI Quality Conference (San Francisco) - June 25th Register here. I’ll also be DJing there…
(Taking the Summer off, sort of…)
Big Data London - September 18-19. Register here
DataEngBytes (Australia) - Late September/Early October, TBA
Helsinki Data Week - Fall TBA
Lots of other stuff in Europe - Fall, TBA
Asia - Fall, TBA
Thanks! If you want to help out…
Thanks for supporting my content. If you aren’t a subscriber, please consider subscribing to this Substack.
Would you like me to speak at your event? Submit a speaking request here.
You can also find me here:
Monday Morning Data Chat (YouTube / Spotify and wherever you get your podcasts). Matt Housely and I interview the top people in the field. Live and unscripted.
My other show is The Joe Reis Show (Spotify and wherever you get your podcasts). I interview guests on it, and it’s unscripted and free of shilling.
Practical Data Modeling. Great discussions about data modeling with data practitioners. This is also where early drafts of my new data modeling book will be published.
Fundamentals of Data Engineering by Matt Housley and I, available at Amazon, O’Reilly, and wherever you get your books.
Be sure to leave a lovely review if you like the content.
Thanks!
Joe Reis
Data teams are to enterprises as maintenance teams are to facilities. I like how in the before and after table linked below, it shows how a team can move from being inwardly focused to business-focused, shifting more towards collaboration with the business and proactivity rather than reactivity
Table: https://media.noria.com/sites/Uploads/2019/6/5/70aaacf5-5824-40bc-a91c-f5ba5a203b8d_Capture-4_extra_large.jpeg
From article: https://www.reliableplant.com/Read/31536/high-performance-maintenance-team