Funny, I had penned a note earlier in the year "DataVaultOps" but you might beat me to publishing something similar on a similar topic. I don't think data modelling ever died, it's everywhere just not in the traditional form because those forms can be time consuming (key-pairs have a data model, Cassandra has a data model). But for large enterprises data traditional data modelling is a must!
I think if you deploy in dbt it's easy to forget that the dbt model is still a data model; easy to stand up, easy to deploy but when you need to explain it dbt only gives you a lineage diagram and not how the content relates to each other.
As for those traditional data modelling steps for conceptual modelling and logical modelling, they still happen but given different names. Domain story telling and Event Storming, in a microservices context they help software engineers build software using Domain Driven Design, in a data mesh they help define a data model.
I couldn't have agreed more. Data modelling is definitely seeing a "fall and rise of" phenomenon - and so it should. It would surprise people how much of the ML & AI is built on and works because of the a sound underlying data model. Last year, I wrote about how data modelling is the key differentiator that makes a data warehouse. Here's an excerpt:
"With the appropriation of the term data warehousing to mean a whole lot of things it shouldn’t, the general understanding of what a data warehouse is has become a nomenclature problem more than anything else. Many people understand that the distinction between a data warehouse and a database is solely that of the underlying infrastructure.
Although infrastructure plays a significant role in building a data warehouse, data warehousing has one more aspect, which, I think, supersedes in importance — data modeling. Because of the failure to acknowledge data modeling as one of the core ideas behind data warehousing, the common understanding of data warehouses is flawed."
Data modelling gives data a structure that results in hints to the systems performing computation, transformation, and movement on that data that result in tremendous performance benefits. It's because of this that I think data modelling is one of those concepts that resembles some age-old wisdom that every generation rejects in their teenage years, only to go back to the age-old wisdom after a few years of making sense of the world.
Joe - thank you for insight on these issues. As I told you a few months ago when we talked on Zoom - I find you are one of few people hitting these issues on the nose for our industry. I feel as though you put into words what is only half-thoughts for many people.
I just get nervous with the classic business opposition to things like this: ROI. Although it seems obvious that at some level, practices like data modeling (and data management in general) will lead to positive business outcomes, it can be hard to articulate exactly why/when/how much.
Sounds great! I'm in an area without phone reception for another week (you can read about it on my Substack 😉) but once I'm back in SF would love to catch up!
“ Bring back and revamp the practice of conceptual and logical data modeling. Make these simple to do, iterative, and traverse the data lifecycle.”
Have you considered Domain Storytelling?
Not as a replacement but together with Domain Mapping I used this as an accelerator to Conceptual Modelling at a customer. The process was so intuitive that after I did one, the customer’s own data engineers did the rest!
Funny, I had penned a note earlier in the year "DataVaultOps" but you might beat me to publishing something similar on a similar topic. I don't think data modelling ever died, it's everywhere just not in the traditional form because those forms can be time consuming (key-pairs have a data model, Cassandra has a data model). But for large enterprises data traditional data modelling is a must!
I think if you deploy in dbt it's easy to forget that the dbt model is still a data model; easy to stand up, easy to deploy but when you need to explain it dbt only gives you a lineage diagram and not how the content relates to each other.
As for those traditional data modelling steps for conceptual modelling and logical modelling, they still happen but given different names. Domain story telling and Event Storming, in a microservices context they help software engineers build software using Domain Driven Design, in a data mesh they help define a data model.
Looking forward to catching up in Australia! Lots to talk about
I couldn't have agreed more. Data modelling is definitely seeing a "fall and rise of" phenomenon - and so it should. It would surprise people how much of the ML & AI is built on and works because of the a sound underlying data model. Last year, I wrote about how data modelling is the key differentiator that makes a data warehouse. Here's an excerpt:
"With the appropriation of the term data warehousing to mean a whole lot of things it shouldn’t, the general understanding of what a data warehouse is has become a nomenclature problem more than anything else. Many people understand that the distinction between a data warehouse and a database is solely that of the underlying infrastructure.
Although infrastructure plays a significant role in building a data warehouse, data warehousing has one more aspect, which, I think, supersedes in importance — data modeling. Because of the failure to acknowledge data modeling as one of the core ideas behind data warehousing, the common understanding of data warehouses is flawed."
Data modelling gives data a structure that results in hints to the systems performing computation, transformation, and movement on that data that result in tremendous performance benefits. It's because of this that I think data modelling is one of those concepts that resembles some age-old wisdom that every generation rejects in their teenage years, only to go back to the age-old wisdom after a few years of making sense of the world.
very cool
Joe - thank you for insight on these issues. As I told you a few months ago when we talked on Zoom - I find you are one of few people hitting these issues on the nose for our industry. I feel as though you put into words what is only half-thoughts for many people.
Thanks Ricky. It’s a issue that is thankfully gaining momentum
I just get nervous with the classic business opposition to things like this: ROI. Although it seems obvious that at some level, practices like data modeling (and data management in general) will lead to positive business outcomes, it can be hard to articulate exactly why/when/how much.
I just listened to Attia's interview on Hubernan's show and loved it! One of the top episodes so far, alongside Huberman's interview with Andy Galpin
Oh nice!
Good to see you Danny. Let’s catch up soon
Sounds great! I'm in an area without phone reception for another week (you can read about it on my Substack 😉) but once I'm back in SF would love to catch up!
This writing is glorious and conclusions are accurate!
Thanks Paul! Much appreciated
“ Bring back and revamp the practice of conceptual and logical data modeling. Make these simple to do, iterative, and traverse the data lifecycle.”
Have you considered Domain Storytelling?
Not as a replacement but together with Domain Mapping I used this as an accelerator to Conceptual Modelling at a customer. The process was so intuitive that after I did one, the customer’s own data engineers did the rest!