History Matters
My “big brother” and mentor, Bill Inmon, lamented to me the other day that people have forgotten the greats in the computing industry. Heavyweights like Grace Hopper, Gene Amdhal, and Ed Yourdon, among many others. But unlike other professions that revere their contributors (think medicine - Hippocrates, Pasteur, etc.), tech and data throw out the legends as soon as it's convenient.
History is important. It helps set the context of the why and what of your profession. If you know the history of something, then you know where things started and morphed to where they are today. You also have a clue as to why that happened. Why did the relational data model come about? Or the data warehouse? What about the perceptron? All of these are building blocks for the field today.
Every innovation you’re currently working with (in this case, LLMs in the last few years) happened because people wanted to improve the status quo. Perceptrons led to neural networks, which led to deep learning. In the case of LLMs, the problem was improving language translation. The transformer was the result in 2017 and paved the way for generative AI. All of this is built on deep learning. If you understand the building blocks of machine learning, you’ll have a much easier time understanding generative AI. It didn’t spring out of nowhere. As for the people who came up with the perceptron, neural networks, deep learning, etc, I suggest looking them up. There’s a lineage of people, and the stories of how they came up with these ideas are fascinating.
Stay ahead of the curve, and don’t forget the history of your field.
Hope you have a fun weekend!
Thanks,
Joe
P.S. Sol Rashidi and I are doing a 3-day course about transitioning from practitioner to executive. If you’re stuck in a career rut and want to level up your situation, get more details and register here.
Next, my good friend Zhamak Dehghani and the NextData team are working on some cool things related to decentralized data and Data Mesh. If you’re in the Bay Area on June 11 or 12th, they’re hosting a private event to demo some of the stuff they’re building. Details are here.
Finally, If you haven’t done so, please sign up for Practical Data Modeling. There are lots of great discussions on data modeling, and I’ll also be releasing early drafts of chapters for my new data modeling book here. Thanks!
Cool Weekend Reads
Slop is the new name for unwanted AI-generated content (Simon Wilson)
“Slop” (or “slom”) is the perfect description for the enshitification of the Internet due to unwanted AI-generated content. While some people are excited about the potential for AI to transform the world, who the hell wants a world full of AI-generated nonsense? If it gets bad enough, I’ll unplug and read my massive pile of books instead.
PwC Set to Become OpenAI’s Largest ChatGPT Enterprise Customer (WSJ)
“PwC also developed a chatbot called ChatPwC, built on OpenAI’s GPT-4 model, which more than 100,000 employees are using globally. Employees using tools like ChatPwC have reported a 20% to 40% increase in productivity, it said. Both ChatGPT Enterprise and ChatPwC will be available to employees, Atkinson said, though many will start using ChatGPT Enterprise as it rolls out.”
While I’m not privy to ChatPwC, if employees have this increase in productivity using OpenAI’s GPT-4, it makes me wonder how differentiated PWC’s offering is from any other firm with the same tools. Also, will these productivity gains be passed to customers at lower fees?
Does AI have a gross margin problem? (Mostly Metrics)
“What will separate the winners from the losers in the quest for “good” gross margins are their relationships with suppliers, most notably AWS / GCP / Azure on the cloud side, and chip manufacturers, like Nvidia, AMD, and Qualcomm.”
An Engineering Manager Challenge (Neward Associates)
"You're the tech lead and your team is getting stretched thin. You decide to add resources (sic; not my choice of words here) but you can afford 1 senior full-stack developer or 2 junior full-stack devs. Which do you choose and why?"
So what would you do?
Other cool reads…
Doing is normally distributed, learning is log-normal (Andrew Quinn's TILs)
Sorry AI, Old-School Spreadsheets Are Still King (WSJ)
Hardly any of us are using AI tools like ChatGPT, study says – here’s why (Techradar)
Prefer Noun-Adjective Naming (Kyle Shevlin)
Unexpected Anti-Patterns for Engineering Leaders — Lessons From Stripe, Uber & Carta (Firstround)
Tiny number of ‘supersharers’ spread the vast majority of fake news (Science)
Don’t Worry About LLMs (Vicki Boykis)
New Content, Events, and Upcoming Stuff
Monday Morning Data Chat
Note - There might be one more episode from Matt and me. After that, we are taking the Summer off from the Monday Morning Data Chat. Back in the Fall, with an incredible new lineup.
In case you missed it…
Chris Tabb - Platform Gravity (YouTube)
Ghalib Suleiman - The Zero-Interest Hangover in Data and AI (Spotify, YouTube)
Bart Vandekerckhove - Data Security Deep Dive (Spotify, YouTube)
Yali Sassoon - Using LLMs to Support the Analytics Workflow (Spotify, YouTube)
David Yaffe & John Kutay - The State of Streaming and Change Data Capture (Spotify, YouTube)
The Joe Reis Show
Coming up - Juha Korpela, Doug Needham, Nick Freund, and many more.
This week…
5 Minute Friday - History Matters (Spotify)
In case you missed it…
5 Minute Friday - Career Progression Advice (Spotify)
Yulia Pavlova - Yulia Pavlova - AI and Disinformation/Misinformation in the Media (Spotify)
5 Minute Friday - Is Data Modeling a Waste of Time? (Spotify)
Safiyy Momen - The Good and Bad of the Modern Data Stack, Controlling Cloud Costs, and More (Spotify)
Gordon Wong - Why Most Data Teams Aren’t That Valuable (Spotify)
Roman Yampolskiy - AI Safety & The Dangers of General Super Intelligence (Spotify)
Bill Inmon - History Lessons of the Data Industry. This is a real treat and a very rare conversation with the godfather himself (Spotify) - PINNED HERE.
Events I’m At
Matillion Deep Dish (San Francisco) - June 3rd and 4th. Register here
AI Quality Conference (San Francisco) - June 25th Register here. Rumor has it I’ll also be DJing there…
(Taking the Summer off, sort of…)
Big Data London - September 18-19. Register here
DataEngBytes (Australia) - Late September/Early October, TBA
Helsinki Data Week - Fall TBA
Lots of other stuff in Europe - Fall, TBA
Asia - Fall, TBA
Thanks! If you want to help out…
Thanks for supporting my content. If you aren’t a subscriber, please consider subscribing to this Substack.
Would you like me to speak at your event? Submit a speaking request here.
Want to sponsor this newsletter? Fill out this short form.
You can also find me here:
Monday Morning Data Chat (YouTube / Spotify and wherever you get your podcasts). Matt Housely and I interview the top people in the field. Live and unscripted.
My other show is The Joe Reis Show (Spotify and wherever you get your podcasts). I interview guests on it, and it’s unscripted and free of shilling.
Practical Data Modeling. Great discussions about data modeling with data practitioners. This is also where early drafts of my new data modeling book will be published.
Fundamentals of Data Engineering by Matt Housley and I, available at Amazon, O’Reilly, and wherever you get your books.
Be sure to leave a lovely review if you like the content.
Thanks!
Joe Reis
I’m trying to find history of the “Star schema”. Back to 1994 is easy, with the internet. Before that references are locked up in paper presentations or thesis.
Good question. I’ve got personally seen anything before that date