Understanding before Data

Why Data Modeling Must Start with Human Understanding

A Conversation on Information Modeling, AI, and the Real Challenge of Enterprise Data

In a recent conversation with a data professional exploring database development tools, Marco Wobben offered a perspective that cuts straight through the noise surrounding modern data projects. With over 25 years in information modeling, he's seen the same pattern repeat: technology advances rapidly, yet the biggest failures trace back to something very human.

The Communication Gap Technology Can't Fix

"Before you can model anything," Marco says, "you need to plow through a whole field of human communication."

The traditional workflow is familiar: interview experts, draw diagrams, interpret requirements, and produce a technical design. The catch? "Most of the hard work, towards real understanding," he says, "never leaves the data professional's head."

He sees the same scenario play out repeatedly:

"They think they heard what the business meant. They think they understood it. And then they implement a solution that fits the requirements as they perceived them. The gap in communication—and the lack of a feedback loop—is enormous."

When the technical team presents their solution, business stakeholders reply with a shrug: "Look, you're the expert. I don't know what this diagram means." And with that, meaningful feedback becomes impossible.

Why Glossaries are not the whole story

Organizations often respond by building business glossaries. But Marco has learned not to mistake definitions for understanding.

"Even with a glossary, you're still not cutting it," he says.

Take "inventory." The definition is easy: the amount of products you have. But each department quietly redefines it for its own use:

Sales subtracts everything already promised to clients.
Purchasing adds everything that's on order.
The warehouse looks at physical reality and sees something different again.

"So everyone use the same words," Marco notes, "but everyone thinks from their own context. And nobody explains how their context relates to someone else's."

It gets worse when teams collaborate:

"Imagine, you and I work together for a while—we develop shortcuts. A number scribbled on a note is enough. But then a new colleague joins, or we have to work with another department, and they developed their own shortcuts. Nobody talks clearly and explicitly about their work anymore. Suddenly we all need to communicate—and we realize that's practically impossible. Systems lack the original communication and therefor create this meaning debt for the organization as a whole."

The AI Reality Check

With generative AI entering the enterprise stack, many expect automation to close the gap. Marco is blunt about the limits.

"When ChatGPT came out, I thought: great—how can I use this without it making me obsolete? And very quickly I learned: it's about 80% right, but I'm not sure which 80%."

But the real revelation came when he asked the model about domains he did understand:

"As soon as I asked questions in my field of expertise, the answers were… well, they were crap. Confident crap."

This is exactly what worries him: "Feed it a technical catalog and people expect magic. But the model has no idea what your data means. And even if it gives an answer, how do you know it's correct? You don't."

He uses an analogy that lands hard:

"It's like throwing a shoebox full of receipts at an LLM and expecting a clean, reconciled accounting system."

Data Is a Program—Not a Project

One of Marco's biggest frustrations is how organizations treat data initiatives.

"Business says: 'We need this; let's start a project.' They don't realize that data is the distilled communication of how they work."

He compares it to other core disciplines:

"You don't run HR as a project. HR is a program—it's ongoing. Same with finance. Last year's accounting is not 'done.' Neither is data."

If organizations saw data as a continuous program, glossaries wouldn't be the end—they'd be the first brick in an ongoing professional discipline. It'd be part of the business as a whole, not just a technical artifact, a product, or an application.

The Hidden Cost of Moving Too Fast

Pressure to deliver fast has always existed, but AI accelerates it to unsafe extremes.

"I'm worried," Marco admits. "AI makes everything look easy and fast—and that leads to more technical debt, not less."

His story about privacy lawyers illustrates the real work required. "I had to peel back their statements for two hours just to understand one requirement," he recalls. At the end, the lawyer looked stunned: "I had no idea my job was that complex."

Throughout his career, he's seen this everywhere—from lawyers to surgeons to railway engineers:

"Expecting someone with years of specialized knowledge to explain their needs in a couple of hours? That's wildly naive."

AI will produce something that looks right. But Marco asks the essential question:

"It's 80% correct—but which 80%? And how do we verify the rest when we don't understand their domain?"

The Data Outlives Everything

Marco has rebuilt, and watched systems get rebuilt over decades—from desktop apps to web interfaces to mobile platforms.

"But the one thing that never changed," he says, "was the database. It survived every rewrite." And this is why he warns:

"If you structure your database as quickly as you build your software, you're shooting yourself in the foot. The software might be the end of the project. But it's never the end of the data."

Where Information Modeling Makes the Difference

This is where FCO-IM (Fully Communication-Oriented Information Modeling) comes in. "What I've done for 25 years," Marco explains, "is capture how people actually talk about their work, and more specifically their data."

This modeling discipline creates a semantic bridge between human understanding and technical implementation. When done right, the benefits become dramatic.

In a recent demo, he showed how information modeling leads to AI to work safely:

"It can answer questions with full accuracy. It finds the right data. It verbalizes it properly. It explains it and may even flag data quality issues. Zero catastrofic hallucinations."

Why? "Because the semantics are modeled, the ontology is clear. The language is captured. And the access to the data is curated through a very narrow, controlled port."

A Real-World Success Story

One of Marco's favorite examples is from the Dutch national rail infrastructure organization.

"Three years ago, if a track malfunctioned, it took up to four hours to figure out which power lines needed to be switched off."

After years of consistent, disciplined fact-oriented information modeling across departments, they built something like Google Maps for their infrastructure. You zoom in, draw a circle, and click "turn it off." The whole chain—from switches to signals to passenger notifications—is fully automated.

The result?

From four hours to two seconds.

And then Marco adds:

"Naturally, not every organization needs to be able to respond in seconds, but this is clearly a massive improvement and business value increase. You cannot achieve that if you treat data as a byproduct, or just another project."

The Path Forward

Marco even tracks how AI assists with the modeling process itself.

"I log everything: who created an element, who approved it, who checked it, where it's used—and also whether I did it, someone else did it, or was it generated."

This makes quality measurable. "In the final output, you can determine: in development, domain-verified, in need of retirement, generated, ambiguous, etc. That tells you something about the certainty you're delivering."

His closing reflection brings the whole conversation back to the human core:

"Data work is really about the human condition. We're all trying to make sense of things—faster, clearer, better. AI's help, but they can also mislead us in ways we don't fully feel yet."

For organizations serious about data, the message is unmistakable: Technology isn't the starting line. Human understanding is, human communication is.