Customer Data Integration: FAQs and Fiction Part 1

Originally published May 2, 2006

Herewith, my inaugural article for my new expert channel on the Business Intelligence Network, focusing on customer data integration (CDI) and customer data management (CDM). See my accompanying blog posting for a brief introduction to this channel and my thoughts about where we can take it together. Suffice it to say, I’m psyched to be weighing in on these topics, and I'm hoping to add a sonorous note to the chorus of voices already singing about master data management (MDM), business intelligence, data warehousing and other hot areas. We might not always be in harmony, but sometimes the off-key is far more entertaining.

I figured I'd launch the channel by establishing some clarity around customer data integration. There’s a lot of buzz about customer data integration – much of it the result of unclear messaging and the vendor spin endemic to any emerging technology. I've called this article Customer Data Integration: FAQs and Fiction because I’d like to break open the truth about CDI, talk about what it is, distinguish it from other technology trends, and dispel the already well-worn rumors and misguided assumptions. In this, Part 1, we’ll begin with some frequently asked questions about customer data integration and offer some answers. In Part 2 next month, we’ll take on some of the fiction around CDI and set the record straight.

So, here are a few FAQs that may help clarify some of the disparate messages around customer data integration.

Is CDI intended to replace data warehousing?

No. In fact, most of the CDI solutions on the market today have their roots in non-analytic technologies. Some CDI tools originated as identity recognition and matching solutions for specific sectors such as healthcare and government. Others have their roots in enterprise information integration (EII) technologies, availing the ability for companies to query a range of systems on the fly. Other CDI technologies are based on data cleansing algorithms, enabling data standardization and reconciliation in real-time.

If CDI isn't a data warehouse replacement, what's it supposed to do? In other words, what's the difference?

First, our definition of customer data integration:

Customer data integration is the collection of processes, controls, automation and skills necessary to standardize and integrate customer data from different sources.

Customer data integration is intended to recognize customer data from heterogeneous data sources, reconcile that data by applying rules, matching it with existing customer records, and establishing the authoritative version of a customer via a new or enhanced customer record. CDI combines data quality checking, match/merge functionality, data integration and sometimes even data storage technologies as a single solution.

At our company, we consider our data warehouse to be the "single version of the truth" about customers. But everyone seems to be applying the “single version of the truth” label to CDI. Why?


Your data warehouse might well be your company’s single version of the truth today. The natural evolution of the enterprise data warehouse is that it stores cross-functional information about different types of master data on behalf of the company. Bill Inmon's original definition of a data warehouse is apt here: subject-oriented, integrated, time-variant, non-volatile... Your data warehouse is likely to be a great platform for storing cross-functional master data from across your company. Indeed, it might well be the only system with data that crosses organizational boundaries.

Where a data warehouse is heterogeneous, non-volatile and historical, a CDI hub is customer-oriented, volatile and current. Customer data integration is more transactional than a data warehouse. Though performance and scalability are often significant issues in a data warehouse, a CDI hub is built for performance and scalability, and designed to support both read and update. Furthermore, although a CDI hub can serve an end-user community in need of up-to-the-minute customer data, its real constituency is in the range of applications needing different versions of the same customer. These applications usually need very current data very quickly. Customer data integration finds and links related customer records from across the enterprise and makes them available to all the applications that need them.

But our data warehouse contains all the information we need about customers!

Show me a data warehouse that is considered successful, and I’ll show you one that supports a company’s requirements for analytical processing. The data warehouse is designed and tuned to support queries and complex analytics. To that end, it shouldn’t store data that isn’t necessary to support actual, articulated business requirements. In fact, we’ve found that the phenomenon of “forklifting” data into the data warehouse en masse and letting it languish there is a high indicator of risk in the assessments we do.

A CDI hub supports the functional needs of a wide range of different applications and, as such, needs to be as thorough and complete as possible. Most CDI hubs follow the "party" model, meaning that data about customers as well as about business partners, resellers, distributors, suppliers and agents populates the hub. And this data is available to lots of different systems for many different purposes.

So is CDI an ODS?

No, there are some clear distinctions between customer data integration and an operational data store (ODS). The main one is that with an ODS, data cleansing hasn't been done with any rigor or intended permanence, so the data is closer to its "raw" form. There's typically latency associated with the ODS, which can be measured in hours – or longer. On the other hand, with CDI, data quality is "baked in." This means that the CDI hub doesn't record a customer until the version of that customer is matched with other versions, standardized, updated and completely accurate. CDI establishes an operational image of the customer, allowing other systems access to this image so that they don’t have to re-collect and redefine customer data. This capability is often referred to as "once and done," whereas an ODS can function as a point of evolution for data that will probably undergo further transformation before reaching its end state.

Moreover, customer data integration isn’t intended for analytic reporting, though it can support it. CDI can act as an ODS, but it’s more sophisticated. It updates and maintains data, and supports two-way transactions. You can think of CDI as an online master customer database available to a range of legacy systems, packaged applications and, yes, data warehouses and marts across the company. These systems need “one true view” of the customer, for once and for all.

Can you provide an example?

Sure. Here's one based on a real-life client story.

A patient – we’ll call him Jim – is diagnosed with chronic obstructive pulmonary disease (COPD), and his doctor decides to begin with a course of drug therapy, prescribing theophylline. Jim goes to the pharmacy and submits his prescription with the insurance card from his new employer.

The pharmacy system can't locate Jim's new policy number, but recognizes him via his phone number and home address.

You're probably thinking, "So what? Big deal. An application could do that; it's just a matter of the right programming logic." But programming logic by itself can’t support a single version of the truth.

When Jim arrives to pick up his prescription, the pharmacist not only has his history, but also understands that another doctor covered by a different insurance carrier recently prescribed seizure medication for Jim. The pharmacist knows that theophylline can exacerbate seizures. He offers to call Jim's doctor to explain the potential drug interaction and suggest an alternative medicine.

The point here is that a CDI hub can uniquely identify a customer – in this case, a patient – across different divisions, service providers, geographies and technology platforms.

CDI is big in healthcare, then?

Well it’s no coincidence that customer data integration is causing a sea change in the healthcare and pharmaceutical industries. CDI solutions are aware of different data distributed across multiple operational systems, and the implicit data quality processing means that data will be highly accurate – a must when it comes to people’s health. Obviously, physicians get lots of value knowing what medications their patients are taking at or before the time of care. They can look up patient history without relying on details provided by the patient. The trend toward enterprise master patient index (EMPI) is huge in healthcare, and customer data integration is smack dab in the middle of it.

What about other industries?

Customer data integration is relevant – and being used – across consumer-oriented businesses, across multiple industries. CDI fits when functionality requires the identification of a customer.

Seems no one talks about CDI without discussing SOA. Why is that?

Because customer data integration is an operational application, it requires an application programming interface. CDI isn’t just a database, it’s an application platform. Consequently, it needs an interface that’s much more robust than a standard database interface (i.e., ODBC or JDBC). Since vendors are reluctant to release any product that requires a proprietary interface, CDI providers have adopted service-oriented architectures (SOAs) to ensure market acceptance of their technologies. Having said that, the value of CDI via SOA is that it allows a company to deploy customer recognition, matching, accuracy checking and validation as a service to multiple applications and systems across the enterprise.

Who in our company should own CDI?

It depends. There are a range of issues to consider before determining customer data integration ownership, the main one being how customer data integration will be used. Does your company consider CDI to be an application to help with customer identity management, or more of an infrastructure technology that supports sustained and accurate deployment of customer data? Because CDI is an operational platform, it's usually owned by someone in IT. We've seen CDI owned by application development groups, database groups and data warehouse groups seeking to broaden their horizons. We've even seen it adopted by data management centers of excellence.

So data management is a big deal for CDI?

Huge! We'll cover that topic in next month's article.

  • Jill DychéJill Dyché

    Jill is a partner with Baseline Consulting, a data integration and business intelligence (BI) services firm. She is an internationally recognized speaker and writer on the topic of the business value of technology, and has been featured in the Wall Street Journal, CIO Magazine, Intelligent Enterprise and Newsweek.com. Jill leads the Customer Data Integration, Master Data Management and Data Governance channel for the BeyeNETWORK, and blogs regularly on those and other IT-related topics. She is the author of two acclaimed books, e-Data, which introduced enterprise data to business executives, and The CRM Handbook, which was the best-selling book on the topic of customer relationship management. Her latest book, Customer Data Integration: Reaching a Single Version of the Truth – co-authored by Baseline Partner Evan Levy – was recently published by John Wiley & Sons.

    Editor's note: More articles, resources, news and events are available in Jill's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Jill Dyché


Related Stories


 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!