Customer data platforms, have emerged as an integral part of the MarTech ecosystem, supporting the integration and activation of first party customer data across the marketing ecosystem. Customer Data Platforms have in large part been monolithic. In the past couple of years however, new players have entered the market bringing a novel approach of “Composability” to the industry.
Companies have been faced with a number of challenges pushing their CDP programs through. A lot of CDPs programs have been operating as a shadow IT under marketing departments. Increased scrutiny on the origin and usage of PII data as well as increased volume and importance of data have however made this approach less viable.
Companies now to address a number of concerns and risks, pertaining to the setup and usage of these customer data platforms.
Increased Data Regulation and Security concerns — With the increased importance of data protection legislation such as GDPR and CCPA, the rise of ransomwares and data leaks, data security has become a hot topic. Leveraging traditional customer data platforms, usually require the data to be copied outside of companies premises and increase the attack footprint. This concern has been further accentuated by the move to SaaS models of previously self-hosted solution such as Bloomreach CDP. These security concerns have become so prevalent at some of our clients that we have heard some of the enterprise architects state: “you cannot have PII information in the CDP”.
Concerns on tooling limitations: All platform have limitations due to their design, resource allocations, ways of integrating and more. With the increase focus towards leveraging customer data in order to remain competitive, this represents a growing concern. Traditional CDPs often need to be majorly bypassed to work around some of their limitations, for instance — in case of limitation around the deduplication capabilities, MDMs are often introduced, often with their own latency limitations; the introduction of this new components, pushes the data to be integrated onto the CDP from a single source, severely reducing the attract and usage of the CDP’s ingestion capabilities. Similarly if you need to bypass a traditional CDP due to lack of proper real-time support, besides the direct integration needs, an identity deduplication module would also need be setup outside of the CDP for this purpose.
Increased in data volume concern: The amount of information collected worldwide keeps on increasing with over 20% more data collected than the previous year. With more data than ever being collected, storage and transfer costs are increasingly relevant and businesses are becoming increasingly attentive to features addressing these concerns such as Bring your own lake functionalities.
Increased latency considerations- Most of the traditional CDP solutions out there haven’t been built with real-time or just in time processing in mind. They have been built to address the growing importance, scale, data silos and growing regulatory demands on customer data. Real-time or just-in time has been for these traditional CDPs often an afterthought, and the shift in architecture needed to achieve real-time or just in time integration.
Increased importance of data: Many companies have started shifting towards treating data as an asset, want to break down silos and provide self service capabilities for their data. A lot of companies have invested massively in building their datalakes, lakehouses and want to start reaping benefits out of these platforms. Having a separate data silo or data duplication with its own challenges in terms of consistency appears less than desirable.
Risk of placing such a large central components at the mercy of one single vendor. A number of our clients have been “burned” by significant increase of their license fees after having implemented their customer data platform. Customer Data Platform act as a central piece for collecting, storing, processing and activating customer data onto a wide array of systems, and can be quite difficult to replace from one to another posing vendor lock-in concerns. Some of our clients have furthermore fully externalized their marketing activities to third parties agencies, controlling and operating the Customer Data Platform on their clients’ behalf. With the increasing importance of data and security concerns, we see an increasing trend to bring CDPs back onto the companies’s premises.
Composable CDPs take a different approach to traditional CDPs, splitting the responsibility across the stack in a number of modules. Composability foster a best of breed approach, while at the same time facilitating potential migration and reducing vendor lock-in. Only one set of functionality need to be replaced at the time, greatly reducing the scope of impact from changing a vendor solution.
Audience Segmentation is one of the key functionality of any CDP. CDPs need to address different segmentation needs, covering static to dynamic segmentation, event or scheduled trigger to rule base or machine learning based segmentations. CDP need to offer a user friendly interface to allow marketers to be able to operate the tool. The Audience Segmentation module needs to be as well able to export the audience results to different destinations such as media platforms. Different composable CDP vendors such as Census or High touch operate in this space.
CDPs have been facilitating the ingestion of data onto a central platform, they offer collection APIs and SDKs to capture clickstream events for both websites and app behavioural data, as well as various out of the box connector to source data, from data warehouse, streams, or external applications such as CRMs.
Different vendor exists in order to support data collection, activities, different providers have already focused on helping integrating data onto existing data warehouse like Fivetran or Snowplow for clickstream data acquisitions, as well as other providers more focused on the CDP ecosystem such as RudderStack or Segment.
In order to operate a Customer data platform, processing and storage capabilities need to be available. Traditional CDP vendors such as Salesforce or Treasure Data have been embedding big data engine such as Spark or Hive as part of their solution. In a composable CDP architecture, these engine would be taken directly from companies’ internal Datalakes or warehouse. Lakehouse vendors such as Databricks and Snowflake have been particularly supporting this use case, with Databricks offering a CDP use case as part of their solution accelerator offering.
Part of the customer data platform role, is to provide an accurate and actionable view of customers for marketing activation. For this purpose, cdp create an identity graph, and provide a consolidated record “Golden record” across the different touch points. More robust customer data platforms, support more than a single identity graph, supporting deduplication of data entities beyond just the Customer entity and supporting both Single Source of Truth (SSOT) and Multiple version of truth (MVOT) approaches. Some of the vendors are able to leverage third party data enrichment to further complement their identity graph and increase the match rate both within their platform and outside. Composable vendors such as ActionIQ or Hightouch have strong offerings in that regards and AWS offers a generic solution.
CDPs vendors have traditionally offered some limited capabilities in terms of dashboarding, offering visibility on operational metrics such as match rates, campaign measurement or an overview of the customer base. For deeper analysis, some vendors have started offering direct JDBC connections to the underlying datasets stored within the CDP. This enables the use of third party dashboarding solutions such as Tableau, PowerBi or Superset. Within a composable CDP landscape this activity can be fully leveraged from these dashboarding solutions on top of an existing datalake, rather than having an overlap of responsibility.
Data access within CDPs have been offered either through a portal providing visibility on single customer’s data, or through providing an interface to query the data, Epsilon People’s cloud for instance includes an analytical notebook functionality. Within a composable CDP the query and notebook portion of that functionality is typically handled by the chosen datalake solution.
Most CDPs have invested in the last couple of years on developing some machine learning functionalities. These come out of the box, and can satisfy certain use cases such as product recommendation or the calculation of lifetime value. We have found these capabilities to be often limited and some form of a black box, with uncertain performance. Within the composable CDP space, this capability is typically addressed by the data platform which can offer full end to end MLOps functionalities, including AutoML, automation, performance tracking.
Customer data platforms have incorporated reverse ETL functionalities in order to push the necessary data — beyond just an audience list — onto different target systems. These data points have been necessary to orchestrate different journeys or provide personalization attributes.
Within a composable architecture, with the customer data stored within the company’s data platform, the integration can leverage a diverse set of integration tools, such as Azure Logic Apps, Fivetran, or composable CDP components to integrate the data depending on the needs. It is often not apparent with traditional CDPs what is the maturity, reliability or coverage of a specific connector, composable CDPs reduces the risk overall by making it easier to change the integration method should a particular integration method not fully meeting the needs of a particular use case.
While traditional CDPs have made progress in addressing some of the data governance challenges, with for example the setup of “Data Plans” in mParticle for establishing and managing data contracts or data spaces in Salesforce. Traditional CDPs fall short compared to full fledge data platforms completed by data quality and cataloging tooling such as Montecarlo or OpenMetadata, which allow to incorporate and address varied data quality issues as well provide visibility on all the data assets and their lineage.
Composable CDPs represent a more modern and future proof way to architect customer data platforms, They bring a number of advantages by bringing the solution to where the data is rather than needing data to be integrated onto an another external solution — as well as address a number of concerns and risks of traditional CDPs while providing enhanced flexibility, allowing to tailor the solution based on needs rather than based on what is on offer.