How to Connect AI Models to Your Customer Data CDP

The gap between brands that talk about AI-driven personalisation and those that actually deliver it comes down to one thing: the quality and structure of the customer data sitting underneath the model.

Tealium’s Nick Albertini put it plainly in a recent post — the novelty of LLM-generated content has evaporated. Every brand can spin up a chatbot or draft a campaign brief with AI. What most can’t do is feed that AI accurate, real-time, consented customer context. That’s the moat now. And for marketing teams in Southeast Asia, where a single customer might browse on a Shopee app in the morning, transact via GrabPay at lunch, and engage with a LINE OA campaign by evening, the stitching challenge is genuinely harder than in markets with more consolidated platform ecosystems.

Why Your AI Is Only as Smart as Your Data Architecture

Most AI personalisation projects quietly fail at the data layer, not the model layer. The model is rarely the problem. GPT-4, Gemini, Claude — they’re all capable enough. What breaks personalisation is when the model is handed a fragmented, stale, or unresolved customer profile: multiple user IDs that haven’t been identity-stitched, behavioural signals that are three sessions out of date, or declared preferences from an onboarding survey taken eighteen months ago.

Tealium’s framing is useful here: AI models need a persistent, unified customer profile as their input — one that merges behavioural data (clickstreams, session depth, content affinity), transactional data (purchase history, basket abandonment, lifetime value tier), and declared data (explicit preferences, survey responses, consent flags). Without that foundation, your AI is essentially a very expensive autocomplete running on incomplete sentences.

For Southeast Asian brands, this is compounded by multilingual data — a customer who writes reviews in Thai but browses in English creates identity resolution headaches that most off-the-shelf CDPs underestimate.

The Modern Data Stack Is Finally Catching Up

The infrastructure story has genuinely improved. Google’s recently announced Antigravity — an agentic IDE built to operate within the dbt ecosystem — signals something important: the transformation layer of the modern data stack is becoming AI-native, not just AI-adjacent. As dbt’s Stephen Robb outlined, Antigravity can autonomously generate, test, and refine data models, which compresses the time between raw event data landing in your warehouse and a clean, activation-ready customer table being available downstream.

For a CDP practitioner, this matters because the bottleneck has historically been the data engineering queue. Marketing teams identify a new segmentation need — say, identifying high-intent browsers who’ve visited a product page three or more times without converting — and then wait weeks for a data model to be built, QA’d, and deployed. Agentic tooling on top of dbt starts to close that gap, letting analysts move faster without bypassing data governance.

The caveat: agentic data transformation is only trustworthy if your upstream data quality is solid. Garbage in, autonomously scaled garbage out.

Retrieval Architecture Is a Customer Data Problem Too

One development worth tracking from adjacent AI infrastructure work: Proxy-Pointer RAG, detailed by Partha Sarkar on Towards Data Science. The architecture separates what to retrieve (the proxy) from where to retrieve it (the pointer), achieving significantly higher retrieval accuracy on structured data than standard vector search approaches.

This isn’t just an academic footnote. As brands build AI assistants and personalisation engines on top of customer data, retrieval accuracy becomes a conversion-rate problem. A product recommendation engine that retrieves the wrong customer context — pulling last quarter’s purchase history instead of this session’s browsing signals — will underperform even a basic rule-based system. Proxy-Pointer RAG’s separation of concerns maps well onto how a well-architected CDP already structures data: profile attributes stored separately from behavioural event streams, with a resolution layer in between.

If your team is building internal AI tooling on top of customer data, the retrieval architecture deserves as much attention as the model selection.

Making AI Activation Actually Scalable Across Channels

Connecting the model to the data is step one. The harder problem is propagating AI-driven decisions across every touchpoint in real time — and doing it within the consent and regulatory constraints that vary by market across Southeast Asia (Thailand’s PDPA, Indonesia’s PDP Law, Singapore’s PDPA each have meaningful differences).

Libeary’s new headless CMS architecture, covered by CustomerThink, illustrates a useful pattern: centralise content and customer context, then deliver via API to any surface. The same principle applies to AI-driven personalisation. A centralised decision layer — typically your CDP or a composable personalisation engine sitting on top of it — makes the AI call once, then distributes the output to your web CMS, your push notification platform, your Shopee store front, and your LINE messaging API simultaneously.

The failure mode to avoid: letting each channel team build its own AI integration independently. You end up with five different models, five different data connections, and five different versions of who your customer is — which is precisely the problem a CDP was supposed to solve.

Implementation-wise, prioritise getting one high-volume channel right first. Prove the lift. Then scale the architecture horizontally. A 15% improvement in email click-through from AI-matched subject lines is a more persuasive internal business case than a slide deck about unified profiles.

Key Takeaways

AI personalisation fails at the data layer, not the model layer — invest in identity resolution and profile unification before selecting a model.
Agentic tooling like Google’s Antigravity on dbt can compress the data engineering queue, but only if upstream data quality is already governed.
Build a centralised AI decision layer and distribute outputs to channels via API — siloed channel-by-channel AI integrations recreate the fragmentation a CDP is designed to eliminate.

The brands that will win the next phase of AI-driven customer engagement aren’t the ones with the most sophisticated models. They’re the ones who did the unglamorous work of building a trustworthy, consent-compliant, real-time customer profile underneath those models. The question worth sitting with: how confident are you, honestly, in the accuracy of the customer data your AI is currently reading?

At grzzly, we spend a lot of time in exactly this architecture layer — helping brands across Southeast Asia connect their CDPs to AI activation workflows that are actually production-ready, not just proof-of-concept. If your team is trying to close the gap between your customer data and your AI ambitions, we’d enjoy that conversation. Let’s talk

How to Connect AI Models to Your Customer Data CDP

Why Your AI Is Only as Smart as Your Data Architecture

The Modern Data Stack Is Finally Catching Up

Retrieval Architecture Is a Customer Data Problem Too

Making AI Activation Actually Scalable Across Channels

Enjoyed this?Let's talk.

Enjoyed this?
Let's talk.