Warehouse-first or CDP-first? The architecture decision that shapes every lifecycle decision for the next five years. Here is how to choose.
.png)
The architecture choice between warehouse-first and CDP-first is the single most consequential MarTech decision a Series B PLG SaaS company makes. The choice locks in five years of tooling, hiring, and team structure. Most companies make it implicitly — they buy a CDP because everyone does, never define what data lives where, and discover three years later that the warehouse and the CDP have drifted into different sources of truth. The right choice depends on team composition and campaign sophistication, but for most PLG companies past Series A, warehouse-first is the architecture that scales.
CDP-first means the customer data platform is the canonical source of customer data. Events flow from product into the CDP. The CDP enriches them, identifies users, and forwards them to downstream tools — Customer.io, the data warehouse, the analytics tools. The warehouse exists, but it is a read-only mirror used for analysis, not the source.
Warehouse-first inverts the flow. Events still get captured by the CDP, but the CDP forwards everything to the warehouse, where it is modeled, joined with CRM and finance data, and turned into clean, semantic tables. Customer.io reads from the warehouse via reverse ETL. The CDP becomes thin — mostly a collection layer with light routing.
The day-to-day difference shows up in where lifecycle teams go to define a segment. In CDP-first, the lifecycle team works in the CDP. In warehouse-first, the lifecycle team works with the data team to define a model in dbt, then activates it through reverse ETL into Customer.io.
CDP-first is the right architecture when:
This describes a lot of pre-Series-A companies and most B2C companies with simpler data needs. CDP-first lets the marketing team move fast without engineering bottlenecks.
Warehouse-first is the right architecture when:
This describes most Series B+ PLG SaaS companies. The data team has built models. The campaign team needs them. The choice is whether to acknowledge that the warehouse is the source of truth or pretend it is not.
The reason this decision matters so much is that migrating between the two is genuinely hard. Each architecture creates organizational habits, tooling investments, and data flows that are expensive to undo.
CDP-first to warehouse-first migration: the data team has to model years of behavioral data. The lifecycle team has to learn a new workflow. Customer.io has to be rewired through reverse ETL. The CDP becomes thin and the team has to confront the fact that they are paying for capabilities they no longer use.
Warehouse-first to CDP-first migration: less common, because companies rarely move backward. But it happens when a company decides the data team cannot keep up with marketing's segmentation needs. The migration involves replicating warehouse logic in the CDP, which usually loses fidelity.
In both cases, the migration takes 6-12 months and creates a transition period where two sources of truth coexist. Better to choose right the first time.
Three questions:
For most Series B PLG companies, the answers point to warehouse-first. The data team exists, the campaign logic is becoming sophisticated, and other teams already depend on the warehouse.
If you are pre-Series-A: CDP-first is fine. Buy Segment, ship campaigns through Customer.io's standard event-triggered flows. If you are Series A: this is the decision point. Pick warehouse-first if you are growing the data team and campaign sophistication is on the roadmap. If you are Series B+ and already CDP-first and feeling the pain, start the migration. The longer you wait, the more painful it gets.