Commerce Intelligence

What a Stoa store is, why it's different, and how the pieces compose into an integrated commerce intelligence system.

The Core Idea

A Stoa store is a commerce intelligence platform — not a storefront with analytics bolted on. It encodes an opinionated e-commerce operations playbook into deployable, customizable infrastructure. The playbook comes from real operational experience: Article.com (D2C, 8-figure e-commerce, full-stack data product), Game Data Pros/Warner Brothers (enterprise-scale segmentation and targeted treatment systems), and independent consulting (outdoor retailer discovery audits, review systems, data infrastructure).

The differentiator is not any single capability. It's the virtuous experimentation loop — infrastructure for the full cycle of systematically discovering customer structure through experimentation, refining understanding through qualitative feedback, and compounding improvements over time.

Not "we have A/B testing" (every Shopify app does that). Not "we have RFM segments" (that's a dbt tutorial). The differentiator is the full measure → segment → hypothesize → test → learn → act cycle, with agentic assistance at each step.

The Virtuous Loop

This is the core operating principle. Everything else exists to make this loop turn.

    ┌─────────────────────────────────────────────────┐
    │                                                 │
    v                                                 │
SEGMENT ----> EXPERIMENT ----> TYCHE ----> DISCOVERY  │
(rough cuts,   (vary something  (Bayesian   ("variant │
 evolving)      for these        analysis,   B works  │
                segments)        HTE)        for X    │
    ^                                       not Y")  │
    │                         ┌─────────────────┤    │
    │                         v                 v    │
    │                   REFINE SEGMENTS   ASK WHY    │
    │                   (new boundary     (survey    │
    │                    discovered)       the       │
    │                         │           group)     │
    │                         └───────┬───────┘      │
    │                                 v              │
    └──────────────── RICHER MODEL ──────────────────┘
                      of customer behavior

Each node in the loop is a real system component:

Loop node	System component	What it does
Segment	dbt models + `assignable_attributes` cache	Groups visitors by behavioral/transactional/stated signals
Experiment	Experiment assignment framework	Delivers targeted variants to segments via storefront
Tyche	Python/PyMC analysis engine	Bayesian inference, heterogeneous treatment effect discovery
Discovery	Tyche HTE output	Surfaces segment boundaries you didn't hypothesize
Refine Segments	Updated dbt models, new segment definitions	Incorporates discovered boundaries into the segmentation model
Ask Why	Survey/VoC triggers	Targets qualitative questions at the surprising group
Richer Model	The segmentation model itself	Evolves through loop iterations, not built once

The loop compounds learning. Iteration 1 uses rough segments (new vs. returning). Iteration 3 adds "took a course" because Tyche's HTE analysis showed this matters. Iteration 5 adds "price-sensitive" because a post-experiment survey explained why a segment bounced. Each pass through the loop produces better segments, which produce better experiments, which produce richer discoveries.

Four Capability Layers

Every Stoa store capability belongs to one of four layers. The layers are ordered by the customer journey but interconnected — measurement feeds understanding, understanding shapes acquisition, retention informs the next acquisition experiment.

Understand (know your customer)

The foundation. You can't optimize what you don't understand.

Journey tracking — first touch → conversion, multi-touch attribution, time-to-purchase by source
Customer segmentation — RFM, lifecycle stage, category affinity, behavioral cohorts (dbt models + storefront awareness)
Voice of Customer — survey triggers (post-purchase, NPS, discovery), structured collection, feeds into segmentation
Review content as structured data — skill level, use patterns, filterable

Acquire & Convert (guide the journey)

Every capability here is both a feature and an experiment surface. The dual-metric principle means we measure both revenue and satisfaction impact.

Guided selling — finder quiz, persona-based entry points ("I'm new to packrafting")
Search & discovery — sort, filter, interaction-level tracking, bounce analysis
Cart & checkout optimization — continuity, abandonment detection, segment-aware recovery flows
Pricing & promotion — discount display psychology, promotion real estate, bundling
Cross-sell & recommendations — manual + data-driven, composite cart items

Retain & Grow (keep customers, increase LTV)

Email lifecycle engine — segmented campaigns, behavioral triggers, win-back, post-purchase sequences
Post-purchase experience — review solicitation, next-step suggestions, cross-sell based on purchase, reorder prompts
Customer progression — "you've done X, here's Y" — general pattern that verticals instantiate

Measure & Learn (close the loop)

This layer is what makes the other three layers improve over time rather than stagnate.

Tyche analysis engine — Bayesian inference, sequential testing, HTE discovery
Funnel & attribution reporting — "where do people drop off, by source?", "what drove last month's sales?"
Dashboards — weekly cadence, segment-aware, actually used
The virtuous loop — measure → segment → hypothesize → test → learn → act, agentic assistance at each step

The Dual-Metric Principle

Every experiment measures against both revenue and satisfaction. Not one or the other — both, always.

Revenue-only optimization leads to dark patterns. Satisfaction-only optimization leaves money on the table. The interesting decisions happen when the two metrics diverge: a change that lifts revenue 5% but drops satisfaction signals a dark pattern worth investigating, not a winner to ship.

Segmentation as Practice

Segmentation is a practice, not a model. The model is a snapshot — a particular set of segment boundaries applied to visitors at a point in time. The practice is the process of the model getting better over time through the virtuous loop.

This distinction matters for implementation:

The model is a dbt transformation that computes segment assignments from behavioral/transactional/stated signals and pushes them to a cache
The practice is the organizational process: run experiments → Tyche discovers meaningful boundaries → incorporate boundaries into the model → run better experiments → repeat
Infrastructure supports the practice by making model updates low-friction: add a new signal to dbt, recompute assignments, new experiments can target the new segment immediately

Starting signals for a v1 model are deliberately simple — new vs. returning, purchase history, basic engagement level. Sophistication comes from loop iterations, not from over-engineering the initial model.

Enriched Assignment Architecture

The system has two sides with different performance characteristics, connected by a cache contract.

HEAVY SIDE (batch, Python/SQL)            FAST SIDE (per-request, TS)
──────────────────────────────            ────────────────────────────

┌──────────┐    ┌──────────┐              ┌──────────────────────┐
│  Umami   │───►│   dbt    │              │     Storefront       │
│  events  │    │ segment  │              │     (RR7 loader)     │
└──────────┘    │ models   │              │                      │
                └────┬─────┘              │  visitor_id          │
┌──────────┐         │                    │       │              │
│  Tyche   │         ▼                    │       ▼              │
│ HTE disc.│    ┌──────────┐              │  ┌──────────────┐    │
│ → new    │───►│ segment  │──── push ───►│  │ Redis cache  │    │
│ segment  │    │ assign-  │   (Redis)    │  │ assignable_  │    │
│ boundaries    │ ments    │              │  │ attributes   │    │
└──────────┘    │ table    │              │  └──────┬───────┘    │
                └──────────┘              │         │            │
                                          │         ▼            │
                                          │  experiment          │
                                          │  assignment          │
                                          │  (segment-aware)     │
                                          └──────────────────────┘

The heavy side (Python, dbt, PyMC) does the computationally expensive work in batch. The fast side (TypeScript, Redis) serves per-request experiment assignments in microseconds. The contract between them is assignable_attributes — a key-value store keyed by visitor ID containing segment memberships and other targeting attributes.

Why This Matters for Non-Web-Scale Stores

Most A/B testing tooling assumes millions of visitors. Single stores (30K-300K monthly visits) have limited statistical power, low conversion rates, and high order value variance. Standard frequentist A/B tests often say "inconclusive" after weeks of waiting.

The commerce intelligence approach addresses this through:

Bayesian inference (Tyche) — produces probability distributions, not p-values. "82% chance variant B is better" is actionable even without classical significance
Sequential testing / optional stopping — don't wait for a fixed sample size; check continuously with proper statistical controls
Constrained HTEs — extract meaningful signal by finding stable, targetable segments large enough to act on, even with limited total traffic
The full loop — even when individual experiment results are noisy, the loop compounds understanding over time. Each iteration refines the model, which produces better-targeted experiments, which produce cleaner signal

The infrastructure is designed for stores where every visitor matters and every experiment needs to earn its traffic allocation.

← Back to Stoa Stack