Stage 5: Marketing Data

by | May 14, 2026 | Digital Roadmap

Stage 5 of the Product Data Refinery: Where Product Becomes Visible

Product search used to mean Google plus the search bar at the top of a distributor’s website. That definition no longer holds. Search now includes AI agents shopping for buyers, LLMs interpreting conversational queries, and MCP-connected procurement tools acting as the buyer’s hands. Customers are reaching for these tools because they have to. For them, time pressure is real. Procurement roles are being forced to take on more and be more accurate in their activities. This requires efficiency and has become a race to AI enablement.

However, there is a real gap. AI solutions do not match the accuracy a determined human can achieve when they sit through a bad search experience. Not because the AI is incapable. The data it needs is missing. Attributes are blank or inconsistent. Descriptions don’t reinforce what the attributes say. Categories are off. Images are wrong or absent. “Enter Stage 5!”

Marketing data (our Stage 5) is more than findability. It is also branding and marketing, which creates perception, and perception is reality. A company with average operations and disciplined Stage 5 data can look perfect to the market. A strong supplier with weak Stage 5 data looks invisible. That has always been true in marketing. AI amplifies it. AI judges data quality, not supplier quality. The seasoned buyer used to apply judgment beyond the catalog, recognizing which supplier was actually trustworthy regardless of how the page looked. AI flattens that judgment. The distributor with the cleanest data wins the agent’s pick, deserved or not.

A buyer opens a chat with an AI agent and asks for a 4-inch flap disc, 80 grit, for a 300 series stainless steel. The agent shops across the catalogs it can reach. Whether your products show up depends entirely on whether the data at this station was governed. The buyer never visits your search bar. The agent searched for them, and the purchasing decision was made before the buyer ever saw a product page. This may not be a next-year problem, but it certainly will be within the next decade. The distributors and manufacturers that begin building Stage 5 discipline now will own the catalog presence procurement is moving toward.

Last issue, we built the fulfillment station, where price, lead time, and fulfillment model became commercially honest. In this issue, we make the product findable and compelling to whoever, or whatever, is searching for it. Everything Stage 5 produces serves search, in every form search now takes.

Business framing: the first station the customer sees

The first four stations did their work out of the customer’s view. SKU Definitions stabilized identity. Regulatory Data confirmed the product could legally and safely move. Operational Data captured its physical reality. Fulfillment Data made its commercial truth honest. None of that work was ever displayed to the buyer. It was the refinery, running.

Stage 5 is where the refining becomes visible. Descriptions, titles, images, faceted attributes, taxonomy, brand and manufacturer recognition, supporting digital assets. The data that decides whether a buyer, or an agent acting on their behalf, can find the product, recognize it, and trust what they see. The distributor’s brand lives at this station. So does the digital shelf, which is the only catalog most buyers will ever experience. Both are direct outputs of how disciplined the work is here.

Back when we introduced this series, we named the pattern. This is the station every organization wants to start with. It’s the shiny object that everyone (internally and externally) sees. It shows up in demos. Leadership can point at a product page and feel like progress is happening. The earlier stages produce no such artifact. A clean UOM hierarchy does not photograph well.

This blog is the reckoning on that instinct. Marketing & Merchandising is not the starting point. It is the payoff station for everything upstream. The work at this station only produces a clean output when the refined fuel flowing in is already clean. Inconsistent SKUs, blank operational fields, unreliable lead times, missing regulatory profiles, all of these surface here as merchandising defects that no creative effort can compensate for. The product page looks polished, and underneath it, everyone is quietly absorbing the gaps.

Very simply, digitizing your product creates transparency, and this stage is the final line to cross before going to the customer. In an analog world, Sales, customer service, and operations teams used to be the company’s intermediary layer that absorbed gaps before the buyer ever saw them. E-commerce removed most of that buffer. The agentic future removes the rest. The product page is now the entire conversation.

What breaks when this stage is weak

The symptoms are visible to anyone who has shopped a distributor catalog and noticed it feels wrong without knowing exactly why.

Product titles within a single category read differently depending on which manufacturer supplied them. One title starts with the diameter. The next starts with the brand. The next starts with the grit. The buyer scrolls a result page and has to re-parse each line. They do it once. They do not do it three times. They go somewhere the data is consistent.

Faceted navigation surfaces only the products that happened to have attributes populated. The rest exist in the catalog but are functionally invisible. A buyer filters for “316 stainless” and sees twelve results. The actual count is forty. The other twenty-eight had the alloy specified in the description, not the attribute, and the facet has no way to read prose.

Customers land on the wrong category and conclude the distributor does not carry the product. Sometimes the product is in a category that does not match the search term the buyer used. Sometimes it is in two categories and gets duplicated. Sometimes the taxonomy is deep enough that the product is buried six levels down where no buyer will navigate to it. Stage 5 was the gate against all three, and the gate was open.

Search burying the better product is more common than most distributors realize. Modern search scores partially on fill rate and attribute-to-description redundancy. A product with a complete attribute set and a description that reinforces those attributes outranks an otherwise identical product where the attribute set is incomplete. The better product, by any operational measure, ends up below the more thoroughly described one. The distributor’s instinct is to blame the search engine. The data was the problem.

In my experience, product images are arguably the most critical element of marketing data. Missing images create buyer hesitation that no amount of detailed copy can repair. The image is the single most reliable trust signal a buyer has at first look (it can also be the largest detriment when incorrect). A product without one suggests the catalog is not maintained. A product with a wrong image suggests the catalog is wrong about other things, too. In a B2B context where the buyer often cannot tolerate the wrong part arriving, the hesitation is rational. Trust is broken, and they go elsewhere.

Brand and manufacturer recognition collapses when logos, manufacturer names, and brand fields are entered inconsistently across SKUs. 3M might be entered as “3M,” “3M Company,” “Three-M,” and “3-M” across the same catalog, depending on who created the SKU. Faceted filters and brand pages fragment as a result. A buyer searching for 3M abrasives sees four versions of 3M in the brand filter and trusts none of them.

And then there is the symptom merchandising teams know best, because they live it. The human cost. The hand-curation that takes the place of governance. Someone in merchandising is rewriting titles by category. Someone else is hunting down missing images. A third person is normalizing the attribute values that should have been normalized at intake. None of that work scales. It also does not survive turnover. When the person leaves, the institutional memory of which categories were cleaned up leaves with them, and the catalog drifts back to entropy within a year.

In the agentic context, the failures change shape. The agent does not call a buyer to ask whether “coarse” means 40-grit or 60-grit. It picks whichever product the data unambiguously supports. If that is not your product, you lost the order silently.

What “good” looks like at this stage

A healthy Stage 5 catalog reads consistently across brands within a category. Titles follow a defined sequence. Sanding discs always lead with diameter, then attachment type, then grit, then brand, or whatever order the category team decided. The decision is documented. New SKUs are titled to the standard before they publish. This is typically designed by a category manager or category expert. It is critical that an SME is embedded within the marketing data strategy. Most, if not all, of what good looks like below comes from subject matter expertise.

Faceted attributes are populated to a defined minimum fill rate before a product can go live. Where the standard is, for example, ninety percent of category-defining attributes populated, the product gates at that threshold. Below it, the SKU is held in PIM until the gap is closed. The merchandising team is not making this decision SKU by SKU. The gate is automated against the standard.

The attribute dictionary is documented. Where ambiguous values appear in the wild (grit and grade, coarse and fine, heavy-duty and standard-duty), the distributor publishes the definition the buyer is going to encounter. McMaster-Carr is a useful reference here, particularly in fastener, pipe fitting, and component hardware categories. Their faceted attributes carry definitions, ranges, and disambiguation notes that let a buyer understand what a value means before they filter on it. The discipline is visible because the data discipline behind it was made first. Worth noting: this approach fits component-heavy categories well. The same distributor handling power tools or capital equipment would build the experience differently, and that is not a failure. Stage 5 maturity is built category by category. No distributor is best at everything, and the ones who claim to be tend to be best at nothing.

Images exist for every active SKU and meet the defined standard. White background or whichever background the brand standard dictates. Defined minimum resolution. Defined angle. Where supporting assets exist (spec sheets, install guides, video), they are linked from the SKU record and accessible from the product page. The buyer never has to leave the page to verify what the product is.

Descriptions reinforce the attribute set rather than replacing it. The same attribute value (3/4-inch diameter, 80 grit, hook-and-loop attachment) appears in both the attribute table and the description prose. That redundancy is not laziness. It is search fuel. Modern search scores higher when key facts appear in multiple fields, and AI agents parsing the catalog confirm interpretations the same way. A description that exists independently of the attribute set is worth less to search than one that confirms it.

Taxonomy is owned by the distributor, not inherited from a manufacturer feed. Taxonomy management is often an underappreciated task, but it is arguably the most differentiating factor of a company’s brand. Category depth is enough to support meaningful navigation, and shallow enough to avoid pigeonholing products into nodes no buyer will ever reach. The 35,000-leaf-node taxonomy is not the goal. A few hundred well-governed top categories with deliberate depth in the high-volume areas is closer to it. Coincidentally, this is foundational for customer experience and directly feeds into an effective price strategy as well.

Sales, customer service, and the digital channel all surface the same data when they talk about the same product. There is no internal version that is correct and an external version that is “the customer-facing one.” The customer-facing data is the data, because every team that needed it consumed from the same governed source. If the internal teams are using the website as a product resource, then you can be sure the customers will too.

And the team itself is doing the right work. Merchandising is focused on growth, on launching new categories, on launching new programs, on improving conversion. They are not hand-curating defects that should have been gated at intake. That is the truest test of whether Stage 5 is healthy, more than any one metric. The team is building, not patching.

Building the filter: distributor-standardized content normalization

The first four stations each had a filter pattern of their own. SKU Definitions used sequential governance layers. Regulatory Data used a category-driven rulebook with conditional toggles. Operational Data used a structural filter where every product needed every layer. Fulfillment Data was bilateral, distributor-defined and supplier-confirmed.

Stage 5 has its own pattern: distributor-standardized, brand-agnostic content normalization. The distributor sets the content standard. Manufacturers provide the raw content. The distributor transforms what they provide into a normalized, buyer-ready form. Manufacturers do not get to define how their products are presented inside the distributor’s catalog. That is the distributor’s brand decision, and it is also the distributor’s data discipline. This can be a point of contention. Admittedly, it takes teamwork between the manufacturer and the distributor to properly represent and leverage key information that the brand or manufacturer may want to get in front of their end-user customers. Not all distributors’ websites function the same, so the brand or manufacturer has to work within the constraints of their customer’s platform.

Most distributors get this polarity wrong. They take manufacturer-provided content at face value and load it into the PIM. The catalog inherits the manufacturer’s variance, brand by brand, and the merchandising team spends its time fighting symptoms. The distributor’s content standard is what closes the gap. The filter is what enforces it.

DISTRIBUTOR-DEFINED CONTENT STANDARD → MANUFACTURER-PROVIDED RAW CONTENT → NORMALIZED, BUYER-READY PRODUCT DATA

Layer 1: Taxonomy ownership. The distributor owns the taxonomy. The search engine consumes whatever the distributor feeds it, and the navigation experience is the taxonomy made visible. Categories define what a product can be. Subcategories define how deep the navigation gets. Manufacturer-supplied categories are inputs, not authorities. Where two manufacturers categorize the same product differently, the distributor decides, and the decision is documented. Taxonomies that are too deep bury products. Taxonomies that are too shallow do not support faceted filtering at the level customers actually shop. Both are fixable. Neither fixes itself.

Owned by: Category / Taxonomy team

Layer 2: Attribute normalization and the attribute dictionary. Manufacturers create attribute data. Distributors normalize it. The attributes themselves (grit, alloy, thread pitch, voltage, capacity) are defined at the category level, with allowed values and clear definitions. Manufacturer-supplied values get mapped to the normalized set. Where ambiguity exists (coarse versus 40-grit, medium-duty versus standard-duty), the distributor publishes the definition the buyer will see. The attribute dictionary is the document that makes that definition consistent across categories and consistent across years. Without it, every new merchandiser invents their own interpretation, and the catalog drifts again.

Owned by: Product Data / Item Enrichment team

Layer 3: Title and description discipline. Product titles within a category follow a defined sequence. The sequence is decided once, at the category level, and applied to every SKU in that category regardless of which manufacturer supplied it. Long descriptions reinforce the attribute set, in the distributor’s voice, with consistent length and structure. AI is useful here, not for generating descriptions from nothing but for transforming inconsistent manufacturer content into the distributor’s standard form. The distributor still owns the standard. AI is the tool that scales the application.

Owned by: Content team, governed by category standards

Layer 4: Digital asset standards. Images, video, spec sheets, install guides. The standard defines what is required at the SKU level (typically at least one primary image meeting the defined visual standard) and what is optional but valuable. Image background, resolution, angle, and file naming follow rules the asset team can enforce. Supporting assets are linked, named, and tagged so they surface where the buyer can find them. Where manufacturer-supplied assets do not meet the standard, the distributor’s options are to source them, build them, or hold the SKU. The choice gets made deliberately. It does not get absorbed by publishing the product anyway and hoping the buyer does not notice.

Owned by: Digital Asset / Content team

Layer 5: Fill-rate and web-load gating. This is the equivalent of the regulatory gate at Stage 2, applied to merchandising. Products do not publish to the customer-facing site until they meet the defined content standard. Title in the right structure. Required attributes populated to threshold. Primary image present. Category correctly assigned. Description reinforcing the attributes. Below the threshold, the SKU is held in PIM. The gate is enforced by the system, not by the merchandiser’s judgment. The merchandiser’s judgment is what set the threshold in the first place, but the application is automated.

Owned by: Merchandising / PIM governance

Five layers. The distributor defines the standard. The system enforces it. The manufacturer’s content flows through it, normalized into something the catalog can actually use, and something an AI agent can actually interpret. That is the filter.

The leadership mindset for this stage

The most common dysfunction at Stage 5 is leadership treating merchandising as a creative function rather than a data discipline. It is easy for those not doing the work to lack empathy around what it takes to scale this data domain.

The product page is judged on how it looks. The catalog is judged on how it photographs. Whether the data underneath the page is governed is not a question executives are usually asking, because the page itself looks fine. It almost always looks fine. The merchandising team is good at making a single page look fine. The question they are not being asked is whether the next thousand pages will look fine without that team manually intervening on every one.

This is the gap. Stage 5 is the data discipline that determines whether your digital shelf scales. The merchandising team is not the standard. They are the team that applies the standard. The standard is leadership’s responsibility, and it is the work of Stage 5.

There is a related dysfunction worth naming. Leadership often outsources the standard to manufacturers, by default, by accepting whatever attribute data and category structure the manufacturer provides. That is not partnership. That is abdication, and the distributor’s brand pays for it. Manufacturers will accommodate the distributor’s standard when it exists and is consistently enforced. They will not invent it. The distributor knows what its catalog requires. The distributor knows what its customers search for. The distributor is the one whose brand sits on top of the product page. The standard is theirs to set.

The question for a leadership team at this station is direct. Is the distributor’s brand on the digital shelf something the business is governing, or is it something being assembled by whoever happens to be loading content this week? Both answers exist in practice. Only one of them scales.

And there is a forward-looking version of the same question. The agentic future is going to consume your catalog the way your sales team used to: completely, structurally, and unforgivingly. The agent does not call a buyer to clarify what coarse means. It picks the product where the data is unambiguous. The leadership decision today is whether the catalog will be ready for that consumption pattern, or whether the business will discover the gap one missed opportunity at a time, with no clear way to trace it.

Where to start

The diagnostic at this station is category-bounded, not catalog-wide. Stage 5 maturity is built category by category. The starting audit follows that.

Pick one product category. Choose one with enough SKUs across multiple manufacturers to make inconsistency visible. Fasteners. Abrasives. Electrical fittings. Whatever category in your business carries enough brand variance that the differences will show.

Audit at the SKU level against five questions:

  • Is the product title consistent in sequence and structure across all brands in the category, or does the order shift depending on who supplied the SKU?
  • Are the faceted attributes populated to your defined fill rate, and do the attribute values reconcile to a normalized dictionary, or are manufacturer values flowing through as-is?
  • Does every active SKU have at least one primary image meeting the distributor’s defined standard, or are there missing images, wrong images, or inconsistent treatments inside the same category?
  • Does the description reinforce the same facts the attribute table carries, or does the description exist independently of the attributes (or contradict them)?
  • Is the category placement correct for every SKU, and does the taxonomy node have enough depth to be navigable and enough restraint to be findable?

One category. Five questions. The result is a diagnostic, not a comprehensive audit. It will tell the leadership team whether the distributor’s brand on the digital shelf is governed or improvised inside this category. Most distributors are surprised by how much is improvised inside categories they consider mature.

The follow-on work then sequences naturally. Categories where the audit results are worst become the first targets for standard-setting and normalization. Categories that are already mature become reference models for how the rest of the catalog can be built. The Stage 5 program is not one project. It is a category-by-category discipline, applied in the order the business decides matters most.

Next station

Stage 6: Product Relationships. With the underlying data normalized, the business can finally make intelligent connections between products. Cross-references, substitutions, complementary items, kits, upsells. The capability every organization wanted to start with becomes reliable only when the products it connects are fully defined, commercially honest, and consistently presented. Get Stage 5 right, and the final station of the refinery has something worth building on. Get it wrong, and no relationship logic, however clever, can connect products whose underlying identity, presentation, and recognizability are still in flux.

Lost in digital? Let us know how we can help (info@b2b-squared.com).

 

0 Comments

Subscribe TODAY

Built by (and for) practitioners and executives, The Digital Roadmap delivers B2B platform insights with every issue.