Creating Cultures of Product-Led Experimentation

Jul 3

Building and maintaining successful products requires a sustained commitment to innovation. As PMs, we’re tasked with delivering exceptional value to customers while ensuring business growth.

One of the best ways to achieve these goals is to foster a culture of product-led experimentation within your organization.

That is, our goal is to use our products to drive experiments for sales, marketing, customer success, and other customer-facing functions.

That’s why we’ve pulled together hard-won product experimentation insights that we’ve discovered at Product Teacher and meshed them with the perspectives shared in Amplitude’s PLG guide to create this resource on creating experimentation cultures.

Below, we’ll explore the various aspects of building cultures of experimentation, from designing effective experiments to expanding beyond traditional A/B testing, and why it's crucial to act on directional results even before achieving statistical significance.

What Good Experimentation Design Looks Like

Before diving into the nuances of experimentation, we need to align on what constitutes good experimentation design. Many organizations mistakenly believe that experimentation involves throwing random ideas at the wall to see what sticks. This approach is highly inefficient and rarely yields valuable insights.

Effective experimentation design involves the following key principles:

Hypothesis-Driven: Every experiment should begin with a clear hypothesis that outlines the expected value that it delivers to users, as well as the expected value that we can capture as a business. This hypothesis serves as the foundation for the experiment and helps in setting measurable goals.
Randomization: To ensure the validity of results, users should be randomly assigned to control and experimental groups. This minimizes bias and ensures that the observed differences can be attributed to the changes introduced by the experiment.
Measurement and Analytics: Define relevant key performance indicators (KPIs) that align with your hypothesis. Accurate and timely data collection and analysis are essential for evaluating the success of the experiment.
Sample Size Determination: Ensure that your experiment has a sufficiently large sample size to detect meaningful differences. Inadequate sample sizes can lead to inconclusive or misleading results.

Buying vs. Building Experimentation Platforms

When it comes to setting up experimentation platforms, organizations often face the decision of whether to build in-house solutions or buy external platforms.

In most cases, buying is the better choice for several reasons:

Expertise and Specialization: External experimentation platforms are developed and maintained by experts in the field. They continuously improve their offerings based on industry best practices and user feedback.
Time and Cost Efficiency: Building an in-house platform can be time-consuming and costly. Purchasing an existing platform can provide immediate access to advanced features and functionalities.
Scalability and Reliability: Established experimentation platforms are built to scale and are often more reliable than homegrown solutions. They can handle increased user loads without performance issues.
Compliance and Security: External platforms often come with built-in compliance and security features, reducing the risk of data breaches and legal issues.

Challenges in B2B Product Experiments

B2B product experiments present unique challenges compared to their B2C counterparts. In B2B, the "atomic unit" for experimentation is an entire customer base (often consisting of hundreds or thousands of users) as opposed to individual users in B2C settings.

This difference introduces complexities in experiment design and execution:

Inability to randomize within a customer: for B2B use cases, user-level randomization is typically unacceptable because businesses need consistent experiences for their employees and their consumers. In other words, all users within Customer A should have the same experience (e.g. the test cell), and all users within Customer B should have the same experience (e.g. the control cell).
Non-independence of user behavior: in B2C, we can assume that most users don’t drastically influence each other’s “natural” behavior. But, in B2B, users within the same account will influence each other; for example, 1 manager user within Customer XYZ may mandate 1,000 employees to adopt an unintuitive or painful workflow that these employee users would otherwise never elect to adopt on their own.
Longer test resolution cycles: because we’re testing with only 100’s of entities (i.e. customer accounts) rather than millions of users, our tests will naturally have to run for a longer period of time to gain confidence. Keep in mind that B2C users adopt new features within hours of exposure, whereas B2B users may sometimes require months to learn how to use a new productivity workflow.

Expanding Experimentation Beyond A/B Testing

While A/B testing is a fundamental component of experimentation, successful product managers recognize that they can expand their experimentation toolkit outside of A/B testing.

Here are a couple of valuable approaches to broaden your experimentation horizons:

Sketching Wireframes: Experimentation can start even before writing a single line of code. Sketching wireframes and mockups and gathering user feedback can help refine ideas and ensure you're building the right product.
Product Pilots: Launch small-scale product pilots to test new features or concepts with a select group of users. This enables in-depth feedback and validation before a full-scale rollout. These initial users provide invaluable feedback on usability and performance.

Acting on Directional Results

One common misconception in experimentation is the belief that statistical significance must be achieved before taking action.

While statistical significance helps create confidence in test results, it's not always necessary to wait until significance is achieved, especially when time is a critical factor. Directional results, indicating a clear trend or pattern, can be immensely valuable.

Here's why acting on directional results can make sense:

Iteration speed: Waiting for statistical significance delays decision-making. If you see a powerful result within 2 weeks, then waiting another 6 weeks for the test to resolve is dangerous inertia; during those additional 6 weeks of inaction, your competitors could have already surpassed you. In fast-moving markets, agility is essential.
Statistical barriers: In some experimentation setups, it’s literally impossible to reach statistical significance. For example, imagine that you’ve implemented a “new feature announcement module.” For 100 customers in your test cell, you see that 25% adopt the new feature. For the 100 customers in your control cell, you see that only 15% adopt the new feature. Clearly the new feature announcement is working as it’s performing 60% better than the control, but it’s not technically a statistically significant result.

Closing Thoughts

Building a culture of product-led experimentation is paramount for modern product managers and their customer-facing counterparts. A well-rounded product experimentation culture encompasses effective experimentation design, establishes clarity on choosing between building and buying experimentation platforms, embraces techniques beyond A/B testing, and leverages directional results.

By embracing experimentation as a fundamental practice, product managers can make informed decisions, drive innovation, and ultimately deliver products that delight customers and drive business growth. Experimentation isn't about throwing spaghetti at the wall; it's about using a systematic and strategic approach to uncover valuable insights that guide your product development journey.

Thank you to Pauli Bielewicz, Markus Seebauer, Juliet Chuang, and Kendra Ritterhern for making this guide possible.

Clement Kao