Question 1 of 6
FlowScale data shows: "Users who create a workflow within 24h have 3x higher 30-day retention." This is best described as:
A Causal evidence โ creating workflows early causes better retention
B A correlation that cannot be used to justify forcing workflow creation without experimental validation
C Behavioral data that automatically qualifies as strong evidence
D Insufficient data to draw any conclusion
Correct. This is a correlation. The causation might run in the opposite direction: highly engaged users are both more likely to create workflows early AND more likely to retain. Designing a feature to force early workflow creation based on this correlation alone could be ineffective or even harmful.
Question 2 of 6
What is the correct experiment hypothesis format?
A "We believe that improving step 3 will increase retention."
B "Step 3 drop-off is caused by technical knowledge gaps."
C "If we simplify data source connection to require only a URL, then step 3 completion will increase from 41% to 55% because the primary barrier (API key knowledge) is removed."
D "We will test whether a simplified connection step improves activation."
Correct. A good hypothesis is falsifiable: If X, then Y will change by Z because of mechanism M. The mechanism is important โ it tells you why you expect the change, which helps you diagnose when the hypothesis is wrong.
Question 3 of 6
A confounding variable in the 24h workflow creation / retention correlation could be:
A The time of day users sign up
B Whether users have a company email
C The country the user is in
D User engagement level: highly engaged users are both more likely to create workflows early and more likely to retain โ engagement is the real driver
Correct. Engagement level is the classic confounding variable here. If you design a feature to force workflow creation early for low-engagement users, you are treating a symptom (late workflow creation) rather than the root cause (low engagement). The experiment may show no retention lift.
Question 4 of 6
What is the "time to value" activation metric measuring for FlowScale?
A Time from signup to first payment
B Time from first login to customer success check-in
C Minutes from signup to first workflow creation โ the moment a user experiences the product's core value
D Time from sales contract signing to onboarding completion
Correct. Time to value measures how quickly users reach the "aha moment" โ the first time they experience the product's core promise. For FlowScale, that is creating a working workflow. Current baseline is 47 minutes; target is 15 minutes.
Question 5 of 6
The AI analysis says: "Step 3 drop-off causes lower retention." What is the correct response?
A Accept it โ step 3 clearly causes problems
B Flag it as a causal claim from observational data โ mark as [INFERENCE] and require an experiment to validate causation before committing engineering resources to fix it
C Dismiss it โ AI cannot analyse funnels reliably
D Ask the AI to re-run the analysis with more data
Correct. From observational data alone, you can say "step 3 drop-off correlates with lower retention." Causation requires an experiment. Marking this as [INFERENCE] keeps the team honest about the evidence level and prevents over-investing in a fix for something that might not actually drive retention.
Question 6 of 6
Why must an experiment brief specify minimum sample size, not just a run duration?
A Because experiment duration is irrelevant to validity
B Because sample size is required by GDPR for A/B tests
C Because AI tools cannot calculate experiment duration
D Because statistical significance depends on sample size, not time โ "2 weeks" might be 100 users or 10,000, producing completely different confidence levels
Correct. A 2-week experiment with 100 users gives you almost no statistical power. A 2-week experiment with 5,000 users gives you high confidence. Power calculations determine the minimum sample size needed to detect the effect size you care about. Duration is a derived constraint, not the primary one.