Our Methodology

How We Evaluate & Score Products

Most review sites rely on crowdsourced sentiment, gamified reviews, or surface-level summaries. We use a standardized audit framework designed to expose what marketing pages hide.

Albert Richer
Founder & Lead SaaS Analyst · Last updated April 2026

Why Most Reviews Are Broken

Aggregators like G2 and Capterra rely on user reviews that are often gamified, incentivized, or outright fake. General tech blogs skim the surface without understanding the difference between what a roofing contractor needs and what an enterprise CTO needs. The result is rankings that serve vendors, not buyers.

WhatAreTheBest.com takes a different approach. We use a Standardized SaaS Audit—a data-first evaluation framework designed to surface hidden costs, integration limitations, and workflow bottlenecks that pricing pages don't show you. Our goal is comparison power, not endorsement.

Our 2-Tier Testing Protocol

With over 9,000 tools in our database, we apply different levels of scrutiny based on a tool's market impact. We believe buyers need deep stress testing for market leaders, and accurate forensic analysis for emerging tools. Nobody credibly hands-on tests 9,000 products—so we're explicit about what gets which treatment.

Tier 1
The "Deep Lab" Review
Category leaders, high-velocity tools, "Best Of" finalists
  • Live account testing with real workflow simulations
  • Support response & competence verification
  • Native vs. middleware integration checks
  • Proprietary "Workflow Friction" scoring
  • Original screenshots & walkthrough documentation
Tier 2
The "Forensic Spec" Audit
Specialized verticals, new entrants, long-tail solutions
  • True monthly cost calculation (seat price + required add-ons)
  • Security & compliance verification from legal docs
  • Hidden limit analysis (API caps, storage, contacts)
  • Standardized specification matrix for side-by-side comparison
A note on honesty: Tier 1 Deep Lab reviews are applied selectively to major category leaders where hands-on testing provides the most value. The majority of our 9,000+ product evaluations use the Tier 2 Forensic Spec methodology. We clearly distinguish between hands-on lab tests and specification audits in our analysis.

Proprietary Metrics We Calculate

"Starting at $19/mo" is a meaningless number. To give you real comparison power, we calculate metrics you won't find on any vendor's site:

Metric Why We Calculate It
Price-to-Seat Ratio Vendors blur the line between "per user" and "flat fee." We standardize to show the exact cost efficiency for a 5-person team.
Integration Weight 500 integrations isn't always better than 50. We score on quality—prioritizing deep, native 2-way syncs over shallow webhook connections.
"Skip If" Logic Our most critical metric. We identify the specific user profile who should avoid this tool (e.g., "Skip if you need visual Kanban boards").
Support-to-Price Value We analyze whether the support tier (email vs. chat vs. phone) justifies the price point relative to category norms.

How We Score: The Six Pillars

Every product is assessed across six evaluation categories. Four are applied universally; two are selected per category based on what matters most for that product type.

01
Feature Coverage & Depth
Breadth and depth of capabilities against the core needs of its category.
02
Market Credibility & Trust
User base, industry recognition, certifications, and adoption by credible organizations.
03
Usability & Experience
Onboarding complexity, learning curve, documentation quality, and time to value.
04
Pricing & Transparency
Pricing structure clarity, scalability, hidden costs, and alignment with category norms.
05–06
Category-Specific Criteria
Two additional pillars selected per category: integrations, scalability, security, compliance, support quality, onboarding, or ecosystem strength.

Each pillar is scored 1–10 using documented evidence drawn from official product documentation, third-party reviews, certifications, and credible market signals. Scores are normalized within categories—a score of 8.5 in CRM does not mean the same thing as 8.5 in cybersecurity. The full breakdown is displayed on every product page so you can see exactly how each score was earned.

The "Workflow Fit" Model

We don't score products in a vacuum. A 9/10 CRM for a 5-person agency might be a 4/10 for an enterprise with 500 reps. That's why we grade on a Curve of Intent—adjusting what we reward and penalize based on who the tool is actually for:

SMB / Contractor
Speed Focus
Reward: Time to value
Penalize: Unnecessary complexity
Mid-Market / Scale
Flexibility Focus
Reward: Customizability
Penalize: Weak reporting
Enterprise
Governance Focus
Reward: API depth & security
Penalize: Lack of governance

A tool is only "Best" if it fits your specific growth stage. This is why the same product may rank differently across different SaaS category pages depending on the audience profile of that category.

Independence & Ethics

Not Pay-to-Play
You cannot buy a higher score or a spot in our "Top 3." Rankings are set by the audit framework, not by commercial relationships.
Affiliate Disclosure
We're reader-supported. Clicking a link may earn us a commission. This funds our testing—but is invisible to our editorial process during scoring. If a high-commission tool breaks during testing, we'll tell you to skip it.
Human-Validated
We use AI to help organize datasets, but every final review, every "Skip If" recommendation, and every ranking decision is validated by a human analyst. We don't publish raw AI-generated content.

Limitations & Transparency

No evaluation framework is perfect. We think it's important to be upfront about where ours has edges:

  • Evaluations rely on publicly available information—we cannot assess private features or internal processes that vendors don't disclose.
  • Market signals and adoption indicators may lag behind reality, especially for fast-evolving categories.
  • Category-specific criteria may not capture every nuance that matters to your individual use case.
  • Deep Lab (Tier 1) reviews are applied selectively; most evaluations use the Forensic Spec methodology.

Our evaluations are a starting point for comparison, not a final verdict. We encourage you to conduct your own research, trial the tools yourself, and consider multiple sources before making a decision.

What Our Evaluation Is Not

To be clear about what you're reading on our product pages: our scores represent relative capability and fit within a category. A higher score means stronger alignment with our audit criteria—not a universal recommendation. Specifically, our evaluations are not paid placements, not sponsored rankings, not influenced by affiliate relationships, and not endorsements. Evaluation logic operates independently from monetization.

See Our Evaluations in Action

Our methodology is applied across 1,000+ SaaS categories. Explore some of our flagship hub pages to see scoring in practice:

Browse all 1,000+ SaaS categories →

Albert Richer
Albert Richer
Founder & Lead SaaS Analyst, WhatAreTheBest.com

Albert oversees the SaaS Audit Framework at WhatAreTheBest.com. With 25+ years in software systems analysis and product evaluation, he developed the Forensic Spec methodology to help buyers cut through marketing hype. He personally conducts Tier 1 Deep Lab reviews for major category leaders and audits the data integrity of all Tier 2 comparisons. Connect on LinkedIn →