WCAG Coverage
WCAG 2.2 Level AA includes 56 success criteria. AllyProof's multi-engine scanner can automatically test approximately 57-70% of them. This page explains what is covered, what is not, and why the gap exists.
WCAG 2.2 — New in October 2023
WCAG 2.2 (W3C Recommendation, Oct 2023; ISO/IEC 40500:2025) added nine success criteria on top of WCAG 2.1. Findings that map to one of these nine are tagged with a New in 2.2pill on the issue detail page, so customers scoping against 2.2 conformance can see at a glance which of their issues wouldn't have existed under the earlier standard.
2.4.11Focus Not Obscured (Minimum) — AA2.4.12Focus Not Obscured (Enhanced) — AAA2.4.13Focus Appearance — AAA2.5.7Dragging Movements — AA2.5.8Target Size (Minimum) — AA3.2.6Consistent Help — A3.3.7Redundant Entry — A3.3.8Accessible Authentication (Minimum) — AA3.3.9Accessible Authentication (Enhanced) — AAA
Rule catalog — difficulty & verification
Each issue detail page uses a static rule catalog that, where we've seen the rule often enough to calibrate, tells you:
- Difficulty band — easy (mechanical markup, under 5 min), medium (code + minor design decision, 5–20 min), hard (design review, content rewrite, or architectural change, over 20 min).
- Time estimate — rough median minutes to fix one occurrence, visible in the Effort row of the metadata strip.
- AT verification recipe— tool (NVDA, VoiceOver, JAWS, keyboard, DevTools) and numbered steps for confirming the fix. Rendered as a "Verify with NVDA" section beneath the AI fix suggestion, so QA who aren't full-time a11y specialists can complete the manual check.
The issue list also has a Quick winschip that filters the current site's open issues down to the easydifficulty band — useful for answering "what can a junior pick off this afternoon?" without writing JQL.
The 57% Baseline
The widely cited "57%" figure comes from research on axe-core's rule coverage against WCAG 2.x success criteria. With axe-core alone, approximately 57% of WCAG 2.2 AA criteria have at least one automated rule that can detect failures.
By adding HTML_CodeSniffer as a second engine, AllyProof extends coverage to approximately 67-70%. The additional rules catch issues that axe-core's strict zero-false-positive policy causes it to skip.
Coverage by WCAG Principle
| Principle | Total AA criteria | Fully automatable | Partially automatable | Manual only |
|---|---|---|---|---|
| 1. Perceivable | 16 | 7 | 4 | 5 |
| 2. Operable | 20 | 6 | 5 | 9 |
| 3. Understandable | 11 | 4 | 3 | 4 |
| 4. Robust | 9 | 3 | 3 | 3 |
| Total | 56 | ~20 | ~15 | ~21 |
Fully Automatable Criteria
These criteria can be tested with high confidence using automated rules. A passing result from the scanner is a strong indicator of conformance:
1.1.1Non-text Content — Detects images without alt attributes, empty alt on non-decorative images1.3.1Info and Relationships — Checks for proper heading structure, form labels, table markup1.4.3Contrast (Minimum) — Computes foreground/background color contrast ratios1.4.11Non-text Contrast — Checks UI component and graphical object contrast2.1.1Keyboard — Detects interactive elements not reachable via keyboard2.4.1Bypass Blocks — Checks for skip navigation links or landmarks2.4.2Page Titled — Verifies pages have descriptive titles2.4.4Link Purpose (In Context) — Detects empty links and links with non-descriptive text3.1.1Language of Page — Checks forlangattribute on HTML element3.1.2Language of Parts — Detects content in different languages missinglangattributes4.1.1Parsing — Detects duplicate IDs and other markup errors4.1.2Name, Role, Value — Checks ARIA roles, states, and properties
And approximately 8 more with strong automated coverage.
Partially Automatable Criteria
These criteria have automated rules that catch some failures but not all. A clean scan result does not guarantee conformance — manual review is still recommended:
1.3.2Meaningful Sequence — Can detect some CSS reordering issues, but cannot assess whether reading order is meaningful1.3.5Identify Input Purpose — Can check forautocompleteattributes on forms, but cannot verify they are correct1.4.4Resize Text — Can detect fixed font sizes, but cannot verify layout at 200% zoom2.4.6Headings and Labels — Can detect empty headings and labels, but cannot assess whether they are descriptive2.4.7Focus Visible — Can detectoutline: nonewithout replacement, but cannot assess focus indicator visibility in all states3.3.2Labels or Instructions — Can detect missing labels, but cannot assess whether instructions are adequate
And approximately 9 more with partial automated coverage.
Manual-Only Criteria
These criteria cannot be meaningfully tested by automated tools. They require human judgment, assistive technology testing, or evaluation of content meaning:
1.2.1-1.2.5Time-based Media — Require human evaluation of captions, audio descriptions, and transcripts1.3.3Sensory Characteristics — Instructions that rely solely on shape, color, size, or location2.1.2No Keyboard Trap — Requires interactive testing of all focusable elements2.2.1Timing Adjustable — Requires testing session timeouts and timed interactions2.4.5Multiple Ways — Requires assessment of site navigation alternatives3.2.3Consistent Navigation — Requires cross-page comparison of navigation patterns3.3.3Error Suggestion — Requires testing form validation messages for helpfulness3.3.4Error Prevention (Legal, Financial, Data) — Requires testing of confirmation and review steps
And approximately 13 more that require purely manual assessment.
What "Not Evaluated" Means in VPATs
When AllyProof generates a draft VPAT, criteria that fall into the "manual only" category are marked as Not Evaluated. This conformance level means:
- The criterion was not tested because no automated rule covers it
- This is not the same as "Does Not Support" — it means the result is unknown
- A human accessibility tester should evaluate these criteria and update the VPAT to reflect actual conformance
The VPAT remarks column for Not Evaluated criteria includes a note explaining which type of manual testing is recommended (e.g. "Screen reader testing required" or "Keyboard-only navigation testing required").
Why Not 100%?
Many WCAG criteria require understanding intent and meaning, which automated tools fundamentally cannot assess:
- Is this alt text accurate? (Tools can verify it exists, not that it is correct.)
- Are these instructions clear? (Requires human comprehension.)
- Does the reading order make sense? (Requires understanding content structure.)
- Are captions synchronized and accurate? (Requires watching the video.)
- Can a user complete a task using only a keyboard? (Requires interactive testing.)
This is an inherent limitation of all automated accessibility testing tools, not specific to AllyProof. The 57-70% figure is consistent across the industry. AllyProof maximizes automated coverage and clearly identifies where manual testing is needed.