Architecture
AllyProof is a Next.js application backed by Supabase, with Playwright-based scanning workers, AI-powered fix suggestions via Claude Haiku, and pg_cron for scheduled automation. This page describes the system architecture, directory structure, scan pipeline, and security model.
Tech Stack
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | Next.js 15 + React 19 | App Router, Server Components, Turbopack |
| Styling | Tailwind CSS v4 + shadcn/ui | Utility-first CSS, accessible component library |
| Auth + Database | Supabase | PostgreSQL, Auth, Row Level Security, Realtime |
| Scanning | Playwright + axe-core + HTML_CodeSniffer + APCA | Multi-engine accessibility analysis |
| AI | Claude Haiku 4.5 | Fix suggestions, VPAT executive summaries |
| Resend + React Email | Transactional notifications | |
| Payments | Paddle | Subscription billing (Merchant of Record) |
| Storage | Cloudflare R2 | Report PDFs, VPAT documents |
| Scheduling | pg_cron | Automated scan triggers |
| Monitoring | Sentry | Error tracking and performance monitoring |
Directory Structure
src/
├── app/ # Next.js App Router
│ ├── (auth)/ # Auth pages (login, signup, reset)
│ ├── (dashboard)/ # Authenticated dashboard pages
│ │ ├── sites/ # Site management
│ │ ├── issues/ # Violation browser
│ │ ├── reports/ # VPAT and report generation
│ │ └── settings/ # Org settings, team, billing
│ ├── api/ # API routes
│ │ ├── v1/ # Public API (CI/CD, external)
│ │ └── internal/ # Internal endpoints (cron triggers)
│ ├── docs/ # Documentation site
│ └── statement/[slug]/ # Public accessibility statements
├── components/ # Shared React components
│ ├── ui/ # shadcn/ui primitives
│ ├── dashboard/ # Dashboard-specific components
│ └── landing/ # Marketing site components
├── lib/ # Shared utilities
│ ├── supabase/ # Supabase client (browser + server)
│ ├── scanning/ # Scan engine logic
│ │ ├── axe.ts # axe-core integration
│ │ ├── htmlcs.ts # HTML_CodeSniffer integration
│ │ ├── apca.ts # APCA contrast analysis
│ │ └── dedup.ts # Cross-engine deduplication
│ ├── ai/ # Claude Haiku integration
│ ├── email/ # Resend email sending
│ └── scoring/ # Score calculation
├── emails/ # React Email templates
├── types/ # Shared TypeScript types
└── supabase/
└── migrations/ # Database migration filesScan Pipeline
When a scan is triggered (manually, via API, or by pg_cron), it follows this pipeline:
1. TRIGGER
├── Manual: User clicks "Scan Now"
├── API: POST /api/v1/scan (CI/CD)
└── Scheduled: pg_cron → POST /api/internal/trigger-scheduled-scans
2. PAGE DISCOVERY
├── Fetch sitemap.xml (if available)
├── Crawl homepage for internal links (fallback)
└── Apply page limit based on plan tier
3. PER-PAGE SCANNING
├── Launch Playwright Chromium
├── Navigate to page, wait for network idle
├── Run axe-core (primary engine)
├── Run HTML_CodeSniffer (concurrent with APCA)
├── Run APCA contrast analysis (concurrent with HTMLCS)
└── Deduplicate cross-engine results
4. POST-PROCESSING
├── Store violations in database
├── Calculate accessibility score
├── Calculate litigation risk score
├── Generate AI fix suggestions (async, via Claude Haiku)
└── Update site.latest_score cache
5. NOTIFICATION
├── Send scan-complete email to subscribed members
├── Send critical-violation alert (if applicable)
└── Send score-drop alert (if score dropped 10+)Multi-Tenancy Model
AllyProof uses a shared database with Row Level Security (RLS) for tenant isolation:
- Organization-scoped — All data (sites, scans, violations) belongs to an organization. Users access data through their org membership.
- RLS enforcement— Every table has RLS policies that filter rows by the authenticated user's org membership. Even if application code has a bug, the database prevents cross-tenant data access.
- Role-based access — Three roles control permissions:
owner— Full access, can manage billing and delete the organizationadmin— Can manage sites, scans, team members, and settingsmember— Read-only access to scan results and reports
Authentication Layers
AllyProof has three authentication mechanisms for different contexts:
| Layer | Mechanism | Used for |
|---|---|---|
| Browser sessions | Supabase Auth (JWT cookies) | Dashboard access. Supports email/password and Google OAuth. Sessions are HttpOnly cookies with PKCE flow. |
| API access | Bearer token (API key) | CI/CD integration, external API calls. Keys are SHA-256 hashed in the database, validated on each request. |
| Internal endpoints | Shared secret (CRON_SECRET) | pg_cron scheduled scan triggers. The secret is validated in the Authorization header. |
Data Flow
Browser ──→ Next.js Server Components ──→ Supabase (RLS)
↑ JWT from cookie
API Client ──→ Next.js API Routes ──→ Supabase (service role)
↑ API key validated ↑ Org scoped manually
pg_cron ──→ Internal API Route ──→ Supabase (service role)
↑ CRON_SECRET validatedServer Components use the user's JWT token, so RLS policies apply automatically. API routes and internal endpoints use the service role key and manually scope queries to the authenticated organization.
Key Architectural Decisions
- Server Components by default — Most pages are server-rendered. Client components (
"use client") are used only for interactive elements (forms, modals, real-time updates) and pushed to leaf components. - Supabase over custom auth — Supabase provides auth, database, and RLS in one package. Eliminates the need for a separate auth service and custom session management.
- pg_cron over external scheduler— Using PostgreSQL's built-in cron eliminates the need for a separate job scheduler service. The cron job runs inside Supabase and calls the application via HTTP.
- Multi-engine scanning — axe-core provides the zero-false-positive baseline, HTML_CodeSniffer adds coverage breadth, and APCA adds forward-looking WCAG 3.0 contrast analysis. Results are deduplicated to avoid double-counting.
- AI as enhancement, not dependency — The scanning pipeline works without the Anthropic API key. AI features (fix suggestions, VPAT summaries) are generated asynchronously after scan results are stored.
- Paddle for payments — As a Merchant of Record, Paddle handles tax collection and remittance globally, simplifying compliance for a solo developer operating from Ukraine.
Performance Considerations
- Scan parallelism — Pages within a scan are scanned sequentially to avoid overwhelming target servers. Multiple scans for different sites can run in parallel.
- Staggered scheduling — When pg_cron triggers scans for multiple sites, jobs are queued with 30-second intervals to spread load.
- Cursor-based pagination — All list endpoints use cursor-based pagination (never OFFSET) for consistent performance regardless of dataset size.
- Score caching — The latest score is cached on the
sitestable to avoid recalculation on every dashboard load.