System Architecture
Stack Overview
Frontend: React 19 + Vite Backend: Hono (Cloudflare Worker) Database: Neon PostgreSQL (serverless) Auth: Microsoft Azure AD AI: LM Studio (optional, local)
Infrastructure
- Frontend: Cloudflare Pages (global CDN)
- API: Cloudflare Workers (edge computing)
- Database: Neon (serverless PostgreSQL)
- Auth: Microsoft Azure AD
All hosted on Cloudflare infrastructure except database.
Data Flow
User → Frontend (Cloudflare Pages)
→ API Worker (Cloudflare)
→ Neon PostgreSQL
Step by step:
- User logs in via Azure AD
- Azure AD issues JWT token
- Frontend stores token
- All API requests include JWT in Authorization header
- Worker validates JWT
- Worker sets RLS session context
- PostgreSQL enforces clinic isolation
- Worker returns data
- All PHI access logged to audit_logs
Security Layers
Layer 1: Transport
- HTTPS everywhere (TLS 1.3)
- No unencrypted connections
Layer 2: Authentication
- JWT tokens from Azure AD
- RS256 signing
- Short-lived tokens (15 min)
- Refresh token rotation
Layer 3: Authorization
- Role-based access control (admin/staff/referrer)
- Enforced in API layer
- Checked before database queries
Layer 4: Database
- Row-level security (RLS)
- Clinic isolation
- Session context from JWT
- Automatic filtering
Layer 5: Encryption
- PHI fields encrypted in JSONB
- Database encryption at rest
- Encrypted backups
Layer 6: Audit
- Every PHI access logged
- Immutable logs
- 7-year retention
- Exportable for compliance
Multi-Clinic Support
Each clinic is isolated:
- Users belong to one clinic
- RLS enforces clinic_id on all queries
- Superadmin can switch clinics
- Cross-clinic data access prevented
Authentication Flow
1. User visits intakepilot.com
2. Redirected to Azure AD login
3. User enters credentials
4. Azure AD validates and issues JWT
5. Frontend receives JWT
6. Frontend stores in sessionStorage
7. All API calls include JWT
8. Worker validates JWT signature
9. Worker extracts user_id, clinic_id, role
10. Worker sets PostgreSQL session context
API Architecture
Framework: Hono (fast, lightweight)
Middleware stack:
- CORS handling
- Rate limiting (KV store)
- JWT validation
- RLS context setting
- Request logging
- Error handling
Response format:
{
"data": [...],
"pagination": {...},
"metadata": {...}
}
Or errors:
{
"error": "message",
"details": "...",
"code": "ERROR_CODE"
}
Database Architecture
Provider: Neon (serverless PostgreSQL)
Connection: WebSocket (no pooling needed)
Features used:
- JSONB for flexible data
- RLS for clinic isolation
- Triggers for audit logging
- Indexes for performance
- Full-text search (coming)
Migrations: V3 consolidated schema
Frontend Architecture
Framework: React 19
Build: Vite
Key libraries:
- React Router (navigation)
- Keycloak adapter (auth)
- Fetch API (no Axios)
State management: React hooks (no Redux)
Styling: Custom CSS + Tailwind
AI Integration
Optional feature—system works without it.
How it works:
- User clicks “Generate AI Summaries”
- Frontend sends form data to LM Studio (localhost:1234)
- LM Studio generates 3 summaries in parallel
- Frontend saves summaries via API
- Summaries stored in
processing_results.summaries
Privacy: All AI processing local, nothing sent to cloud.
Deployment
Frontend
cd frontend
npm run build
wrangler pages deploy dist --project-name intakepilot
Backend
cd worker
wrangler deploy
Database
Migrations via Neon SQL editor or scripts.
Monitoring
Frontend: Cloudflare Pages analytics
API: Cloudflare Workers analytics
- Request count
- Errors
- Response times
- Geographic distribution
Database: Neon dashboard
- Query performance
- Connection count
- Storage usage
Errors: Console logging (future: Sentry)
Rate Limiting
100 requests/min per user.
Tracked in Cloudflare KV:
- Key:
rate_limit:{user_id} - Value: Request count
- TTL: 60 seconds
Scalability
Frontend: Globally distributed (Cloudflare CDN)
API: Auto-scales (Cloudflare Workers)
Database: Auto-scales (Neon serverless)
No manual scaling needed.
Performance
Frontend:
- Lazy loading routes
- Code splitting
- Compressed assets
- CDN caching
API:
- Edge computing (low latency)
- Efficient queries
- JSONB indexing
- Response caching (future)
Database:
- Indexed queries
- Connection pooling not needed (serverless)
- Query optimization
Future Architecture
Planned additions:
- Redis cache layer
- Full-text search (PostgreSQL FTS)
- Webhook system
- Real-time updates (WebSockets)
- Mobile app
- EHR integrations