CI/CD Pipelines
Loop Health uses GitHub Actions for continuous integration and deployment across multiple platforms (Vercel, Fly.io, and custom infrastructure).
Overview
Our CI/CD strategy focuses on:
- Fast feedback - PRs get checks within 5 minutes
- Safe deployments - Automated tests, type safety, and health checks
- Zero-downtime - Rolling deployments with automatic rollback
- Security - Secrets rotation, dependency scanning, and SBOM generation
- Observability - Sentry releases, deployment tracking, and performance monitoring
Workflows
Patient Graph CI/CD
File: .github/workflows/patient-graph-ci-cd.yml
Triggers:
- Push to
mainaffectingapps/patient-graph/** - Push affecting dependencies:
packages/{shared,core,hono,patient-graph}/** - Manual trigger via
workflow_dispatch
Full Pipeline:
name: Patient Graph CI/CD
on:
push:
branches: [main]
paths:
- 'apps/patient-graph/**'
- 'packages/shared/**'
- 'packages/core/**'
- 'packages/hono/**'
- 'packages/patient-graph/**'
- 'pnpm-lock.yaml'
workflow_dispatch:
jobs:
ci:
name: Continuous Integration
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for Sentry releases
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'pnpm'
- name: Enable Corepack
run: corepack enable
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Type-check
run: pnpm typecheck --filter=@loop/patient-graph-api
- name: Lint
run: pnpm lint --filter=@loop/patient-graph-api
- name: Run unit tests
run: pnpm test --filter=@loop/patient-graph-api
env:
DATABASE_URL: postgresql://test:test@localhost:5432/test
- name: Build
run: pnpm build --filter=@loop/patient-graph-api
- name: Upload build artifacts
uses: actions/upload-artifact@v4
with:
name: patient-graph-build
path: apps/patient-graph/dist
retention-days: 7
deploy:
name: Deploy to Fly.io
needs: ci
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Fly CLI
uses: superfly/flyctl-actions/setup-flyctl@master
- name: Deploy to Fly.io
run: flyctl deploy --remote-only --app patient-graph-api
env:
FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
working-directory: apps/patient-graph
- name: Wait for deployment
run: |
for i in {1..30}; do
if curl -f https://patient-graph.loop.health/health/ready; then
echo "Deployment successful"
exit 0
fi
echo "Waiting for deployment... ($i/30)"
sleep 10
done
echo "Deployment health check failed"
exit 1
- name: Create Sentry release
uses: getsentry/action-release@v1
env:
SENTRY_AUTH_TOKEN: ${{ secrets.SENTRY_AUTH_TOKEN }}
SENTRY_ORG: loop-health
SENTRY_PROJECT: patient-graph-api
with:
environment: production
version: ${{ github.sha }}
- name: Notify deployment
if: always()
uses: slackapi/slack-github-action@v1
with:
payload: |
{
"text": "Patient Graph deployed to production",
"status": "${{ job.status }}",
"commit": "${{ github.sha }}",
"url": "https://patient-graph.loop.health"
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_DEPLOYMENTS_WEBHOOK }}Vercel Deployments (Consumer Apps)
File: .github/workflows/vercel-deploy.yml
Apps:
my.loop.health- Patient portaladmin.loop.health- Admin dashboardluna.loop.health- Luna AI assistantdocs.loop.health- Developer documentation
Pipeline:
name: Vercel Deploy
on:
push:
branches: [main]
pull_request:
types: [opened, synchronize, reopened]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Deploy to Vercel
uses: amondnet/vercel-action@v25
with:
vercel-token: ${{ secrets.VERCEL_TOKEN }}
vercel-org-id: ${{ secrets.VERCEL_ORG_ID }}
vercel-project-id: ${{ secrets.VERCEL_PROJECT_ID }}
vercel-args: ${{ github.ref == 'refs/heads/main' && '--prod' || '' }}
scope: loop-health
- name: Comment PR with preview URL
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const url = process.env.VERCEL_URL
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `🚀 Preview deployed: ${url}`
})Database Migrations
File: .github/workflows/migrations.yml
Triggers: Manual only (via workflow_dispatch) for safety
Pipeline:
name: Database Migrations
on:
workflow_dispatch:
inputs:
environment:
description: 'Target environment'
required: true
type: choice
options:
- staging
- production
jobs:
migrate:
name: Run migrations
runs-on: ubuntu-latest
environment: ${{ github.event.inputs.environment }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Enable Corepack
run: corepack enable
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Generate migration SQL
run: pnpm db:generate
working-directory: packages/shared
- name: Dry run migration
run: pnpm db:migrate --dry-run
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
working-directory: packages/shared
- name: Apply migration
run: pnpm db:migrate
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
working-directory: packages/shared
- name: Verify migration
run: pnpm db:push --dry-run
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
working-directory: packages/sharedAPI Docs Validation
File: .github/workflows/api-docs-validation.yml
Triggers: PRs that modify openapi/*.yaml
Pipeline:
name: API Docs Validation
on:
pull_request:
paths:
- 'openapi/**/*.yaml'
- 'openapi/**/*.yml'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0 # Need base branch for diff
- name: Validate OpenAPI syntax
uses: char0n/swagger-editor-validate@v1
with:
definition-file: openapi/patient-graph.yaml
- name: Check for breaking changes
uses: oasdiff/oasdiff-action@main
with:
base: openapi/patient-graph.yaml@main
revision: openapi/patient-graph.yaml
format: text
fail-on: ERR
- name: Generate API diff
run: |
npx @redocly/cli diff \
openapi/patient-graph.yaml@main \
openapi/patient-graph.yaml \
--format markdown > api-diff.md
- name: Comment PR with diff
uses: actions/github-script@v7
with:
script: |
const fs = require('fs')
const diff = fs.readFileSync('api-diff.md', 'utf8')
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## API Changes\n\n${diff}`
})Lighthouse CI (Performance)
File: .github/workflows/lighthouse-ci.yml
Triggers: PRs to main
Pipeline:
name: Lighthouse CI
on:
pull_request:
branches: [main]
paths:
- 'apps/my-loop-health/**'
- 'packages/**'
jobs:
lighthouse:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build app
run: pnpm build
working-directory: apps/my-loop-health
env:
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY: ${{ secrets.NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY }}
- name: Run Lighthouse CI
uses: treosh/lighthouse-ci-action@v10
with:
urls: |
http://localhost:3000
http://localhost:3000/dashboard
http://localhost:3000/protocols
uploadArtifacts: true
temporaryPublicStorage: true
configPath: ./.lighthouserc.json
- name: Check performance budgets
run: |
if [ $(jq '.performance.score' lighthouse-report.json) -lt 90 ]; then
echo "Performance score below 90"
exit 1
fiSecurity Scanning
File: .github/workflows/security.yml
Triggers: Daily cron + PRs touching dependencies
Pipeline:
name: Security Scanning
on:
schedule:
- cron: '0 6 * * *' # Daily at 6am UTC
pull_request:
paths:
- 'pnpm-lock.yaml'
- 'package.json'
- '**/package.json'
jobs:
dependency-scan:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Run Snyk security scan
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
command: test
args: --severity-threshold=high
- name: Run npm audit
run: pnpm audit --audit-level=high
- name: Generate SBOM
uses: anchore/sbom-action@v0
with:
format: cyclonedx-json
output-file: sbom.json
- name: Upload SBOM
uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.json
codeql:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: javascript, typescript
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2Branch Strategy
| Branch | Purpose | CI Checks | Auto-deploy | Deployment Target |
|---|---|---|---|---|
main | Production | Full suite | Yes | Production (Vercel + Fly.io) |
develop | Staging | Full suite | Yes | Staging environment |
| Feature branches | Development | Basic checks | Preview only | Preview deployments |
| Hotfix branches | Emergency fixes | Full suite | Manual approval | Production (fast-tracked) |
Branch Protection Rules
main branch:
- Require pull request before merging
- Require 1 approval from CODEOWNERS
- Require status checks to pass:
- CI pipeline
- Type-check
- Lint
- Tests
- Lighthouse (performance)
- Require branches to be up to date
- Require linear history
- Require signed commits
develop branch:
- Require pull request before merging
- Require status checks to pass
- Allow force pushes (for rebasing)
Deployment Strategies
Zero-Downtime Deployments
Fly.io (Patient Graph):
- Health checks run every 10 seconds
- New machines start alongside old ones
- Traffic gradually shifts to new machines
- Old machines drain and shut down
- Rollback automatically if health checks fail
# Manual deployment with zero downtime
flyctl deploy --strategy rolling --wait-timeout 300
# Force immediate rollback
flyctl releases rollback --forceVercel (Consumer Apps):
- New deployment builds in isolation
- Deployment goes live atomically
- Old deployment remains accessible at versioned URL
- Instant rollback via Vercel dashboard
Blue-Green Deployments
For major schema changes or risky deployments:
# Deploy to "green" environment
flyctl deploy --app patient-graph-api-green
# Test green environment
curl https://patient-graph-green.loop.health/health/ready
# Switch traffic to green
flyctl proxy 443:3000 -a patient-graph-api-green
# Monitor for 15 minutes, then promote
flyctl apps rename patient-graph-api patient-graph-api-blue
flyctl apps rename patient-graph-api-green patient-graph-apiCanary Deployments
For gradual rollouts:
# Deploy canary (10% traffic)
flyctl deploy --strategy canary --canary-weight 10
# Monitor error rates in Sentry
# If OK, gradually increase traffic
flyctl scale count 5 --region iad
# Full rollout
flyctl deploy --strategy immediateSecrets Management
GitHub Secrets
Repository secrets:
FLY_API_TOKEN- Fly.io deployment tokenVERCEL_TOKEN- Vercel deployment tokenVERCEL_ORG_ID- Vercel organization IDVERCEL_PROJECT_ID- Vercel project ID (per app)SENTRY_AUTH_TOKEN- Sentry release creation tokenSLACK_DEPLOYMENTS_WEBHOOK- Slack notifications webhookSNYK_TOKEN- Snyk security scanning token
Environment secrets (per environment: staging, production):
DATABASE_URL- PostgreSQL connection stringCLERK_SECRET_KEY- Clerk authentication secretOPENAI_API_KEY- OpenAI API key- All other runtime secrets
Rotating Secrets
# 1. Generate new secret
NEW_SECRET=$(openssl rand -base64 32)
# 2. Set in GitHub
gh secret set DATABASE_URL --body "$NEW_SECRET" --env production
# 3. Update Fly.io
flyctl secrets set DATABASE_URL="$NEW_SECRET" -a patient-graph-api
# 4. Trigger redeployment
flyctl deploy -a patient-graph-api
# 5. Verify deployment
curl https://patient-graph.loop.health/health/readySecret Rotation Schedule
| Secret | Rotation Frequency | Owner |
|---|---|---|
| Database passwords | 90 days | DevOps |
| API keys (external) | 180 days | Engineering |
| JWT secrets | 365 days | Security |
| Webhook secrets | On breach | Security |
Rollback Procedures
Fly.io Rollback
# List recent releases
flyctl releases -a patient-graph-api
# Rollback to previous release
flyctl releases rollback -a patient-graph-api
# Rollback to specific version
flyctl releases rollback v42 -a patient-graph-api
# Verify rollback
curl https://patient-graph.loop.health/health/readyVercel Rollback
# List deployments
vercel ls patient-graph-api
# Promote specific deployment to production
vercel promote <deployment-url> --yes
# Or via dashboard: Deployments → Click deployment → Promote to ProductionDatabase Migration Rollback
# Migrations are versioned - rollback by running down migration
pnpm db:rollback --step 1
# For emergency: restore from backup
pg_restore -d $DATABASE_URL backup-2024-03-20.sql
# Verify schema state
pnpm db:push --dry-runMonitoring & Alerts
Deployment Alerts
Slack notifications:
- ✅ Successful deployments →
#deployments - ❌ Failed deployments →
#incidents - ⚠️ Slow deployments (>10min) →
#engineering
Sentry alerts:
- New error after deployment → Notify on-call
- Error rate spike (>5x baseline) → Page on-call
- Performance regression (>20% slower) → Notify team
Health Check Monitoring
# Fly.io health checks
flyctl checks -a patient-graph-api
# Manual health verification
curl https://patient-graph.loop.health/health/ready -v
# Database connectivity check
psql $DATABASE_URL -c "SELECT 1;"Post-Deployment Verification
Automated checks (run in CI):
- Health endpoint returns 200
- Database migrations applied
- Sentry release created
- No new errors in Sentry (5min window)
- Performance metrics within SLO
Manual verification checklist:
- Check Sentry dashboard for new errors
- Verify key user flows (login, dashboard, protocols)
- Check database connection pool stats
- Review Fly.io metrics (CPU, memory, request latency)
- Verify external integrations (Rimo, Oura, etc.)
Troubleshooting
CI Pipeline Failures
Type-check failures:
# Reproduce locally
pnpm typecheck --filter=@loop/patient-graph-api
# Common fix: regenerate types from schema
pnpm db:generateBuild failures:
# Check for missing dependencies
pnpm install --frozen-lockfile
# Clear build cache
rm -rf apps/patient-graph/dist
pnpm build --filter=@loop/patient-graph-apiTest failures:
# Run tests locally with same environment
DATABASE_URL=postgresql://test:test@localhost:5432/test pnpm test
# Debug specific test
pnpm test -- --watch --testNamePattern="treatment workflow"Deployment Failures
Fly.io deployment timeout:
# Check build logs
flyctl logs -a patient-graph-api
# Common issue: missing environment variable
flyctl secrets list -a patient-graph-api
# Increase timeout
flyctl deploy --wait-timeout 600Health check failures:
# SSH into running machine
flyctl ssh console -a patient-graph-api
# Check logs
flyctl logs -a patient-graph-api --tail 100
# Test health endpoint locally
curl http://localhost:3000/health/readyVercel build failures:
# Check build logs in Vercel dashboard
vercel logs <deployment-url>
# Reproduce locally
pnpm build
# Common issue: missing environment variables
vercel env pull .env.localPerformance Optimization
Build Optimization
Turborepo caching:
{
"pipeline": {
"build": {
"dependsOn": ["^build"],
"outputs": ["dist/**"],
"cache": true
}
}
}Docker layer caching:
# Cache dependencies separately from code
COPY pnpm-lock.yaml pnpm-workspace.yaml ./
RUN pnpm install --frozen-lockfile
# Code changes don't invalidate dependency cache
COPY . .
RUN pnpm buildDeployment Speed
Parallel deployments:
# Deploy multiple services concurrently
jobs:
deploy-api:
# ...
deploy-consumer:
# ...
# Both run in parallelSkip unnecessary steps:
# Only run migrations if schema changed
- name: Run migrations
if: contains(github.event.head_commit.modified, 'packages/shared/src/db/')
run: pnpm db:migrateBest Practices
DO ✅
- Always run migrations before deploying code
- Use semantic versioning for releases
- Tag production deployments in Git
- Monitor deployments for 15 minutes post-deploy
- Keep deployment windows small (<5 files changed)
- Test migrations on staging first
- Use feature flags for risky changes
- Document breaking changes in PR description
DON’T ❌
- Deploy on Fridays (unless hotfix)
- Skip CI checks (“just this once”)
- Deploy with failing tests
- Make schema changes without migrations
- Deploy multiple major changes at once
- Ignore deployment alerts
- Force-push to
main - Deploy without reviewing diff