Upgrading from Multi-Stack
Earlier versions of the platform split infrastructure across nine separate
CloudFormation stacks. The current architecture consolidates all of it into one
— PlatformStack. If you’re running an older multi-stack deployment, this page
walks the migration: back up, tear down, redeploy, and restore.
Before you start
Section titled “Before you start”- Admin access to the GitHub repository (to trigger workflows).
- The latest code on
main(or the branch carrying the single-stack architecture). Confirminfrastructure/lib/platform-stack.tsexists andinfrastructure/bin/infrastructure.tsreferences onlyPlatformStack. - Your environment variables and secrets configured (see Platform (CDK)).
- Your
CDK_PROJECT_PREFIXand the target GitHub environment in hand.
1. Back up
Section titled “1. Back up”This is the most critical step — it captures all application data so it can be restored after the rebuild.
Run Actions → Backup Data (Pre-Migration) with:
| Input | Value |
|---|---|
project_prefix | Your CDK_PROJECT_PREFIX |
aws_region | Your AWS region |
aws_environment | The GitHub environment (e.g. production) |
include_ephemeral | false — session tables aren’t worth preserving |
Note the backup bucket name from the output ({prefix}-backup-{timestamp})
and confirm the logs show a non-zero summary.ok count and zero
summary.failed.
2. Tear down
Section titled “2. Tear down”Run Actions → Teardown All Infrastructure with:
| Input | Value |
|---|---|
environment | The environment to tear down |
confirm | DESTROY (exactly, all caps) |
This destroys the existing CloudFormation stacks in that environment.
3. Clean up retained resources
Section titled “3. Clean up retained resources”If the environment had CDK_RETAIN_DATA_ON_DELETE=true (the old production
default), CloudFormation retained stateful resources instead of deleting
them. They must be removed before the new stack can recreate replacements with
the same names.
Typically retained: the ~24 DynamoDB tables, the data S3 buckets, the Cognito
User Pool, Secrets Manager secrets, KMS keys, and the SSM parameters under
/${CDK_PROJECT_PREFIX}/.
Identify and delete the retained resources:
# Identifyaws dynamodb list-tables --query "TableNames[?starts_with(@, '${CDK_PROJECT_PREFIX}')]"aws s3 ls | grep "${CDK_PROJECT_PREFIX}"aws ssm get-parameters-by-path --path "/${CDK_PROJECT_PREFIX}/" --recursive \ --query 'Parameters[].Name' --output text | tr '\t' '\n'
# Delete the image-tag params (minimum required for an in-place migration)aws ssm delete-parameter --name "/${CDK_PROJECT_PREFIX}/app-api/image-tag" 2>/dev/null || trueaws ssm delete-parameter --name "/${CDK_PROJECT_PREFIX}/inference-api/image-tag" 2>/dev/null || true
# ...or sweep every parameter under your prefix (safe — all re-published on the next deploy)aws ssm get-parameters-by-path --path "/${CDK_PROJECT_PREFIX}/" --recursive \ --query 'Parameters[].Name' --output text \ | tr '\t' '\n' | xargs -r -n10 aws ssm delete-parameters --namesDelete DynamoDB tables, empty and remove S3 buckets (aws s3 rb … --force), the
Cognito pool, and the secrets the same way.
4. Deploy the new architecture
Section titled “4. Deploy the new architecture”With the old stacks gone and retained resources cleared, deploy the single stack, then ship code and data:
- Platform Stack —
cdk deployprovisions all infrastructure (~15 min). - Backend Deploy — builds images, pushes to ECR, updates ECS / Lambda / Runtime.
- Frontend Deploy — publishes the Angular SPA.
- Bootstrap Data Seeding — seeds default configuration.
See Deployment Overview for the full first-deploy sequence.
5. Restore
Section titled “5. Restore”Run Actions → Restore Data with:
| Input | Value |
|---|---|
backup_bucket | The bucket from step 1 |
manifest_key | {prefix}/{timestamp}/manifest.json |
target_prefix | Your CDK_PROJECT_PREFIX (unchanged) |
region | Your AWS region |
dry_run | true first, then false |
skip_cognito_users | false (unless you want users to re-register) |
Run with dry_run=true first and review the output to confirm it found your
data, then run again with dry_run=false to write it into the new tables and
buckets.
6. Verify
Section titled “6. Verify”Visit the application and confirm login works, chat history is present, file uploads are accessible, the admin dashboard shows users / costs / models, and RAG assistants work. Confirm every workflow on the Actions dashboard is green.
Rollback
Section titled “Rollback”If the migration fails and you need to revert:
- Run Teardown to destroy the new stack.
- Switch the repository back to the old multi-stack branch.
- Redeploy the old architecture via its workflows.
- Restore data from the same backup bucket.
The backup bucket is immutable and survives all teardown operations.
Timeline
Section titled “Timeline”| Step | Duration |
|---|---|
| Backup | 5–15 min |
| Teardown | 5–10 min |
| Clean up retained resources | 5–15 min (skipped if retainDataOnDelete=false) |
| Platform deploy | 10–15 min |
| Backend deploy | 3–5 min |
| Frontend deploy | 2 min |
| Bootstrap seeding | 1 min |
| Restore | 5–15 min |
| Total | ~45–75 min |
Migration gotchas
Section titled “Migration gotchas”- Cognito passwords don’t transfer. AWS doesn’t export password hashes — native-password users must use “Forgot Password” on first login. Federated (OIDC/SAML) users are unaffected.
- “Resource already exists” during the platform deploy means a retained resource wasn’t cleaned up. Find it in the CloudFormation error event, delete it, and re-run.
- “Table not found” during restore means the platform deploy didn’t finish
or a table name changed. Confirm
npx cdk listshows the stack and that the SSM parameters under your prefix are published.