Version Management
Table of Contents
- Overview
- Architecture
- Version Config Schema
- Shell App Bootstrap
- Deployment Pipeline
- Rollback
- KV/D1 Write Atomicity Gap
- Canary / Gradual Rollout
- Cache Invalidation Strategy
- Admin UI Features
- D1 Schema
- References
Overview
Runtime version management allows platform operators to control exactly which version of each micro frontend (MFE) is loaded in production, without redeploying the shell application or any other part of the infrastructure. The Admin UI provides a centralized interface for pinning, promoting, and rolling back MFE versions across environments.
Core Capabilities
- No-redeploy version changes: Update a configuration value and every subsequent page load picks up the new MFE version. The shell application reads the active version config at bootstrap time and instructs Module Federation to load remotes from the corresponding CDN URLs.
- Instant rollback: Previous build artifacts remain on R2/CDN indefinitely. Rolling back means pointing the version config to an older URL --- no rebuild, no redeployment.
- Canary deployments: Route a percentage of users to a new MFE version using consistent hashing on user identity. Gradually increase the percentage as confidence grows.
- Environment-specific pinning: Maintain independent version configs for
dev,staging, andproduction. Promote versions through environments with explicit approval gates. - Full audit trail: Every version change is recorded in D1 with the identity of the actor, timestamp, and event type, enabling compliance and post-incident analysis.
Architecture
Data Flow
The version management system follows a write-through caching pattern where D1 serves as the persistent source of truth and KV provides low-latency edge reads.
1. Admin pins MFE version via UI
The operator selects an MFE, chooses a registered version from the dropdown, and clicks "Activate". The Admin UI sends an authenticated POST request to the Version Config Service.
2. Config Service writes to D1 (audit) + KV (fast reads)
The Version Config Service Worker performs a transactional write:
- Inserts a
deployment_eventsrow in D1 recording the activation. - Updates the
version_configstable: setsis_active = falseon the previously active row for that MFE/environment, andis_active = trueon the newly activated row. - Writes the aggregated version config JSON to KV under the key
version-config:{environment}.
Because the D1 write and KV write happen in the same Worker invocation, the system provides strong consistency in the common case. However, D1 and KV are separate storage systems with no cross-system transactional guarantee --- see KV/D1 Write Atomicity Gap for failure modes and mitigations.
3. Shell app fetches version config from KV on page load
When a user navigates to the application, the shell app's bootstrap logic issues a GET request to the Version Config Service endpoint (e.g., GET /api/v1/version-config?env=production). This endpoint reads directly from KV, ensuring sub-millisecond response times at any Cloudflare edge location worldwide.
4. Module Federation runtime loads remotes from versioned CDN URLs
The shell passes the entry URLs from the version config into Module Federation's init() call. The runtime fetches each MFE's mf-manifest.json from the versioned CDN path, then loads the necessary chunks on demand as the user navigates to each route.
Version Config Schema
TypeScript Interface
interface MfeVersionEntry {
/** Semver version string, e.g. "2.3.1" */
version: string;
/** Full URL to the mf-manifest.json for this version */
entry: string;
/** Subresource Integrity hash for the manifest (optional but recommended) */
integrity?: string;
/** ISO 8601 timestamp of when this version was activated */
updatedAt: string;
/** Email or identity of the user who activated this version */
updatedBy: string;
/** Canary configuration (present only during gradual rollouts) */
canary?: {
/** Version being rolled out */
version: string;
/** Entry URL for the canary version */
entry: string;
/** Percentage of traffic routed to canary (0-100) */
percentage: number;
/** Integrity hash for canary manifest */
integrity?: string;
};
}
interface VersionConfig {
[mfeName: string]: MfeVersionEntry;
}
Example KV Value
The following JSON is stored in KV under the key version-config:production:
{
"mfe_dashboard": {
"version": "2.3.1",
"entry": "https://cdn.example.com/mfe-dashboard/v2.3.1/mf-manifest.json",
"integrity": "sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8w",
"updatedAt": "2025-03-15T10:30:00Z",
"updatedBy": "admin@example.com"
},
"mfe_settings": {
"version": "1.8.0",
"entry": "https://cdn.example.com/mfe-settings/v1.8.0/mf-manifest.json",
"integrity": "sha384-Li9vy3DqF8tnTXuiaAJuML3ky+er10rcgNR/VqsVpcw+ThHmYcwiB1pbOxEb2V2I",
"updatedAt": "2025-03-14T16:45:00Z",
"updatedBy": "admin@example.com"
},
"mfe_analytics": {
"version": "3.1.0",
"entry": "https://cdn.example.com/mfe-analytics/v3.1.0/mf-manifest.json",
"updatedAt": "2025-03-13T09:00:00Z",
"updatedBy": "deploy-bot@example.com",
"canary": {
"version": "3.2.0-rc.1",
"entry": "https://cdn.example.com/mfe-analytics/v3.2.0-rc.1/mf-manifest.json",
"percentage": 10
}
}
}
KV Key Naming Convention
| Key Pattern | Description |
|---|---|
version-config:production | Active version config for production |
version-config:staging | Active version config for staging |
version-config:dev | Active version config for dev |
version-history:{env}:{mfe_name} | Last 50 versions for quick history lookup |
Shell App Bootstrap
The shell application reads the version config at startup and dynamically configures Module Federation remotes. This means the shell never hardcodes remote URLs --- it discovers them at runtime.
Bootstrap Sequence
- Fetch version config from the Version Config Service (backed by KV at the edge).
- Resolve canary assignments --- if any MFE has a canary config, determine whether the current user should receive the canary version based on consistent hashing.
- Initialize Module Federation runtime with dynamic remotes derived from the config.
- Register routes that lazy-load each MFE's exposed components.
- Render the application.
Implementation
// src/bootstrap.tsx
import { init, loadRemote } from '@module-federation/enhanced/runtime';
import { createRoot } from 'react-dom/client';
import { App } from './App';
import type { VersionConfig, MfeVersionEntry } from './types/version-config';
const VERSION_CONFIG_URL = 'https://config.example.com/api/v1/version-config';
/**
* Fetches the active version config from the Version Config Service.
* The service reads from KV, so this is fast at any edge location.
*/
async function fetchVersionConfig(): Promise<VersionConfig> {
const environment = import.meta.env.VITE_ENVIRONMENT ?? 'production';
const response = await fetch(`${VERSION_CONFIG_URL}?env=${environment}`, {
headers: { 'Accept': 'application/json' },
});
if (!response.ok) {
throw new Error(
`Failed to fetch version config: ${response.status} ${response.statusText}`
);
}
return response.json();
}
/**
* Determines which entry URL to use for an MFE, accounting for canary config.
* Uses consistent hashing on the user ID so the same user always gets the
* same version within a canary window.
*/
function resolveEntry(
mfeName: string,
config: MfeVersionEntry,
userId: string | null
): string {
if (!config.canary || !userId) {
return config.entry;
}
const hash = simpleHash(`${userId}:${mfeName}`);
const bucket = hash % 100;
if (bucket < config.canary.percentage) {
return config.canary.entry;
}
return config.entry;
}
/**
* Simple deterministic hash for canary bucketing.
* Not cryptographic --- just needs to be consistent and well-distributed.
*/
function simpleHash(input: string): number {
let hash = 0;
for (let i = 0; i < input.length; i++) {
const char = input.charCodeAt(i);
hash = ((hash << 5) - hash + char) | 0;
}
return Math.abs(hash);
}
/**
* Main bootstrap function.
*/
const bootstrap = async (): Promise<void> => {
try {
const config = await fetchVersionConfig();
const userId = localStorage.getItem('user_id');
// Build the remotes object for Module Federation init().
// Each remote includes an entry URL and, when available, an SRI integrity
// hash so that the browser rejects tampered bundles at load time.
const resolvedRemotes = Object.entries(config).map(([name, mfeEntry]) => {
const isCanary =
mfeEntry.canary && userId
? simpleHash(`${userId}:${name}`) % 100 < mfeEntry.canary.percentage
: false;
return {
name,
entry: isCanary ? mfeEntry.canary!.entry : mfeEntry.entry,
integrity: isCanary
? mfeEntry.canary!.integrity
: mfeEntry.integrity,
};
});
// Enforce SRI: reject any remote that was registered without an integrity hash.
// This prevents loading MFE bundles that cannot be verified against tampering.
const remotesWithoutIntegrity = resolvedRemotes.filter((r) => !r.integrity);
if (remotesWithoutIntegrity.length > 0) {
console.error(
'[Shell] SRI integrity hash missing for remotes:',
remotesWithoutIntegrity.map((r) => r.name)
);
throw new Error(
`SRI integrity hash is required for all MFE remotes. ` +
`Missing: ${remotesWithoutIntegrity.map((r) => r.name).join(', ')}`
);
}
init({
name: 'shell',
remotes: Object.fromEntries(
resolvedRemotes.map(({ name, entry }) => [name, { name, entry }])
),
shared: {
react: {
version: '19.2.4',
scope: 'default',
lib: () => import('react'),
shareConfig: { singleton: true, requiredVersion: '^19.2.4' },
},
'react-dom': {
version: '19.2.4',
scope: 'default',
lib: () => import('react-dom'),
shareConfig: { singleton: true, requiredVersion: '^19.2.4' },
},
},
});
const root = createRoot(document.getElementById('root')!);
root.render(<App versionConfig={config} />);
} catch (error) {
console.error('[Shell] Bootstrap failed:', error);
// Render a fallback error UI so the user is not left with a blank screen
const root = createRoot(document.getElementById('root')!);
root.render(
<div role="alert">
<h1>Application failed to load</h1>
<p>Please refresh the page or contact support.</p>
</div>
);
}
};
bootstrap();
Route Registration with Lazy-Loaded MFEs
// src/routes.tsx
import { lazy, Suspense } from 'react';
import { loadRemote } from '@module-federation/enhanced/runtime';
import type { RouteObject } from 'react-router-dom';
import { LoadingSpinner } from './components/LoadingSpinner';
import { MfeErrorBoundary } from './components/MfeErrorBoundary';
/**
* Creates a lazy React component backed by a Module Federation remote.
*/
function createRemoteComponent(remoteName: string, exposedModule: string) {
return lazy(async () => {
const module = await loadRemote<{ default: React.ComponentType }>(
`${remoteName}/${exposedModule}`
);
if (!module) {
throw new Error(`Failed to load remote module: ${remoteName}/${exposedModule}`);
}
return module;
});
}
const Dashboard = createRemoteComponent('mfe_dashboard', 'DashboardPage');
const Settings = createRemoteComponent('mfe_settings', 'SettingsPage');
const Analytics = createRemoteComponent('mfe_analytics', 'AnalyticsPage');
export const routes: RouteObject[] = [
{
path: '/dashboard',
element: (
<MfeErrorBoundary mfeName="mfe_dashboard">
<Suspense fallback={<LoadingSpinner />}>
<Dashboard />
</Suspense>
</MfeErrorBoundary>
),
},
{
path: '/settings/*',
element: (
<MfeErrorBoundary mfeName="mfe_settings">
<Suspense fallback={<LoadingSpinner />}>
<Settings />
</Suspense>
</MfeErrorBoundary>
),
},
{
path: '/analytics/*',
element: (
<MfeErrorBoundary mfeName="mfe_analytics">
<Suspense fallback={<LoadingSpinner />}>
<Analytics />
</Suspense>
</MfeErrorBoundary>
),
},
];
Deployment Pipeline
When a developer merges a change to an MFE, the CI pipeline builds the MFE, uploads artifacts to R2, and registers the new version with the Version Config Service. Crucially, registration does not mean activation --- the new version sits in the registry until an admin (or an automated promotion rule) explicitly activates it.
End-to-End Flow
CI Script
The following GitHub Actions workflow handles steps 2 through 4.
# .github/workflows/deploy-mfe.yml
name: Deploy MFE
on:
push:
branches: [main]
env:
MFE_NAME: mfe-dashboard
R2_BUCKET: mfe-artifacts
CONFIG_SERVICE_URL: https://config.example.com/api/v1
jobs:
build-and-register:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v6
- name: Setup Node.js
uses: actions/setup-node@v6
with:
node-version: '22'
cache: 'pnpm'
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Determine version
id: version
run: |
VERSION=$(node -p "require('./package.json').version")
SHORT_SHA=$(git rev-parse --short HEAD)
FULL_VERSION="${VERSION}+${SHORT_SHA}"
echo "version=${FULL_VERSION}" >> "$GITHUB_OUTPUT"
- name: Build MFE with Rsbuild
run: pnpm rsbuild build
env:
MFE_VERSION: ${{ steps.version.outputs.version }}
PUBLIC_PATH: https://cdn.example.com/${{ env.MFE_NAME }}/${{ steps.version.outputs.version }}/
- name: Compute integrity hash
id: integrity
run: |
HASH=$(shasum -b -a 384 dist/mf-manifest.json | awk '{ print $1 }' | xxd -r -p | base64)
echo "hash=sha384-${HASH}" >> "$GITHUB_OUTPUT"
- name: Upload to R2
uses: cloudflare/wrangler-action@v4
with:
command: r2 object put "${{ env.R2_BUCKET }}/${{ env.MFE_NAME }}/${{ steps.version.outputs.version }}/" --file=dist/ --recursive
apiToken: ${{ secrets.CLOUDFLARE_API_TOKEN }}
- name: Register version with Config Service
run: |
curl -sf -X POST "${{ env.CONFIG_SERVICE_URL }}/versions" \
-H "Authorization: Bearer ${{ secrets.CONFIG_SERVICE_TOKEN }}" \
-H "Content-Type: application/json" \
-d '{
"mfeName": "${{ env.MFE_NAME }}",
"version": "${{ steps.version.outputs.version }}",
"entryUrl": "https://cdn.example.com/${{ env.MFE_NAME }}/${{ steps.version.outputs.version }}/mf-manifest.json",
"integrityHash": "${{ steps.integrity.outputs.hash }}",
"environment": "dev",
"createdBy": "ci-bot@example.com"
}'
- name: Auto-activate in dev environment
if: success()
run: |
curl -sf -X POST "${{ env.CONFIG_SERVICE_URL }}/versions/activate" \
-H "Authorization: Bearer ${{ secrets.CONFIG_SERVICE_TOKEN }}" \
-H "Content-Type: application/json" \
-d '{
"mfeName": "${{ env.MFE_NAME }}",
"version": "${{ steps.version.outputs.version }}",
"environment": "dev",
"activatedBy": "ci-bot@example.com"
}'
Version Config Service --- Registration Endpoint
// workers/version-config-service/src/handlers/register-version.ts
import { D1Database, KVNamespace } from '@cloudflare/workers-types';
interface RegisterVersionRequest {
mfeName: string;
version: string;
entryUrl: string;
integrityHash?: string;
environment: string;
createdBy: string;
}
export async function handleRegisterVersion(
request: Request,
env: { DB: D1Database; VERSION_KV: KVNamespace }
): Promise<Response> {
const body: RegisterVersionRequest = await request.json();
// Validate that the manifest is actually accessible before registering
const manifestCheck = await fetch(body.entryUrl, { method: 'HEAD' });
if (!manifestCheck.ok) {
return Response.json(
{ error: `Manifest not accessible at ${body.entryUrl}: ${manifestCheck.status}` },
{ status: 400 }
);
}
// Check for duplicate registration
const existing = await env.DB.prepare(
'SELECT id FROM version_configs WHERE environment = ? AND mfe_name = ? AND version = ?'
)
.bind(body.environment, body.mfeName, body.version)
.first();
if (existing) {
return Response.json(
{ error: 'Version already registered', existingId: existing.id },
{ status: 409 }
);
}
// Insert the version record (is_active defaults to false)
const result = await env.DB.prepare(
`INSERT INTO version_configs (environment, mfe_name, version, entry_url, integrity_hash, created_by)
VALUES (?, ?, ?, ?, ?, ?)`
)
.bind(
body.environment,
body.mfeName,
body.version,
body.entryUrl,
body.integrityHash ?? null,
body.createdBy
)
.run();
// Record the deployment event
await env.DB.prepare(
`INSERT INTO deployment_events (environment, mfe_name, version, event_type, created_by)
VALUES (?, ?, ?, 'registered', ?)`
)
.bind(body.environment, body.mfeName, body.version, body.createdBy)
.run();
return Response.json(
{ id: result.meta.last_row_id, status: 'registered' },
{ status: 201 }
);
}
Rollback
Rollback is one of the most critical operational capabilities. Because all previously deployed MFE bundles remain on R2/CDN, a rollback is simply a configuration change that points the version config back to a prior version's URL. No rebuild or redeployment is involved.
Rollback Flow
Rollback Handler
// workers/version-config-service/src/handlers/activate-version.ts
interface ActivateVersionRequest {
mfeName: string;
version: string;
environment: string;
activatedBy: string;
isRollback?: boolean;
}
export async function handleActivateVersion(
request: Request,
env: { DB: D1Database; VERSION_KV: KVNamespace }
): Promise<Response> {
const body: ActivateVersionRequest = await request.json();
// 1. Verify the target version exists and its bundle is accessible
const targetVersion = await env.DB.prepare(
'SELECT * FROM version_configs WHERE environment = ? AND mfe_name = ? AND version = ?'
)
.bind(body.environment, body.mfeName, body.version)
.first<VersionConfigRow>();
if (!targetVersion) {
return Response.json({ error: 'Version not found' }, { status: 404 });
}
const manifestCheck = await fetch(targetVersion.entry_url, { method: 'HEAD' });
if (!manifestCheck.ok) {
return Response.json(
{ error: `Bundle no longer accessible at ${targetVersion.entry_url}` },
{ status: 400 }
);
}
// 2-4. Wrap deactivation, activation, and audit event in a D1 batch
// transaction to prevent race conditions (e.g., two simultaneous
// activations leaving multiple rows with is_active = true).
const eventType = body.isRollback ? 'rollback' : 'activated';
const stmtDeactivate = env.DB.prepare(
`UPDATE version_configs SET is_active = false
WHERE environment = ? AND mfe_name = ? AND is_active = true`
).bind(body.environment, body.mfeName);
const stmtActivate = env.DB.prepare(
`UPDATE version_configs
SET is_active = true, activated_at = datetime('now'), activated_by = ?
WHERE id = ?`
).bind(body.activatedBy, targetVersion.id);
const stmtEvent = env.DB.prepare(
`INSERT INTO deployment_events (environment, mfe_name, version, event_type, metadata, created_by)
VALUES (?, ?, ?, ?, ?, ?)`
).bind(
body.environment,
body.mfeName,
body.version,
eventType,
JSON.stringify({ previousVersion: targetVersion.version }),
body.activatedBy
);
// D1 batch() executes all statements in a single transaction.
// If any statement fails, the entire batch is rolled back.
await env.DB.batch([stmtDeactivate, stmtActivate, stmtEvent]);
// 5. Rebuild and write the aggregated version config to KV
await syncVersionConfigToKV(env, body.environment);
return Response.json({ status: eventType, version: body.version });
}
/**
* Reads all active versions for an environment from D1 and writes the
* aggregated config to KV.
*/
async function syncVersionConfigToKV(
env: { DB: D1Database; VERSION_KV: KVNamespace },
environment: string
): Promise<void> {
const activeVersions = await env.DB.prepare(
'SELECT * FROM version_configs WHERE environment = ? AND is_active = true'
)
.bind(environment)
.all<VersionConfigRow>();
const config: Record<string, MfeVersionEntry> = {};
for (const row of activeVersions.results) {
config[row.mfe_name] = {
version: row.version,
entry: row.entry_url,
integrity: row.integrity_hash ?? undefined,
updatedAt: row.activated_at ?? row.created_at,
updatedBy: row.activated_by ?? row.created_by,
};
}
await env.VERSION_KV.put(
`version-config:${environment}`,
JSON.stringify(config),
{ metadata: { updatedAt: new Date().toISOString() } }
);
}
Rollback Considerations
| Concern | Mitigation |
|---|---|
| KV propagation delay | For urgent rollbacks, also purge the Cloudflare cache on the version config endpoint using the Cache API. See Cache Invalidation Strategy. |
| Session continuity | Users with an active session will continue running the old MFE code until they refresh or navigate. The shell can detect version mismatches and show a soft prompt: "A new version is available. Click to refresh." |
| Shared state compatibility | If the new version changed the shape of persisted state (e.g., localStorage, IndexedDB), rolling back to the old version may encounter unexpected data. MFEs should use versioned storage keys or schema migrations. |
| CDN cache on MFE bundles | MFE bundles are served from versioned paths (/v2.3.1/), so they are effectively immutable. Rolling back does not require purging bundle caches --- the old bundles are already cached under their own paths. |
| Bundle retention | Never delete old bundles from R2. Implement a retention policy (e.g., keep the last 20 versions) to manage storage costs while preserving rollback capability. |
KV/D1 Write Atomicity Gap
D1 (SQLite) and KV are independent storage systems. There is no distributed transaction spanning both, which means a write can succeed in one system and fail in the other. This section documents the failure modes and recommended mitigations.
Failure Scenarios
| Scenario | Symptom | Impact |
|---|---|---|
| D1 write succeeds, KV write fails | The audit trail and version_configs table reflect the new active version, but the shell continues loading the old version because KV still holds the previous config. | Users see stale MFE versions. The Admin UI shows the version as "active" even though it is not being served. |
| D1 write fails, KV write succeeds | Unlikely in practice because the code writes to D1 first, but could happen if D1 commits and then the Worker crashes before the KV write, followed by a retry that skips D1 (due to UNIQUE constraint) but writes KV. | KV serves a config that does not match the D1 source of truth. A subsequent full sync from D1 to KV would overwrite the stale KV value. |
| Partial D1 batch + KV write | The D1 batch transaction (deactivate + activate + event) succeeds atomically, but the subsequent KV write fails due to a transient KV error or Worker timeout. | Same as the first scenario: D1 is correct, KV is stale. |
Mitigation Strategies
1. Retry with Idempotency
The KV write is inherently idempotent (a PUT with the same key and value is safe to repeat). If the KV write fails, the Config Service should retry it a bounded number of times before returning an error to the caller.
async function syncVersionConfigToKVWithRetry(
env: { DB: D1Database; VERSION_KV: KVNamespace },
environment: string,
maxRetries = 3
): Promise<void> {
const config = await buildVersionConfigFromD1(env, environment);
const payload = JSON.stringify(config);
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
await env.VERSION_KV.put(
`version-config:${environment}`,
payload,
{ metadata: { updatedAt: new Date().toISOString() } }
);
return; // Success
} catch (error) {
console.error(
`[syncKV] Attempt ${attempt}/${maxRetries} failed for env=${environment}:`,
error
);
if (attempt === maxRetries) throw error;
// Brief delay before retry
await new Promise((resolve) => setTimeout(resolve, 100 * attempt));
}
}
}
2. Periodic Reconciliation Job
A scheduled Worker (Cron Trigger) runs every few minutes and reconciles KV with D1. For each environment, it reads the active versions from D1, builds the expected KV value, and overwrites KV if the values differ.
// workers/version-config-service/src/scheduled.ts
export default {
async scheduled(
_event: ScheduledEvent,
env: { DB: D1Database; VERSION_KV: KVNamespace }
): Promise<void> {
for (const environment of ['dev', 'staging', 'production']) {
const expected = await buildVersionConfigFromD1(env, environment);
const current = await env.VERSION_KV.get(`version-config:${environment}`);
if (JSON.stringify(expected) !== current) {
console.warn(
`[Reconciliation] KV drift detected for env=${environment}. Resyncing.`
);
await env.VERSION_KV.put(
`version-config:${environment}`,
JSON.stringify(expected),
{ metadata: { updatedAt: new Date().toISOString(), reconciledBy: 'cron' } }
);
}
}
},
};
3. Response Indicates Partial Failure
If D1 succeeds but KV fails (even after retries), the activation endpoint should return a response that clearly indicates partial success so the caller can take corrective action.
// In handleActivateVersion, after the D1 batch succeeds:
try {
await syncVersionConfigToKVWithRetry(env, body.environment);
} catch (kvError) {
console.error('[Activate] KV sync failed after D1 commit:', kvError);
return Response.json(
{
status: eventType,
version: body.version,
warning: 'D1 updated successfully but KV sync failed. '
+ 'The reconciliation job will correct this within minutes. '
+ 'You may also retry the activation.',
},
{ status: 207 } // 207 Multi-Status
);
}
Design Principle
D1 is the source of truth. KV is a derived, eventually-consistent cache. Any divergence should be treated as a KV staleness issue and resolved by re-deriving KV from D1. The reconciliation job provides the safety net that ensures divergence is always temporary and bounded.
Canary / Gradual Rollout
Percentage-Based Rollout
Canary deployments allow a new MFE version to be tested with a fraction of real production traffic before full activation. The version config supports an optional canary field on any MFE entry.
Canary Config Schema
interface CanaryConfig {
/** The canary version string */
version: string;
/** Entry URL for the canary mf-manifest.json */
entry: string;
/** Integrity hash for the canary manifest */
integrity?: string;
/** Percentage of users who should receive the canary (0-100) */
percentage: number;
/** When the canary was started */
startedAt: string;
/** Who initiated the canary */
startedBy: string;
}
Admin UI Canary Workflow
- Admin navigates to the MFE's version list and selects a registered version.
- Instead of "Activate", clicks "Start Canary".
- Sets the initial traffic percentage (e.g., 5%).
- Config Service writes the canary config nested inside the MFE's entry in KV.
- Admin monitors error rates and performance in the dashboard.
- Admin increases percentage incrementally (5% -> 25% -> 50% -> 100%).
- At 100%, admin clicks "Promote" to fully activate and remove the canary config.
Shell-Side Canary Logic
// src/canary.ts
/**
* Resolves the effective entry URL for an MFE, accounting for canary routing.
*
* Uses consistent hashing so the same user always lands in the same bucket
* for a given MFE. This prevents users from flipping between versions on
* successive page loads.
*/
export function resolveCanaryEntry(
mfeName: string,
config: MfeVersionEntry,
userId: string | null
): { entry: string; version: string; isCanary: boolean } {
// No canary config or no user ID: always serve the stable version.
// Anonymous users never get canary to avoid inconsistent experiences.
if (!config.canary || !userId) {
return { entry: config.entry, version: config.version, isCanary: false };
}
const bucket = consistentBucket(userId, mfeName);
if (bucket < config.canary.percentage) {
return {
entry: config.canary.entry,
version: config.canary.version,
isCanary: true,
};
}
return { entry: config.entry, version: config.version, isCanary: false };
}
/**
* Produces a stable integer in [0, 100) for a given user + MFE pair.
* Uses FNV-1a for speed and good distribution.
*/
function consistentBucket(userId: string, mfeName: string): number {
const input = `${userId}:${mfeName}`;
let hash = 0x811c9dc5; // FNV offset basis
for (let i = 0; i < input.length; i++) {
hash ^= input.charCodeAt(i);
hash = Math.imul(hash, 0x01000193); // FNV prime
}
return ((hash >>> 0) % 100);
}
Canary Observability
The shell reports the active version of each MFE to the observability stack so that error rates and performance can be segmented by version.
// src/telemetry.ts
export function reportMfeVersions(
resolvedVersions: Record<string, { version: string; isCanary: boolean }>
): void {
// Tag all subsequent telemetry with MFE versions
for (const [mfeName, { version, isCanary }] of Object.entries(resolvedVersions)) {
analytics.setGlobalTag(`mfe.${mfeName}.version`, version);
analytics.setGlobalTag(`mfe.${mfeName}.canary`, String(isCanary));
}
}
Environment-Based Promotion
In addition to canary rollouts within a single environment, the platform supports promoting version configs across environments. This provides a structured path from development to production.
Promotion Flow
| Stage | Activation Policy |
|---|---|
dev | Auto-activated by CI on every push to main. |
staging | Manually promoted from dev via Admin UI or API. |
production | Manually activated or canary-deployed from staging. |
Promotion Endpoint
// workers/version-config-service/src/handlers/promote-version.ts
interface PromoteRequest {
mfeName: string;
version: string;
fromEnvironment: string; // e.g., "staging"
toEnvironment: string; // e.g., "production"
promotedBy: string;
}
export async function handlePromoteVersion(
request: Request,
env: { DB: D1Database; VERSION_KV: KVNamespace }
): Promise<Response> {
const body: PromoteRequest = await request.json();
// Verify the version is active in the source environment
const sourceVersion = await env.DB.prepare(
`SELECT * FROM version_configs
WHERE environment = ? AND mfe_name = ? AND version = ? AND is_active = true`
)
.bind(body.fromEnvironment, body.mfeName, body.version)
.first<VersionConfigRow>();
if (!sourceVersion) {
return Response.json(
{ error: `Version ${body.version} is not active in ${body.fromEnvironment}` },
{ status: 400 }
);
}
// Check if this version is already registered in the target environment
const existingInTarget = await env.DB.prepare(
`SELECT id FROM version_configs
WHERE environment = ? AND mfe_name = ? AND version = ?`
)
.bind(body.toEnvironment, body.mfeName, body.version)
.first();
if (!existingInTarget) {
// Register the version in the target environment
await env.DB.prepare(
`INSERT INTO version_configs (environment, mfe_name, version, entry_url, integrity_hash, created_by)
VALUES (?, ?, ?, ?, ?, ?)`
)
.bind(
body.toEnvironment,
body.mfeName,
body.version,
sourceVersion.entry_url,
sourceVersion.integrity_hash,
body.promotedBy
)
.run();
}
// Activate it in the target environment (reuses the activation handler logic)
const activateRequest = new Request(request.url, {
method: 'POST',
headers: request.headers,
body: JSON.stringify({
mfeName: body.mfeName,
version: body.version,
environment: body.toEnvironment,
activatedBy: body.promotedBy,
}),
});
return handleActivateVersion(activateRequest, env);
}
Cache Invalidation Strategy
The version config sits behind multiple caching layers. Understanding propagation delays is essential for predictable operational behavior.
Caching Layers
Admin writes to Config Service
│
▼
D1 (immediate consistency within the same colo)
│
▼
KV write (propagation: ~60 seconds to all edge locations)
│
▼
Cloudflare CDN cache on /api/v1/version-config endpoint
│
▼
Browser HTTP cache (if Cache-Control headers allow)
Propagation Timeline
| Layer | Typical Delay | Worst Case |
|---|---|---|
| D1 write | Immediate | < 100ms |
| KV global propagation | ~60 seconds | Up to 60 seconds |
| CDN edge cache (if enabled) | Depends on TTL | Up to TTL duration |
| Browser cache | Depends on headers | Until expiration or manual refresh |
Mitigation Strategies
1. Short TTL on the Version Config Endpoint
The Version Config Service endpoint sets conservative cache headers to ensure freshness.
// workers/version-config-service/src/handlers/get-config.ts
export async function handleGetConfig(
request: Request,
env: { VERSION_KV: KVNamespace }
): Promise<Response> {
const url = new URL(request.url);
const environment = url.searchParams.get('env') ?? 'production';
const config = await env.VERSION_KV.get(`version-config:${environment}`);
if (!config) {
return Response.json({}, { status: 200 });
}
return new Response(config, {
headers: {
'Content-Type': 'application/json',
// Short TTL: browsers and CDN edge will re-validate frequently
'Cache-Control': 'public, max-age=30, s-maxage=15, stale-while-revalidate=60',
// ETag for conditional requests
'ETag': `"${await hashContent(config)}"`,
},
});
}
2. Cache API Purge for Urgent Rollbacks
When an urgent rollback is performed, the Config Service proactively purges the CDN cache.
// workers/version-config-service/src/cache.ts
export async function purgeVersionConfigCache(environment: string): Promise<void> {
const cache = caches.default;
const cacheKey = new Request(
`https://config.example.com/api/v1/version-config?env=${environment}`
);
await cache.delete(cacheKey);
}
3. Shell-Side Polling
For long-lived sessions, the shell polls for config updates on a timer so users eventually receive the latest versions without a full page reload.
// src/version-poller.ts
const POLL_INTERVAL_MS = 5 * 60 * 1000; // 5 minutes
export function startVersionPoller(
currentConfig: VersionConfig,
onVersionChange: (newConfig: VersionConfig) => void
): () => void {
const intervalId = setInterval(async () => {
try {
const latestConfig = await fetchVersionConfig();
const hasChanges = Object.entries(latestConfig).some(
([name, entry]) => currentConfig[name]?.version !== entry.version
);
if (hasChanges) {
onVersionChange(latestConfig);
}
} catch (error) {
// Silently ignore polling errors --- the user is still running a valid version.
console.warn('[VersionPoller] Failed to check for updates:', error);
}
}, POLL_INTERVAL_MS);
return () => clearInterval(intervalId);
}
4. WebSocket Push for Immediate Notification
For scenarios where even a 5-minute delay is unacceptable (e.g., critical security patches), the shell can maintain a WebSocket connection to a Durable Object that broadcasts version change events.
// src/version-websocket.ts
export function connectVersionWebSocket(
onVersionChange: (newConfig: VersionConfig) => void
): WebSocket {
const ws = new WebSocket('wss://config.example.com/ws/version-updates');
ws.addEventListener('message', (event) => {
try {
const message = JSON.parse(event.data);
if (message.type === 'version-changed') {
onVersionChange(message.config);
}
} catch {
// Ignore malformed messages
}
});
ws.addEventListener('close', () => {
// Reconnect with exponential backoff
setTimeout(() => connectVersionWebSocket(onVersionChange), 5000);
});
return ws;
}
5. Soft Reload Prompt
When the shell detects a version change (via polling or WebSocket), it does not force a reload. Instead, it shows a non-intrusive notification.
// src/components/VersionUpdateBanner.tsx
import { useState } from 'react';
interface VersionUpdateBannerProps {
updatedMfes: string[];
}
export function VersionUpdateBanner({ updatedMfes }: VersionUpdateBannerProps) {
const [dismissed, setDismissed] = useState(false);
if (dismissed) return null;
return (
<div role="status" className="version-update-banner">
<p>
Updated versions available for: {updatedMfes.join(', ')}.
</p>
<button onClick={() => window.location.reload()}>
Refresh now
</button>
<button onClick={() => setDismissed(true)}>
Dismiss
</button>
</div>
);
}
Admin UI Features
The Admin UI is a dedicated MFE (or a standalone application) that provides operational control over MFE version management. It communicates with the Version Config Service via authenticated API calls.
Dashboard
- Displays the currently active version of each MFE, grouped by environment.
- Shows a health indicator (green/yellow/red) based on the last health check result.
- Highlights MFEs with active canary deployments and their current rollout percentage.
Version History with Audit Trail
- Full chronological log of every version event: registration, activation, deactivation, rollback.
- Each entry shows: timestamp, version, event type, actor (who), and optional metadata.
- Filterable by MFE name, environment, event type, and date range.
- Data sourced from the
deployment_eventstable in D1.
One-Click Rollback
- From the version history view, each previously active version has a "Rollback to this version" button.
- Before executing the rollback, the system performs a pre-flight health check to confirm the bundle is still accessible on CDN.
- A confirmation dialog shows the current version and the target version side by side.
- After rollback, the event is logged and appears immediately in the audit trail.
Canary Configuration
- "Start Canary" button on any registered (inactive) version.
- Slider or input for setting the traffic percentage.
- Real-time metrics panel showing error rates for the canary vs. stable version.
- "Promote" button to graduate the canary to full activation.
- "Abort Canary" button to immediately revert all traffic to the stable version.
Health Checks
Before activating any version, the system performs automated health checks.
// workers/version-config-service/src/health-check.ts
interface HealthCheckResult {
mfeName: string;
version: string;
manifestAccessible: boolean;
manifestValid: boolean;
exposedModulesAccessible: boolean;
responseTimeMs: number;
checkedAt: string;
}
export async function performHealthCheck(
entryUrl: string,
mfeName: string,
version: string
): Promise<HealthCheckResult> {
const start = Date.now();
const result: HealthCheckResult = {
mfeName,
version,
manifestAccessible: false,
manifestValid: false,
exposedModulesAccessible: false,
responseTimeMs: 0,
checkedAt: new Date().toISOString(),
};
try {
// Check that the manifest is accessible
const manifestResponse = await fetch(entryUrl);
result.manifestAccessible = manifestResponse.ok;
if (!manifestResponse.ok) return result;
// Validate manifest structure
const manifest = await manifestResponse.json();
result.manifestValid =
manifest != null &&
typeof manifest === 'object' &&
'exposes' in manifest;
if (!result.manifestValid) return result;
// Spot-check that at least one exposed module's entry chunk is accessible
const firstExpose = Object.values(manifest.exposes ?? {})[0] as
| { path: string }
| undefined;
if (firstExpose?.path) {
const baseUrl = entryUrl.replace(/\/[^/]+$/, '/');
const chunkResponse = await fetch(`${baseUrl}${firstExpose.path}`, {
method: 'HEAD',
});
result.exposedModulesAccessible = chunkResponse.ok;
}
} catch {
// Leave all checks as false
} finally {
result.responseTimeMs = Date.now() - start;
}
return result;
}
Role-Based Access Control (RBAC)
Access to version management operations is governed by roles managed through WorkOS.
| Role | Permissions |
|---|---|
viewer | View dashboard, version history, and health status |
developer | All viewer permissions + activate versions in dev environment |
release-manager | All developer permissions + activate/rollback in staging and production, configure canary deployments |
admin | All permissions + manage RBAC roles, configure retention policies |
// workers/version-config-service/src/middleware/auth.ts
import { WorkOS } from '@workos-inc/node';
const workos = new WorkOS(process.env.WORKOS_API_KEY);
type Permission = 'version:read' | 'version:activate:dev' | 'version:activate:staging' | 'version:activate:production' | 'canary:manage';
const ROLE_PERMISSIONS: Record<string, Permission[]> = {
viewer: ['version:read'],
developer: ['version:read', 'version:activate:dev'],
'release-manager': [
'version:read',
'version:activate:dev',
'version:activate:staging',
'version:activate:production',
'canary:manage',
],
admin: [
'version:read',
'version:activate:dev',
'version:activate:staging',
'version:activate:production',
'canary:manage',
],
};
export function requirePermission(permission: Permission) {
return async (request: Request): Promise<Response | null> => {
const sessionToken = request.headers.get('Authorization')?.replace('Bearer ', '');
if (!sessionToken) {
return Response.json({ error: 'Unauthorized' }, { status: 401 });
}
try {
const session = await workos.userManagement.authenticateWithSessionToken({
sessionToken,
});
const userRoles: string[] = session.organizationMemberships?.map(
(m) => m.role?.slug ?? 'viewer'
) ?? [];
const hasPermission = userRoles.some((role) =>
ROLE_PERMISSIONS[role]?.includes(permission)
);
if (!hasPermission) {
return Response.json({ error: 'Forbidden' }, { status: 403 });
}
return null; // null means "authorized, continue"
} catch {
return Response.json({ error: 'Invalid session' }, { status: 401 });
}
};
}
D1 Schema
The D1 database provides persistent storage for version registrations and a complete audit trail of all deployment events.
Tables
-- Stores every version that has been registered for each MFE/environment pair.
-- Only one row per (environment, mfe_name) should have is_active = true at any time.
CREATE TABLE version_configs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
environment TEXT NOT NULL, -- 'dev', 'staging', 'production'
mfe_name TEXT NOT NULL, -- e.g., 'mfe_dashboard'
version TEXT NOT NULL, -- semver, e.g., '2.3.1' or '2.3.1+abc1234'
entry_url TEXT NOT NULL, -- full URL to mf-manifest.json
integrity_hash TEXT, -- SRI hash (e.g., 'sha384-...')
is_active BOOLEAN DEFAULT false, -- only one active version per (env, mfe_name)
activated_at DATETIME, -- when this version was activated
activated_by TEXT, -- who activated it
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
created_by TEXT NOT NULL -- who registered it (usually CI)
);
-- Prevent duplicate version registrations at the database level.
-- The application also checks for duplicates before inserting, but this
-- constraint provides a hard guarantee against race conditions.
CREATE UNIQUE INDEX uq_version_configs_env_mfe_version
ON version_configs (environment, mfe_name, version);
-- Indexes for common query patterns
CREATE INDEX idx_version_configs_active
ON version_configs (environment, mfe_name, is_active)
WHERE is_active = true;
CREATE INDEX idx_version_configs_lookup
ON version_configs (environment, mfe_name, version);
CREATE INDEX idx_version_configs_history
ON version_configs (environment, mfe_name, created_at DESC);
-- Records every state transition for full audit trail.
CREATE TABLE deployment_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
environment TEXT NOT NULL, -- 'dev', 'staging', 'production'
mfe_name TEXT NOT NULL, -- e.g., 'mfe_dashboard'
version TEXT NOT NULL, -- the version this event pertains to
event_type TEXT NOT NULL, -- 'registered', 'activated', 'deactivated', 'rollback'
metadata TEXT, -- JSON blob for additional context
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
created_by TEXT NOT NULL -- who triggered the event
);
-- Index for querying events by MFE and environment
CREATE INDEX idx_deployment_events_lookup
ON deployment_events (environment, mfe_name, created_at DESC);
-- Index for filtering by event type (useful for audit queries)
CREATE INDEX idx_deployment_events_type
ON deployment_events (event_type, created_at DESC);
Query Examples
-- Get the currently active version for all MFEs in production
SELECT mfe_name, version, entry_url, activated_at, activated_by
FROM version_configs
WHERE environment = 'production' AND is_active = true;
-- Get version history for a specific MFE (most recent first)
SELECT version, is_active, activated_at, activated_by, created_at, created_by
FROM version_configs
WHERE environment = 'production' AND mfe_name = 'mfe_dashboard'
ORDER BY created_at DESC
LIMIT 20;
-- Get recent deployment events for audit
SELECT de.environment, de.mfe_name, de.version, de.event_type,
de.metadata, de.created_at, de.created_by
FROM deployment_events de
WHERE de.environment = 'production'
ORDER BY de.created_at DESC
LIMIT 50;
-- Count deployments per MFE in the last 30 days
SELECT mfe_name, COUNT(*) as deploy_count
FROM deployment_events
WHERE environment = 'production'
AND event_type = 'activated'
AND created_at >= datetime('now', '-30 days')
GROUP BY mfe_name
ORDER BY deploy_count DESC;
References
- Cloudflare Workers KV --- Global, low-latency key-value storage used as the edge cache for version configs.
- Cloudflare D1 --- SQLite-based relational database at the edge, used as the persistent store and audit trail.
- Cloudflare R2 --- Object storage for MFE build artifacts (bundles, manifests).
- Module Federation Runtime API --- The
init()andloadRemote()APIs used by the shell to dynamically configure and load MFE remotes. - Module Federation v2 Manifest --- Documentation for
mf-manifest.json, the manifest file generated by Rsbuild's Module Federation plugin. - Rsbuild Module Federation Plugin --- Build-time configuration for generating Module Federation outputs with Rsbuild.
- WorkOS User Management --- Authentication and RBAC used to secure the Admin UI and Version Config Service.
- Subresource Integrity (SRI) --- Integrity verification mechanism referenced in the version config schema.