Platform Engineering Is Migration Work

June 6, 2026

#platform #migration #automation #gitops #developer-tooling

Platform engineering is often described as building paved roads, internal products, and self-service workflows.

That is true, but it misses a large part of the work.

In practice, platform engineering also means moving people and systems from one operating model to another. A new CI standard, a new GitHub rule, a new runtime, a new deployment flow, a new package manager, a new Kubernetes convention, or a new developer workflow all create the same basic problem: existing repositories and teams already have habits, configuration, and history.

The platform is not only the target state. It is also the migration path.

That is where the work becomes interesting. A good platform migration is not just a big announcement followed by a deadline. It needs levers that help teams move safely, make the diff visible, and keep enough determinism that people can trust what is happening.

Migration Is A Platform Capability

Most platform changes fail in the space between “we know the better standard” and “all teams actually use it.”

That space is where migration capability matters.

The platform team usually has a few options:

provide a deterministic migration tool
expose a declared policy model
run centralized changes across repositories
let teams adopt a decentralized pattern
provide scripts that handle repetitive work
use coding assistants for the parts that are hard to express as pure scripts

None of these is universally better. The useful question is where the migration should sit.

flowchart LR
    Change["Platform change"] --> Surface["Migration surface"]
    Surface --> Deterministic["Deterministic migration"]
    Surface --> Declarative["Declared desired state"]
    Surface --> Scripted["Scripted workflow"]
    Surface --> Assisted["Assisted workflow"]

    Deterministic --> Review["Reviewable diff"]
    Declarative --> Review
    Scripted --> Review
    Assisted --> Review

    Review --> Control["Plan, diff, tests, approvals"]
    Control --> Adoption["Safer adoption"]

The pattern I prefer is simple: make the boring parts deterministic, make the desired state reviewable, and use assistants only where the repository shape is too uneven for a clean script.

Nx Migrations: Deterministic Change Inside The Repository

Nx is a good example of a migration mechanism that lives close to the code.

When the workspace moves from one version to another, nx migrate does more than bump a package version. It can generate a migrations.json file and run version-specific migrations that update configuration and source files. That matters because tooling upgrades often include subtle shape changes: project configuration, executor options, lint setup, testing setup, cache behavior, and package alignment.

The platform lesson is not specific to Nx. The lesson is that a migration embedded in the tool can make a class of change more reliable than a wiki page ever will.

For a platform team, this creates a useful lever:

package the expected change as a migration
let the migration update the repository
keep the resulting diff in the PR
run the normal checks
let maintainers review the concrete output

This becomes even more useful when Renovate is already opening dependency PRs. For Nx upgrades, Renovate can be configured to run a post-upgrade task that executes the migration command before the PR is committed.

The shape is roughly:

{
  "postUpgradeTasks": {
    "commands": ["pnpm nx migrate --run-migrations --if-exists"],
    "fileFilters": [
      "package.json",
      "pnpm-lock.yaml",
      "nx.json",
      "migrations.json",
      "apps/**",
      "libs/**"
    ],
    "executionMode": "branch"
  }
}

That example is intentionally incomplete because the exact command depends on how the update branch is produced. The important part is the platform pattern.

Renovate creates the update PR. Nx performs the deterministic repository migration. CI validates the result. The reviewer sees the actual diff instead of receiving an instruction to run a command locally.

There is a risk, though. Renovate post-upgrade tasks are powerful. They run commands inside a repository after dependency updates, which is why Renovate requires those commands to be explicitly allowed by the Renovate administrator. That is the right constraint. A migration command should be narrow, predictable, and limited to the files it is expected to touch.

The goal is not “let automation edit everything.” The goal is “let a known migration produce a reviewable patch.”

Another pattern is to keep Renovate focused on dependency updates and let a dedicated GitHub Actions workflow run the migration on Renovate pull requests.

For example, a repository can run Nx migrations only when the pull request comes from Renovate:

name: Run Nx migrations

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: write
  pull-requests: write

jobs:
  nx-migrate:
    if: github.actor == 'renovate[bot]' && startsWith(github.head_ref, 'renovate/')
    runs-on: ubuntu-latest

    steps:
      - name: Checkout Renovate branch
        uses: actions/checkout@v6
        with:
          ref: ${{ github.head_ref }}
          token: ${{ secrets.RENOVATE_MIGRATION_TOKEN }}

      - name: Setup pnpm
        uses: pnpm/action-setup@v4

      - name: Setup Node.js
        uses: actions/setup-node@v6
        with:
          node-version: 24.13.1
          cache: pnpm

      - name: Install dependencies
        run: pnpm install --frozen-lockfile

      - name: Run generated Nx migrations
        run: pnpm nx migrate --run-migrations --if-exists

      - name: Commit migration output
        uses: stefanzweifel/git-auto-commit-action@v7
        with:
          commit_message: "fix(deps): run nx migrations"

This is not the only possible workflow, but it shows the control points I care about. The trigger is narrow. The actor is checked. The command only runs when migrations.json exists. The migration output is committed back to the Renovate branch, so the PR remains the review surface.

There are details to handle carefully. The push token must be scoped for this job, not a broad personal token reused everywhere. Forked pull requests should not receive write credentials. If the workflow can run in many repositories, it should be packaged as a reusable workflow with the same guardrails everywhere.

The point is not whether the migration runs inside Renovate or inside GitHub Actions. The point is that the migration should produce a patch that people can review before it lands.

Terraform And GitHub: Centralized Versus Decentralized Policy

Some migrations do not belong inside one repository.

GitHub repository settings, branch protections, rulesets, teams, environments, and permissions are good examples. They affect repositories, but they are not always best owned by each repository independently.

Terraform with the GitHub provider gives a platform team another lever: declare the desired state of the GitHub organization and let Terraform show the plan before applying it.

This is where the centralized versus decentralized decision matters.

A centralized model can work well for cross-cutting controls:

organization rulesets
default branch protections
required checks
team-to-repository permissions
repository creation defaults
shared security settings
standardized repository files and folders

The advantage is consistency. The platform team can update one module or one map and roll a policy across many repositories.

The risk is distance. If the model is too centralized, teams may not understand why a rule changed, how to request an exception, or which repository-specific constraints were missed.

A decentralized model can work better when teams need ownership:

repository-local configuration
explicit module inputs
team-owned policy declarations
pull requests in the repository that owns the workflow

The advantage is proximity. The people affected by the change see it where they already work.

The risk is drift. If every team owns its own policy shape without strong defaults, the organization slowly loses the platform benefit.

Terraform is useful here because it gives the migration a deterministic checkpoint: terraform plan.

Before applying a change, the plan shows what Terraform intends to create, update, or delete. That makes the policy migration reviewable. It does not remove the need for judgment, but it gives the discussion a concrete artifact.

The same provider can also manage standardized files directly with github_repository_file.

That can be useful when the platform needs every repository to carry the same small contract:

a .github/dependabot.yml or Renovate helper config
a standard workflow file
a policy marker file
a default folder such as .github/ISSUE_TEMPLATE
an AGENTS.md file that documents repository rules for coding tools

The Terraform shape is straightforward:

resource "github_repository_file" "agent_instructions" {
  for_each = local.repositories

  repository          = each.key
  branch              = "main"
  file                = "AGENTS.md"
  content             = file("${path.module}/templates/AGENTS.md")
  commit_message      = "chore(platform): standardize coding instructions"
  commit_author       = "Platform Engineering"
  commit_email        = "platform@example.com"
  overwrite_on_create = true
}

This is a stronger lever than asking every team to copy a file, but it has a sharper edge too. If the platform owns the file completely, teams may lose the ability to express local context. If teams own it completely, the organization may lose the standard. A better model is often to standardize the minimum contract and leave repository-local extension points.

For example, Terraform can manage .github/platform.yml or .github/workflows/platform-checks.yml, while the repository keeps its own application-specific workflows. Or Terraform can manage a top-level AGENTS.md that points to a team-owned docs/engineering-rules.md. The centralized file becomes the stable contract, not the whole operating model.

That artifact matters. Without it, a GitHub policy migration can become a vague conversation about standards. With it, the conversation becomes specific:

which repositories will change
which rules will be added
which teams will gain or lose access
which exceptions still exist
which resources are outside Terraform state

That is the difference between migration as coordination and migration as guesswork.

Argo CD: Diff Before Sync

Kubernetes migrations have the same shape.

GitOps gives us a desired state, but the important part during migration is not only that Git is the source of truth. It is that the platform can show the difference between the desired state and the live state before applying the change.

Argo CD makes that visible through application diffs. A diff is not a perfect guarantee, but it is a critical control surface. It lets people inspect what will change in the cluster instead of treating sync as a black box.

That matters during platform migrations because Kubernetes changes often have hidden coupling:

a Helm value changes generated manifests
a CRD update changes accepted fields
an ingress migration changes routing behavior
a secret operator changes runtime dependencies
a sync wave changes ordering

The platform team should make those changes visible before they become runtime surprises.

This is the same principle as Terraform plan. Determinism does not mean there is no risk. It means the expected change is inspectable before execution.

Scripts Still Matter

Not every migration deserves a full productized tool.

Sometimes the right lever is a script. A script can enumerate repositories, inspect configuration, normalize metadata, open pull requests, or report which teams are still outside the target state.

Scripts are underrated because they are less glamorous than platforms and coding tools. But a good migration script has strong properties:

it can be run repeatedly
it can produce the same output for the same input
it can support a dry run
it can write a report
it can stop before irreversible actions
it can be reviewed and versioned

That is often enough.

For platform migrations, I like scripts that separate inspection from mutation:

inventory -> classify -> plan -> apply -> verify

The first three steps should usually be deterministic and read-only. They should produce an artifact that humans can inspect: a CSV, a JSON file, a Markdown report, a Terraform plan, an Argo CD diff, or a list of pull requests to open.

Only then should the migration mutate anything.

Where Codex And opencode Fit

Some migration work is too uneven for a plain script.

That does not make it magic. It just means the migration has a judgment step.

A repository may have an old workflow name, a slightly different package manager setup, a custom folder layout, or a README section that needs to be updated without losing local context. A deterministic script can find those cases and prepare the work, but it may be too rigid to make the final edit cleanly everywhere.

That is where Codex and the opencode SDK can be useful.

The mistake is to make them own the migration.

The better pattern is to keep the migration owned by code:

a script creates the inventory
deterministic checks classify repositories
Codex or opencode handles the non-uniform edit
tests, diffs, or plans validate the result
a human reviews the PR

Codex is useful when the migration needs a checkpointed code change: inspect the repository, make a scoped edit, run the expected command, and stop when validation fails. That fits code migrations better than a broad instruction like “modernize this repository.”

The opencode SDK is useful in a different way. Because it is a JavaScript and TypeScript client, it can sit inside an existing migration script. The interesting part is structured output: the script can ask opencode for a decision, but require the answer to match a schema before the next step runs.

That makes the assistant output feel more like code and less like prose.

For example, a migration script can ask opencode to classify what kind of repository it is looking at, then use that structured answer to decide which deterministic path to run:

import { createOpencode } from "@opencode-ai/sdk";
import * as z from "zod";

const migrationDecisionSchema = z.object({
  migrationPath: z
    .enum(["nx-migrate", "terraform-file", "manual-pr", "skip"])
    .describe("Safest migration path for this repository."),
  confidence: z.number().min(0).max(1),
  reason: z.string(),
  filesToReview: z.array(z.string()),
});

type MigrationDecision = z.infer<typeof migrationDecisionSchema>;

const { client, server } = await createOpencode();

try {
  const session = await client.session.create({
    body: { title: "Classify repository migration" },
  });

  const result = await client.session.prompt({
    path: { id: session.id },
    body: {
      parts: [
        {
          type: "text",
          text: [
            "Inspect this repository and classify the safest migration path.",
            "Return only the structured decision.",
          ].join("\n"),
        },
      ],
      format: {
        type: "json_schema",
        retryCount: 2,
        schema: z.toJSONSchema(migrationDecisionSchema),
      },
    },
  });

  const decision: MigrationDecision = migrationDecisionSchema.parse(
    result.data.info.structured_output
  );

  if (decision.migrationPath === "nx-migrate") {
    // Run the normal deterministic migration path here.
  }
} finally {
  server.close();
}

That is the shape I care about. The model can help with a fuzzy step, but the script still owns the flow. opencode receives JSON Schema because that is what its structured output API expects; Zod stays as the source of truth in the code. If the output does not parse, the script fails. If the confidence is too low, the script can open a manual issue instead of a PR. If the selected path is nx-migrate, the next command is still the normal Nx migration command.

The same idea works for edits. A script can use opencode to propose a repository-local patch, but the patch still has to pass formatting, tests, and review. The output is not trusted because it came from a model. It is trusted only after it survives the normal engineering checks.

I see the split this way:

use scripts for inventory, classification, dry runs, and reports
use Nx migrations when the tool already knows how to update its workspace
use Terraform when the desired state belongs to the organization or repository control plane
use Codex when the work needs a scoped code migration with validation
use the opencode SDK when a script needs a structured decision or repository-specific edit
use plans, diffs, tests, and pull requests as the shared control surface

That mixed model is where I think the value is:

scripted flow + structured assistant output + deterministic verification

The assistant helps with variation. The script and verification keep the migration bounded.

The Risk Model

Migration work is risky because it changes systems that already work well enough to resist change.

The main risks are predictable:

silent behavior change: the diff looks small, but runtime behavior changes
over-centralization: the platform imposes policy without enough local context
under-centralization: every team solves the same migration differently
automation blast radius: one bad command touches too many repositories
assistant overreach: a coding assistant changes more than the approved scope
review fatigue: too many migration PRs create shallow review
state mismatch: declared state, live state, and ownership reality diverge

The answer is not to avoid automation. Manual migrations have their own failure modes: missed repositories, inconsistent steps, stale instructions, and changes that are impossible to audit later.

The answer is to make automation reviewable.

That usually means:

dry runs before writes
plans before applies
diffs before syncs
generated reports before batch changes
narrow file filters for automated PRs
explicit allowlists for commands
small migration batches
rollback or revert paths
clear ownership of exceptions

The deeper point is that determinism is not only a technical preference. It is a trust mechanism.

If developers can see what will happen, they are more likely to accept the migration. If platform teams can prove what changed, they are more likely to operate the migration safely.

A Practical Migration Ladder

When I think about platform migration levers, I usually see a ladder.

At the bottom are documentation and examples. They are useful, but they are weak migration mechanisms because they rely on every team interpreting the change correctly.

Above that are scripts. They make the work repeatable.

Above that are tool-native migrations like Nx migrations. They encode the change close to the system that understands it.

Above that are declarative control planes like Terraform and Argo CD. They make desired state and planned change visible.

Coding assistants sit across the ladder rather than at the top. They are useful when the migration needs judgment, adaptation, or repository-specific edits. But they should still be surrounded by deterministic checkpoints.

flowchart TB
    Docs["Docs and examples"] --> Scripts["Scripts and reports"]
    Scripts --> Native["Tool-native migrations"]
    Native --> Declarative["Declarative control planes"]

    Assistant["Assisted adaptation"] -.-> Scripts
    Assistant -.-> Native
    Assistant -.-> Declarative

    Declarative --> Evidence["Plan, diff, tests, audit trail"]

That is the platform shape I want: not one giant migration mechanism, but a set of levers with clear boundaries.

The Platform Is The Path

Platform teams often focus on the target architecture.

That is necessary, but it is not enough. If the migration path is unclear, risky, or too manual, the target architecture remains mostly theoretical.

A useful platform makes adoption easier by turning migration into a controlled workflow:

deterministic where possible
declarative where useful
scripted where practical
assisted where variation is real
always reviewable before execution

That is why tools like Nx migrations, Renovate, Terraform, Argo CD, the opencode SDK, and Codex belong in the same conversation. They solve different parts of the same platform problem.

The platform is not only what teams move to.

The platform is also how they get there.