Catch Version Inconsistencies Before They Break Your Bicep Deployments

Using github action automation to control versioning across bicep repository

Catch Version Inconsistencies Before They Break Your Bicep Deployments

Catch Version Inconsistencies Before They Break Your Bicep Deployments

If you're not familiar with Bicep: It's Microsoft's domain-specific language for deploying Azure infrastructure as code. Think of it like Terraform, but specifically designed for Azure. You write .bicep files that declare what infrastructure you want (virtual networks, storage accounts, databases), and Azure creates it. Bicep modules are reusable templates - like "standard web app setup" or "secure landing zone" - that teams can deploy consistently across environments.

The key thing to understand: Just like application code, infrastructure code has versions. When you improve a Bicep module (add features, fix bugs, update security), you bump the version. These versioned modules get published to an Azure Container Registry (ACR), and other teams reference them in their deployments: "Give me the web app module, version 1.3.1."

The problem? Unlike application code where version management is built into package managers (npm, pip, Maven), Bicep modules require manual version tracking across multiple files. And that's where things fall apart.

Version control is critical. Version validation is tedious. And in the chaos of a PR review, it's incredibly easy to miss.

The problem isn't that people don't care about versioning. It's that manual validation during PR reviews is tedious, error-prone, and easy to overlook when you're focused on reviewing the actual infrastructure code.

Here's how I automated the entire version validation process - so version inconsistencies get caught automatically before they hit the main branch.

Effort for Automation vs. Pain

Like every automation, we have to ask ourselves how big is the benefit.

Time to Build

Total investment: ~60 minutes

  • 30 minutes writing the validation script
  • 20 minutes setting up the GitHub workflow
  • 10 minutes testing

Writing this blog post took longer than building the automation.

Time Saved

Before automation:

  • 5-10 minutes per PR for manual version checks
  • ~20 PRs per month with version changes
  • = 100-200 minutes of manual work monthly

Plus hidden costs:

  • Back-and-forth comments on missing updates
  • PR delays waiting for fixes
  • Production incidents from version inconsistencies

Super conservative estimate: 4-5 hours saved per month.

The Math

  • Week 1: Break even
  • Month 12: 48+ hours saved
  • Forever: Zero version-related incidents

One hour of work. Years of benefit.

If you're manually validating versions on more than a few PRs per month, you're already losing time you could have saved.


Our Version Control Challenge

In our Bicep repository, every module maintains version information across four different files. Each serves a specific purpose, and all four must stay in sync:

1. metadata.json - The Source of Truth

json

{
  "version": {
    "major": 0,
    "minor": 3,
    "patch": 0
  },
  "level": 2
}

Purpose: Determines which version gets published to our Azure Container Registry (ACR) when the module is released. This drives our artifact versioning.

2. Bicep File Metadata - Quick Reference

bicep

metadata version = '0.3.0'

Purpose: Lets developers instantly see which version they're looking at when opening the Bicep file. No need to jump to metadata.json - the version is right there at the top.

3. CHANGELOG.md - Change Documentation

markdown

## 0.3.0 - 28.10.2025

Features(MFL): 
- All the changes I did

Purpose: Human-readable history of what changed between versions. Critical for understanding impact and troubleshooting issues.

4. Configuration Files (.bicepparam) - Deployment References

bicep

using 'br:myacrname.azurecr.io/bicep-solution/landing-zones/alz:v0.3.0'

Purpose: References to specific module versions for deployments. Here's the tricky part: .bicepparam files could technically use older versions (backward compatibility), but when we bump the version in a PR, we typically update these to point to the new version we're about to create.

This is where things get overlooked. You're creating v0.3.0, but your .bicepparam files are still pointing to v0.2.0 - which means they won't test the new version you're about to publish.


The Manual Review Problem

During a PR review, you're focused on:

  • Does the Bicep logic make sense?
  • Are the parameters correct?
  • Is the infrastructure secure?
  • Will this break existing deployments?

Version validation becomes an afterthought. You might catch it... or you might not.


The Solution: Automated Validation

I built a two-part automation that validates version consistency on every PR:

The Architecture

PR Created
    ↓
validate-versioning.yml (GitHub Action)
    ├─ Triggers when version files change
    ├─ Checks out code
    └─ Calls validate-versioning.sh
        ↓
validate-versioning.sh (Bash Script)
    ├─ Extracts version from metadata.json
    ├─ Checks CHANGELOG.md
    ├─ Checks Bicep metadata
    ├─ Checks .bicepparam files
    └─ Returns: ✅ Pass or ❌ Fail
        ↓
validate-versioning.yml
    └─ Posts results as PR comment
        ↓
GitHub blocks merge if validation fails

What it validates:

CHANGELOG.md has an entry at the top matching metadata.json
Bicep metadata version matches metadata.json
⚠️ .bicepparam files reference the new version (warning, not blocking)

Why warnings for .bicepparam?

  • Old versions still work (backward compatibility)
  • You might intentionally keep older references for certain environments
  • But you probably want to update them - so we warn, not block

How the Two Files Work Together

Part 1: validate-versioning.yml (The Orchestrator)

This GitHub Actions workflow is the control center. It decides when to run and what to do with the results.

Triggers when these files change:

yaml

on:
  pull_request:
    branches: [ "main" ]
    paths:
      - '**/metadata.json'
      - '**/CHANGELOG.md'
      - '**/*.bicep'
      - '**/*.bicepparam'

Why this is smart: Only runs when necessary. Documentation typo? Workflow doesn't trigger.

The workflow does three things:

1. Sets up the environment:

yaml

- name: Checkout code
  uses: actions/checkout@v3
  with:
    fetch-depth: 0  # Need full history to compare versions

2. Runs the validation script:

yaml

- name: Run versioning validation
  run: |
    ./.github/scripts/validate-versioning.sh

3. Posts results to the PR:

yaml

- name: Comment on PR with validation results
  if: always()
  uses: actions/github-script@v6
  with:
    script: |
      # Posts formatted results as PR comment
      # Updates existing comment if present

Key point: The workflow handles GitHub integration (triggering, commenting, blocking merges), but doesn't do any validation logic itself.


Part 2: validate-versioning.sh (The Validator)

This bash script does the actual work - parsing files, comparing versions, and generating clear error messages.

The validation logic:

bash

# 1. Find changed metadata.json files
changed_files=$(git diff --name-only HEAD~1 HEAD | grep "metadata.json")

# 2. For each changed file
for metadata_file in $changed_files; do
  
  # Extract version from metadata.json
  version=$(jq -r '.version | "\(.major).\(.minor).\(.patch)"' < "$metadata_file")
  
  # Check 1: CHANGELOG.md contains version at top
  if ! grep -q "^## ${version} - " CHANGELOG.md; then
    echo "❌ ERROR: CHANGELOG.md missing version $version"
  fi
  
  # Check 2: Bicep metadata matches
  bicep_version=$(grep "metadata version" main.bicep | sed 's/.*'\''//g')
  if [ "$bicep_version" != "$version" ]; then
    echo "❌ ERROR: Bicep metadata mismatch"
  fi
  
  # Check 3: .bicepparam files reference new version (warning only)
  for param_file in configurations/*.bicepparam; do
    if ! grep -q ":v${version}" "$param_file"; then
      echo "⚠️ WARNING: $param_file may need updating"
    fi
  done
  
done

Key point: The script focuses purely on validation logic. It doesn't know about GitHub PRs or comments - it just returns success/failure.


What Happens in Practice

Scenario 1: Everything's Correct ✅

You update metadata.json to v0.3.0 and update all four files correctly:

1. PR created
2. Workflow detects metadata.json change
3. Script validates:
   ✅ metadata.json: 0.3.0
   ✅ CHANGELOG.md: ## 0.3.0 - 28.10.2025
   ✅ Bicep metadata: 0.3.0
   ✅ .bicepparam files: v0.3.0
4. Workflow posts: "✅ All validation checks passed!"
5. PR can merge

Scenario 2: Something's Missing ❌

You update metadata.json to v0.3.0 but forget the CHANGELOG:

1. PR created
2. Workflow detects metadata.json change
3. Script validates:
   ✅ metadata.json: 0.3.0
   ❌ CHANGELOG.md: Missing section for 0.3.0
   ✅ Bicep metadata: 0.3.0
   ⚠️ .bicepparam files: Still reference v0.2.0
4. Workflow posts error comment with details
5. PR merge blocked until fixed

The PR comment shows:
Catch Version Inconsistencies Before They Break Your Bicep Deployments 2-20251205

Catch Version Inconsistencies Before They Break Your Bicep Deployments 2-20251205-1

No ambiguity. The developer knows exactly what to fix.


The Impact

The Real Win: Teaching Through Automation

The biggest benefit isn't just catching errors - it's teaching developers how to version properly without them even realizing it.

Before automation:

  • New team members didn't know all four files needed updating
  • Even experienced developers forgot steps during rushed PRs
  • Version inconsistencies slipped through when reviewers weren't familiar with the process
  • "Can you update the CHANGELOG?" became a recurring comment on every PR

After automation:

  • The workflow teaches developers the versioning requirements immediately
  • Clear error messages show exactly what's needed and where
  • New contributors learn the pattern after their first PR
  • Everyone follows the same standard - no knowledge gaps

What we measure:

  • Zero version inconsistencies in merged PRs (down from ~30%)
  • 30-second validation vs. 5-10 minutes of manual checking
  • Complete audit trail - CHANGELOG always matches actual changes

What we feel:

  • Reviewers focus on infrastructure logic, not version bookkeeping
  • No more back-and-forth about missing documentation
  • Confidence that version history is accurate
  • New team members onboard faster - the automation guides them

The bottom line: Automation doesn't just save time - it removes the need to remember complex rules. The workflow becomes the teacher, and everyone benefits.


Why This Design Works

1. Separation of Concerns

validate-versioning.yml: Handles GitHub integration
validate-versioning.sh: Handles validation logic

Why this matters:

  • Script can be run locally for testing: ./validate-versioning.sh
  • Easy to update validation rules without touching GitHub Actions syntax
  • Clear debugging - script problems vs. workflow problems are separate

2. Errors vs. Warnings

Errors (block merge):

  • Missing CHANGELOG entry
  • Mismatched Bicep metadata
  • Invalid version format

Warnings (don't block):

  • .bicepparam files with old version references
  • Missing configurations/ folder

Rationale:

  • Core version consistency is non-negotiable
  • Configuration references need flexibility
  • Warnings nudge developers without blocking urgent fixes

3. Clear Expectations

The PR comment includes a "Version Consistency Requirements" section that explains:

  • What format CHANGELOG needs
  • Where Bicep metadata goes
  • Why .bicepparam references matter

New contributors don't have to guess - the automation teaches them.

4. Fast Feedback Loop

Validation runs in ~30 seconds. Developers know immediately if something's wrong, not after a reviewer spots it hours later.


The Bottom Line

Version control is critical. But manual validation doesn't scale.

This two-part automation - a GitHub Actions workflow that calls a validation script - transformed our Bicep repository from "hopefully consistent" to "guaranteed consistent."

The result:

  • 100% version consistency across all modules
  • Faster PR reviews - no more version check comments
  • Complete audit trail - CHANGELOG always accurate
  • Zero version-related production incidents

Your infrastructure code deserves the same version rigor as your application code. Now you have the automation to enforce it.


Have ideas for improvement? I'd love to hear how you're managing versions in your IaC repositories. Drop a comment below.