Skip to content

Conversation

@gh-app-edge-ai
Copy link

Release 2.4.0

Source: dev branch
Target: main branch

What's Changed

This release was created from the dev branch and includes all merged features and fixes since the last release.

Pre-merge Checklist

  • All CI/CD checks pass
  • Release notes reviewed
  • Documentation updated
  • Breaking changes documented (if any)

Validation

  • ✓ Gap 1: No conflicting release PRs exist
  • ✓ Gap 3: Version collision check passed
  • ✓ Branch created in both AzDO and GitHub

GitHub Actions Status

Monitor workflow status: https://github.com/microsoft/edge-ai/actions

Azure Pipelines and others added 6 commits January 15, 2026 04:46
…Terraform deployments to hang indefinitely

Terraform deployments of the `full-single-node-cluster` blueprint occasionally hang forever during the IoT Operations initialization phase. The deployment gets stuck waiting for the kubeconfig file after starting `az connectedk8s proxy`, with kubectl perpetually retrying connection attempts without success.

**Root Cause:**
The script `init-scripts.sh` has a race condition:
1. `mktemp` creates an empty kubeconfig file
2. `az connectedk8s proxy` starts in the background and asynchronously writes to the file
3. `kubectl` attempts to use the kubeconfig before it's fully written

When kubectl encounters an empty or partially written kubeconfig, it enters an infinite retry loop without re-reading the file, causing the deployment to hang indefinitely.

## Solution

Implemented an atomic file creation pattern that eliminates the race condition:

1. **Temporary file isolation**: `az connectedk8s proxy` now writes to a separate temporary file
2. **Completion monitoring**: A wrapper process monitors the temp file until it has content
3. **Write completion delay**: 2-second delay ensures the file is fully written (best effort)
4. **Atomic move**: The temp file is atomically moved to the final location using `mv`
5. **Guaranteed completeness**: The kubeconfig file only appears when fully written and ready

This ensures kubectl never sees a partial or empty kubeconfig file (assuming 2 second delay is sufficient to write file completely).

## Changes Made

**Script Improvements:**
- Added comprehensive file header documenting purpose, features, and environment variables
- Implemented atomic file creation wrapper around `az connectedk8s proxy`
- Added validation for all required environment variables
- Added validation ensuring optional token variables are used together
- Enhanced inline documentation explaining the race condition fix

**Technical Details:**
- Wrapper process monitors `az connectedk8s proxy` lifecycle
- Fails fast if proxy exits unexpectedly (with proper error messaging)
- 30-second timeout for temp file creation with clear error messages
- Process group management ensures proper cleanup of all spawned processes

## Benefits

- ✅ Eliminates indefinite hangs during Terraform deployments
- ✅ Ensures reliable Arc-enabled Kubernetes connectivity
- ✅ Provides better error messages and validation
- ✅ Maintains backward compatibility with existing deployments

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix that resolves a race condition in the initialization script.

#### PR Summary
This PR fixes an issue in `init-scripts.sh` where an asynchronously written kubeconfig file could be read before being fully populated, causing Terraform deployments to hang.
- `src/100-edge/110-iot-ops/scripts/init-scripts.sh`: Implements a temporary file strategy with an atomic move after verifying the file is fully written.
- `src/100-edge/110-iot-ops/scripts/init-scripts.sh`: Introduces robust environment variable validations for required a...
…c-ontology)

# Fabric Ontology Deployment Component (`033-fabric-ontology`)

## Summary

Introduces a **schema-driven Microsoft Fabric Ontology deployment component** that provisions complete ontology solutions from portable YAML definitions. Enables teams to deploy Fabric Ontologies declaratively without writing custom code for each deployment.

## What This PR Enables

### One-Command Deployment for Custom Ontologies

```bash
./scripts/deploy.sh \
  --definition ./my-ontology.yaml \
  --workspace-id <GUID> \
  --data-dir ./my-data/
```

Deploys everything from scratch: Lakehouse, data tables, semantic model, and ontology.

### One-Command IEEE 1872 Robotics Deployment

```bash
./scripts/deploy-cora-corax-dim.sh --workspace-id <GUID> --with-seed-data
```

Deploys a complete IEEE 1872 CORA/CORAX robotics ontology with sample robots, environments, and position measurements (19 Delta tables, 12 entities, 7 relationships).

### Declarative YAML Definitions

- Define entity types, properties, relationships, and data bindings in YAML
- Portable definitions work across environments
- JSON Schema validation for definition files

### Flexible Deployment Modes

- **Generic deployment**: One command deploys Lakehouse, data, semantic model, and ontology
- **Bind to existing data**: Deploy ontology against pre-existing Lakehouse tables
- **Full deployment**: Step-by-step control over each component

### Multi-Data-Source Support

- **Lakehouse**: Static/dimension data (Delta tables)
- **Eventhouse**: Time-series/telemetry data (KQL database)
- **Semantic Model**: Auto-generated Direct Lake TMDL for Power BI

## Component Structure

| Directory | Purpose |
|-----------|---------|
| `definitions/` | YAML schema and example ontology definitions |
| `fabric-ontology-dim/` | IEEE 1872 CORA/CORAX starter kit with seed data |
| `scripts/` | Bash deployment scripts with shared libraries |
| `templates/` | KQL, TMDL, and ontology JSON templates |
| `terraform/` | Terraform module for IaC integration |

## Deployment Scripts

| Script | Purpose |
|--------|---------|
| `deploy.sh` | **Generic one-command deployment (recommended)** |
| `deploy-cora-corax-dim.sh` | IEEE 1872 robotics ontology deployment |
| `validate-definition.sh` | Schema validation |
| `deploy-data-sources.sh` | Create Lakehouse/Eventhouse, load data |
| `deploy-semantic-model.sh` | Generate Direct Lake TMDL |
| `deploy-ontology.sh` | Deploy ontology with bindings |

## Example Definitions

| Definition | Description |
|------------|-------------|
| `cora-corax-dim.yaml` | IEEE 1872 CORA/CORAX robotics (12 entities, 7 relationships) |
| `tutorial-full.yaml` | Lakeshore Retail reference (4 entities, 3 relationships) |

## Terraform Integration

```hcl
module "fabric_ontology" {
  source = "../../../src/000-cloud/033-fabric-ontology/terraform"

  definition_file  = "${path.module}/my-ontology.yaml"
  fabric_workspace = { id = var.worksp...
…ponent

### Summary: Create `109-arc-extensions` Component
Azure Container Storage extension is currently deployed as part of the IoT Operations setup in our codebase, but it is not included in the official enablement manifest for version 2512.
This is a known discrepancy, and we need to refactor this in the repo

**Rationale**:
- cert-manager/trust-manager is a foundational extension required by ACSA and secret-store
- ACSA has independent use cases beyond IoT Operations (edge volumes, cloud ingest, media connector)
- Number `109` positions it after cluster setup (100) but before IoT Operations (110)
- Extensible pattern supports future extensions (Flux GitOps, etc.)

**Initial Scope**:
- **cert-manager** (microsoft.certmanagement) - foundational extension, dependency for ACSA and secret-store
- **Azure Container Storage (ACSA)** (microsoft.arc.containerstorage) - extracted from 110-iot-ops
- **Extensible structure** for future extensions

**Remains in 110-iot-ops**:
- secret-store (microsoft.azure.secretstore) - tightly coupled to IoT Operations instance deployment
- IoT Operations extension and instance resources

### Proposed Component Structure

```
src/100-edge/109-arc-extensions/
├── README.md
├── terraform/
│   ├── main.tf                      # Orchestrates extension modules
│   ├── variables.tf                 # Extension configurations
│   ├── variables.core.tf            # Standard core variables
│   ├── variables.deps.tf            # Arc cluster dependencies
│   ├── outputs.tf                   # Extension IDs/names for downstream
│   ├── versions.tf
│   └── modules/
│       ├── cert-manager/            # cert-manager extension module
│       │   ├── main.tf
│       │   ├── variables.tf
│       │   └── outputs.tf
│       └── container-storage/       # ACSA extension module
│           ├── main.tf
│           ├── variables.tf
│           └── outputs.tf
├── bicep/
│   ├── main.bicep
│   ├── types.bicep
│   ├── types.core.bicep
│   └── modules/
│       ├── cert-manager.bicep
│       └── container-storage.bicep
└── ci/
    └── terraform/
        ├── main.tf
        ├── variables.tf
        └── versions.tf
```

### Testing
The ACSA and cert-manager changes were tested by an implementation of the blueprint **minimum-single-node-cluster.**
![image.png](https://dev.azure.com/ai-at-the-edge-flagship-accelerator/3bef5a01-44ac-4d6c-8c8d-f4b7d374def6/_apis/git/repositories/a2834f5e-bda4-4acf-94d6-be1f4139ee96/pullRequests/579/attachments/image.png)

`cert-manager` and `azure-arc-containerstorage` were verified to be successfully deployed in the cluster
![image (3).png](https://dev.azure.com/ai-at-the-edge-flagship-accelerator/3bef5a01-44ac-4d6c-8c8d-f4b7d374def6/_apis/git/repositories/a2834f5e-bda4-4acf-94d6-be1f4139ee96/pullRequests/579/attachments/image%20%283%29.png)

----
#### AI description  (iteration 3)
#### PR Classification
This pull request implements a new feature and refactors container storage by introducing an Arc Extensions componen...
- regenerate terraform and bicep module README.md files
- apply markdown lint fixes via npm run mdlint-fix
- update 10 blueprint and 10 component README files

📝 - Generated by Copilot
@github-actions
Copy link

📚 Documentation Health Report

Generated on: 2026-01-18 01:32:40 UTC

📈 Documentation Statistics

Category File Count
Main Documentation 235
Infrastructure Components 183
Blueprints 38
Learning Platform 89
GitHub Resources 72
AI Assistant Guides (Copilot) 22
Total 639

🏗️ Three-Tree Architecture Status

  • ✅ Bicep Documentation Tree: Auto-generated navigation
  • ✅ Terraform Documentation Tree: Auto-generated navigation
  • ✅ README Documentation Tree: Manual README organization

🔍 Quality Metrics

  • Frontmatter Validation:
    success
  • Sidebar Generation: success
  • Link Validation: success
  • Build Test: skipped

This report is automatically generated by the Documentation Automation workflow.

@WilliamBerryiii WilliamBerryiii merged commit c9e185a into main Jan 18, 2026
27 of 28 checks passed
@WilliamBerryiii WilliamBerryiii deleted the release/2.4.0 branch January 18, 2026 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants