Add Intel Xe GPU driver support #1457

deveworld · 2026-01-10T12:29:25Z

Summary

This PR adds support for Intel GPUs using the Xe kernel driver, which is required for newer hardware like Lunar Lake, Battlemage, and other recent Intel GPUs.

Closes #1407

Changes

Add Xe namespace in btop_collect.cpp with PMU-based GPU monitoring
Add minimal xe_drm.h header with UAPI definitions from Linux kernel
Detect driver type (xe vs i915) and route to appropriate code path
Dynamic PMU device discovery for discrete Intel GPUs (also addresses [BUG] Failed to find Intel GPU engines. Intel Arc A310. Linux x86_64 6.6.65 #938)

Implementation Approach

Per maintainer feedback on PR #1408, this implementation:

Does NOT modify any files in src/linux/intel_gpu_top/ (IGT files)
Adds Xe support entirely within the existing btop_collect.cpp structure
Uses the same integration pattern as the existing i915 code

Supported Metrics (Xe driver)

Metric	Support	Source
GPU Utilization	✅	`engine-active-ticks` / `engine-total-ticks` PMU events
GPU Clock Speed	✅	`gt-actual-frequency` PMU event
Memory Usage	❌	Not available via PMU
Power Usage	❌	Not available via PMU
Temperature	❌	Not available via PMU

Testing

✅ Compiles successfully with make GPU_SUPPORT=true
Tested only in Arc 130V.
⚠️ Needs testing on actual hardware (Arc A310, etc.)

Technical Notes

Uses perf_event_open() syscall directly for PMU counter access
Enumerates engines via DRM_IOCTL_XE_DEVICE_QUERY ioctl
Reports max utilization across all GPU engines (render, copy, compute, etc.)
Properly handles both integrated and discrete Intel GPUs with device-specific PMU names

Copilot

Pull request overview

This PR adds support for Intel GPUs using the Xe kernel driver, which is required for newer Intel hardware like Lunar Lake and Battlemage that don't use the legacy i915 driver.

Changes:

Added minimal xe_drm.h header with UAPI definitions for interfacing with the Xe driver
Implemented Xe namespace with PMU-based GPU monitoring using perf events
Added dynamic driver detection and PMU device discovery to support both i915 and Xe drivers
Routes collection to appropriate code path based on detected driver type

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 14 comments.

File	Description
src/linux/xe_drm.h	New header file containing minimal Xe DRM UAPI definitions (structs and ioctls) for device query and engine enumeration
src/linux/btop_collect.cpp	Added Xe namespace with PMU-based monitoring, dynamic driver detection, PMU device discovery, and routing logic to switch between i915 and Xe code paths

Comments suppressed due to low confidence (1)

src/linux/btop_collect.cpp:2103

Missing free() call for gpu_path on error path. If get_intel_device_id() returns null, the function returns false without freeing gpu_path that was allocated by find_intel_gpu_dir().

			char *gpu_device_id = get_intel_device_id(gpu_path);
			if (!gpu_device_id) {
				Logger::debug("Failed to find Intel GPU device ID, Intel GPUs will not be detected");
				return false;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/linux/btop_collect.cpp

Add support for Intel GPUs using the new Xe kernel driver, which is required for newer hardware like Lunar Lake. This addresses issue aristocratos#1407. Implementation approach (per maintainer feedback on PR aristocratos#1408): - Add Xe namespace in btop_collect.cpp (does not modify IGT files) - Add minimal xe_drm.h header with UAPI definitions from Linux kernel - Detect driver type (xe vs i915) and route to appropriate code path - Use Xe PMU perf events for GPU utilization and clock speed Supported metrics for Xe driver: - GPU utilization (via engine-active-ticks/engine-total-ticks) - GPU clock speed (via gt-actual-frequency) Also fixes dynamic PMU device discovery for discrete Intel GPUs (addresses issue aristocratos#938) by checking for device-specific PMU names like 'xe_0000_03_00.0' before falling back to generic 'xe' or 'i915'. Closes aristocratos#1407

- Fix critical heap corruption: remove free() on static buffer from find_intel_gpu_dir() - Fix GPU clock unit: convert Hz to MHz to match other drivers - Fix MAX_GPU_CLOCK overflow: change from 10e9 Hz to 10000 MHz - Add bounds validation for num_engines to prevent OOB access - Add first-sample baselining to prevent initial utilization spike - Add error handling for ioctl PERF_EVENT_IOC_ENABLE - Add error handling for clock_gettime in init() and collect() - Add group_fd validation before assignment - Add dt minimum clamping to prevent division issues - Add stoull exception handling - Fix memory leak: call free_engines() when pmu_init fails - Replace magic number 14 with strlen(PCI_SLOT_PREFIX) - Use ull suffix and static_cast for type safety in build_config()

The PMU gt-actual-frequency event requires complex time-weighted calculation that wasn't working correctly (always showed 0 MHz). Switch to reading frequency directly from sysfs: /sys/class/drm/cardX/device/tile0/gtN/freq0/cur_freq This matches how nvtop reads Xe GPU frequency and provides accurate real-time clock speed values.

Restore original 2-tab indentation that was accidentally changed to 1-tab in previous commit.

- Replace PMU-based engine-active-ticks with sysfs idle_residency_ms (works on Battlemage without CAP_PERFMON, fixes 0% utilization) - Add DRM_XE_DEVICE_QUERY_MEM_REGIONS for VRAM usage reporting - Implement GT separation: RC (Render/Compute) and MC (Media) tracking for architectures with split GT layout (Lunar Lake, Battlemage) - Update UI to show RC/MC labels instead of ENC/DEC when gt_utilization is enabled, with separate graphs for each GT type Tested on: Lunar Lake (Core Ultra 5 228V) Fixes: aristocratos#1407 (partial - needs Battlemage testing)

- Add first_sample flag to skip first gtidle calculation (fixes 100% spike on startup) - Add pci.ids database lookup for accurate GPU product names (e.g. 'Intel Arc B580' instead of 'Intel Battlemage (Gen20)') - Fallback to codename-based naming if pci.ids lookup fails

- Read idle_status sysfs to detect power gating state (gt-c6 vs gt-c0) - When idle counter doesn't advance AND GT is power-gated: report 0% (not 100%) - When idle counter doesn't advance AND GT is active: report 100% (real load) - Add EMA smoothing (alpha=0.3) to reduce transient spikes from compositor - Handle counter wrap/reset by preserving previous smoothed value This fixes false 100% utilization spikes that occurred when the GPU entered power gating (RC6/MC6) and the idle_residency_ms counter stopped advancing, which was incorrectly interpreted as 100% busy.

- Fix indentation (2 tabs -> 3 tabs in structs/globals) - Replace !, &&, || with not, and, or operators - Add comments explaining DRM ioctl patterns, EMA smoothing, and power gating detection logic - Wrap long lines for readability

- Refactor Intel namespace to support multiple GPUs via GpuInstance struct - Add discover_intel_gpus() to find all Intel GPUs via sysfs vendor ID - Refactor Xe namespace with per-GPU state (states vector, gpu_index params) - Update Intel::init/shutdown/collect to iterate over gpu_instances - Add has_pmu_permissions() to prevent crash from assert in i915 PMU code - Add empty gpus vector early-return defense in Gpu::collect() - Fix division by zero guards and typo (mem_total -> pwr_total) - Initialize gpu-vram-totals and gpu-pwr-totals in Xe first_sample block Fixes: Multi-GPU not detected (Issue aristocratos#1407) Fixes: Crash without sudo (Aborted core dumped)

Implement fdinfo-based GPU utilization measurement for Intel Xe GPUs: - Add FdinfoCycles struct and collect_fdinfo_cycles function - Parse /proc/*/fdinfo/* for drm-cycles-rcs/vcs data - Use client-id deduplication to prevent double-counting - Apply EMA smoothing for stable readings - Fall back to gtidle when fdinfo unavailable This provides more accurate utilization data compared to residency-based gtidle measurements.

TheSovietPancakes · 2026-02-01T09:16:05Z

Please do not reformat code if you can

Copilot AI review requested due to automatic review settings January 10, 2026 12:29

Copilot started reviewing on behalf of deveworld January 10, 2026 12:29 View session

deveworld force-pushed the feature/xe-gpu-support branch from e5e3853 to 1f8f375 Compare January 10, 2026 12:33

Copilot AI reviewed Jan 10, 2026

View reviewed changes

deveworld mentioned this pull request Jan 11, 2026

[Feature Request] Add Intel Xe driver support (Lunar Lake, etc.) #1407

Open

aristocratos added the AI generated Majority of included code is AI generated label Jan 15, 2026

aristocratos changed the title ~~[AI generated] Add Intel Xe GPU driver support~~ Add Intel Xe GPU driver support Jan 15, 2026

deckstose added the gpu Issues or pull requests related to GPU functionality label Jan 25, 2026

deveworld added 9 commits January 31, 2026 02:44

[AI generated] Fix XeState struct indentation

7a48873

Restore original 2-tab indentation that was accidentally changed to 1-tab in previous commit.

deveworld force-pushed the feature/xe-gpu-support branch from 178426b to c4341c0 Compare January 30, 2026 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Intel Xe GPU driver support #1457

Add Intel Xe GPU driver support #1457

deveworld commented Jan 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TheSovietPancakes commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Add Intel Xe GPU driver support #1457

Are you sure you want to change the base?

Add Intel Xe GPU driver support #1457

Conversation

deveworld commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Implementation Approach

Supported Metrics (Xe driver)

Testing

Technical Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TheSovietPancakes commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

deveworld commented Jan 10, 2026 •

edited

Loading