Skip to content

Conversation

@deveworld
Copy link

@deveworld deveworld commented Jan 10, 2026

Summary

This PR adds support for Intel GPUs using the Xe kernel driver, which is required for newer hardware like Lunar Lake, Battlemage, and other recent Intel GPUs.

Closes #1407

Changes

Implementation Approach

Per maintainer feedback on PR #1408, this implementation:

  • Does NOT modify any files in src/linux/intel_gpu_top/ (IGT files)
  • Adds Xe support entirely within the existing btop_collect.cpp structure
  • Uses the same integration pattern as the existing i915 code

Supported Metrics (Xe driver)

Metric Support Source
GPU Utilization engine-active-ticks / engine-total-ticks PMU events
GPU Clock Speed gt-actual-frequency PMU event
Memory Usage Not available via PMU
Power Usage Not available via PMU
Temperature Not available via PMU

Testing

  • ✅ Compiles successfully with make GPU_SUPPORT=true
  • Tested only in Arc 130V.
  • ⚠️ Needs testing on actual hardware (Arc A310, etc.)

Technical Notes

  • Uses perf_event_open() syscall directly for PMU counter access
  • Enumerates engines via DRM_IOCTL_XE_DEVICE_QUERY ioctl
  • Reports max utilization across all GPU engines (render, copy, compute, etc.)
  • Properly handles both integrated and discrete Intel GPUs with device-specific PMU names

Copilot AI review requested due to automatic review settings January 10, 2026 12:29
@deveworld deveworld force-pushed the feature/xe-gpu-support branch from e5e3853 to 1f8f375 Compare January 10, 2026 12:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for Intel GPUs using the Xe kernel driver, which is required for newer Intel hardware like Lunar Lake and Battlemage that don't use the legacy i915 driver.

Changes:

  • Added minimal xe_drm.h header with UAPI definitions for interfacing with the Xe driver
  • Implemented Xe namespace with PMU-based GPU monitoring using perf events
  • Added dynamic driver detection and PMU device discovery to support both i915 and Xe drivers
  • Routes collection to appropriate code path based on detected driver type

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 14 comments.

File Description
src/linux/xe_drm.h New header file containing minimal Xe DRM UAPI definitions (structs and ioctls) for device query and engine enumeration
src/linux/btop_collect.cpp Added Xe namespace with PMU-based monitoring, dynamic driver detection, PMU device discovery, and routing logic to switch between i915 and Xe code paths
Comments suppressed due to low confidence (1)

src/linux/btop_collect.cpp:2103

  • Missing free() call for gpu_path on error path. If get_intel_device_id() returns null, the function returns false without freeing gpu_path that was allocated by find_intel_gpu_dir().
			char *gpu_device_id = get_intel_device_id(gpu_path);
			if (!gpu_device_id) {
				Logger::debug("Failed to find Intel GPU device ID, Intel GPUs will not be detected");
				return false;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@aristocratos aristocratos added the AI generated Majority of included code is AI generated label Jan 15, 2026
@aristocratos aristocratos changed the title [AI generated] Add Intel Xe GPU driver support Add Intel Xe GPU driver support Jan 15, 2026
@deckstose deckstose added the gpu Issues or pull requests related to GPU functionality label Jan 25, 2026
Add support for Intel GPUs using the new Xe kernel driver, which is
required for newer hardware like Lunar Lake. This addresses issue aristocratos#1407.

Implementation approach (per maintainer feedback on PR aristocratos#1408):
- Add Xe namespace in btop_collect.cpp (does not modify IGT files)
- Add minimal xe_drm.h header with UAPI definitions from Linux kernel
- Detect driver type (xe vs i915) and route to appropriate code path
- Use Xe PMU perf events for GPU utilization and clock speed

Supported metrics for Xe driver:
- GPU utilization (via engine-active-ticks/engine-total-ticks)
- GPU clock speed (via gt-actual-frequency)

Also fixes dynamic PMU device discovery for discrete Intel GPUs
(addresses issue aristocratos#938) by checking for device-specific PMU names
like 'xe_0000_03_00.0' before falling back to generic 'xe' or 'i915'.

Closes aristocratos#1407
- Fix critical heap corruption: remove free() on static buffer from find_intel_gpu_dir()
- Fix GPU clock unit: convert Hz to MHz to match other drivers
- Fix MAX_GPU_CLOCK overflow: change from 10e9 Hz to 10000 MHz
- Add bounds validation for num_engines to prevent OOB access
- Add first-sample baselining to prevent initial utilization spike
- Add error handling for ioctl PERF_EVENT_IOC_ENABLE
- Add error handling for clock_gettime in init() and collect()
- Add group_fd validation before assignment
- Add dt minimum clamping to prevent division issues
- Add stoull exception handling
- Fix memory leak: call free_engines() when pmu_init fails
- Replace magic number 14 with strlen(PCI_SLOT_PREFIX)
- Use ull suffix and static_cast for type safety in build_config()
The PMU gt-actual-frequency event requires complex time-weighted
calculation that wasn't working correctly (always showed 0 MHz).

Switch to reading frequency directly from sysfs:
/sys/class/drm/cardX/device/tile0/gtN/freq0/cur_freq

This matches how nvtop reads Xe GPU frequency and provides
accurate real-time clock speed values.
Restore original 2-tab indentation that was accidentally changed
to 1-tab in previous commit.
- Replace PMU-based engine-active-ticks with sysfs idle_residency_ms
  (works on Battlemage without CAP_PERFMON, fixes 0% utilization)
- Add DRM_XE_DEVICE_QUERY_MEM_REGIONS for VRAM usage reporting
- Implement GT separation: RC (Render/Compute) and MC (Media) tracking
  for architectures with split GT layout (Lunar Lake, Battlemage)
- Update UI to show RC/MC labels instead of ENC/DEC when gt_utilization
  is enabled, with separate graphs for each GT type

Tested on: Lunar Lake (Core Ultra 5 228V)
Fixes: aristocratos#1407 (partial - needs Battlemage testing)
- Add first_sample flag to skip first gtidle calculation (fixes 100% spike on startup)
- Add pci.ids database lookup for accurate GPU product names
  (e.g. 'Intel Arc B580' instead of 'Intel Battlemage (Gen20)')
- Fallback to codename-based naming if pci.ids lookup fails
- Read idle_status sysfs to detect power gating state (gt-c6 vs gt-c0)
- When idle counter doesn't advance AND GT is power-gated: report 0% (not 100%)
- When idle counter doesn't advance AND GT is active: report 100% (real load)
- Add EMA smoothing (alpha=0.3) to reduce transient spikes from compositor
- Handle counter wrap/reset by preserving previous smoothed value

This fixes false 100% utilization spikes that occurred when the GPU
entered power gating (RC6/MC6) and the idle_residency_ms counter stopped
advancing, which was incorrectly interpreted as 100% busy.
- Fix indentation (2 tabs -> 3 tabs in structs/globals)
- Replace !, &&, || with not, and, or operators
- Add comments explaining DRM ioctl patterns, EMA smoothing,
  and power gating detection logic
- Wrap long lines for readability
- Refactor Intel namespace to support multiple GPUs via GpuInstance struct
- Add discover_intel_gpus() to find all Intel GPUs via sysfs vendor ID
- Refactor Xe namespace with per-GPU state (states vector, gpu_index params)
- Update Intel::init/shutdown/collect to iterate over gpu_instances
- Add has_pmu_permissions() to prevent crash from assert in i915 PMU code
- Add empty gpus vector early-return defense in Gpu::collect()
- Fix division by zero guards and typo (mem_total -> pwr_total)
- Initialize gpu-vram-totals and gpu-pwr-totals in Xe first_sample block

Fixes: Multi-GPU not detected (Issue aristocratos#1407)
Fixes: Crash without sudo (Aborted core dumped)
@deveworld deveworld force-pushed the feature/xe-gpu-support branch from 178426b to c4341c0 Compare January 30, 2026 17:58
Implement fdinfo-based GPU utilization measurement for Intel Xe GPUs:
- Add FdinfoCycles struct and collect_fdinfo_cycles function
- Parse /proc/*/fdinfo/* for drm-cycles-rcs/vcs data
- Use client-id deduplication to prevent double-counting
- Apply EMA smoothing for stable readings
- Fall back to gtidle when fdinfo unavailable

This provides more accurate utilization data compared to
residency-based gtidle measurements.
@TheSovietPancakes
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI generated Majority of included code is AI generated gpu Issues or pull requests related to GPU functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Add Intel Xe driver support (Lunar Lake, etc.)

4 participants