Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base-defconfig: enable DRM_XE #91

Merged
merged 1 commit into from
Apr 25, 2024

Conversation

fredoh9
Copy link

@fredoh9 fredoh9 commented Apr 22, 2024

Enable DRM for Intel Xe series GPUs.

Enable DRM for Intel Xe series GPUs.

Signed-off-by: Fred Oh <[email protected]>
@plbossart
Copy link
Member

@kv2019i do we also need

DRM_XE_DEBUG
DRM_XE_DISPLAY  <<< That sounds important, no?

and there are tons of other DRM_XE options....

@fredoh9
Copy link
Author

fredoh9 commented Apr 22, 2024

Device Testings are going on,

  • main_ace: planresultdetail/40144
  • main_cavs: planresultdetail/40145
  • stable-v2.2: planresultdetail/40146

@kv2019i
Copy link

kv2019i commented Apr 23, 2024

@plbossart wrote:

@kv2019i do we also need

DRM_XE_DEBUG
DRM_XE_DISPLAY  <<< That sounds important, no?

and there are tons of other DRM_XE options....

Here's the kconfig for Fedora test kernel for XE driver:
https://copr-dist-git.fedorainfracloud.org/cgit/ulissesf/kernel-xekmd/kernel.git/tree/kernel-x86_64-fedora.config#n1805

So yes definitely for DRM_XE_DISPLAY=m

@@ -237,6 +237,7 @@ CONFIG_SND_SOC_ACPI_INTEL_MATCH=m
# DRM
CONFIG_DRM=m
CONFIG_DRM_I915_ALPHA_SUPPORT=y
CONFIG_DRM_XE=m
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be good. We also need "CONFIG_DRM_XE_DISPLAY=y" but that's anyways the default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best to not add something that's default :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirmed, ONFIG_DRM_XE_DISPLAY=y is in generated .config file

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks CONFIG_DRM_I915_ALPHA_SUPPORT is removed from i915 driver.

@plbossart
Copy link
Member

@fredoh9 let us know what the test results are, and we can merge. Thanks for starting this!

@fredoh9
Copy link
Author

fredoh9 commented Apr 23, 2024

  • main_ace: planresultdetail/40144 ==> Good
  • main_cavs: planresultdetail/40145 ==> Good
  • stable-v2.2: planresultdetail/40146 ==> deploy failed due to FW

@fredoh9
Copy link
Author

fredoh9 commented Apr 23, 2024

re-run for stable-v2.2
stable-v2.2: planresultdetail/40195

@fredoh9 fredoh9 changed the title [DRAFT]base-defconfig: enable DRM_XE base-defconfig: enable DRM_XE Apr 23, 2024
@fredoh9 fredoh9 marked this pull request as ready for review April 23, 2024 19:27
@fredoh9
Copy link
Author

fredoh9 commented Apr 23, 2024

stable-v2.2 looks good too

@fredoh9
Copy link
Author

fredoh9 commented Apr 25, 2024

Will merge this soon to be included in tomorrow's daily build.

@fredoh9 fredoh9 merged commit a635023 into thesofproject:master Apr 25, 2024
4 checks passed
@marc-hb
Copy link

marc-hb commented Apr 26, 2024

I tried to test this.

Apr 26 17:00:07 jf-lnlm-rvp-nocodec-2 kernel: xe 0000:00:02.0: Your graphics device 64a0 is not officially supported
                                              by xe driver in this kernel version. To force Xe probe,
                                              use xe.force_probe='64a0' and i915.force_probe='!64a0'
                                              module parameters or CONFIG_DRM_XE_FORCE_PROBE='64a0' and
                                              CONFIG_DRM_I915_FORCE_PROBE='!64a0' configuration options.

I don't understand the i915 part of the advice. After adding both force_probe:

Apr 26 22:48:00 jf-lnlm-rvp-nocodec-2 kernel: xe 0000:00:02.0: Direct firmware load for xe/lnl_guc_70.bin failed with error -2
Apr 26 22:48:00 jf-lnlm-rvp-nocodec-2 kernel: xe 0000:00:02.0: [drm] GuC firmware xe/lnl_guc_70.bin: fetch failed with error -2
Apr 26 22:48:00 jf-lnlm-rvp-nocodec-2 kernel: xe 0000:00:02.0: [drm] GuC firmware(s) can be downloaded from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
Apr 26 22:48:00 jf-lnlm-rvp-nocodec-2 kernel: xe 0000:00:02.0: [drm] *ERROR* GuC init failed with -2
Apr 26 22:48:00 jf-lnlm-rvp-nocodec-2 kernel: xe 0000:00:02.0: probe with driver xe failed with error -2

After adding the GuC firmware, and re-enabling gdm3, wayland fails to start with "no GPU found" + the Ubuntu 22 infinite loop documented in thesofproject/sof-test#998

gt0 and gt1 GPU appears in /sys/kernel/debug/dri/0/ but no display engine there.

Maybe it's missing some other firmware? The kernel logs don't have any error anymore.

To be continued.



× [email protected] - GNOME Shell on Wayland
     Loaded: loaded (/usr/lib/systemd/user/[email protected]; static)
     Active: failed (Result: protocol) since Fri 2024-04-26 23:04:23 UTC; 7min ago
    Process: 732 ExecCondition=/bin/sh -c test "$XDG_SESSION_TYPE" = "wayland" || exit 2 (code=exited, status=0/SUCCESS)
    Process: 734 ExecStart=/usr/bin/gnome-shell (code=exited, status=1/FAILURE)
    Process: 741 ExecStopPost=/bin/sh -c test "$SERVICE_RESULT" != "exec-condition" && systemctl --user unset-environment GNOME_SETUP_DISPLAY WAYLAND_DISPLAY DISPLAY XAUTHORITY (code=exited, status=0/SUCCESS)
   Main PID: 734 (code=exited, status=1/FAILURE)
        CPU: 64ms

Apr 26 23:04:23 jf-lnlm-rvp-nocodec-2 systemd[616]: Starting GNOME Shell on Wayland...
Apr 26 23:04:23 jf-lnlm-rvp-nocodec-2 gnome-shell[734]: Running GNOME Shell (using mutter 42.9) as a Wayland display server
Apr 26 23:04:23 jf-lnlm-rvp-nocodec-2 gnome-shell[734]: g_hash_table_destroy: assertion 'hash_table != NULL' failed
Apr 26 23:04:23 jf-lnlm-rvp-nocodec-2 gnome-shell[734]: Failed to open gpu '/dev/dri/card0': No suitable mode setting backend found
Apr 26 23:04:23 jf-lnlm-rvp-nocodec-2 gnome-shell[734]: Failed to setup: No GPUs found
Apr 26 23:04:23 jf-lnlm-rvp-nocodec-2 systemd[616]: [email protected]: Failed with result 'protocol'.
Apr 26 23:04:23 jf-lnlm-rvp-nocodec-2 systemd[616]: Failed to start GNOME Shell on Wayland.
Apr 26 23:04:23 jf-lnlm-rvp-nocodec-2 systemd[616]: [email protected]: Triggering OnFailure= dependencies.

marc-hb added a commit to marc-hb/sof-test that referenced this pull request Apr 26, 2024
Searching for "(dis)connected" in i915_display_info shows useful
connector info.

Also show /sys/class/drm/ and /sys/kernel/debug/dri/0/, this will be
useful for DRM_XE, see thesofproject/kconfig#91

Signed-off-by: Marc Herbert <[email protected]>
marc-hb added a commit to marc-hb/sof-test that referenced this pull request Apr 27, 2024
Searching for "(dis)connected" in i915_display_info shows useful
connector info.

Also show /sys/class/drm/ and /sys/kernel/debug/dri/0/, this will be
useful for DRM_XE, see thesofproject/kconfig#91

Signed-off-by: Marc Herbert <[email protected]>
@marc-hb
Copy link

marc-hb commented Apr 29, 2024

Smoking gun display:no found by Clint (thanks!)

journalctl -b | grep display
Apr 29 21:52:48 jf-lnlm-rvp-nocodec-2 kernel: xe 0000:00:02.0: [drm:xe_pci_probe [xe]] XE_LUNARLAKE  64a0:0001 dgfx:0 gfx:Xe2_LPG (20.04) media:Xe2_LPM (20.00)
        display:no dma_m_s:46 tc:1 gscfi:0

Still no idea why, to be continued.

@marc-hb
Copy link

marc-hb commented Apr 29, 2024

I got used to misnamed disable_FOO options and the corresponding double negations but this one really takes the cake:

modinfo xe | grep _display

parm:           disable_display:Disable display (default: false) (bool)
parm:           enable_display:Enable display (bool)

Wow!

The undocumented default for enable_display is YES so this is hopefully not the problem but it was worth mentioning at least for the laugh.

@plbossart
Copy link
Member

Intel only supports a Schroedinger display which can be enabled and disabled at the same time.

@marc-hb
Copy link

marc-hb commented Apr 30, 2024

Thanks Clint and RK for root causing this. And the issue is... LNL display is simply not enabled upstream yet! This one-line patch is missing:

https://cgit.freedesktop.org/drm-tip/commit/?id=79263e4b3f0ed5928a1622300d32ed35f7d8fc24

--- a/[drivers/gpu/drm/xe/xe_pci.c](https://cgit.freedesktop.org/drm-tip/tree/drivers/gpu/drm/xe/xe_pci.c?id=bf1a72ab5a446e383682e34347237ee5317c2185)
+++ b/[drivers/gpu/drm/xe/xe_pci.c](https://cgit.freedesktop.org/drm-tip/tree/drivers/gpu/drm/xe/xe_pci.c?id=79263e4b3f0ed5928a1622300d32ed35f7d8fc24)
@@ -333,6 +333,7 @@ static const struct xe_device_desc mtl_desc = {
 
 static const struct xe_device_desc lnl_desc = {
 	PLATFORM(XE_LUNARLAKE),
+	.has_display = true,
 	.require_force_probe = true,
 };

Now why would you hold this line back when an explicit force_probe is required anyway?

The answer is: display code tends to lag behind GPU code for various reasons. So there is a time window where you want users to beta test the GPU with force_probe but NOT the display yet.

Lesson learned: next time I'll just... wait.

@plbossart
Copy link
Member

plbossart commented Apr 30, 2024

so what's the conclusion? revert this PR, or add something on LNL test devices?

@lucasdemarchi
Copy link

The answer is: display code tends to lag behind GPU code for various reasons. So there is a time window where you want users to beta test the GPU with force_probe but NOT the display yet.

yes, that is correct. We had the core of LNL upstream around November/2023, but the display part is more recent. It is already upstream, but not in a released kernel. You are missing this commit:

commit 79263e4b3f0ed5928a1622300d32ed35f7d8fc24
Author: Balasubramani Vivekanandan <[email protected]>
Date:   Tue Mar 12 13:36:39 2024 -0300

    drm/xe/lnl: Enable display support
    
    Enable display support for Lunar Lake.
    
    Signed-off-by: Balasubramani Vivekanandan <[email protected]>
    Reviewed-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Gustavo Sousa <[email protected]>
    Reviewed-by: Matt Roper <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Lucas De Marchi <[email protected]>

@marc-hb
Copy link

marc-hb commented Apr 30, 2024

so what's the conclusion? revert this PR, or add something on LNL test devices?

  1. Wait 6.10
  2. Test again (with force_probe etc.)

marc-hb added a commit to thesofproject/sof-test that referenced this pull request May 1, 2024
Searching for "(dis)connected" in i915_display_info shows useful
connector info.

Also show /sys/class/drm/ and /sys/kernel/debug/dri/0/, this will be
useful for DRM_XE, see thesofproject/kconfig#91

Signed-off-by: Marc Herbert <[email protected]>
@marc-hb
Copy link

marc-hb commented May 20, 2024

Wow, the transition is actually documented!
https://www.kernel.org/doc/html//next/gpu/rfc/xe.html

I don't think anyone ever pointed me to that page...

@marc-hb
Copy link

marc-hb commented Aug 5, 2024

New xe development worth noticing:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants