ACPI/HMAT: Move HMAT messages to pr_debug() #34

clsotog · 2024-12-06T22:06:38Z

ACPI/HMAT: Move HMAT messages to pr_debug()

The HMAT messages printed at boot, beyond being noisy, can also print details for nodes that are not yet enabled. The primary method to consume HMAT details is via sysfs, and the sysfs interface gates what is emitted by whether the node is online or not. Hide the messages by default by moving them from "info" to "debug" log level.

Otherwise, these prints are just a pretty-print way to dump the ACPI HMAT table. It has always been the case that post-analysis was required for these messages to map proximity-domains to Linux NUMA nodes, and as Priya points out that analysis also needs to consider whether the proximity domain is marked "enabled" in the SRAT.

Reported-by: Priya Autee [email protected]

Acked-by: Rafael J. Wysocki [email protected]
Link: https://patch.msgid.link/170668982094.318782.2963631284830500182.stgit@dwillia2-xfh.jf.intel.com

(cherry picked from commit e2b952ffafced49fa6bd5cdc90f472b8bd932b5d cxl-next)
Signed-off-by: Carol L Soto <[email protected]

Signed-off-by: Ian May <[email protected]>

…dversion" This reverts commit 47d27f2. We need to revert this to avoid regressing any modules used in Jammy. Signed-off-by: Ian May <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>

Ignore: yes Signed-off-by: Ian May <[email protected]>

Signed-off-by: Ian May <[email protected]>

Ignore: yes Signed-off-by: Ian May <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2038972 Properties: no-test-build Signed-off-by: Ian May <[email protected]>

Ignore: yes Signed-off-by: Ian May <[email protected]>

Signed-off-by: Ian May <[email protected]>

Ignore: yes Signed-off-by: Paolo Pisati <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2046137 Properties: no-test-build Signed-off-by: Paolo Pisati <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Paolo Pisati <[email protected]>

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

Signed-off-by: Paolo Pisati <[email protected]>

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2055128 Properties: no-test-build Signed-off-by: Andrea Righi <[email protected]>

…ain/d2024.02.07) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Andrea Righi <[email protected]>

Signed-off-by: Andrea Righi <[email protected]>

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

Signed-off-by: Andrea Righi <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2048183 We currently build an empty package linux-nvidia-tools-host, stop that. Also, remove hooks.mk which a) should not exist at all for a regular package and b) only contains unneeded/obsolete build options. Signed-off-by: Juerg Haefliger <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Acked-by: Ivan Hu <[email protected]> Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2083936 When going through pools to download the DKMS debs, support both one-line (*.list) and DEB822 (*.sources) formats when parsing source lists. Signed-off-by: Noah Wager <[email protected]> Signed-off-by: Paolo Pisati <[email protected]> Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2084598 Change the Kconfig dependency, so this driver can be built and run on ARM64 with 4K page size. 16/64K page sizes are not supported yet. Signed-off-by: Haiyang Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 40a1d11) Signed-off-by: John Cabaj <[email protected]> Acked-by: Tim Gardner <[email protected]> Acked-by: Paolo Pisati <[email protected]> Signed-off-by: John Cabaj <[email protected]> (cherry picked from commit 775969387fec7d04adcc705b656ee5a4396a0579 noble:linux-azure/master-next) Signed-off-by: Jacob Martin <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: John Cabaj <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2084598 As defined by the MANA Hardware spec, the queue size for DMA is 4KB minimal, and power of 2. And, the HWC queue size has to be exactly 4KB. To support page sizes other than 4KB on ARM64, define the minimal queue size as a macro separately from the PAGE_SIZE, which we always assumed it to be 4KB before supporting ARM64. Also, add MANA specific macros and update code related to size alignment, DMA region calculations, etc. Signed-off-by: Haiyang Zhang <[email protected]> Reviewed-by: Michael Kelley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 382d174) Signed-off-by: John Cabaj <[email protected]> Acked-by: Marcelo Henrique Cerri <[email protected]> Acked-by: Thibault Ferrante <[email protected]> Signed-off-by: John Cabaj <[email protected]> (cherry picked from commit 4191de20636ee76151172bf5c88bf0cdb1bafc05 noble:linux-azure/master-next) Signed-off-by: Jacob Martin <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: John Cabaj <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Signed-off-by: Jacob Martin <[email protected]>

… size BugLink: https://bugs.launchpad.net/bugs/2084598 MANA hardware uses 4k page size. When calculating the page table index, it should use the hardware page size, not the system page size. Cc: [email protected] Fixes: 0266a17 ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Signed-off-by: Long Li <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> (cherry picked from commit 9e517a8) Signed-off-by: John Cabaj <[email protected]> Signed-off-by: Jacob Martin <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: John Cabaj <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Signed-off-by: Jacob Martin <[email protected]>

…l page BugLink: https://bugs.launchpad.net/bugs/2084598 When mapping doorbell page from user-mode, the driver should use the system page size as this memory is allocated via mmap() from user-mode. Cc: [email protected] Fixes: 0266a17 ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Signed-off-by: Long Li <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> (cherry picked from commit 4a3b99b) Signed-off-by: John Cabaj <[email protected]> Signed-off-by: Jacob Martin <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: John Cabaj <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2084598 Set the following configs on x86 and arm64: CONFIG_MANA_INFINIBAND=m CONFIG_MICROSOFT_MANA=m Signed-off-by: Jacob Martin <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: John Cabaj <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Signed-off-by: Jacob Martin <[email protected]>

…ackage BugLink: https://bugs.launchpad.net/bugs/2084598 Include mana.ko in linux-modules-ABIVER, rather than linux-modules-extra-ABIVER. Signed-off-by: Jacob Martin <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: John Cabaj <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Signed-off-by: Jacob Martin <[email protected]>

Ignore: yes Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2084817 Properties: no-test-build Signed-off-by: Jacob Martin <[email protected]>

Signed-off-by: Jacob Martin <[email protected]>

Ignore: yes Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2085928 Properties: no-test-build Signed-off-by: Jacob Martin <[email protected]>

Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2086233 The CPPC performance feedback counters could be 0 or unchanged when the target cpu is in a low-power idle state, e.g. power-gated or clock-gated. When the counters are 0, cppc_cpufreq_get_rate() returns 0 KHz, which makes cpufreq_online() get a false error and fail to generate a cpufreq policy. When the counters are unchanged, the existing cppc_perf_from_fbctrs() returns a cached desired perf, but some platforms may update the real frequency back to the desired perf reg. For the above cases in cppc_cpufreq_get_rate(), get the latest desired perf from the CPPC reg to reflect the frequency because some platforms may update the actual frequency back there; if failed, use the cached desired perf. Fixes: 6a4fec4 ("cpufreq: cppc: cppc_cpufreq_get_rate() returns zero in all error cases.") Signed-off-by: Jie Zhan <[email protected]> Reviewed-by: Zeng Heng <[email protected]> Reviewed-by: Ionela Voinescu <[email protected]> Reviewed-by: Huisong Li <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> (cherry picked from commit c471956 linux-next) Signed-off-by: Jamie Nguyen <[email protected]> Tested-by: Carol Soto <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Carol L Soto <[email protected]> Acked-by: Koba Ko <[email protected]> Signed-off-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2086233 Since commit 6c8d750 ("cpufreq / cppc: Work around for Hisilicon CPPC cpufreq"), we introduce a workround for HiSilicon platforms that do not support performance feedback counters, whereas they can get the actual frequency from the desired perf register. Later on, FIE is disabled in that workaround as well. Now the workround can be handled by the common code. Desired perf would be read and converted to frequency if feedback counters don't change. FIE would be disabled if the CPPC regs are in PCC region. Hence, the workaround is no longer needed and can be safely removed, in an effort to consolidate the driver procedure. Signed-off-by: Jie Zhan <[email protected]> Reviewed-by: Xiongfeng Wang <[email protected]> Reviewed-by: Huisong Li <[email protected]> [ Viresh: Move fie_disabled withing CONFIG option to fix warning ] Signed-off-by: Viresh Kumar <[email protected]> (cherry picked from commit ea1829d linux-next) Signed-off-by: Jamie Nguyen <[email protected]> Tested-by: Carol Soto <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Carol L Soto <[email protected]> Acked-by: Koba Ko <[email protected]> Signed-off-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2088114 There is a corner case where the desired_perf is exactly same as the old perf, but the actual current freq is not. This happens during S3 while the cpufreq governor is set to powersave. During cpufreq resume process, the booting CPU's new_freq obtained via .get() is the highest frequency, while the policy->cur and cpu->perf_ctrls.desired_perf are set to the lowest level (powersave governor). This causes the warning: "CPU frequency out of sync:", and the cpufreq core sets policy->cur to new_freq. Then the governor->limits() calls cppc_cpufreq_set_target() to configures the CPU frequency and returns directly because the desired_perf converted from target_freq is same as the cpu->perf_ctrls.desired_perf and both are the lowest_perf. Since target_freq and policy->cur have been already compared in __cpufreq_driver_target(), there's no need to compare them again here. Drop the comparison. Signed-off-by: Riwen Lu <[email protected]> [ Viresh: Updated commit message / subject ] Signed-off-by: Viresh Kumar <[email protected]> (cherry picked from commit 90e4ed6) Signed-off-by: Jamie Nguyen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Jacob Martin <[email protected]> Acked-by: Noah Wager <[email protected]> Signed-off-by: Brad Figg <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Jacob Martin <[email protected]>

Ignore: yes Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2086287 Properties: no-test-build Signed-off-by: Jacob Martin <[email protected]>

…-versions (main/2024.10.28) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Jacob Martin <[email protected]>

Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2089306 This reverts commit "vfio/pci: Insert full vma on mmap'd MMIO fault". The original commit changes vfio_pci to pre-fault the entire vma when handling a fault. For PCIe devices with large BAR regions, this can take a very long time to complete, causing kernel soft lockup warnings. This is particularly noticeable when launching a virtual machine with a passthrough PCIe GPU. Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2089306 This reverts commit "vfio/pci: Use unmap_mapping_range()". The original commit rewrote the vfio_pci mmap'd MMIO fault handler to use the "unmap_mapping_range()" and "vmf_insert_pfn()" functions in place of vfio_pci tracking its own mapped areas and using "zap_vma_ptes()" and "io_remap_pfn_range()". Use of "vmf_insert_pfn()" is significantly slower than the prior implementation. With large BAR region passthrough PCIe devices, this causes host soft lockup warnings if the commit "vfio/pci: Insert full vma on mmap'd MMIO fault" is present, or an extremely slow guest boot if it is not. Signed-off-by: Jacob Martin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2091107 Recent changes to the devlink reload (commit 9b2348e ("devlink: warn about existing entities during reload-reinit")) force the drivers to destroy devlink ports during reinit. Adjust ice driver to this requirement, unregister netdvice, destroy devlink port. ice_init_eth() was removed and all the common code between probe and reload was moved to ice_load(). During devlink reload we can't take devl_lock (it's already taken) and in ice_probe() we have to lock it. Use devl_* variant of the API which does not acquire and release devl_lock. Guard ice_load() with devl_lock only in case of probe. Suggested-by: Jiri Pirko <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Reviewed-by: Vadim Fedorenko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Signed-off-by: Wojciech Drewek <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> (cherry picked from commit 41cc4e5) Signed-off-by: Jacob Martin <[email protected]>

The HMAT messages printed at boot, beyond being noisy, can also print details for nodes that are not yet enabled. The primary method to consume HMAT details is via sysfs, and the sysfs interface gates what is emitted by whether the node is online or not. Hide the messages by default by moving them from "info" to "debug" log level. Otherwise, these prints are just a pretty-print way to dump the ACPI HMAT table. It has always been the case that post-analysis was required for these messages to map proximity-domains to Linux NUMA nodes, and as Priya points out that analysis also needs to consider whether the proximity domain is marked "enabled" in the SRAT. Reported-by: Priya Autee <[email protected]> Signed-off-by: Dan Williams <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Link: https://patch.msgid.link/170668982094.318782.2963631284830500182.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <[email protected]> (cherry picked from commit e2b952ffafced49fa6bd5cdc90f472b8bd932b5d cxl-next) Signed-off-by: Carol L Soto <[email protected]

khfeng

Acked-by: Kai-Heng Feng [email protected]

ianmay81 and others added 30 commits November 20, 2024 14:54

UBUNTU: [Packaging] Initialize linux-nvidia-6.5

8a53e93

Signed-off-by: Ian May <[email protected]>

Revert "UBUNTU: SAUCE: modpost: support arbitrary symbol length in mo…

a3888cd

…dversion" This reverts commit 47d27f2. We need to revert this to avoid regressing any modules used in Jammy. Signed-off-by: Ian May <[email protected]>

UBUNTU: [Packaging] update variants

88131a5

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>

UBUNTU: [Packaging] update Ubuntu.md

773a75d

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>

UBUNTU: Start new release

1299c7d

Ignore: yes Signed-off-by: Ian May <[email protected]>

UBUNTU: [Config] nvidia-6.5: update annotations

36afdaf

Signed-off-by: Ian May <[email protected]>

UBUNTU: Ubuntu-nvidia-6.5-6.5.0-1001.1

ce04b01

Signed-off-by: Ian May <[email protected]>

UBUNTU: [Packaging] nvidia-6.5: disable rust support

d00f916

Ignore: yes Signed-off-by: Ian May <[email protected]>

UBUNTU: Start new release

fe19009

Ignore: yes Signed-off-by: Ian May <[email protected]>

UBUNTU: link-to-tracker: update tracking bug

50d9546

BugLink: https://bugs.launchpad.net/bugs/2038972 Properties: no-test-build Signed-off-by: Ian May <[email protected]>

UBUNTU: [Config] nvidia-6.5: update annotations

2cfe4e5

Ignore: yes Signed-off-by: Ian May <[email protected]>

UBUNTU: Ubuntu-nvidia-6.5-6.5.0-1004.4

28ee622

Signed-off-by: Ian May <[email protected]>

UBUNTU: Start new release

229c09b

Ignore: yes Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: rename debian.nvidia-6.6 to debian.nvidia

069cb89

UBUNTU: link-to-tracker: update tracking bug

aab1100

BugLink: https://bugs.launchpad.net/bugs/2046137 Properties: no-test-build Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Packaging] update variants

ad9fa49

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Packaging] update update.conf

2c13d97

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Packaging] move to gcc-13 by default

bc7d31f

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: rebase on Ubuntu-6.6.0-14.14

84ca88e

Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Config] updateconfigs following Ubuntu-6.6.0-14.14 rebase

38465d2

Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: Ubuntu-nvidia-6.6.0-1001.1

f2f48dd

Signed-off-by: Paolo Pisati <[email protected]>

UBUNTU: [Packaging] move to linux 6.8

268fbb8

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: update dropped.txt

a1d9b90

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: Start new release

5276e07

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: link-to-tracker: update tracking bug

5360a44

BugLink: https://bugs.launchpad.net/bugs/2055128 Properties: no-test-build Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: debian.nvidia/dkms-versions -- update from kernel-versions (m…

68011a4

…ain/d2024.02.07) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: [Packaging] add Rust build dependencies

7289b9f

Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: [Config] update annotations after rebase to v6.8

049b8a3

Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: [Packaging] clean ABI check files

7a3912d

Ignore: yes Signed-off-by: Andrea Righi <[email protected]>

UBUNTU: Ubuntu-nvidia-6.8.0-1001.1

5bbfe10

Signed-off-by: Andrea Righi <[email protected]>

juergh and others added 26 commits November 20, 2024 14:54

UBUNTU: Start new release

310956f

Ignore: yes Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: link-to-tracker: update tracking bug

9476794

BugLink: https://bugs.launchpad.net/bugs/2084817 Properties: no-test-build Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: Ubuntu-nvidia-6.8.0-1017.19

158d529

Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: Start new release

0b220b0

Ignore: yes Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: link-to-tracker: update tracking bug

6cccaca

BugLink: https://bugs.launchpad.net/bugs/2085928 Properties: no-test-build Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: Ubuntu-nvidia-6.8.0-1018.20

fa145d8

Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: [Packaging] resync git-ubuntu-log

de8c8f1

BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: Start new release

a911845

Ignore: yes Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: link-to-tracker: update tracking bug

9c3e33f

BugLink: https://bugs.launchpad.net/bugs/2086287 Properties: no-test-build Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: [Packaging] debian.nvidia/dkms-versions -- update from kernel…

b511ad9

…-versions (main/2024.10.28) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Jacob Martin <[email protected]>

UBUNTU: Ubuntu-nvidia-6.8.0-1019.21

54316b8

Signed-off-by: Jacob Martin <[email protected]>

clsotog force-pushed the clsotog/hmat_printk branch from 2b906f4 to 7e938a5 Compare December 6, 2024 22:11

khfeng approved these changes Dec 9, 2024

View reviewed changes

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch from c42e6dc to 5dfc765 Compare December 10, 2024 16:01

nvidia-bfigg force-pushed the 24.04_linux-nvidia branch from 5dfc765 to 5ac091f Compare December 19, 2024 16:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACPI/HMAT: Move HMAT messages to pr_debug() #34

ACPI/HMAT: Move HMAT messages to pr_debug() #34

clsotog commented Dec 6, 2024

khfeng left a comment

ACPI/HMAT: Move HMAT messages to pr_debug() #34

Are you sure you want to change the base?

ACPI/HMAT: Move HMAT messages to pr_debug() #34

Conversation

clsotog commented Dec 6, 2024

khfeng left a comment

Choose a reason for hiding this comment