summaryrefslogtreecommitdiff
path: root/drivers/accel/amdxdna
AgeCommit message (Collapse)Author
2025-12-15accel/amdxdna: Block running under a hypervisorMario Limonciello (AMD)
SVA support is required, which isn't configured by hypervisor solutions. Closes: https://github.com/QubesOS/qubes-issues/issues/10275 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4656 Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251213054513.87925-1-superm1@kernel.org Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-11-13accel/amdxdna: Fix deadlock between context destroy and job timeoutLizhi Hou
Hardware context destroy function holds dev_lock while waiting for all jobs to complete. The timeout job also needs to acquire dev_lock, this leads to a deadlock. Fix the issue by temporarily releasing dev_lock before waiting for all jobs to finish, and reacquiring it afterward. Fixes: 4fd6ca90fc7f ("accel/amdxdna: Refactor hardware context destroy routine") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251107181050.1293125-1-lizhi.hou@amd.com
2025-11-13accel/amdxdna: Clear mailbox interrupt register during channel creationLizhi Hou
The mailbox interrupt register is not always cleared when a mailbox channel is created. This can leave stale interrupt states from previous operations. Fix this by explicitly clearing the interrupt register in the mailbox channel creation function. Fixes: b87f920b9344 ("accel/amdxdna: Support hardware mailbox") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251107181115.1293158-1-lizhi.hou@amd.com
2025-11-07accel/amdxdna: Treat power-off failure as unrecoverable errorLizhi Hou
Failing to set power off indicates an unrecoverable hardware or firmware error. Update the driver to treat such a failure as a fatal condition and stop further operations that depend on successful power state transition. This prevents undefined behavior when the hardware remains in an unexpected state after a failed power-off attempt. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251106180521.1095218-1-lizhi.hou@amd.com
2025-11-06accel/amdxdna: Fix dma_fence leak when job is canceledLizhi Hou
Currently, dma_fence_put(job->fence) is called in job notification callback. However, if a job is canceled, the notification callback is never invoked, leading to a memory leak. Move dma_fence_put(job->fence) to the job cleanup function to ensure the fence is always released. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251105194140.1004314-1-lizhi.hou@amd.com
2025-11-05accel/amdxdna: Support preemption requestsLizhi Hou
The driver checks the firmware version during initialization.If preemption is supported, the driver configures preemption accordingly and handles userspace preemption requests. Otherwise, the driver returns an error for userspace preemption requests. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251104185340.897560-1-lizhi.hou@amd.com
2025-11-04accel/amdxdna: Add IOCTL parameter for telemetry dataLizhi Hou
Extend DRM_IOCTL_AMDXDNA_GET_INFO to include additional parameters that allow collection of telemetry data. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251104062546.833771-3-lizhi.hou@amd.com
2025-11-04accel/amdxdna: Add IOCTL parameter for resource dataLizhi Hou
Extend DRM_IOCTL_AMDXDNA_GET_INFO to include additional parameters that allow collection of resource data. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251104062546.833771-2-lizhi.hou@amd.com
2025-11-04accel/amdxdna: Add hardware specific attributesLizhi Hou
Add three hardware specific attributes to describe device capabilities: hwctx_limit: The maximum number of hardware context supported. max_tops: The maximum TOPS supported. curr_tops: The TOPS achievable with the current power and frequency configuration. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251104062546.833771-1-lizhi.hou@amd.com
2025-11-03accel/amdxdna: Use MSG_OP_CHAIN_EXEC_NPU when supportedLizhi Hou
MSG_OP_CHAIN_EXEC_NPU is a unified mailbox message that replaces MSG_OP_CHAIN_EXEC_BUFFER_CF and MSG_OP_CHAIN_EXEC_DPU. Add driver logic to check firmware version, and if MSG_OP_CHAIN_EXEC_NPU is supported, uses it to submit firmware commands. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251031014700.2919349-1-lizhi.hou@amd.com
2025-10-31drm: include drm_print.h where neededJani Nikula
There are a gazillion files that depend on drm_print.h being indirectly included via drm_buddy.h, drm_mm.h, or ttm/ttm_resource.h. In preparation for removing those includes, explicitly include drm_print.h where needed. Cc: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/5fe67395907be33eb5199ea6d540e29fddee71c8.1761734313.git.jani.nikula@intel.com
2025-10-30accel/amdxdna: Fix incorrect command state for timed out jobLizhi Hou
When a command times out, mark it as ERT_CMD_STATE_TIMEOUT. Any other commands that are canceled due to this timeout should be marked as ERT_CMD_STATE_ABORT. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251029193423.2430463-1-lizhi.hou@amd.com
2025-10-24accel/amdxdna: Fix uninitialized return valueLizhi Hou
In aie2_get_hwctx_status() and aie2_query_ctx_status_array(), the functions could return an uninitialized value in some cases. Update them to always return 0. The amount of valid results is indicated by the returned buffer_size, element_size, and num_element fields. Fixes: 2f509fe6a42c ("accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAY") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251024165503.1548131-1-lizhi.hou@amd.com
2025-10-24accel/amdxdna: Fix incorrect return value in aie2_hwctx_sync_debug_bo()Lizhi Hou
When the driver issues the SYNC_DEBUG_BO command, it currently returns 0 even if the firmware fails to execute the command. Update the driver to return -EINVAL in this case to properly indicate the failure. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/aPsadTBXunUSBByV@stanley.mountain/ Fixes: 7ea046838021 ("accel/amdxdna: Support firmware debug buffer") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20251024162608.1544842-1-lizhi.hou@amd.com
2025-10-20accel/amdxdna: Support firmware debug bufferLizhi Hou
To collect firmware debug information, the userspace application allocates a AMDXDNA_BO_DEV buffer object through DRM_IOCTL_AMDXDNA_CREATE_BO. Then it associates the buffer with the hardware context through DRM_IOCTL_AMDXDNA_CONFIG_HWCTX which requests firmware to bind the buffer through a mailbox command. The firmware then writes the debug data into this buffer. The buffer can be mapped into userspace so that applications can retrieve and analyze the firmware debug information. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20251016203016.819441-1-lizhi.hou@amd.com
2025-10-16accel/amdxdna: Support getting last hardware errorLizhi Hou
Add new parameter DRM_AMDXDNA_HW_LAST_ASYNC_ERR to get array IOCTL. When hardware reports an error, the driver save the error information and timestamp. This new get array parameter retrieves the last error. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20251014234119.628453-1-lizhi.hou@amd.com
2025-10-08accel/amdxdna: Resume power for creating and destroying hardware contextLizhi Hou
When the hardware is powered down by auto-suspend, creating or destroying a hardware context without resuming power will fail. Call amdxdna_pm_resume_get() before requesting the hardware to create or destroy a hardware context. Fixes: 063db451832b ("accel/amdxdna: Enhance runtime power management") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20251008045324.4171807-1-lizhi.hou@amd.com
2025-09-24accel/amdxdna: Enhance runtime power managementLizhi Hou
Currently, pm_runtime_resume_and_get() is invoked in the driver's open callback, and pm_runtime_put_autosuspend() is called in the close callback. As a result, the device remains active whenever an application opens it, even if no I/O is performed, leading to unnecessary power consumption. Move the runtime PM calls to the AIE2 callbacks that actually interact with the hardware. The device will automatically suspend after 5 seconds of inactivity (no hardware accesses and no pending commands), and it will be resumed on the next hardware access. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250923152229.1303625-1-lizhi.hou@amd.com
2025-09-17accel/amdxdna: Call dma_buf_vmap_unlocked() for imported objectLizhi Hou
In amdxdna_gem_obj_vmap(), calling dma_buf_vmap() triggers a kernel warning if LOCKDEP is enabled. So for imported object, use dma_buf_vmap_unlocked(). Then, use drm_gem_vmap() for other objects. The similar change applies to vunmap code. Fixes: bd72d4acda10 ("accel/amdxdna: Support user space allocated buffer") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250916174842.234709-1-lizhi.hou@amd.com
2025-09-11accel/amdxdna: Fix an integer overflow in aie2_query_ctx_status_array()Lizhi Hou
The unpublished smatch static checker reported a warning. drivers/accel/amdxdna/aie2_pci.c:904 aie2_query_ctx_status_array() warn: potential user controlled sizeof overflow 'args->num_element * args->element_size' '1-u32max(user) * 1-u32max(user)' Even this will not cause a real issue, it is better to put a reasonable limitation for element_size and num_element. Add condition to make sure the input element_size <= 4K and num_element <= 1K. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/aL56ZCLyl3tLQM1e@stanley.mountain/ Fixes: 2f509fe6a42c ("accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAY") Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250909154531.3469979-1-lizhi.hou@amd.com
2025-09-04accel/amdxdna: Add ioctl DRM_IOCTL_AMDXDNA_GET_ARRAYLizhi Hou
Add interface for applications to get information array. The application provides a buffer pointer along with information type, maximum number of entries and maximum size of each entry. The buffer may also contain match conditions based on the information type. After the ioctl completes, the actual number of entries and entry size are returned. (see [1], used by driver runtime library) [1] https://github.com/amd/xdna-driver/blob/main/src/shim/host/platform_host.cpp#L337 Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250903053402.2103196-1-lizhi.hou@amd.com
2025-08-29accel/amdxdna: Use int instead of u32 to store error codesQianfeng Rong
Change the 'ret' variable from u32 to int to store -EINVAL. Storing the negative error codes in unsigned type, doesn't cause an issue at runtime but it's ugly as pants. Additionally, assigning -EINVAL to u32 ret (i.e., u32 ret = -EINVAL) may trigger a GCC warning when the -Wsign-conversion flag is enabled. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250828033917.113364-1-rongqianfeng@vivo.com
2025-08-26accel/amdxdna: Fix incorrect type used for a local variableLizhi Hou
drivers/accel/amdxdna/aie2_pci.c:794:13: sparse: sparse: incorrect type in assignment (different address spaces) Fixes: c8cea4371e5e ("accel/amdxdna: Add a function to walk hardware contexts") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508230855.0b9efFl6-lkp@intel.com/ Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250826171951.801585-1-lizhi.hou@amd.com
2025-08-18accel/amdxdna: Add a function to walk hardware contextsLizhi Hou
Walking hardware contexts created by a process is duplicated in multiple spots. Add a function, amdxdna_hwctx_walk(), and replace all spots. hwctx_srcu and dev_lock are good enough to protect hardware context list. Remove hwctx_lock. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250815171634.3417487-1-lizhi.hou@amd.com
2025-08-06accel/amdxdna: Unify pm and rpm suspend and resume callbacksLizhi Hou
The suspend and resume callbacks for pm and runtime pm should be same. During suspending, it needs to stop all hardware contexts first. And the hardware contexts will be restarted after the device is resumed. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250803191450.1568851-1-lizhi.hou@amd.com
2025-07-22accel/amdxdna: Delete pci_free_irq_vectors()Salah Triki
The device is managed so pci_free_irq_vectors() is called automatically no need to do it manually. Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Salah Triki <salah.triki@gmail.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/aHs8QAfUlFeNp7qL@pc
2025-07-22accel/amdxdna: Support user space allocated bufferLizhi Hou
Enhance DRM_IOCTL_AMDXDNA_CREATE_BO to accept user space allocated buffer pointer. The buffer pages will be pinned in memory. Unless the CAP_IPC_LOCK is enabled for the application process, the total pinned memory can not beyond rlimit_memlock. Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250716164414.112091-1-lizhi.hou@amd.com
2025-07-15drm/sched: Rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESETMaíra Canal
Among the scheduler's statuses, the only one that indicates an error is DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV signifies that the operation succeeded and the GPU is in a nominal state. However, to provide more information about the GPU's status, it is needed to convey more information than just "OK". Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has hung, but it has been successfully reset and is now in a nominal state again. Reviewed-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-1-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-06-20Merge tag 'drm-misc-next-2025-06-19' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.17: UAPI Changes: - Add Task Information for the wedge API Cross-subsystem Changes: Core Changes: - Fix warnings related to export.h - fbdev: Make CONFIG_FIRMWARE_EDID available on all architectures - fence: Fix UAF issues - format-helper: Improve tests Driver Changes: - ivpu: Add turbo flag, Add Wildcat Lake Support - rz-du: Improve MIPI-DSI Support - vmwgfx: fence improvement Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250619-perfect-industrious-whippet-8ed3db@houat
2025-06-18Merge tag 'drm-misc-next-2025-06-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.17: UAPI Changes: Cross-subsystem Changes: Core Changes: - atomic-helpers: Tune the enable / disable sequence - bridge: Add destroy hook - color management: Add helpers for hardware gamma LUT handling - HDMI: Add CEC handling, YUV420 output support - sched: tracing improvements Driver Changes: - hyperv: Move out of simple-kms, drm_panic support - i915: drm_panel_follower support - imx: Add IMX8qxq Display Controller Support - lima: Add Rockchip RK3528 GPU Support - nouveau: fence handling cleanup - panfrost: Add BO labeling, 64-bit registers access - qaic: Add RAS Support - rz-du: Add RZ/V2H(P) Support, MIPI-DSI DCS Support - sun4i: Add H616 Support - tidss: Add TI AM62L Support - vkms: YUV and R* formats support - bridges: - Switched to reference counted drm_bridge allocations - panels: - Switched to reference counted drm_panel allocations - Add support for fwnode-based panel lookup - himax-hx8394: Support for Huiling hl055fhv028c - ilitek-ili9881c: Support for 7" Raspberry Pi 720x1280 - panel-edp: Support for KDC KD116N3730A05, N160JCE-ELL CMN, - panel-simple: Support for AUO P238HAN01 - st7701: Support for Winstar wf40eswaa6mnn0 - visionox-rm69299: Support for rm69299-shift - New panels: Renesas R61307, Renesas R69328 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250612-coucal-of-impossible-cleaning-a5eecf@houat
2025-06-16accel/amdxdna: Revise device bo creation and freeLizhi Hou
The device bo is allocated from the device heap memory. (a trunk of memory dedicated to device) Rename amdxdna_gem_insert_node_locked to amdxdna_gem_heap_alloc and move related sanity checks into it. Add amdxdna_gem_dev_obj_free and move device bo free code into it. Calculate the kernel virtual address of device bo by the device heap memory address and offset. Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250616091418.2605476-1-lizhi.hou@amd.com
2025-06-11Merge drm/drm-next into drm-misc-nextThomas Zimmermann
Backmerging to forward to v6.16-rc1 Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-06-09accel/amdxdna: Fix incorrect PSP firmware sizeLizhi Hou
The incorrect PSP firmware size is used for initializing. It may cause error for newer version firmware. Fixes: 8c9ff1b181ba ("accel/amdxdna: Add a new driver for AMD AI Engine") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250604143217.1386272-1-lizhi.hou@amd.com
2025-05-30Merge tag 'iommu-updates-v6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core: - Introduction of iommu-pages infrastructure to consolitate page-table allocation code among hardware drivers. This is ground-work for more generalization in the future - Remove IOMMU_DEV_FEAT_SVA and IOMMU_DEV_FEAT_IOPF feature flags - Convert virtio-iommu to domain_alloc_paging() - KConfig cleanups - Some small fixes for possible overflows and race conditions Intel VT-d driver: - Restore WO permissions on second-level paging entries - Use ida to manage domain id - Miscellaneous cleanups AMD-Vi: - Make sure notifiers finish running before module unload - Add support for HTRangeIgnore feature - Allow matching ACPI HID devices without matching UIDs ARM-SMMU: - SMMUv2: - Recognise the compatible string for SAR2130P MDSS in the Qualcomm driver, as this device requires an identity domain - Fix Adreno stall handling so that GPU debugging is more robust and doesn't e.g. result in deadlock - SMMUv3: - Fix ->attach_dev() error reporting for unrecognised domains - IO-pgtable: - Allow clients (notably, drivers that process requests from userspace) to silence warnings when mapping an already-mapped IOVA S390: - Add support for additional table regions Mediatek: - Add support for MT6893 MM IOMMU And some smaller fixes and improvements in various other drivers" * tag 'iommu-updates-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (75 commits) iommu/vt-d: Restore context entry setup order for aliased devices iommu/mediatek: Fix compatible typo for mediatek,mt6893-iommu-mm iommu/arm-smmu-qcom: Make set_stall work when the device is on iommu/arm-smmu: Move handing of RESUME to the context fault handler iommu/arm-smmu-qcom: Enable threaded IRQ for Adreno SMMUv2/MMU500 iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() iommu: Clear the freelist after iommu_put_pages_list() iommu/vt-d: Change dmar_ats_supported() to return boolean iommu/vt-d: Eliminate pci_physfn() in dmar_find_matched_satc_unit() iommu/vt-d: Replace spin_lock with mutex to protect domain ida iommu/vt-d: Use ida to manage domain id iommu/vt-d: Restore WO permissions on second-level paging entries iommu/amd: Allow matching ACPI HID devices without matching UIDs iommu: make inclusion of arm/arm-smmu-v3 directory conditional iommu: make inclusion of riscv directory conditional iommu: make inclusion of amd directory conditional iommu: make inclusion of intel directory conditional iommu: remove duplicate selection of DMAR_TABLE iommu/fsl_pamu: remove trailing space after \n iommu/arm-smmu-qcom: Add SAR2130P MDSS compatible ...
2025-05-28drm/sched: Store the drm client_id in drm_sched_fencePierre-Eric Pelloux-Prayer
This will be used in a later commit to trace the drm client_id in some of the gpu_scheduler trace events. This requires changing all the users of drm_sched_job_init to add an extra parameter. The newly added drm_client_id field in the drm_sched_fence is a bit of a duplicate of the owner one. One suggestion I received was to merge those 2 fields - this can't be done right now as amdgpu uses some special values (AMDGPU_FENCE_OWNER_*) that can't really be translated into a client id. Christian is working on getting rid of those; when it's done we should be able to squash owner/drm_client_id together. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-3-pierre-eric.pelloux-prayer@amd.com
2025-05-08accel/amdxdna: Support submit commands without argumentsLizhi Hou
The latest userspace runtime allows generating commands which do not have any argument. Remove the corresponding check in driver IOCTL to enable this use case. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250507161500.2339701-1-lizhi.hou@amd.com Link: https://lore.kernel.org/r/20250507161500.2339701-1-lizhi.hou@amd.com
2025-04-28iommu: Remove IOMMU_DEV_FEAT_SVAJason Gunthorpe
None of the drivers implement anything here anymore, remove the dead code. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2025-04-10accel/amdxdna: Fix incorrect size of ERT_START_NPU commandsLizhi Hou
When multiple ERT_START_NPU commands are combined in one buffer, the buffer size calculation is incorrect. Also, the condition to make sure the buffer size is not beyond 4K is also fixed. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250409210013.10854-1-lizhi.hou@amd.com
2025-03-28accel/amdxdna: Add BO import and exportLizhi Hou
Add amdxdna_gem_prime_export() and amdxdna_gem_prime_import() for BO import and export. Register mmu notifier for imported BO as well. When MMU_NOTIFIER_UNMAP event is received, queue work to remove the notifier. The same BO could be mapped multiple times if it is exported and imported by an application. Use a link list to track VMAs the BO been mapped. v2: Rebased and call get_dma_buf() before dma_buf_attach() v3: Removed import_attach usage Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250325200105.2744079-1-lizhi.hou@amd.com
2025-03-27accel/amdxdna: s/drm_gem_v[un]map_unlocked/drm_gem_v[un]map/Boris Brezillon
Commit 8f5c4871a014 ("drm/gem: Change locked/unlocked postfix of drm_gem_v/unmap() function names") dropped the _unlocked suffix, but accel drivers were left behind. Fixes: 8f5c4871a014 ("drm/gem: Change locked/unlocked postfix of drm_gem_v/unmap() function names") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Min Ma <min.ma@amd.com> Cc: Lizhi Hou <lizhi.hou@amd.com> Cc: Oded Gabbay <ogabbay@kernel.org> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250327104300.1982058-3-boris.brezillon@collabora.com
2025-02-27accel/amdxdna: Check interrupt register before mailbox_rx_worker exitsLizhi Hou
There is a timeout failure been found during stress tests. If the firmware generates a mailbox response right after driver clears the mailbox channel interrupt register, the hardware will not generate an interrupt for the response. This causes the unexpected mailbox command timeout. To handle this failure, driver checks the interrupt register before exiting mailbox_rx_worker(). If there is a new response, driver goes back to process it. Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226161810.4188334-1-lizhi.hou@amd.com
2025-02-25Merge tag 'v6.14-rc4' into drm-nextDave Airlie
Backmerge Linux 6.14-rc4 at the request of tzimmermann so misc-next can base on rc4. Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-02-19accel/amdxdna: Add missing include linux/slab.hSu Hui
When compiling without CONFIG_IA32_EMULATION, there can be some errors: drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘mailbox_release_msg’: drivers/accel/amdxdna/amdxdna_mailbox.c:197:2: error: implicit declaration of function ‘kfree’. 197 | kfree(mb_msg); | ^~~~~ drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘xdna_mailbox_send_msg’: drivers/accel/amdxdna/amdxdna_mailbox.c:418:11: error:implicit declaration of function ‘kzalloc’. 418 | mb_msg = kzalloc(sizeof(*mb_msg) + pkg_size, GFP_KERNEL); | ^~~~~~~ Add the missing include. Fixes: b87f920b9344 ("accel/amdxdna: Support hardware mailbox") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250211015354.3388171-1-suhui@nfschina.com
2025-02-18Merge drm/drm-next into drm-misc-nextThomas Zimmermann
Backmerging to get bugfixes from v6.14-rc2. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-02-14accel/amdxdna: Refactor hardware context destroy routineLizhi Hou
It is required by firmware to wait up to 2 seconds for pending commands before sending the destroy hardware context command. After 2 seconds wait, if there are still pending commands, driver needs to cancel them. So the context destroy steps need to be: 1. Stop drm scheduler. (drm_sched_entity_destroy) 2. Wait up to 2 seconds for pending commands. 3. Destroy hardware context and cancel the rest pending requests. 4. Wait all jobs associated with the hwctx are freed. 5. Free job resources. Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250124173536.148676-1-lizhi.hou@amd.com
2025-02-14Merge tag 'drm-misc-next-2025-02-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.15: UAPI Changes: fourcc: - Add modifiers for MediaTek tiled formats Cross-subsystem Changes: bus: - mhi: Enable image transfer via BHIe in PBL dma-buf: - Add fast-path for single-fence merging Core Changes: atomic helper: - Allow full modeset on connector changes - Clarify semantics of allow_modeset - Clarify semantics of drm_atomic_helper_check() buddy allocator: - Fix multi-root cleanup ci: - Update IGT display: - dp: Support Extendeds Wake Timeout - dp_mst: Fix RAD-to-string conversion panic: - Encode QR code according to Fido 2.2 probe helper: - Cleanups scheduler: - Cleanups ttm: - Refactor pool-allocation code - Cleanups Driver Changes: amdxdma: - Fix error handling - Cleanups ast: - Refactor detection of transmitter chips - Refactor support of VBIOS display-mode handling - astdp: Fix connection status; Filter unsupported display modes bridge: - adv7511: Report correct capabilities - it6505: Fix HDCP V compare - sn65dsi86: Fix device IDs - Cleanups i915: - Enable Extendeds Wake Timeout imagination: - Check job dependencies with DRM-sched helper ivpu: - Improve command-queue handling - Use workqueue for IRQ handling - Add suport for HW fault injection - Locking fixes - Cleanups mgag200: - Add support for G200eH5 chips msm: - dpu: Add concurrent writeback support for DPU 10.x+ nouveau: - Move drm_slave_encoder interface into driver - nvkm: Refactor GSP RPC omapdrm: - Cleanups panel: - Convert several panels to multi-style functions to improve error handling - edp: Add support for B140UAN04.4, BOE NV140FHM-NZ, CSW MNB601LS1-3, LG LP079QX1-SP0V, MNE007QS3-7, STA 116QHD024002, Starry 116KHD024006, Lenovo T14s Gen6 Snapdragon - himax-hx83102: Add support for CSOT PNA957QT1-1, Kingdisplay kd110n11-51ie, Starry 2082109qfh040022-50e panthor: - Expose sizes of intenral BOs via fdinfo - Fix race between reset and suspend - Cleanups qaic: - Add support for AIC200 - Cleanups renesas: - Fix limits in DT bindings rockchip: - rk3576: Add HDMI support - vop2: Add new display modes on RK3588 HDMI0 up to 4K - Don't change HDMI reference clock rate - Fix DT bindings solomon: - Set SPI device table to silence warnings - Fix pixel and scanline encoding v3d: - Cleanups vc4: - Use drm_exec - Use dma-resv for wait-BO ioctl - Remove seqno infrastructure virtgpu: - Support partial mappings of GEM objects - Reserve VGA resources during initialization - Fix UAF in virtgpu_dma_buf_free_obj() - Add panic support vkms: - Switch to a managed modesetting pipeline - Add support for ARGB8888 xlnx: - Set correct DMA segment size - Fix error handling - Fix docs Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250212090625.GA24865@linux.fritz.box
2025-02-12drm/sched: Use struct for drm_sched_init() paramsPhilipp Stanner
drm_sched_init() has a great many parameters and upcoming new functionality for the scheduler might add even more. Generally, the great number of parameters reduces readability and has already caused one missnaming, addressed in: commit 6f1cacf4eba7 ("drm/nouveau: Improve variable name in nouveau_sched_init()"). Introduce a new struct for the scheduler init parameters and port all users. Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Acked-by: Matthew Brost <matthew.brost@intel.com> # for Xe Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # for Panfrost and Panthor Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> # for Etnaviv Reviewed-by: Frank Binns <frank.binns@imgtec.com> # for Imagination Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> # for Sched Reviewed-by: Maíra Canal <mcanal@igalia.com> # for v3d Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> # for amdxdna Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250211111422.21235-2-phasta@kernel.org
2025-02-04accel/amdxdna: Add MODULE_FIRMWARE() declarationsMario Limonciello
Initramfs building tools such as dracut will look for a MODULE_FIRMWARE() declaration to determine which firmware to include in the initramfs when a driver is included in the initramfs. As amdxdna doesn't declare any firmware this causes the driver to fail to load with -ENOENT when in the initramfs. Add the missing declaration for possible firmware. Reported-by: Renjith Pananchikkal <Renjith.Pananchikkal@amd.com> Suggested-by: Alexander Deucher <Alexander.Deucher@amd.com> Fixes: 8c9ff1b181ba ("accel/amdxdna: Add a new driver for AMD AI Engine") Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250204174031.3425762-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250204174031.3425762-1-superm1@kernel.org
2025-01-13accel/amdxdna: Declare sched_ops as staticLizhi Hou
Fix sparse warning: symbol 'sched_ops' was not declared. Should it be static? Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113182617.1256094-2-lizhi.hou@amd.com
2025-01-13accel/amdxdna: Remove casting mailbox payload pointerLizhi Hou
The mailbox payload pointer is void __iomem *. Casting it to u32 * is incorrect and causes sparse warning. cast removes address space '__iomem' of expression Fixes: b87f920b9344 ("accel/amdxdna: Support hardware mailbox") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501130921.ktqwsMLH-lkp@intel.com/ Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113182617.1256094-1-lizhi.hou@amd.com