summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2024-09-04drm/amdkfd: don't allow mapping the MMIO HDP page with large pagesAlex Deucher
commit be4a2a81b6b90d1a47eaeaace4cc8e2cb57b96c7 upstream. We don't get the right offset in that case. The GPU has an unused 4K area of the register BAR space into which you can remap registers. We remap the HDP flush registers into this space to allow userspace (CPU or GPU) to flush the HDP when it updates VRAM. However, on systems with >4K pages, we end up exposing PAGE_SIZE of MMIO space. Fixes: d8e408a82704 ("drm/amdkfd: Expose HDP registers to user space") Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_relocJesse Zhang
commit 88a9a467c548d0b3c7761b4fd54a68e70f9c0944 upstream. Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001. V2: To really improve the handling we would actually need to have a separate value of 0xffffffff.(Christian) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Vamsi Krishna Brahmajosyula <vamsi-krishna.brahmajosyula@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04Revert "drm/amd/display: Validate hw_points_num before using it"Alex Hung
commit 8f4bdbc8e99db6ec9cb0520748e49a2f2d7d1727 upstream. This reverts commit 58c3b3341cea4f75dc8c003b89f8a6dd8ec55e50. [WHY & HOW] The writeback series cause a regression in thunderbolt display. Signed-off-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04drm/msm/dpu: cleanup FB if dpu_format_populate_layout failsDmitry Baryshkov
[ Upstream commit bfa1a6283be390947d3649c482e5167186a37016 ] If the dpu_format_populate_layout() fails, then FB is prepared, but not cleaned up. This ends up leaking the pin_count on the GEM object and causes a splat during DRM file closure: msm_obj->pin_count WARNING: CPU: 2 PID: 569 at drivers/gpu/drm/msm/msm_gem.c:121 update_lru_locked+0xc4/0xcc [...] Call trace: update_lru_locked+0xc4/0xcc put_pages+0xac/0x100 msm_gem_free_object+0x138/0x180 drm_gem_object_free+0x1c/0x30 drm_gem_object_handle_put_unlocked+0x108/0x10c drm_gem_object_release_handle+0x58/0x70 idr_for_each+0x68/0xec drm_gem_release+0x28/0x40 drm_file_free+0x174/0x234 drm_release+0xb0/0x160 __fput+0xc0/0x2c8 __fput_sync+0x50/0x5c __arm64_sys_close+0x38/0x7c invoke_syscall+0x48/0x118 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x4c/0x120 el0t_64_sync_handler+0x100/0x12c el0t_64_sync+0x190/0x194 irq event stamp: 129818 hardirqs last enabled at (129817): [<ffffa5f6d953fcc0>] console_unlock+0x118/0x124 hardirqs last disabled at (129818): [<ffffa5f6da7dcf04>] el1_dbg+0x24/0x8c softirqs last enabled at (129808): [<ffffa5f6d94afc18>] handle_softirqs+0x4c8/0x4e8 softirqs last disabled at (129785): [<ffffa5f6d94105e4>] __do_softirq+0x14/0x20 Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/600714/ Link: https://lore.kernel.org/r/20240625-dpu-mode-config-width-v5-1-501d984d634f@linaro.org Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04drm/msm/dp: reset the link phy params before link trainingAbhinav Kumar
[ Upstream commit 319aca883bfa1b85ee08411541b51b9a934ac858 ] Before re-starting link training reset the link phy params namely the pre-emphasis and voltage swing levels otherwise the next link training begins at the previously cached levels which can result in link training failures. Fixes: 8ede2ecc3e5e ("drm/msm/dp: Add DP compliance tests on Snapdragon Chipsets") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # SM8350-HDK Reviewed-by: Stephen Boyd <swboyd@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/605946/ Link: https://lore.kernel.org/r/20240725220450.131245-1-quic_abhinavk@quicinc.com Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04drm/msm/dpu: don't play tricks with debug macrosDmitry Baryshkov
[ Upstream commit df24373435f5899a2a98b7d377479c8d4376613b ] DPU debugging macros need to be converted to a proper drm_debug_* macros, however this is a going an intrusive patch, not suitable for a fix. Wire DPU_DEBUG and DPU_DEBUG_DRIVER to always use DRM_DEBUG_DRIVER to make sure that DPU debugging messages always end up in the drm debug messages and are controlled via the usual drm.debug mask. I don't think that it is a good idea for a generic DPU_DEBUG macro to be tied to DRM_UT_KMS. It is used to report a debug message from driver, so by default it should go to the DRM_UT_DRIVER channel. While refactoring debug macros later on we might end up with particular messages going to ATOMIC or KMS, but DRIVER should be the default. Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/606932/ Link: https://lore.kernel.org/r/20240802-dpu-fix-wb-v2-2-7eac9eb8e895@linaro.org Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04drm/lima: set gp bus_stop bit before hard resetErico Nunes
[ Upstream commit 27aa58ec85f973d98d336df7b7941149308db80f ] This is required for reliable hard resets. Otherwise, doing a hard reset while a task is still running (such as a task which is being stopped by the drm_sched timeout handler) may result in random mmu write timeouts or lockups which cause the entire gpu to hang. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124025947.2110659-5-nunes.erico@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04drm/amd/display: Validate hw_points_num before using itAlex Hung
[ Upstream commit 58c3b3341cea4f75dc8c003b89f8a6dd8ec55e50 ] [WHAT] hw_points_num is 0 before ogam LUT is programmed; however, function "dwb3_program_ogam_pwl" assumes hw_points_num is always greater than 0, i.e. substracting it by 1 as an array index. [HOW] Check hw_points_num is not equal to 0 before using it. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04drm/amdgpu/jpeg2: properly set atomics vmid fieldAlex Deucher
commit e414a304f2c5368a84f03ad34d29b89f965a33c9 upstream. This needs to be set as well if the IB uses atomics. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 35c628774e50b3784c59e8ca7973f03bcb067132) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04drm/amdgpu: Actually check flags for all context ops.Bas Nieuwenhuizen
commit 0573a1e2ea7e35bff08944a40f1adf2bb35cea61 upstream. Missing validation ... Checked libdrm and it clears all the structs, so we should be safe to just check everything. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c6b86421f1f9ddf9d706f2453159813ee39d0cf9) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/i915/gem: Fix Virtual Memory mapping boundaries calculationAndi Shyti
commit 8bdd9ef7e9b1b2a73e394712b72b22055e0e26c3 upstream. Calculating the size of the mapped area as the lesser value between the requested size and the actual size does not consider the partial mapping offset. This can cause page fault access. Fix the calculation of the starting and ending addresses, the total size is now deduced from the difference between the end and start addresses. Additionally, the calculations have been rewritten in a clearer and more understandable form. Fixes: c58305af1835 ("drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass") Reported-by: Jann Horn <jannh@google.com> Co-developed-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Jonathan Cavitt <Jonathan.cavitt@intel.com> [Joonas: Add Requires: tag] Requires: 60a2066c5005 ("drm/i915/gem: Adjust vma offset for framebuffer mmap offset") Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240802083850.103694-3-andi.shyti@linux.intel.com (cherry picked from commit 97b6784753da06d9d40232328efc5c5367e53417) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/mgag200: Set DDC timeout in millisecondsThomas Zimmermann
commit ecde5db1598aecab54cc392282c15114f526f05f upstream. Compute the i2c timeout in jiffies from a value in milliseconds. The original values of 2 jiffies equals 2 milliseconds if HZ has been configured to a value of 1000. This corresponds to 2.2 milliseconds used by most other DRM drivers. Update mgag200 accordingly. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Fixes: 414c45310625 ("mgag200: initial g200se driver (v2)") Cc: Dave Airlie <airlied@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.5+ Link: https://patchwork.freedesktop.org/patch/msgid/20240513125620.6337-2-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/bridge: analogix_dp: properly handle zero sized AUX transactionsLucas Stach
commit e82290a2e0e8ec5e836ecad1ca025021b3855c2d upstream. Address only transactions without any data are valid and should not be flagged as short transactions. Simply return the message size when no transaction errors occured. CC: stable@vger.kernel.org Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Robert Foss <rfoss@kernel.org> Signed-off-by: Robert Foss <rfoss@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240318203925.2837689-1-l.stach@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/client: fix null pointer dereference in drm_client_modeset_probeMa Ke
commit 113fd6372a5bb3689aba8ef5b8a265ed1529a78f upstream. In drm_client_modeset_probe(), the return value of drm_mode_duplicate() is assigned to modeset->mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. Cc: stable@vger.kernel.org Fixes: cf13909aee05 ("drm/fb-helper: Move out modeset config code") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240802044736.1570345-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/amd/display: Add null checker before passing variablesAlex Hung
[ Upstream commit 8092aa3ab8f7b737a34b71f91492c676a843043a ] Checks null pointer before passing variables to functions. This fixes 3 NULL_RETURNS issues reported by Coverity. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/amdgpu/pm: Fix the null pointer dereference in apply_state_adjust_rulesMa Jun
[ Upstream commit d19fb10085a49b77578314f69fff21562f7cd054 ] Check the pointer value to fix potential null pointer dereference Acked-by: Yang Wang<kevinyang.wang@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/amdgpu: Fix the null pointer dereference to ras_managerMa Jun
[ Upstream commit 4c11d30c95576937c6c35e6f29884761f2dddb43 ] Check ras_manager before using it Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/amdgpu/pm: Fix the null pointer dereference for smu7Ma Jun
[ Upstream commit c02c1960c93eede587576625a1221205a68a904f ] optimize the code to avoid pass a null pointer (hwmgr->backend) to function smu7_update_edc_leakage_table. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/vmwgfx: Fix a deadlock in dma buf fence pollingZack Rusin
commit e58337100721f3cc0c7424a18730e4f39844934f upstream. Introduce a version of the fence ops that on release doesn't remove the fence from the pending list, and thus doesn't require a lock to fix poll->fence wait->fence unref deadlocks. vmwgfx overwrites the wait callback to iterate over the list of all fences and update their status, to do that it holds a lock to prevent the list modifcations from other threads. The fence destroy callback both deletes the fence and removes it from the list of pending fences, for which it holds a lock. dma buf polling cb unrefs a fence after it's been signaled: so the poll calls the wait, which signals the fences, which are being destroyed. The destruction tries to acquire the lock on the pending fences list which it can never get because it's held by the wait from which it was called. Old bug, but not a lot of userspace apps were using dma-buf polling interfaces. Fix those, in particular this fixes KDE stalls/deadlock. Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Fixes: 2298e804e96e ("drm/vmwgfx: rework to new fence interface, v2") Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.2+ Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com> Reviewed-by: Martin Krastev <martin.krastev@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240722184313.181318-2-zack.rusin@broadcom.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/vmwgfx: Fix overlay when using Screen TargetsIan Forbes
[ Upstream commit cb372a505a994cb39aa75acfb8b3bcf94787cf94 ] This code was never updated to support Screen Targets. Fixes a bug where Xv playback displays a green screen instead of actual video contents when 3D acceleration is disabled in the guest. Fixes: c8261a961ece ("vmwgfx: Major KMS refactoring / cleanup in preparation of screen targets") Reported-by: Doug Brown <doug@schmorgal.com> Closes: https://lore.kernel.org/all/bd9cb3c7-90e8-435d-bc28-0e38fee58977@schmorgal.com Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Tested-by: Doug Brown <doug@schmorgal.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240719163627.20888-1-ian.forbes@broadcom.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/nouveau: prime: fix refcount underflowDanilo Krummrich
[ Upstream commit a9bf3efc33f1fbf88787a277f7349459283c9b95 ] Calling nouveau_bo_ref() on a nouveau_bo without initializing it (and hence the backing ttm_bo) leads to a refcount underflow. Instead of calling nouveau_bo_ref() in the unwind path of drm_gem_object_init(), clean things up manually. Fixes: ab9ccb96a6e6 ("drm/nouveau: use prime helpers") Reviewed-by: Ben Skeggs <bskeggs@nvidia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240718165959.3983-2-dakr@kernel.org (cherry picked from commit 1b93f3e89d03cfc576636e195466a0d728ad8de5) Signed-off-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/dp_mst: Fix all mstb marked as not probed after suspend/resumeWayne Lin
[ Upstream commit d63d81094d208abb20fc444514b2d9ec2f4b7c4e ] [Why] After supend/resume, with topology unchanged, observe that link_address_sent of all mstb are marked as false even the topology probing is done without any error. It is caused by wrongly also include "ret == 0" case as a probing failure case. [How] Remove inappropriate checking conditions. Cc: Lyude Paul <lyude@redhat.com> Cc: Harry Wentland <hwentlan@amd.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: stable@vger.kernel.org Fixes: 37dfdc55ffeb ("drm/dp_mst: Cleanup drm_dp_send_link_address() a bit") Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626084825.878565-2-Wayne.Lin@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/panfrost: Mark simple_ondemand governor as softdepDragan Simic
commit 80f4e62730a91572b7fdc657f7bb747e107ae308 upstream. Panfrost DRM driver uses devfreq to perform DVFS, while using simple_ondemand devfreq governor by default. This causes driver initialization to fail on boot when simple_ondemand governor isn't built into the kernel statically, as a result of the missing module dependency and, consequently, the required governor module not being included in the initial ramdisk. Thus, let's mark simple_ondemand governor as a softdep for Panfrost, to have its kernel module included in the initial ramdisk. This is a rather longstanding issue that has forced distributions to build devfreq governors statically into their kernels, [1][2] or has forced users to introduce some unnecessary workarounds. [3] For future reference, not having support for the simple_ondemand governor in the initial ramdisk produces errors in the kernel log similar to these below, which were taken from a Pine64 RockPro64: panfrost ff9a0000.gpu: [drm:panfrost_devfreq_init [panfrost]] *ERROR* Couldn't initialize GPU devfreq panfrost ff9a0000.gpu: Fatal error during GPU init panfrost: probe of ff9a0000.gpu failed with error -22 Having simple_ondemand marked as a softdep for Panfrost may not resolve this issue for all Linux distributions. In particular, it will remain unresolved for the distributions whose utilities for the initial ramdisk generation do not handle the available softdep information [4] properly yet. However, some Linux distributions already handle softdeps properly while generating their initial ramdisks, [5] and this is a prerequisite step in the right direction for the distributions that don't handle them properly yet. [1] https://gitlab.manjaro.org/manjaro-arm/packages/core/linux/-/blob/linux61/config?ref_type=heads#L8180 [2] https://salsa.debian.org/kernel-team/linux/-/merge_requests/1066 [3] https://forum.pine64.org/showthread.php?tid=15458 [4] https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git/commit/?id=49d8e0b59052999de577ab732b719cfbeb89504d [5] https://github.com/archlinux/mkinitcpio/commit/97ac4d37aae084a050be512f6d8f4489054668ad Cc: Diederik de Haas <didi.debian@cknow.org> Cc: Furkan Kardame <f.kardame@manjaro.org> Cc: stable@vger.kernel.org Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Signed-off-by: Dragan Simic <dsimic@manjaro.org> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/4e1e00422a14db4e2a80870afb704405da16fd1b.1718655077.git.dsimic@manjaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/i915/dp: Reset intel_dp->link_trained before retraining the linkImre Deak
commit d13e2a6e95e6b87f571c837c71a3d05691def9bb upstream. Regularly retraining a link during an atomic commit happens with the given pipe/link already disabled and hence intel_dp->link_trained being false. Ensure this also for retraining a DP SST link via direct calls to the link training functions (vs. an actual commit as for DP MST). So far nothing depended on this, however the next patch will depend on link_trained==false for changing the LTTPR mode to non-transparent. Cc: <stable@vger.kernel.org> # v5.15+ Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708190029.271247-2-imre.deak@intel.com (cherry picked from commit a4d5ce61765c08ab364aa4b327f6739b646e6cfa) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/amdgpu/sdma5.2: Update wptr registers as well as doorbellAlex Deucher
commit a03ebf116303e5d13ba9a2b65726b106cb1e96f6 upstream. We seem to have a case where SDMA will sometimes miss a doorbell if GFX is entering the powergating state when the doorbell comes in. To workaround this, we can update the wptr via MMIO, however, this is only safe because we disallow gfxoff in begin_ring() for SDMA 5.2 and then allow it again in end_ring(). Enable this workaround while we are root causing the issue with the HW team. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440 Tested-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit f2ac52634963fc38e4935e11077b6f7854e5d700) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8Nitin Gote
commit 65564157ae64cec0f527583f96e32f484f730f92 upstream. We're seeing a GPU hang issue on a CHV platform, which was caused by commit bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for Gen8"). The Gen8 platform only supports timeslicing and doesn't have a preemption mechanism, as its engines do not have a preemption timer. Commit 751f82b353a6 ("drm/i915/gt: Only disable preemption on Gen8 render engines") addressed this issue only for render engines. This patch extends that fix by ensuring that preemption is not considered for all engines on Gen8 platforms. v4: - Use the correct Fixes tag (Rodrigo Vivi) - Reworded commit log (Andi Shyti) v3: - Inside need_preempt(), condition of can_preempt() is not required as simplified can_preempt() is enough. (Chris Wilson) v2: Simplify can_preempt() function (Tvrtko Ursulin) Fixes: 751f82b353a6 ("drm/i915/gt: Only disable preemption on gen8 render engines") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396 Suggested-by: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> CC: <stable@vger.kernel.org> # v5.12+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711163208.1355736-1-nitin.r.gote@intel.com (cherry picked from commit 7df0be6e6280c6fca01d039864bb123e5e36604b) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/amd/display: Check for NULL pointerSung Joon Kim
commit 4ab68e168ae1695f7c04fae98930740aaf7c50fa upstream. [why & how] Need to make sure plane_state is initialized before accessing its members. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Xi (Alex) Liu <xi.liu@amd.com> Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 295d91cbc700651782a60572f83c24861607b648) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/gma500: fix null pointer dereference in psb_intel_lvds_get_modesMa Ke
commit 2df7aac81070987b0f052985856aa325a38debf6 upstream. In psb_intel_lvds_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. Cc: stable@vger.kernel.org Fixes: 89c78134cc54 ("gma500: Add Poulsbo support") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240709092011.3204970-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/gma500: fix null pointer dereference in cdv_intel_lvds_get_modesMa Ke
commit cb520c3f366c77e8d69e4e2e2781a8ce48d98e79 upstream. In cdv_intel_lvds_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. Cc: stable@vger.kernel.org Fixes: 6a227d5fd6c4 ("gma500: Add support for Cedarview") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240709113311.37168-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19drm/qxl: Add check for drm_cvt_modeChen Ni
[ Upstream commit 7bd09a2db0f617377027a2bb0b9179e6959edff3 ] Add check for the return value of drm_cvt_mode() and return the error if it fails in order to avoid NULL pointer dereference. Fixes: 1b043677d4be ("drm/qxl: add qxl_add_mode helper function") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240621071031.1987974-1-nichen@iscas.ac.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/etnaviv: fix DMA direction handling for cached RW buffersLucas Stach
[ Upstream commit 58979ad6330a70450ed78837be3095107d022ea9 ] The dma sync operation needs to be done with DMA_BIDIRECTIONAL when the BO is prepared for both read and write operations. Fixes: a8c21a5451d8 ("drm/etnaviv: add initial etnaviv DRM driver") Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/mediatek: Add DRM_MODE_ROTATE_0 to rotation propertyHsiao Chien Sung
[ Upstream commit 74608d8feefd1675388f23362aac8df4ac3af931 ] Always add DRM_MODE_ROTATE_0 to rotation property to meet IGT's (Intel GPU Tools) requirement. Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Fixes: 119f5173628a ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.") Signed-off-by: Hsiao Chien Sung <shawn.sung@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20240620-igt-v3-8-a9d62d2e2c7e@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/mediatek: Add missing plane settings when async updateHsiao Chien Sung
[ Upstream commit 86b89dc669c400576dc23aa923bcf302f99e8e3a ] Fix an issue that plane coordinate was not saved when calling async update. Fixes: 920fffcc8912 ("drm/mediatek: update cursors by using async atomic update") Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Fixes: 119f5173628a ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.") Signed-off-by: Hsiao Chien Sung <shawn.sung@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20240620-igt-v3-1-a9d62d2e2c7e@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/panel: boe-tv101wum-nl6: Check for errors on the NOP in prepare()Douglas Anderson
[ Upstream commit 6320b9199dd99622668649c234d4e8a99e44a9c8 ] The mipi_dsi_dcs_nop() function returns an error but we weren't checking it in boe_panel_prepare(). Add a check. This is highly unlikely to matter in practice. If the NOP failed then likely later MIPI commands would fail too. Found by code inspection. Fixes: 812562b8d881 ("drm/panel: boe-tv101wum-nl6: Fine tune the panel power sequence") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240517143643.3.Ibffbaa5b4999ac0e55f43bf353144433b099d727@changeid Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240517143643.3.Ibffbaa5b4999ac0e55f43bf353144433b099d727@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/panel: boe-tv101wum-nl6: If prepare fails, disable GPIO before regulatorsDouglas Anderson
[ Upstream commit 587c48f622374e5d47b1d515c6006a4df4dee882 ] The enable GPIO should clearly be set low before turning off regulators. That matches both the inverse order that things were enabled and also the order in unprepare(). Fixes: a869b9db7adf ("drm/panel: support for boe tv101wum-nl6 wuxga dsi video mode panel") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240517143643.2.Ieac346cd0f1606948ba39ceea06b55359fe972b6@changeid Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240517143643.2.Ieac346cd0f1606948ba39ceea06b55359fe972b6@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/amdgpu: Check if NBIO funcs are NULL in amdgpu_device_baco_exitFriedrich Vock
[ Upstream commit 0cdb3f9740844b9d95ca413e3fcff11f81223ecf ] The special case for VM passthrough doesn't check adev->nbio.funcs before dereferencing it. If GPUs that don't have an NBIO block are passed through, this leads to a NULL pointer dereference on startup. Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Fixes: 1bece222eabe ("drm/amdgpu: Clear doorbell interrupt status for Sienna Cichlid") Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/amd/pm: Fix aldebaran pcie speed reportingLijo Lazar
[ Upstream commit b6420021e17e262c57bb289d0556ee181b014f9c ] Fix the field definitions for LC_CURRENT_DATA_RATE. Fixes: c05d1c401572 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19drm/meson: fix canvas release in bind functionYao Zi
[ Upstream commit a695949b2e9bb6b6700a764c704731a306c4bebf ] Allocated canvases may not be released on the error exit path of meson_drv_bind_master(), leading to resource leaking. Rewrite exit path to release canvases on error. Fixes: 2bf6b5b0e374 ("drm/meson: exclusively use the canvas provider module") Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240703155826.10385-2-ziyao@disroot.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240703155826.10385-2-ziyao@disroot.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-27drm/amdgpu: Fix signedness bug in sdma_v4_0_process_trap_irq()Dan Carpenter
commit 6769a23697f17f9bf9365ca8ed62fe37e361a05a upstream. The "instance" variable needs to be signed for the error handling to work. Fixes: 8b2faf1a4f3b ("drm/amdgpu: add error handle to avoid out-of-bounds") Reviewed-by: Bob Zhou <bob.zhou@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Siddh Raman Pant <siddh.raman.pant@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-27drm/radeon: check bo_va->bo is non-NULL before using itPierre-Eric Pelloux-Prayer
[ Upstream commit 6fb15dcbcf4f212930350eaee174bb60ed40a536 ] The call to radeon_vm_clear_freed might clear bo_va->bo, so we have to check it before dereferencing it. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-27drm/vmwgfx: Fix missing HYPERVISOR_GUEST dependencyAlexey Makhalov
[ Upstream commit 8c4d6945fe5bd04ff847c3c788abd34ca354ecee ] VMWARE_HYPERCALL alternative will not work as intended without VMware guest code initialization. [ bp: note that this doesn't reproduce with newer gccs so it must be something gcc-9-specific. ] Closes: https://lore.kernel.org/oe-kbuild-all/202406152104.FxakP1MB-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240616012511.198243-1-alexey.makhalov@broadcom.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-18drm/amdgpu/atomfirmware: silence UBSAN warningAlex Deucher
commit d0417264437a8fa05f894cabba5a26715b32d78e upstream. This is a variable sized array. Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/110420.html Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-18drm/nouveau: fix null pointer dereference in nouveau_connector_get_modesMa Ke
commit 80bec6825b19d95ccdfd3393cf8ec15ff2a749b4 upstream. In nouveau_connector_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. Cc: stable@vger.kernel.org Fixes: 6ee738610f41 ("drm/nouveau: Add DRM driver for NVIDIA GPUs") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627074204.3023776-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-18drm/amd/display: Skip finding free audio for unknown engine_idAlex Hung
[ Upstream commit 1357b2165d9ad94faa4c4a20d5e2ce29c2ff29c3 ] [WHY] ENGINE_ID_UNKNOWN = -1 and can not be used as an array index. Plus, it also means it is uninitialized and does not need free audio. [HOW] Skip and return NULL. This fixes 2 OVERRUN issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-18drm/amd/display: Check pipe offset before setting vblankAlex Hung
[ Upstream commit 5396a70e8cf462ec5ccf2dc8de103c79de9489e6 ] pipe_ctx has a size of MAX_PIPES so checking its index before accessing the array. This fixes an OVERRUN issue reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-18drm/amd/display: Check index msg_id before read or writeAlex Hung
[ Upstream commit 59d99deb330af206a4541db0c4da8f73880fba03 ] [WHAT] msg_id is used as an array index and it cannot be a negative value, and therefore cannot be equal to MOD_HDCP_MESSAGE_ID_INVALID (-1). [HOW] Check whether msg_id is valid before reading and setting. This fixes 4 OVERRUN issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-18drm/amdgpu: Initialize timestamp for some legacy SOCsMa Jun
[ Upstream commit 2e55bcf3d742a4946d862b86e39e75a95cc6f1c0 ] Initialize the interrupt timestamp for some legacy SOCs to fix the coverity issue "Uninitialized scalar variable" Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-18drm/lima: fix shared irq handling on driver removeErico Nunes
[ Upstream commit a6683c690bbfd1f371510cb051e8fa49507f3f5e ] lima uses a shared interrupt, so the interrupt handlers must be prepared to be called at any time. At driver removal time, the clocks are disabled early and the interrupts stay registered until the very end of the remove process due to the devm usage. This is potentially a bug as the interrupts access device registers which assumes clocks are enabled. A crash can be triggered by removing the driver in a kernel with CONFIG_DEBUG_SHIRQ enabled. This patch frees the interrupts at each lima device finishing callback so that the handlers are already unregistered by the time we fully disable clocks. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240401224329.1228468-2-nunes.erico@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modesMa Ke
commit 6d411c8ccc0137a612e0044489030a194ff5c843 upstream. In nv17_tv_get_hd_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). The same applies to drm_cvt_mode(). Add a check to avoid null pointer dereference. Cc: stable@vger.kernel.org Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625081029.2619437-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05drm/i915/gt: Fix potential UAF by revoke of fence registersJanusz Krzysztofik
commit 996c3412a06578e9d779a16b9e79ace18125ab50 upstream. CI has been sporadically reporting the following issue triggered by igt@i915_selftest@live@hangcheck on ADL-P and similar machines: <6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence ... <6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled <6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled <3> [414.070354] Unable to pin Y-tiled fence; err:-4 <3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active)) ... <4>[ 609.603992] ------------[ cut here ]------------ <2>[ 609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301! <4>[ 609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G U W 6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1 <4>[ 609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023 <4>[ 609.604010] Workqueue: i915 __i915_gem_free_work [i915] <4>[ 609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915] ... <4>[ 609.604271] Call Trace: <4>[ 609.604273] <TASK> ... <4>[ 609.604716] __i915_vma_evict+0x2e9/0x550 [i915] <4>[ 609.604852] __i915_vma_unbind+0x7c/0x160 [i915] <4>[ 609.604977] force_unbind+0x24/0xa0 [i915] <4>[ 609.605098] i915_vma_destroy+0x2f/0xa0 [i915] <4>[ 609.605210] __i915_gem_object_pages_fini+0x51/0x2f0 [i915] <4>[ 609.605330] __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915] <4>[ 609.605440] process_scheduled_works+0x351/0x690 ... In the past, there were similar failures reported by CI from other IGT tests, observed on other platforms. Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for idleness of vma->active via fence_update(). That commit introduced vma->fence->active in order for the fence_update() to be able to wait selectively on that one instead of vma->active since only idleness of fence registers was needed. But then, another commit 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") replaced the call to fence_update() in i915_vma_revoke_fence() with only fence_write(), and also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front. No justification was provided on why we might then expect idleness of vma->fence->active without first waiting on it. The issue can be potentially caused by a race among revocation of fence registers on one side and sequential execution of signal callbacks invoked on completion of a request that was using them on the other, still processed in parallel to revocation of those fence registers. Fix it by waiting for idleness of vma->fence->active in i915_vma_revoke_fence(). Fixes: 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021 Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: stable@vger.kernel.org # v5.8+ Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240603195446.297690-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit 24bb052d3dd499c5956abad5f7d8e4fd07da7fb1) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>