<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c, branch v5.10.67</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.67</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.67'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-03-09T10:11:13Z</updated>
<entry>
<title>drm/amdgpu: fix parameter error of RREG32_PCIE() in amdgpu_regs_pcie</title>
<updated>2021-03-09T10:11:13Z</updated>
<author>
<name>Kevin Wang</name>
<email>kevin1.wang@amd.com</email>
</author>
<published>2021-03-02T07:54:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3868a277e6fcd36eeb175ee43773f24388417fa1'/>
<id>urn:sha1:3868a277e6fcd36eeb175ee43773f24388417fa1</id>
<content type='text'>
commit 1aa46901ee51c1c5779b3b239ea0374a50c6d9ff upstream.

the register offset isn't needed division by 4 to pass RREG32_PCIE()

Signed-off-by: Kevin Wang &lt;kevin1.wang@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: support indirect access reg outside of mmio bar (v2)</title>
<updated>2020-10-01T14:42:55Z</updated>
<author>
<name>Hawking Zhang</name>
<email>Hawking.Zhang@amd.com</email>
</author>
<published>2020-09-18T12:32:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f7ee1874b06cfda6295ef236116ef189fdb9bd06'/>
<id>urn:sha1:f7ee1874b06cfda6295ef236116ef189fdb9bd06</id>
<content type='text'>
support both direct and indirect accessor in unified
helper functions.

v2: Retire indirect mmio access via mm_index/data

Signed-off-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Kevin Wang &lt;kevin1.wang@amd.com&gt;
Reviewed-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Get DRM dev from adev by inline-f</title>
<updated>2020-08-24T17:06:06Z</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-08-24T16:29:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4a580877bdcb837e7a3754ae20798dcfccb44e80'/>
<id>urn:sha1:4a580877bdcb837e7a3754ae20798dcfccb44e80</id>
<content type='text'>
Add a static inline adev_to_drm() to obtain
the DRM device pointer from an amdgpu_device pointer.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)</title>
<updated>2020-08-24T17:06:06Z</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-08-24T16:27:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1348969ab68cb864034ec4fe48a86c157cc4e10d'/>
<id>urn:sha1:1348969ab68cb864034ec4fe48a86c157cc4e10d</id>
<content type='text'>
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.

v2: Use a typed visible static inline function
    instead of an invisible macro.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: change reset lock from mutex to rw_semaphore</title>
<updated>2020-08-24T16:23:48Z</updated>
<author>
<name>Dennis Li</name>
<email>Dennis.Li@amd.com</email>
</author>
<published>2020-08-20T02:06:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6049db43d6dd9cbb9d7f897839d15b0bc600e23c'/>
<id>urn:sha1:6049db43d6dd9cbb9d7f897839d15b0bc600e23c</id>
<content type='text'>
clients don't need reset-lock for synchronization when no
GPU recovery.

v2:
change to return the return value of down_read_killable.

v3:
if GPU recovery begin, VF ignore FLR notification.

Reviewed-by: Monk Liu &lt;monk.liu@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: refine codes to avoid reentering GPU recovery</title>
<updated>2020-08-24T16:22:56Z</updated>
<author>
<name>Dennis Li</name>
<email>Dennis.Li@amd.com</email>
</author>
<published>2020-08-19T09:23:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=53b3f8f40e6cff36ae12b11e6d6b308af3c7e53f'/>
<id>urn:sha1:53b3f8f40e6cff36ae12b11e6d6b308af3c7e53f</id>
<content type='text'>
if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive-&gt;in_reset
and adev-&gt;in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: revert "fix system hang issue during GPU reset"</title>
<updated>2020-08-14T20:22:40Z</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2020-08-12T15:48:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f1403342ebdfcff3c3cf57ae476f19d3078f2767'/>
<id>urn:sha1:f1403342ebdfcff3c3cf57ae476f19d3078f2767</id>
<content type='text'>
The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add debugfs interface for RAP test</title>
<updated>2020-08-14T20:22:40Z</updated>
<author>
<name>Wenhui Sheng</name>
<email>Wenhui.Sheng@amd.com</email>
</author>
<published>2020-08-11T03:02:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a4322e1881bed80ddb904482f0b2e948fa7fd47e'/>
<id>urn:sha1:a4322e1881bed80ddb904482f0b2e948fa7fd47e</id>
<content type='text'>
After amdgpu driver loading successfully, we can use
RAP debugfs interface &lt;debugfs_dir&gt;/dri/xxx/rap_test
to trigger RAP test.

Currently only L0 validate test is supported.

v2: refine amdgpu_rap.h

Signed-off-by: Wenhui Sheng &lt;Wenhui.Sheng@amd.com&gt;
Reviewed-by: Guchun Chen &lt;Guchun.Chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix system hang issue during GPU reset</title>
<updated>2020-07-27T20:21:37Z</updated>
<author>
<name>Dennis Li</name>
<email>Dennis.Li@amd.com</email>
</author>
<published>2020-07-08T07:07:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df9c8d1aa278c435c30a69b8f2418b4a52fcb929'/>
<id>urn:sha1:df9c8d1aa278c435c30a69b8f2418b4a52fcb929</id>
<content type='text'>
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev-&gt;in_gpu_reset and hive-&gt;in_reset are used to avoid
re-entering GPU recovery.

During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev-&gt;reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.

v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm-&gt;is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev-&gt;in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.

v3:
1. change back to use adev-&gt;reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;

[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249]  dump_stack+0x98/0xd5
[ 1230.179443]  amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673]  gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882]  amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098]  amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239]  ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394]  ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558]  ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707]  ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832]  ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979]  ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522]  amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833]  free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143]  destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475]  pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819]  kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154]  kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458]  ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656]  ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831]  ksys_ioctl+0x98/0xb0
[ 1230.204004]  __x64_sys_ioctl+0x1a/0x20
[ 1230.205174]  do_syscall_64+0x5f/0x250
[ 1230.206339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

2. remove try_lock and introduce atomic hive-&gt;in_reset, to avoid
re-enter GPU recovery.

v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset

v5:
1. Fix some style issues.

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Suggested-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Suggested-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Suggested-by: Lijo Lazar &lt;Lijo.Lazar@amd.com&gt;
Suggested-by: Luben Tukov &lt;luben.tuikov@amd.com&gt;
Signed-off-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add read amdgpu_gfxoff status in debugfs</title>
<updated>2020-07-21T19:37:37Z</updated>
<author>
<name>Jinzhou.Su</name>
<email>Jinzhou.Su@amd.com</email>
</author>
<published>2020-07-07T10:52:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=443c7f3c3641c790a7c306f9d9d54a2a5e3021b9'/>
<id>urn:sha1:443c7f3c3641c790a7c306f9d9d54a2a5e3021b9</id>
<content type='text'>
 Add interface for SMU12 device, used by UMR.

v2: fix code style

Signed-off-by: Jinzhou.Su &lt;Jinzhou.Su@amd.com&gt;
Reviewed-by: Evan Quan &lt;evan.quan@amd.com&gt;
Reviewed-by: Huang Rui &lt;ray.huang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
