* [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-01-27 12:33 ` Qu Huang 0 siblings, 0 replies; 12+ messages in thread From: Qu Huang @ 2021-01-27 12:33 UTC (permalink / raw) To: Felix.Kuehling Cc: alexander.deucher, christian.koenig, airlied, daniel, amd-gfx, dri-devel, linux-kernel, Qu Huang Amdgpu driver uses 4-byte data type as DQM fence memory, and transmits GPU address of fence memory to microcode through query status PM4 message. However, query status PM4 message definition and microcode processing are all processed according to 8 bytes. Fence memory only allocates 4 bytes of memory, but microcode does write 8 bytes of memory, so there is a memory corruption. Signed-off-by: Qu Huang <jinsdb@126.com> --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index e686ce2..8b38d0c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) pr_debug("Allocating fence memory\n"); /* allocate fence memory on the gart */ - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), &dqm->fence_mem); if (retval) -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-01-27 12:33 ` Qu Huang 0 siblings, 0 replies; 12+ messages in thread From: Qu Huang @ 2021-01-27 12:33 UTC (permalink / raw) To: Felix.Kuehling Cc: airlied, Qu Huang, linux-kernel, dri-devel, amd-gfx, daniel, alexander.deucher, christian.koenig Amdgpu driver uses 4-byte data type as DQM fence memory, and transmits GPU address of fence memory to microcode through query status PM4 message. However, query status PM4 message definition and microcode processing are all processed according to 8 bytes. Fence memory only allocates 4 bytes of memory, but microcode does write 8 bytes of memory, so there is a memory corruption. Signed-off-by: Qu Huang <jinsdb@126.com> --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index e686ce2..8b38d0c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) pr_debug("Allocating fence memory\n"); /* allocate fence memory on the gart */ - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), &dqm->fence_mem); if (retval) -- 1.8.3.1 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-01-27 12:33 ` Qu Huang 0 siblings, 0 replies; 12+ messages in thread From: Qu Huang @ 2021-01-27 12:33 UTC (permalink / raw) To: Felix.Kuehling Cc: airlied, Qu Huang, linux-kernel, dri-devel, amd-gfx, alexander.deucher, christian.koenig Amdgpu driver uses 4-byte data type as DQM fence memory, and transmits GPU address of fence memory to microcode through query status PM4 message. However, query status PM4 message definition and microcode processing are all processed according to 8 bytes. Fence memory only allocates 4 bytes of memory, but microcode does write 8 bytes of memory, so there is a memory corruption. Signed-off-by: Qu Huang <jinsdb@126.com> --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index e686ce2..8b38d0c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) pr_debug("Allocating fence memory\n"); /* allocate fence memory on the gart */ - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), &dqm->fence_mem); if (retval) -- 1.8.3.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption 2021-01-27 12:33 ` Qu Huang (?) @ 2021-01-27 21:50 ` Felix Kuehling -1 siblings, 0 replies; 12+ messages in thread From: Felix Kuehling @ 2021-01-27 21:50 UTC (permalink / raw) To: Qu Huang Cc: alexander.deucher, christian.koenig, airlied, daniel, amd-gfx, dri-devel, linux-kernel Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: > Amdgpu driver uses 4-byte data type as DQM fence memory, > and transmits GPU address of fence memory to microcode > through query status PM4 message. However, query status > PM4 message definition and microcode processing are all > processed according to 8 bytes. Fence memory only allocates > 4 bytes of memory, but microcode does write 8 bytes of memory, > so there is a memory corruption. Thank you for pointing out that discrepancy. That's a good catch! I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. We should probably also fix up the query_status and amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence values everywhere to be consistent. Regards, Felix > > Signed-off-by: Qu Huang <jinsdb@126.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > index e686ce2..8b38d0c 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) > pr_debug("Allocating fence memory\n"); > > /* allocate fence memory on the gart */ > - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), > + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), > &dqm->fence_mem); > > if (retval) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-01-27 21:50 ` Felix Kuehling 0 siblings, 0 replies; 12+ messages in thread From: Felix Kuehling @ 2021-01-27 21:50 UTC (permalink / raw) To: Qu Huang Cc: airlied, linux-kernel, dri-devel, amd-gfx, daniel, alexander.deucher, christian.koenig Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: > Amdgpu driver uses 4-byte data type as DQM fence memory, > and transmits GPU address of fence memory to microcode > through query status PM4 message. However, query status > PM4 message definition and microcode processing are all > processed according to 8 bytes. Fence memory only allocates > 4 bytes of memory, but microcode does write 8 bytes of memory, > so there is a memory corruption. Thank you for pointing out that discrepancy. That's a good catch! I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. We should probably also fix up the query_status and amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence values everywhere to be consistent. Regards, Felix > > Signed-off-by: Qu Huang <jinsdb@126.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > index e686ce2..8b38d0c 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) > pr_debug("Allocating fence memory\n"); > > /* allocate fence memory on the gart */ > - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), > + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), > &dqm->fence_mem); > > if (retval) _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-01-27 21:50 ` Felix Kuehling 0 siblings, 0 replies; 12+ messages in thread From: Felix Kuehling @ 2021-01-27 21:50 UTC (permalink / raw) To: Qu Huang Cc: airlied, linux-kernel, dri-devel, amd-gfx, alexander.deucher, christian.koenig Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: > Amdgpu driver uses 4-byte data type as DQM fence memory, > and transmits GPU address of fence memory to microcode > through query status PM4 message. However, query status > PM4 message definition and microcode processing are all > processed according to 8 bytes. Fence memory only allocates > 4 bytes of memory, but microcode does write 8 bytes of memory, > so there is a memory corruption. Thank you for pointing out that discrepancy. That's a good catch! I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. We should probably also fix up the query_status and amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence values everywhere to be consistent. Regards, Felix > > Signed-off-by: Qu Huang <jinsdb@126.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > index e686ce2..8b38d0c 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) > pr_debug("Allocating fence memory\n"); > > /* allocate fence memory on the gart */ > - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), > + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), > &dqm->fence_mem); > > if (retval) _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption 2021-01-27 21:50 ` Felix Kuehling (?) @ 2021-03-26 9:38 ` Qu Huang -1 siblings, 0 replies; 12+ messages in thread From: Qu Huang @ 2021-03-26 9:38 UTC (permalink / raw) To: Felix Kuehling Cc: alexander.deucher, christian.koenig, airlied, daniel, amd-gfx, dri-devel, linux-kernel On 2021/1/28 5:50, Felix Kuehling wrote: > Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: >> Amdgpu driver uses 4-byte data type as DQM fence memory, >> and transmits GPU address of fence memory to microcode >> through query status PM4 message. However, query status >> PM4 message definition and microcode processing are all >> processed according to 8 bytes. Fence memory only allocates >> 4 bytes of memory, but microcode does write 8 bytes of memory, >> so there is a memory corruption. > > Thank you for pointing out that discrepancy. That's a good catch! > > I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. > We should probably also fix up the query_status and > amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence > values everywhere to be consistent. > > Regards, > Felix Hi Felix, Thanks for your advice, please check v2 at https://lore.kernel.org/patchwork/patch/1372584/ Thanks, Qu. > > >> >> Signed-off-by: Qu Huang <jinsdb@126.com> >> --- >> drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> index e686ce2..8b38d0c 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) >> pr_debug("Allocating fence memory\n"); >> >> /* allocate fence memory on the gart */ >> - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), >> + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), >> &dqm->fence_mem); >> >> if (retval) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-03-26 9:38 ` Qu Huang 0 siblings, 0 replies; 12+ messages in thread From: Qu Huang @ 2021-03-26 9:38 UTC (permalink / raw) To: Felix Kuehling Cc: airlied, linux-kernel, dri-devel, amd-gfx, daniel, alexander.deucher, christian.koenig On 2021/1/28 5:50, Felix Kuehling wrote: > Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: >> Amdgpu driver uses 4-byte data type as DQM fence memory, >> and transmits GPU address of fence memory to microcode >> through query status PM4 message. However, query status >> PM4 message definition and microcode processing are all >> processed according to 8 bytes. Fence memory only allocates >> 4 bytes of memory, but microcode does write 8 bytes of memory, >> so there is a memory corruption. > > Thank you for pointing out that discrepancy. That's a good catch! > > I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. > We should probably also fix up the query_status and > amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence > values everywhere to be consistent. > > Regards, > Felix Hi Felix, Thanks for your advice, please check v2 at https://lore.kernel.org/patchwork/patch/1372584/ Thanks, Qu. > > >> >> Signed-off-by: Qu Huang <jinsdb@126.com> >> --- >> drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> index e686ce2..8b38d0c 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) >> pr_debug("Allocating fence memory\n"); >> >> /* allocate fence memory on the gart */ >> - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), >> + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), >> &dqm->fence_mem); >> >> if (retval) _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-03-26 9:38 ` Qu Huang 0 siblings, 0 replies; 12+ messages in thread From: Qu Huang @ 2021-03-26 9:38 UTC (permalink / raw) To: Felix Kuehling Cc: airlied, linux-kernel, dri-devel, amd-gfx, alexander.deucher, christian.koenig On 2021/1/28 5:50, Felix Kuehling wrote: > Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: >> Amdgpu driver uses 4-byte data type as DQM fence memory, >> and transmits GPU address of fence memory to microcode >> through query status PM4 message. However, query status >> PM4 message definition and microcode processing are all >> processed according to 8 bytes. Fence memory only allocates >> 4 bytes of memory, but microcode does write 8 bytes of memory, >> so there is a memory corruption. > > Thank you for pointing out that discrepancy. That's a good catch! > > I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. > We should probably also fix up the query_status and > amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence > values everywhere to be consistent. > > Regards, > Felix Hi Felix, Thanks for your advice, please check v2 at https://lore.kernel.org/patchwork/patch/1372584/ Thanks, Qu. > > >> >> Signed-off-by: Qu Huang <jinsdb@126.com> >> --- >> drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> index e686ce2..8b38d0c 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >> @@ -1161,7 +1161,7 @@ static int start_cpsch(struct device_queue_manager *dqm) >> pr_debug("Allocating fence memory\n"); >> >> /* allocate fence memory on the gart */ >> - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), >> + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), >> &dqm->fence_mem); >> >> if (retval) _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption 2021-03-26 9:38 ` Qu Huang (?) @ 2021-03-26 19:23 ` Felix Kuehling -1 siblings, 0 replies; 12+ messages in thread From: Felix Kuehling @ 2021-03-26 19:23 UTC (permalink / raw) To: Qu Huang Cc: alexander.deucher, christian.koenig, airlied, daniel, amd-gfx, dri-devel, linux-kernel Am 2021-03-26 um 5:38 a.m. schrieb Qu Huang: > On 2021/1/28 5:50, Felix Kuehling wrote: >> Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: >>> Amdgpu driver uses 4-byte data type as DQM fence memory, >>> and transmits GPU address of fence memory to microcode >>> through query status PM4 message. However, query status >>> PM4 message definition and microcode processing are all >>> processed according to 8 bytes. Fence memory only allocates >>> 4 bytes of memory, but microcode does write 8 bytes of memory, >>> so there is a memory corruption. >> >> Thank you for pointing out that discrepancy. That's a good catch! >> >> I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. >> We should probably also fix up the query_status and >> amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence >> values everywhere to be consistent. >> >> Regards, >> Felix > Hi Felix, Thanks for your advice, please check v2 at > https://lore.kernel.org/patchwork/patch/1372584/ Thank you for the reminder. I somehow missed your v2 patch on the mailing list. I have reviewed and applied it to amd-staging-drm-next now. Regards, Felix > Thanks, > Qu. >> >> >>> >>> Signed-off-by: Qu Huang <jinsdb@126.com> >>> --- >>> drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> index e686ce2..8b38d0c 100644 >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> @@ -1161,7 +1161,7 @@ static int start_cpsch(struct >>> device_queue_manager *dqm) >>> pr_debug("Allocating fence memory\n"); >>> /* allocate fence memory on the gart */ >>> - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), >>> + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), >>> &dqm->fence_mem); >>> if (retval) > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-03-26 19:23 ` Felix Kuehling 0 siblings, 0 replies; 12+ messages in thread From: Felix Kuehling @ 2021-03-26 19:23 UTC (permalink / raw) To: Qu Huang Cc: airlied, linux-kernel, dri-devel, amd-gfx, daniel, alexander.deucher, christian.koenig Am 2021-03-26 um 5:38 a.m. schrieb Qu Huang: > On 2021/1/28 5:50, Felix Kuehling wrote: >> Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: >>> Amdgpu driver uses 4-byte data type as DQM fence memory, >>> and transmits GPU address of fence memory to microcode >>> through query status PM4 message. However, query status >>> PM4 message definition and microcode processing are all >>> processed according to 8 bytes. Fence memory only allocates >>> 4 bytes of memory, but microcode does write 8 bytes of memory, >>> so there is a memory corruption. >> >> Thank you for pointing out that discrepancy. That's a good catch! >> >> I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. >> We should probably also fix up the query_status and >> amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence >> values everywhere to be consistent. >> >> Regards, >> Felix > Hi Felix, Thanks for your advice, please check v2 at > https://lore.kernel.org/patchwork/patch/1372584/ Thank you for the reminder. I somehow missed your v2 patch on the mailing list. I have reviewed and applied it to amd-staging-drm-next now. Regards, Felix > Thanks, > Qu. >> >> >>> >>> Signed-off-by: Qu Huang <jinsdb@126.com> >>> --- >>> drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> index e686ce2..8b38d0c 100644 >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> @@ -1161,7 +1161,7 @@ static int start_cpsch(struct >>> device_queue_manager *dqm) >>> pr_debug("Allocating fence memory\n"); >>> /* allocate fence memory on the gart */ >>> - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), >>> + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), >>> &dqm->fence_mem); >>> if (retval) > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/amdkfd: dqm fence memory corruption @ 2021-03-26 19:23 ` Felix Kuehling 0 siblings, 0 replies; 12+ messages in thread From: Felix Kuehling @ 2021-03-26 19:23 UTC (permalink / raw) To: Qu Huang Cc: airlied, linux-kernel, dri-devel, amd-gfx, alexander.deucher, christian.koenig Am 2021-03-26 um 5:38 a.m. schrieb Qu Huang: > On 2021/1/28 5:50, Felix Kuehling wrote: >> Am 2021-01-27 um 7:33 a.m. schrieb Qu Huang: >>> Amdgpu driver uses 4-byte data type as DQM fence memory, >>> and transmits GPU address of fence memory to microcode >>> through query status PM4 message. However, query status >>> PM4 message definition and microcode processing are all >>> processed according to 8 bytes. Fence memory only allocates >>> 4 bytes of memory, but microcode does write 8 bytes of memory, >>> so there is a memory corruption. >> >> Thank you for pointing out that discrepancy. That's a good catch! >> >> I'd prefer to fix this properly by making dqm->fence_addr a u64 pointer. >> We should probably also fix up the query_status and >> amdkfd_fence_wait_timeout function interfaces to use a 64 bit fence >> values everywhere to be consistent. >> >> Regards, >> Felix > Hi Felix, Thanks for your advice, please check v2 at > https://lore.kernel.org/patchwork/patch/1372584/ Thank you for the reminder. I somehow missed your v2 patch on the mailing list. I have reviewed and applied it to amd-staging-drm-next now. Regards, Felix > Thanks, > Qu. >> >> >>> >>> Signed-off-by: Qu Huang <jinsdb@126.com> >>> --- >>> drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> index e686ce2..8b38d0c 100644 >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c >>> @@ -1161,7 +1161,7 @@ static int start_cpsch(struct >>> device_queue_manager *dqm) >>> pr_debug("Allocating fence memory\n"); >>> /* allocate fence memory on the gart */ >>> - retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr), >>> + retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(uint64_t), >>> &dqm->fence_mem); >>> if (retval) > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-03-26 19:24 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-27 12:33 [PATCH] drm/amdkfd: dqm fence memory corruption Qu Huang 2021-01-27 12:33 ` Qu Huang 2021-01-27 12:33 ` Qu Huang 2021-01-27 21:50 ` Felix Kuehling 2021-01-27 21:50 ` Felix Kuehling 2021-01-27 21:50 ` Felix Kuehling 2021-03-26 9:38 ` Qu Huang 2021-03-26 9:38 ` Qu Huang 2021-03-26 9:38 ` Qu Huang 2021-03-26 19:23 ` Felix Kuehling 2021-03-26 19:23 ` Felix Kuehling 2021-03-26 19:23 ` Felix Kuehling
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.