* Screen corruption using radeon kernel driver @ 2022-04-23 16:31 Krylov Michael 2022-04-25 17:22 ` Alex Deucher 0 siblings, 1 reply; 38+ messages in thread From: Krylov Michael @ 2022-04-23 16:31 UTC (permalink / raw) To: amd-gfx [-- Attachment #1: Type: text/plain, Size: 1966 bytes --] Hello! After updating my Linux kernel from version 4.19 (Debian 10 version) to 5.10 (packaged with Debian 11), I've noticed that the image displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950 AGP video card is severely corrupted in the graphical (Xorg and Wayland) mode: all kinds of black and white stripes across the screen, some letters missing, etc. I've checked several options (Xorg drivers, Wayland instead of Xorg, radeon.agpmode=-1 in kernel command line and so on), but the problem persisted. I've managed to find that the problem was in the kernel, as everything worked well with 4.19 kernel with everything else being from Debian 11. I have managed to find the culprit of that corruption, that is the commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel. Reverting this commit and building the kernel with that commit reverted fixes the problem. Disabling HIMEM also gets rid of that problem. But it also leaves the system with less that 1G of RAM, which is, of course, undesirable. Apparently this problem is somewhat known, as I can tell after googling for the commit id, see this link for example: https://lkml.org/lkml/2020/1/9/518 Mageia distro, for example, reverted this commit in the kernel they are building: http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32 I've reported this bug to Debian bugtracker, checked the recent verion of the kernel (5.17), bug still persists. Here's a link to the Debian bug page: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670 I'm not sure if reverting this commit is the correct way to go, so if you need to check any changes/patches that I could apply and test on the real hardware, I'll be glad to do that (but please keep in mind that testing could take some time, I don't have access to this computer 24/7, but I'll do my best to respond ASAP). Thanks in advance! [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-04-23 16:31 Screen corruption using radeon kernel driver Krylov Michael @ 2022-04-25 17:22 ` Alex Deucher 2022-05-16 13:01 ` Mikhail Krylov 2022-11-28 14:31 ` Mikhail Krylov 0 siblings, 2 replies; 38+ messages in thread From: Alex Deucher @ 2022-04-25 17:22 UTC (permalink / raw) To: Krylov Michael, Maling list - DRI developers; +Cc: amd-gfx list + dri-devel On Mon, Apr 25, 2022 at 3:33 AM Krylov Michael <sqarert@gmail.com> wrote: > > Hello! > > After updating my Linux kernel from version 4.19 (Debian 10 version) to > 5.10 (packaged with Debian 11), I've noticed that the image > displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950 > AGP video card is severely corrupted in the graphical (Xorg and Wayland) > mode: all kinds of black and white stripes across the screen, some > letters missing, etc. > > I've checked several options (Xorg drivers, Wayland instead of > Xorg, radeon.agpmode=-1 in kernel command line and so on), but the > problem persisted. I've managed to find that the problem was in the > kernel, as everything worked well with 4.19 kernel with everything > else being from Debian 11. > > I have managed to find the culprit of that corruption, that is the > commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel. > Reverting this commit and building the kernel with that commit reverted > fixes the problem. Disabling HIMEM also gets rid of that problem. But it > also leaves the system with less that 1G of RAM, which is, of course, > undesirable. > > Apparently this problem is somewhat known, as I can tell after googling > for the commit id, see this link for example: > https://lkml.org/lkml/2020/1/9/518 > > Mageia distro, for example, reverted this commit in the kernel they are > building: > > http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32 > > I've reported this bug to Debian bugtracker, checked the recent verion > of the kernel (5.17), bug still persists. Here's a link to the Debian > bug page: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670 > > I'm not sure if reverting this commit is the correct way to go, so if > you need to check any changes/patches that I could apply and test on > the real hardware, I'll be glad to do that (but please keep in mind > that testing could take some time, I don't have access to this computer > 24/7, but I'll do my best to respond ASAP). I would be happy to revert that commit. I attempted to revert it a year or so ago, but Christoph didn't want to. He was going to look further into it. I was not able to repro the issue. It seemed to be related to highmem support. You might try disabling that. Here is the previous thread for reference: https://lists.freedesktop.org/archives/amd-gfx/2020-September/053922.html Alex ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-04-25 17:22 ` Alex Deucher @ 2022-05-16 13:01 ` Mikhail Krylov 2022-11-28 14:31 ` Mikhail Krylov 1 sibling, 0 replies; 38+ messages in thread From: Mikhail Krylov @ 2022-05-16 13:01 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx list, Maling list - DRI developers [-- Attachment #1: Type: text/plain, Size: 2830 bytes --] On Mon, Apr 25, 2022 at 01:22:04PM -0400, Alex Deucher wrote: > + dri-devel > > On Mon, Apr 25, 2022 at 3:33 AM Krylov Michael <sqarert@gmail.com> wrote: > > > > Hello! > > > > After updating my Linux kernel from version 4.19 (Debian 10 version) to > > 5.10 (packaged with Debian 11), I've noticed that the image > > displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950 > > AGP video card is severely corrupted in the graphical (Xorg and Wayland) > > mode: all kinds of black and white stripes across the screen, some > > letters missing, etc. > > > > I've checked several options (Xorg drivers, Wayland instead of > > Xorg, radeon.agpmode=-1 in kernel command line and so on), but the > > problem persisted. I've managed to find that the problem was in the > > kernel, as everything worked well with 4.19 kernel with everything > > else being from Debian 11. > > > > I have managed to find the culprit of that corruption, that is the > > commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel. > > Reverting this commit and building the kernel with that commit reverted > > fixes the problem. Disabling HIMEM also gets rid of that problem. But it > > also leaves the system with less that 1G of RAM, which is, of course, > > undesirable. > > > > Apparently this problem is somewhat known, as I can tell after googling > > for the commit id, see this link for example: > > https://lkml.org/lkml/2020/1/9/518 > > > > Mageia distro, for example, reverted this commit in the kernel they are > > building: > > > > http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32 > > > > I've reported this bug to Debian bugtracker, checked the recent verion > > of the kernel (5.17), bug still persists. Here's a link to the Debian > > bug page: > > > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670 > > > > I'm not sure if reverting this commit is the correct way to go, so if > > you need to check any changes/patches that I could apply and test on > > the real hardware, I'll be glad to do that (but please keep in mind > > that testing could take some time, I don't have access to this computer > > 24/7, but I'll do my best to respond ASAP). > > I would be happy to revert that commit. I attempted to revert it a > year or so ago, but Christoph didn't want to. He was going to look > further into it. I was not able to repro the issue. It seemed to be > related to highmem support. You might try disabling that. Here is > the previous thread for reference: > https://lists.freedesktop.org/archives/amd-gfx/2020-September/053922.html > > Alex Yeah, I tried to disable HIMEM, and that indeed fixes the problem, but it leaves me with less than 1G of available memory which is undesirable. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-04-25 17:22 ` Alex Deucher 2022-05-16 13:01 ` Mikhail Krylov @ 2022-11-28 14:31 ` Mikhail Krylov 2022-11-28 14:50 ` Alex Deucher 1 sibling, 1 reply; 38+ messages in thread From: Mikhail Krylov @ 2022-11-28 14:31 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx list, Maling list - DRI developers [-- Attachment #1: Type: text/plain, Size: 3100 bytes --] On Mon, Apr 25, 2022 at 01:22:04PM -0400, Alex Deucher wrote: > + dri-devel > > On Mon, Apr 25, 2022 at 3:33 AM Krylov Michael <sqarert@gmail.com> wrote: > > > > Hello! > > > > After updating my Linux kernel from version 4.19 (Debian 10 version) to > > 5.10 (packaged with Debian 11), I've noticed that the image > > displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950 > > AGP video card is severely corrupted in the graphical (Xorg and Wayland) > > mode: all kinds of black and white stripes across the screen, some > > letters missing, etc. > > > > I've checked several options (Xorg drivers, Wayland instead of > > Xorg, radeon.agpmode=-1 in kernel command line and so on), but the > > problem persisted. I've managed to find that the problem was in the > > kernel, as everything worked well with 4.19 kernel with everything > > else being from Debian 11. > > > > I have managed to find the culprit of that corruption, that is the > > commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel. > > Reverting this commit and building the kernel with that commit reverted > > fixes the problem. Disabling HIMEM also gets rid of that problem. But it > > also leaves the system with less that 1G of RAM, which is, of course, > > undesirable. > > > > Apparently this problem is somewhat known, as I can tell after googling > > for the commit id, see this link for example: > > https://lkml.org/lkml/2020/1/9/518 > > > > Mageia distro, for example, reverted this commit in the kernel they are > > building: > > > > http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32 > > > > I've reported this bug to Debian bugtracker, checked the recent verion > > of the kernel (5.17), bug still persists. Here's a link to the Debian > > bug page: > > > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670 > > > > I'm not sure if reverting this commit is the correct way to go, so if > > you need to check any changes/patches that I could apply and test on > > the real hardware, I'll be glad to do that (but please keep in mind > > that testing could take some time, I don't have access to this computer > > 24/7, but I'll do my best to respond ASAP). > > I would be happy to revert that commit. I attempted to revert it a > year or so ago, but Christoph didn't want to. He was going to look > further into it. I was not able to repro the issue. It seemed to be > related to highmem support. You might try disabling that. Here is > the previous thread for reference: > https://lists.freedesktop.org/archives/amd-gfx/2020-September/053922.html > > Alex So, is there any progress on this issue? I do understand it's not a high priority one, and today I've checked it on 6.0 kernel, and unfortunately, it still persists... I'm considering writing a patch that will allow user to override need_dma32/dma_bits setting with a module parameter. I'll have some time after the New Year for that. Is it at all possible that such a patch will be merged into kernel? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-28 14:31 ` Mikhail Krylov @ 2022-11-28 14:50 ` Alex Deucher 2022-11-28 20:48 ` Mikhail Krylov 0 siblings, 1 reply; 38+ messages in thread From: Alex Deucher @ 2022-11-28 14:50 UTC (permalink / raw) To: Mikhail Krylov; +Cc: amd-gfx list, Maling list - DRI developers On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > On Mon, Apr 25, 2022 at 01:22:04PM -0400, Alex Deucher wrote: > > + dri-devel > > > > On Mon, Apr 25, 2022 at 3:33 AM Krylov Michael <sqarert@gmail.com> wrote: > > > > > > Hello! > > > > > > After updating my Linux kernel from version 4.19 (Debian 10 version) to > > > 5.10 (packaged with Debian 11), I've noticed that the image > > > displayed on my older computer, 32-bit Pentium 4 using ATI Radeon X1950 > > > AGP video card is severely corrupted in the graphical (Xorg and Wayland) > > > mode: all kinds of black and white stripes across the screen, some > > > letters missing, etc. > > > > > > I've checked several options (Xorg drivers, Wayland instead of > > > Xorg, radeon.agpmode=-1 in kernel command line and so on), but the > > > problem persisted. I've managed to find that the problem was in the > > > kernel, as everything worked well with 4.19 kernel with everything > > > else being from Debian 11. > > > > > > I have managed to find the culprit of that corruption, that is the > > > commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713 on the linux kernel. > > > Reverting this commit and building the kernel with that commit reverted > > > fixes the problem. Disabling HIMEM also gets rid of that problem. But it > > > also leaves the system with less that 1G of RAM, which is, of course, > > > undesirable. > > > > > > Apparently this problem is somewhat known, as I can tell after googling > > > for the commit id, see this link for example: > > > https://lkml.org/lkml/2020/1/9/518 > > > > > > Mageia distro, for example, reverted this commit in the kernel they are > > > building: > > > > > > http://sophie.zarb.org/distrib/Mageia/7/i586/by-pkgid/b9193a4f85192bc57f4d770fb9bb399c/files/32 > > > > > > I've reported this bug to Debian bugtracker, checked the recent verion > > > of the kernel (5.17), bug still persists. Here's a link to the Debian > > > bug page: > > > > > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993670 > > > > > > I'm not sure if reverting this commit is the correct way to go, so if > > > you need to check any changes/patches that I could apply and test on > > > the real hardware, I'll be glad to do that (but please keep in mind > > > that testing could take some time, I don't have access to this computer > > > 24/7, but I'll do my best to respond ASAP). > > > > I would be happy to revert that commit. I attempted to revert it a > > year or so ago, but Christoph didn't want to. He was going to look > > further into it. I was not able to repro the issue. It seemed to be > > related to highmem support. You might try disabling that. Here is > > the previous thread for reference: > > https://lists.freedesktop.org/archives/amd-gfx/2020-September/053922.html > > > > Alex > > So, is there any progress on this issue? I do understand it's not a high > priority one, and today I've checked it on 6.0 kernel, and > unfortunately, it still persists... > > I'm considering writing a patch that will allow user to override > need_dma32/dma_bits setting with a module parameter. I'll have some time > after the New Year for that. > > Is it at all possible that such a patch will be merged into kernel? Unless someone familiar with HIMEM can figure out what is going wrong we should just revert the patch. Alex ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-28 14:50 ` Alex Deucher @ 2022-11-28 20:48 ` Mikhail Krylov 2022-11-29 14:44 ` Alex Deucher 0 siblings, 1 reply; 38+ messages in thread From: Mikhail Krylov @ 2022-11-28 20:48 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx list, Maling list - DRI developers [-- Attachment #1: Type: text/plain, Size: 2420 bytes --] On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: >>> [excessive quoting removed] >> So, is there any progress on this issue? I do understand it's not a high >> priority one, and today I've checked it on 6.0 kernel, and >> unfortunately, it still persists... >> >> I'm considering writing a patch that will allow user to override >> need_dma32/dma_bits setting with a module parameter. I'll have some time >> after the New Year for that. >> >> Is it at all possible that such a patch will be merged into kernel? >> > On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > Unless someone familiar with HIMEM can figure out what is going wrong > we should just revert the patch. > > Alex Okay, I was suggesting that mostly because a) it works for me with dma_bits = 40 (I understand that's what it is without the original patch applied); b) there's a hint of uncertainity on this line https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 saying that for AGP dma_bits = 32 is the safest option, so apparently there are setups, unlike mine, where dma_bits = 32 is better than 40. But I'm in no position to argue, just wanted to make myself clear. I'm okay with rebuilding the kernel for my machine until the original patch is reverted or any other fix is applied. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-28 20:48 ` Mikhail Krylov @ 2022-11-29 14:44 ` Alex Deucher 2022-11-29 15:59 ` Mikhail Krylov 0 siblings, 1 reply; 38+ messages in thread From: Alex Deucher @ 2022-11-29 14:44 UTC (permalink / raw) To: Mikhail Krylov; +Cc: amd-gfx list, Maling list - DRI developers On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > > On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > > >>> [excessive quoting removed] > > >> So, is there any progress on this issue? I do understand it's not a high > >> priority one, and today I've checked it on 6.0 kernel, and > >> unfortunately, it still persists... > >> > >> I'm considering writing a patch that will allow user to override > >> need_dma32/dma_bits setting with a module parameter. I'll have some time > >> after the New Year for that. > >> > >> Is it at all possible that such a patch will be merged into kernel? > >> > > On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > Unless someone familiar with HIMEM can figure out what is going wrong > > we should just revert the patch. > > > > Alex > > > Okay, I was suggesting that mostly because > > a) it works for me with dma_bits = 40 (I understand that's what it is > without the original patch applied); > > b) there's a hint of uncertainity on this line > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > saying that for AGP dma_bits = 32 is the safest option, so apparently there are > setups, unlike mine, where dma_bits = 32 is better than 40. > > But I'm in no position to argue, just wanted to make myself clear. > I'm okay with rebuilding the kernel for my machine until the original > patch is reverted or any other fix is applied. What GPU do you have and is it AGP? If it is AGP, does setting radeon.agpmode=-1 also fix it? Alex ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-29 14:44 ` Alex Deucher @ 2022-11-29 15:59 ` Mikhail Krylov 2022-11-29 16:05 ` Alex Deucher 0 siblings, 1 reply; 38+ messages in thread From: Mikhail Krylov @ 2022-11-29 15:59 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx list, Maling list - DRI developers [-- Attachment #1: Type: text/plain, Size: 1943 bytes --] On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > > > > >>> [excessive quoting removed] > > > > >> So, is there any progress on this issue? I do understand it's not a high > > >> priority one, and today I've checked it on 6.0 kernel, and > > >> unfortunately, it still persists... > > >> > > >> I'm considering writing a patch that will allow user to override > > >> need_dma32/dma_bits setting with a module parameter. I'll have some time > > >> after the New Year for that. > > >> > > >> Is it at all possible that such a patch will be merged into kernel? > > >> > > > On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > > Unless someone familiar with HIMEM can figure out what is going wrong > > > we should just revert the patch. > > > > > > Alex > > > > > > Okay, I was suggesting that mostly because > > > > a) it works for me with dma_bits = 40 (I understand that's what it is > > without the original patch applied); > > > > b) there's a hint of uncertainity on this line > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > > saying that for AGP dma_bits = 32 is the safest option, so apparently there are > > setups, unlike mine, where dma_bits = 32 is better than 40. > > > > But I'm in no position to argue, just wanted to make myself clear. > > I'm okay with rebuilding the kernel for my machine until the original > > patch is reverted or any other fix is applied. > > What GPU do you have and is it AGP? If it is AGP, does setting > radeon.agpmode=-1 also fix it? > > Alex That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't help, it just makes 3D acceleration in games such as OpenArena stop working. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-29 15:59 ` Mikhail Krylov @ 2022-11-29 16:05 ` Alex Deucher 2022-11-29 17:11 ` Mikhail Krylov 0 siblings, 1 reply; 38+ messages in thread From: Alex Deucher @ 2022-11-29 16:05 UTC (permalink / raw) To: Mikhail Krylov; +Cc: amd-gfx list, Maling list - DRI developers On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > > On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > > On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > > > > > > >>> [excessive quoting removed] > > > > > > >> So, is there any progress on this issue? I do understand it's not a high > > > >> priority one, and today I've checked it on 6.0 kernel, and > > > >> unfortunately, it still persists... > > > >> > > > >> I'm considering writing a patch that will allow user to override > > > >> need_dma32/dma_bits setting with a module parameter. I'll have some time > > > >> after the New Year for that. > > > >> > > > >> Is it at all possible that such a patch will be merged into kernel? > > > >> > > > > On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > Unless someone familiar with HIMEM can figure out what is going wrong > > > > we should just revert the patch. > > > > > > > > Alex > > > > > > > > > Okay, I was suggesting that mostly because > > > > > > a) it works for me with dma_bits = 40 (I understand that's what it is > > > without the original patch applied); > > > > > > b) there's a hint of uncertainity on this line > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > > > saying that for AGP dma_bits = 32 is the safest option, so apparently there are > > > setups, unlike mine, where dma_bits = 32 is better than 40. > > > > > > But I'm in no position to argue, just wanted to make myself clear. > > > I'm okay with rebuilding the kernel for my machine until the original > > > patch is reverted or any other fix is applied. > > > > What GPU do you have and is it AGP? If it is AGP, does setting > > radeon.agpmode=-1 also fix it? > > > > Alex > > That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > help, it just makes 3D acceleration in games such as OpenArena stop > working. Just to confirm, is the board AGP or PCIe? Alex ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-29 16:05 ` Alex Deucher @ 2022-11-29 17:11 ` Mikhail Krylov 2022-11-30 12:54 ` Robin Murphy 0 siblings, 1 reply; 38+ messages in thread From: Mikhail Krylov @ 2022-11-29 17:11 UTC (permalink / raw) To: Alex Deucher; +Cc: amd-gfx list, Maling list - DRI developers [-- Attachment #1: Type: text/plain, Size: 2369 bytes --] On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: > On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > > > On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > > > > On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > > > > > > > > >>> [excessive quoting removed] > > > > > > > > >> So, is there any progress on this issue? I do understand it's not a high > > > > >> priority one, and today I've checked it on 6.0 kernel, and > > > > >> unfortunately, it still persists... > > > > >> > > > > >> I'm considering writing a patch that will allow user to override > > > > >> need_dma32/dma_bits setting with a module parameter. I'll have some time > > > > >> after the New Year for that. > > > > >> > > > > >> Is it at all possible that such a patch will be merged into kernel? > > > > >> > > > > > On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > Unless someone familiar with HIMEM can figure out what is going wrong > > > > > we should just revert the patch. > > > > > > > > > > Alex > > > > > > > > > > > > Okay, I was suggesting that mostly because > > > > > > > > a) it works for me with dma_bits = 40 (I understand that's what it is > > > > without the original patch applied); > > > > > > > > b) there's a hint of uncertainity on this line > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > > > > saying that for AGP dma_bits = 32 is the safest option, so apparently there are > > > > setups, unlike mine, where dma_bits = 32 is better than 40. > > > > > > > > But I'm in no position to argue, just wanted to make myself clear. > > > > I'm okay with rebuilding the kernel for my machine until the original > > > > patch is reverted or any other fix is applied. > > > > > > What GPU do you have and is it AGP? If it is AGP, does setting > > > radeon.agpmode=-1 also fix it? > > > > > > Alex > > > > That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > > help, it just makes 3D acceleration in games such as OpenArena stop > > working. > > Just to confirm, is the board AGP or PCIe? > > Alex It is AGP. That's an old machine. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-29 17:11 ` Mikhail Krylov @ 2022-11-30 12:54 ` Robin Murphy 2022-11-30 14:28 ` Alex Deucher 0 siblings, 1 reply; 38+ messages in thread From: Robin Murphy @ 2022-11-30 12:54 UTC (permalink / raw) To: Mikhail Krylov, Alex Deucher; +Cc: Maling list - DRI developers, amd-gfx list On 2022-11-29 17:11, Mikhail Krylov wrote: > On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: >> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: >>> >>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: >>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>> >>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: >>>>> >>>>>>>> [excessive quoting removed] >>>>> >>>>>>> So, is there any progress on this issue? I do understand it's not a high >>>>>>> priority one, and today I've checked it on 6.0 kernel, and >>>>>>> unfortunately, it still persists... >>>>>>> >>>>>>> I'm considering writing a patch that will allow user to override >>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time >>>>>>> after the New Year for that. >>>>>>> >>>>>>> Is it at all possible that such a patch will be merged into kernel? >>>>>>> >>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>> Unless someone familiar with HIMEM can figure out what is going wrong >>>>>> we should just revert the patch. >>>>>> >>>>>> Alex >>>>> >>>>> >>>>> Okay, I was suggesting that mostly because >>>>> >>>>> a) it works for me with dma_bits = 40 (I understand that's what it is >>>>> without the original patch applied); >>>>> >>>>> b) there's a hint of uncertainity on this line >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 >>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are >>>>> setups, unlike mine, where dma_bits = 32 is better than 40. >>>>> >>>>> But I'm in no position to argue, just wanted to make myself clear. >>>>> I'm okay with rebuilding the kernel for my machine until the original >>>>> patch is reverted or any other fix is applied. >>>> >>>> What GPU do you have and is it AGP? If it is AGP, does setting >>>> radeon.agpmode=-1 also fix it? >>>> >>>> Alex >>> >>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't >>> help, it just makes 3D acceleration in games such as OpenArena stop >>> working. >> >> Just to confirm, is the board AGP or PCIe? >> >> Alex > > It is AGP. That's an old machine. Can you check whether dma_addressing_limited() is actually returning the expected result at the point of radeon_ttm_init()? Disabling highmem is presumably just hiding whatever problem exists, by throwing away all >32-bit RAM such that use_dma32 doesn't matter. Robin. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-30 12:54 ` Robin Murphy @ 2022-11-30 14:28 ` Alex Deucher 2022-11-30 15:42 ` Robin Murphy 0 siblings, 1 reply; 38+ messages in thread From: Alex Deucher @ 2022-11-30 14:28 UTC (permalink / raw) To: Robin Murphy; +Cc: Mikhail Krylov, Maling list - DRI developers, amd-gfx list On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: > > On 2022-11-29 17:11, Mikhail Krylov wrote: > > On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: > >> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > >>> > >>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > >>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > >>>>> > >>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > >>>>> > >>>>>>>> [excessive quoting removed] > >>>>> > >>>>>>> So, is there any progress on this issue? I do understand it's not a high > >>>>>>> priority one, and today I've checked it on 6.0 kernel, and > >>>>>>> unfortunately, it still persists... > >>>>>>> > >>>>>>> I'm considering writing a patch that will allow user to override > >>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time > >>>>>>> after the New Year for that. > >>>>>>> > >>>>>>> Is it at all possible that such a patch will be merged into kernel? > >>>>>>> > >>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > >>>>>> Unless someone familiar with HIMEM can figure out what is going wrong > >>>>>> we should just revert the patch. > >>>>>> > >>>>>> Alex > >>>>> > >>>>> > >>>>> Okay, I was suggesting that mostly because > >>>>> > >>>>> a) it works for me with dma_bits = 40 (I understand that's what it is > >>>>> without the original patch applied); > >>>>> > >>>>> b) there's a hint of uncertainity on this line > >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > >>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are > >>>>> setups, unlike mine, where dma_bits = 32 is better than 40. > >>>>> > >>>>> But I'm in no position to argue, just wanted to make myself clear. > >>>>> I'm okay with rebuilding the kernel for my machine until the original > >>>>> patch is reverted or any other fix is applied. > >>>> > >>>> What GPU do you have and is it AGP? If it is AGP, does setting > >>>> radeon.agpmode=-1 also fix it? > >>>> > >>>> Alex > >>> > >>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > >>> help, it just makes 3D acceleration in games such as OpenArena stop > >>> working. > >> > >> Just to confirm, is the board AGP or PCIe? > >> > >> Alex > > > > It is AGP. That's an old machine. > > Can you check whether dma_addressing_limited() is actually returning the > expected result at the point of radeon_ttm_init()? Disabling highmem is > presumably just hiding whatever problem exists, by throwing away all > >32-bit RAM such that use_dma32 doesn't matter. The device in question only supports a 32 bit DMA mask so dma_addressing_limited() should return true. Bounce buffers are not really usable on GPUs because they map so much memory. If dma_addressing_limited() returns false, that would explain it. Alex ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-30 14:28 ` Alex Deucher @ 2022-11-30 15:42 ` Robin Murphy 2022-11-30 16:07 ` Alex Deucher 0 siblings, 1 reply; 38+ messages in thread From: Robin Murphy @ 2022-11-30 15:42 UTC (permalink / raw) To: Alex Deucher; +Cc: Mikhail Krylov, Maling list - DRI developers, amd-gfx list On 2022-11-30 14:28, Alex Deucher wrote: > On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: >> >> On 2022-11-29 17:11, Mikhail Krylov wrote: >>> On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: >>>> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>> >>>>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: >>>>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>>> >>>>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: >>>>>>> >>>>>>>>>> [excessive quoting removed] >>>>>>> >>>>>>>>> So, is there any progress on this issue? I do understand it's not a high >>>>>>>>> priority one, and today I've checked it on 6.0 kernel, and >>>>>>>>> unfortunately, it still persists... >>>>>>>>> >>>>>>>>> I'm considering writing a patch that will allow user to override >>>>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time >>>>>>>>> after the New Year for that. >>>>>>>>> >>>>>>>>> Is it at all possible that such a patch will be merged into kernel? >>>>>>>>> >>>>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>>>> Unless someone familiar with HIMEM can figure out what is going wrong >>>>>>>> we should just revert the patch. >>>>>>>> >>>>>>>> Alex >>>>>>> >>>>>>> >>>>>>> Okay, I was suggesting that mostly because >>>>>>> >>>>>>> a) it works for me with dma_bits = 40 (I understand that's what it is >>>>>>> without the original patch applied); >>>>>>> >>>>>>> b) there's a hint of uncertainity on this line >>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 >>>>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are >>>>>>> setups, unlike mine, where dma_bits = 32 is better than 40. >>>>>>> >>>>>>> But I'm in no position to argue, just wanted to make myself clear. >>>>>>> I'm okay with rebuilding the kernel for my machine until the original >>>>>>> patch is reverted or any other fix is applied. >>>>>> >>>>>> What GPU do you have and is it AGP? If it is AGP, does setting >>>>>> radeon.agpmode=-1 also fix it? >>>>>> >>>>>> Alex >>>>> >>>>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't >>>>> help, it just makes 3D acceleration in games such as OpenArena stop >>>>> working. >>>> >>>> Just to confirm, is the board AGP or PCIe? >>>> >>>> Alex >>> >>> It is AGP. That's an old machine. >> >> Can you check whether dma_addressing_limited() is actually returning the >> expected result at the point of radeon_ttm_init()? Disabling highmem is >> presumably just hiding whatever problem exists, by throwing away all >> >32-bit RAM such that use_dma32 doesn't matter. > > The device in question only supports a 32 bit DMA mask so > dma_addressing_limited() should return true. Bounce buffers are not > really usable on GPUs because they map so much memory. If > dma_addressing_limited() returns false, that would explain it. Right, it appears to be the only part of the offending commit that *could* reasonably make any difference, so I'm primarily wondering if dma_get_required_mask() somehow gets confused. Thanks, Robin. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-30 15:42 ` Robin Murphy @ 2022-11-30 16:07 ` Alex Deucher 2022-11-30 19:59 ` Mikhail Krylov 2022-12-10 15:32 ` Mikhail Krylov 0 siblings, 2 replies; 38+ messages in thread From: Alex Deucher @ 2022-11-30 16:07 UTC (permalink / raw) To: Robin Murphy; +Cc: Mikhail Krylov, Maling list - DRI developers, amd-gfx list On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@arm.com> wrote: > > On 2022-11-30 14:28, Alex Deucher wrote: > > On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: > >> > >> On 2022-11-29 17:11, Mikhail Krylov wrote: > >>> On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: > >>>> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > >>>>> > >>>>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > >>>>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > >>>>>>> > >>>>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > >>>>>>> > >>>>>>>>>> [excessive quoting removed] > >>>>>>> > >>>>>>>>> So, is there any progress on this issue? I do understand it's not a high > >>>>>>>>> priority one, and today I've checked it on 6.0 kernel, and > >>>>>>>>> unfortunately, it still persists... > >>>>>>>>> > >>>>>>>>> I'm considering writing a patch that will allow user to override > >>>>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time > >>>>>>>>> after the New Year for that. > >>>>>>>>> > >>>>>>>>> Is it at all possible that such a patch will be merged into kernel? > >>>>>>>>> > >>>>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > >>>>>>>> Unless someone familiar with HIMEM can figure out what is going wrong > >>>>>>>> we should just revert the patch. > >>>>>>>> > >>>>>>>> Alex > >>>>>>> > >>>>>>> > >>>>>>> Okay, I was suggesting that mostly because > >>>>>>> > >>>>>>> a) it works for me with dma_bits = 40 (I understand that's what it is > >>>>>>> without the original patch applied); > >>>>>>> > >>>>>>> b) there's a hint of uncertainity on this line > >>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > >>>>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are > >>>>>>> setups, unlike mine, where dma_bits = 32 is better than 40. > >>>>>>> > >>>>>>> But I'm in no position to argue, just wanted to make myself clear. > >>>>>>> I'm okay with rebuilding the kernel for my machine until the original > >>>>>>> patch is reverted or any other fix is applied. > >>>>>> > >>>>>> What GPU do you have and is it AGP? If it is AGP, does setting > >>>>>> radeon.agpmode=-1 also fix it? > >>>>>> > >>>>>> Alex > >>>>> > >>>>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > >>>>> help, it just makes 3D acceleration in games such as OpenArena stop > >>>>> working. > >>>> > >>>> Just to confirm, is the board AGP or PCIe? > >>>> > >>>> Alex > >>> > >>> It is AGP. That's an old machine. > >> > >> Can you check whether dma_addressing_limited() is actually returning the > >> expected result at the point of radeon_ttm_init()? Disabling highmem is > >> presumably just hiding whatever problem exists, by throwing away all > >> >32-bit RAM such that use_dma32 doesn't matter. > > > > The device in question only supports a 32 bit DMA mask so > > dma_addressing_limited() should return true. Bounce buffers are not > > really usable on GPUs because they map so much memory. If > > dma_addressing_limited() returns false, that would explain it. > > Right, it appears to be the only part of the offending commit that > *could* reasonably make any difference, so I'm primarily wondering if > dma_get_required_mask() somehow gets confused. Mikhail, Can you see that dma_addressing_limited() and dma_get_required_mask() return in this case? Alex > > Thanks, > Robin. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-30 16:07 ` Alex Deucher @ 2022-11-30 19:59 ` Mikhail Krylov 2022-12-01 14:00 ` Robin Murphy 2022-12-10 15:32 ` Mikhail Krylov 1 sibling, 1 reply; 38+ messages in thread From: Mikhail Krylov @ 2022-11-30 19:59 UTC (permalink / raw) To: Alex Deucher; +Cc: Robin Murphy, Maling list - DRI developers, amd-gfx list [-- Attachment #1: Type: text/plain, Size: 4868 bytes --] On Wed, Nov 30, 2022 at 11:07:32AM -0500, Alex Deucher wrote: > On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@arm.com> wrote: > > > > On 2022-11-30 14:28, Alex Deucher wrote: > > > On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: > > >> > > >> On 2022-11-29 17:11, Mikhail Krylov wrote: > > >>> On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: > > >>>> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > >>>>> > > >>>>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > > >>>>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > > >>>>>>> > > >>>>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > > >>>>>>> > > >>>>>>>>>> [excessive quoting removed] > > >>>>>>> > > >>>>>>>>> So, is there any progress on this issue? I do understand it's not a high > > >>>>>>>>> priority one, and today I've checked it on 6.0 kernel, and > > >>>>>>>>> unfortunately, it still persists... > > >>>>>>>>> > > >>>>>>>>> I'm considering writing a patch that will allow user to override > > >>>>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time > > >>>>>>>>> after the New Year for that. > > >>>>>>>>> > > >>>>>>>>> Is it at all possible that such a patch will be merged into kernel? > > >>>>>>>>> > > >>>>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > >>>>>>>> Unless someone familiar with HIMEM can figure out what is going wrong > > >>>>>>>> we should just revert the patch. > > >>>>>>>> > > >>>>>>>> Alex > > >>>>>>> > > >>>>>>> > > >>>>>>> Okay, I was suggesting that mostly because > > >>>>>>> > > >>>>>>> a) it works for me with dma_bits = 40 (I understand that's what it is > > >>>>>>> without the original patch applied); > > >>>>>>> > > >>>>>>> b) there's a hint of uncertainity on this line > > >>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > > >>>>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are > > >>>>>>> setups, unlike mine, where dma_bits = 32 is better than 40. > > >>>>>>> > > >>>>>>> But I'm in no position to argue, just wanted to make myself clear. > > >>>>>>> I'm okay with rebuilding the kernel for my machine until the original > > >>>>>>> patch is reverted or any other fix is applied. > > >>>>>> > > >>>>>> What GPU do you have and is it AGP? If it is AGP, does setting > > >>>>>> radeon.agpmode=-1 also fix it? > > >>>>>> > > >>>>>> Alex > > >>>>> > > >>>>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > > >>>>> help, it just makes 3D acceleration in games such as OpenArena stop > > >>>>> working. > > >>>> > > >>>> Just to confirm, is the board AGP or PCIe? > > >>>> > > >>>> Alex > > >>> > > >>> It is AGP. That's an old machine. > > >> > > >> Can you check whether dma_addressing_limited() is actually returning the > > >> expected result at the point of radeon_ttm_init()? Disabling highmem is > > >> presumably just hiding whatever problem exists, by throwing away all > > >> >32-bit RAM such that use_dma32 doesn't matter. > > > > > > The device in question only supports a 32 bit DMA mask so > > > dma_addressing_limited() should return true. Bounce buffers are not > > > really usable on GPUs because they map so much memory. If > > > dma_addressing_limited() returns false, that would explain it. > > > > Right, it appears to be the only part of the offending commit that > > *could* reasonably make any difference, so I'm primarily wondering if > > dma_get_required_mask() somehow gets confused. > > Mikhail, > > Can you see that dma_addressing_limited() and dma_get_required_mask() > return in this case? > > Alex > > > > > > Thanks, > > Robin. Unfortunately, right now I don't have enough time for kernel modifications and rebuilds (I will later!), so I did a quick-and-dirty research with kprobe. The problem is that dma_addressing_limited() seems to be inlined and kprobe fails to intercept it. But I managed to get the result of dma_get_required_mask(). It returns 0x7fffffff (!) on the vanilla (with the patch, buggy) kernel: $ sudo kprobe-perf 'r:dma_get_required_mask $retval' Tracing kprobe dma_get_required_mask. Ctrl-C to end. modprobe-1244 [000] d... 105.582816: dma_get_required_mask: (radeon_ttm_init+0x61/0x240 [radeon] <- dma_get_required_mask) arg1=0x7fffffff This function does not even get called in the kernel without the patch that I built myself. I believe that's because ttm_bo_device_init() doesn't call it without the patch. Hope that helps at least a bit. If not, I'll be able to do more thorough research in a couple of weeks, probably. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-30 19:59 ` Mikhail Krylov @ 2022-12-01 14:00 ` Robin Murphy 2022-12-01 14:06 ` Alex Deucher 2022-12-01 15:28 ` Mikhail Krylov 0 siblings, 2 replies; 38+ messages in thread From: Robin Murphy @ 2022-12-01 14:00 UTC (permalink / raw) To: Mikhail Krylov, Alex Deucher; +Cc: Maling list - DRI developers, amd-gfx list On 2022-11-30 19:59, Mikhail Krylov wrote: > On Wed, Nov 30, 2022 at 11:07:32AM -0500, Alex Deucher wrote: >> On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@arm.com> wrote: >>> >>> On 2022-11-30 14:28, Alex Deucher wrote: >>>> On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: >>>>> >>>>> On 2022-11-29 17:11, Mikhail Krylov wrote: >>>>>> On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: >>>>>>> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>>>> >>>>>>>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: >>>>>>>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: >>>>>>>>>> >>>>>>>>>>>>> [excessive quoting removed] >>>>>>>>>> >>>>>>>>>>>> So, is there any progress on this issue? I do understand it's not a high >>>>>>>>>>>> priority one, and today I've checked it on 6.0 kernel, and >>>>>>>>>>>> unfortunately, it still persists... >>>>>>>>>>>> >>>>>>>>>>>> I'm considering writing a patch that will allow user to override >>>>>>>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time >>>>>>>>>>>> after the New Year for that. >>>>>>>>>>>> >>>>>>>>>>>> Is it at all possible that such a patch will be merged into kernel? >>>>>>>>>>>> >>>>>>>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>>>>>>> Unless someone familiar with HIMEM can figure out what is going wrong >>>>>>>>>>> we should just revert the patch. >>>>>>>>>>> >>>>>>>>>>> Alex >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Okay, I was suggesting that mostly because >>>>>>>>>> >>>>>>>>>> a) it works for me with dma_bits = 40 (I understand that's what it is >>>>>>>>>> without the original patch applied); >>>>>>>>>> >>>>>>>>>> b) there's a hint of uncertainity on this line >>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 >>>>>>>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are >>>>>>>>>> setups, unlike mine, where dma_bits = 32 is better than 40. >>>>>>>>>> >>>>>>>>>> But I'm in no position to argue, just wanted to make myself clear. >>>>>>>>>> I'm okay with rebuilding the kernel for my machine until the original >>>>>>>>>> patch is reverted or any other fix is applied. >>>>>>>>> >>>>>>>>> What GPU do you have and is it AGP? If it is AGP, does setting >>>>>>>>> radeon.agpmode=-1 also fix it? >>>>>>>>> >>>>>>>>> Alex >>>>>>>> >>>>>>>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't >>>>>>>> help, it just makes 3D acceleration in games such as OpenArena stop >>>>>>>> working. >>>>>>> >>>>>>> Just to confirm, is the board AGP or PCIe? >>>>>>> >>>>>>> Alex >>>>>> >>>>>> It is AGP. That's an old machine. >>>>> >>>>> Can you check whether dma_addressing_limited() is actually returning the >>>>> expected result at the point of radeon_ttm_init()? Disabling highmem is >>>>> presumably just hiding whatever problem exists, by throwing away all >>>>> >32-bit RAM such that use_dma32 doesn't matter. >>>> >>>> The device in question only supports a 32 bit DMA mask so >>>> dma_addressing_limited() should return true. Bounce buffers are not >>>> really usable on GPUs because they map so much memory. If >>>> dma_addressing_limited() returns false, that would explain it. >>> >>> Right, it appears to be the only part of the offending commit that >>> *could* reasonably make any difference, so I'm primarily wondering if >>> dma_get_required_mask() somehow gets confused. >> >> Mikhail, >> >> Can you see that dma_addressing_limited() and dma_get_required_mask() >> return in this case? >> >> Alex >> >> >>> >>> Thanks, >>> Robin. > > Unfortunately, right now I don't have enough time for kernel > modifications and rebuilds (I will later!), so I did a quick-and-dirty > research with kprobe. > > The problem is that dma_addressing_limited() seems to be inlined and > kprobe fails to intercept it. > > But I managed to get the result of dma_get_required_mask(). It returns > 0x7fffffff (!) on the vanilla (with the patch, buggy) kernel: > > $ sudo kprobe-perf 'r:dma_get_required_mask $retval' > Tracing kprobe dma_get_required_mask. Ctrl-C to end. > modprobe-1244 [000] d... 105.582816: dma_get_required_mask: (radeon_ttm_init+0x61/0x240 [radeon] <- dma_get_required_mask) arg1=0x7fffffff > > This function does not even get called in the kernel without the patch > that I built myself. I believe that's because ttm_bo_device_init() > doesn't call it without the patch. > > Hope that helps at least a bit. If not, I'll be able to do more thorough > research in a couple of weeks, probably. Hmm, just to clarify, what's your actual RAM layout? I've been assuming that the issue must be caused by unexpected DMA address truncation, but double-checking the older threads it seems that might not be the case. I just did a quick sanity-check of both HIGHMEM4G and HIGHMEM64G configs in a VM with either 2GB or 4GB of RAM assigned, and the dma_direct_get_required_mask() calculation seemed to return the appropriate result for all combinations. Otherwise, the only significant difference of use_dma32 seems to be to switch TTM's allocation flags from GFP_HIGHUSER to GFP_DMA32. Could it just be that the highmem support somewhere between TTM and radeon has bitrotted, and it hasn't been noticed until this change because everyone still using a 32-bit system with highmem also happens not to be using a newer 40-bit-capable GPU? Or perhaps it never worked for AGP at all, in which case an explicit special case might be clearer? diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index d33fec488713..acb2d534bff5 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -696,6 +696,7 @@ int radeon_ttm_init(struct radeon_device *rdev) rdev->ddev->anon_inode->i_mapping, rdev->ddev->vma_offset_manager, rdev->need_swiotlb, + rdev->flags & RADEON_IS_AGP || dma_addressing_limited(&rdev->pdev->dev)); if (r) { DRM_ERROR("failed initializing buffer object driver(%d).\n", r); Robin. ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-12-01 14:00 ` Robin Murphy @ 2022-12-01 14:06 ` Alex Deucher 2022-12-01 15:28 ` Mikhail Krylov 1 sibling, 0 replies; 38+ messages in thread From: Alex Deucher @ 2022-12-01 14:06 UTC (permalink / raw) To: Robin Murphy; +Cc: Mikhail Krylov, Maling list - DRI developers, amd-gfx list On Thu, Dec 1, 2022 at 9:01 AM Robin Murphy <robin.murphy@arm.com> wrote: > > On 2022-11-30 19:59, Mikhail Krylov wrote: > > On Wed, Nov 30, 2022 at 11:07:32AM -0500, Alex Deucher wrote: > >> On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@arm.com> wrote: > >>> > >>> On 2022-11-30 14:28, Alex Deucher wrote: > >>>> On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: > >>>>> > >>>>> On 2022-11-29 17:11, Mikhail Krylov wrote: > >>>>>> On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: > >>>>>>> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > >>>>>>>> > >>>>>>>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > >>>>>>>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > >>>>>>>>>> > >>>>>>>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > >>>>>>>>>> > >>>>>>>>>>>>> [excessive quoting removed] > >>>>>>>>>> > >>>>>>>>>>>> So, is there any progress on this issue? I do understand it's not a high > >>>>>>>>>>>> priority one, and today I've checked it on 6.0 kernel, and > >>>>>>>>>>>> unfortunately, it still persists... > >>>>>>>>>>>> > >>>>>>>>>>>> I'm considering writing a patch that will allow user to override > >>>>>>>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time > >>>>>>>>>>>> after the New Year for that. > >>>>>>>>>>>> > >>>>>>>>>>>> Is it at all possible that such a patch will be merged into kernel? > >>>>>>>>>>>> > >>>>>>>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > >>>>>>>>>>> Unless someone familiar with HIMEM can figure out what is going wrong > >>>>>>>>>>> we should just revert the patch. > >>>>>>>>>>> > >>>>>>>>>>> Alex > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Okay, I was suggesting that mostly because > >>>>>>>>>> > >>>>>>>>>> a) it works for me with dma_bits = 40 (I understand that's what it is > >>>>>>>>>> without the original patch applied); > >>>>>>>>>> > >>>>>>>>>> b) there's a hint of uncertainity on this line > >>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > >>>>>>>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are > >>>>>>>>>> setups, unlike mine, where dma_bits = 32 is better than 40. > >>>>>>>>>> > >>>>>>>>>> But I'm in no position to argue, just wanted to make myself clear. > >>>>>>>>>> I'm okay with rebuilding the kernel for my machine until the original > >>>>>>>>>> patch is reverted or any other fix is applied. > >>>>>>>>> > >>>>>>>>> What GPU do you have and is it AGP? If it is AGP, does setting > >>>>>>>>> radeon.agpmode=-1 also fix it? > >>>>>>>>> > >>>>>>>>> Alex > >>>>>>>> > >>>>>>>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > >>>>>>>> help, it just makes 3D acceleration in games such as OpenArena stop > >>>>>>>> working. > >>>>>>> > >>>>>>> Just to confirm, is the board AGP or PCIe? > >>>>>>> > >>>>>>> Alex > >>>>>> > >>>>>> It is AGP. That's an old machine. > >>>>> > >>>>> Can you check whether dma_addressing_limited() is actually returning the > >>>>> expected result at the point of radeon_ttm_init()? Disabling highmem is > >>>>> presumably just hiding whatever problem exists, by throwing away all > >>>>> >32-bit RAM such that use_dma32 doesn't matter. > >>>> > >>>> The device in question only supports a 32 bit DMA mask so > >>>> dma_addressing_limited() should return true. Bounce buffers are not > >>>> really usable on GPUs because they map so much memory. If > >>>> dma_addressing_limited() returns false, that would explain it. > >>> > >>> Right, it appears to be the only part of the offending commit that > >>> *could* reasonably make any difference, so I'm primarily wondering if > >>> dma_get_required_mask() somehow gets confused. > >> > >> Mikhail, > >> > >> Can you see that dma_addressing_limited() and dma_get_required_mask() > >> return in this case? > >> > >> Alex > >> > >> > >>> > >>> Thanks, > >>> Robin. > > > > Unfortunately, right now I don't have enough time for kernel > > modifications and rebuilds (I will later!), so I did a quick-and-dirty > > research with kprobe. > > > > The problem is that dma_addressing_limited() seems to be inlined and > > kprobe fails to intercept it. > > > > But I managed to get the result of dma_get_required_mask(). It returns > > 0x7fffffff (!) on the vanilla (with the patch, buggy) kernel: > > > > $ sudo kprobe-perf 'r:dma_get_required_mask $retval' > > Tracing kprobe dma_get_required_mask. Ctrl-C to end. > > modprobe-1244 [000] d... 105.582816: dma_get_required_mask: (radeon_ttm_init+0x61/0x240 [radeon] <- dma_get_required_mask) arg1=0x7fffffff > > > > This function does not even get called in the kernel without the patch > > that I built myself. I believe that's because ttm_bo_device_init() > > doesn't call it without the patch. > > > > Hope that helps at least a bit. If not, I'll be able to do more thorough > > research in a couple of weeks, probably. > > Hmm, just to clarify, what's your actual RAM layout? I've been assuming > that the issue must be caused by unexpected DMA address truncation, but > double-checking the older threads it seems that might not be the case. > I just did a quick sanity-check of both HIGHMEM4G and HIGHMEM64G configs > in a VM with either 2GB or 4GB of RAM assigned, and the > dma_direct_get_required_mask() calculation seemed to return the > appropriate result for all combinations. > > Otherwise, the only significant difference of use_dma32 seems to be to > switch TTM's allocation flags from GFP_HIGHUSER to GFP_DMA32. Could it > just be that the highmem support somewhere between TTM and radeon has > bitrotted, and it hasn't been noticed until this change because everyone > still using a 32-bit system with highmem also happens not to be using a > newer 40-bit-capable GPU? Or perhaps it never worked for AGP at all, in > which case an explicit special case might be clearer? WIth AGP, the driver just sets up an aperture on the GPU to point to the AGP aperture in the system. The platform AGP drivers handle the DMA mappings into their aperture. It's possible the AGP drivers are doing something wrong with respect to their DMA masks? Alex > > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > index d33fec488713..acb2d534bff5 100644 > --- a/drivers/gpu/drm/radeon/radeon_ttm.c > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > @@ -696,6 +696,7 @@ int radeon_ttm_init(struct radeon_device *rdev) > rdev->ddev->anon_inode->i_mapping, > rdev->ddev->vma_offset_manager, > rdev->need_swiotlb, > + rdev->flags & RADEON_IS_AGP || > dma_addressing_limited(&rdev->pdev->dev)); > if (r) { > DRM_ERROR("failed initializing buffer object driver(%d).\n", r); > > > Robin. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-12-01 14:00 ` Robin Murphy @ 2022-12-01 15:28 ` Mikhail Krylov 2022-12-01 15:28 ` Mikhail Krylov 1 sibling, 0 replies; 38+ messages in thread From: Mikhail Krylov @ 2022-12-01 15:28 UTC (permalink / raw) To: Robin Murphy; +Cc: Alex Deucher, Maling list - DRI developers, amd-gfx list [-- Attachment #1: Type: text/plain, Size: 7775 bytes --] On Thu, Dec 01, 2022 at 02:00:58PM +0000, Robin Murphy wrote: > On 2022-11-30 19:59, Mikhail Krylov wrote: > > On Wed, Nov 30, 2022 at 11:07:32AM -0500, Alex Deucher wrote: > > > On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@arm.com> wrote: > > > > > > > > On 2022-11-30 14:28, Alex Deucher wrote: > > > > > On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: > > > > > > > > > > > > On 2022-11-29 17:11, Mikhail Krylov wrote: > > > > > > > On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: > > > > > > > > On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > > > > > > > > > > > > > > On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > > > > > > > > > > On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > > > > > > > > > > > > > > > > > > > > > > > > > [excessive quoting removed] > > > > > > > > > > > > > > > > > > > > > > > > So, is there any progress on this issue? I do understand it's not a high > > > > > > > > > > > > > priority one, and today I've checked it on 6.0 kernel, and > > > > > > > > > > > > > unfortunately, it still persists... > > > > > > > > > > > > > > > > > > > > > > > > > > I'm considering writing a patch that will allow user to override > > > > > > > > > > > > > need_dma32/dma_bits setting with a module parameter. I'll have some time > > > > > > > > > > > > > after the New Year for that. > > > > > > > > > > > > > > > > > > > > > > > > > > Is it at all possible that such a patch will be merged into kernel? > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > > > > > > > > Unless someone familiar with HIMEM can figure out what is going wrong > > > > > > > > > > > > we should just revert the patch. > > > > > > > > > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Okay, I was suggesting that mostly because > > > > > > > > > > > > > > > > > > > > > > a) it works for me with dma_bits = 40 (I understand that's what it is > > > > > > > > > > > without the original patch applied); > > > > > > > > > > > > > > > > > > > > > > b) there's a hint of uncertainity on this line > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > > > > > > > > > > > saying that for AGP dma_bits = 32 is the safest option, so apparently there are > > > > > > > > > > > setups, unlike mine, where dma_bits = 32 is better than 40. > > > > > > > > > > > > > > > > > > > > > > But I'm in no position to argue, just wanted to make myself clear. > > > > > > > > > > > I'm okay with rebuilding the kernel for my machine until the original > > > > > > > > > > > patch is reverted or any other fix is applied. > > > > > > > > > > > > > > > > > > > > What GPU do you have and is it AGP? If it is AGP, does setting > > > > > > > > > > radeon.agpmode=-1 also fix it? > > > > > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > > > > > > > > > help, it just makes 3D acceleration in games such as OpenArena stop > > > > > > > > > working. > > > > > > > > > > > > > > > > Just to confirm, is the board AGP or PCIe? > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > It is AGP. That's an old machine. > > > > > > > > > > > > Can you check whether dma_addressing_limited() is actually returning the > > > > > > expected result at the point of radeon_ttm_init()? Disabling highmem is > > > > > > presumably just hiding whatever problem exists, by throwing away all > > > > > > >32-bit RAM such that use_dma32 doesn't matter. > > > > > > > > > > The device in question only supports a 32 bit DMA mask so > > > > > dma_addressing_limited() should return true. Bounce buffers are not > > > > > really usable on GPUs because they map so much memory. If > > > > > dma_addressing_limited() returns false, that would explain it. > > > > > > > > Right, it appears to be the only part of the offending commit that > > > > *could* reasonably make any difference, so I'm primarily wondering if > > > > dma_get_required_mask() somehow gets confused. > > > > > > Mikhail, > > > > > > Can you see that dma_addressing_limited() and dma_get_required_mask() > > > return in this case? > > > > > > Alex > > > > > > > > > > > > > > Thanks, > > > > Robin. > > > > Unfortunately, right now I don't have enough time for kernel > > modifications and rebuilds (I will later!), so I did a quick-and-dirty > > research with kprobe. > > > > The problem is that dma_addressing_limited() seems to be inlined and > > kprobe fails to intercept it. > > > > But I managed to get the result of dma_get_required_mask(). It returns > > 0x7fffffff (!) on the vanilla (with the patch, buggy) kernel: > > $ sudo kprobe-perf 'r:dma_get_required_mask $retval' > > Tracing kprobe dma_get_required_mask. Ctrl-C to end. > > modprobe-1244 [000] d... 105.582816: dma_get_required_mask: (radeon_ttm_init+0x61/0x240 [radeon] <- dma_get_required_mask) arg1=0x7fffffff > > > > This function does not even get called in the kernel without the patch > > that I built myself. I believe that's because ttm_bo_device_init() > > doesn't call it without the patch. > > > > Hope that helps at least a bit. If not, I'll be able to do more thorough > > research in a couple of weeks, probably. > > Hmm, just to clarify, what's your actual RAM layout? I've been assuming > that the issue must be caused by unexpected DMA address truncation, but > double-checking the older threads it seems that might not be the case. > I just did a quick sanity-check of both HIGHMEM4G and HIGHMEM64G configs > in a VM with either 2GB or 4GB of RAM assigned, and the > dma_direct_get_required_mask() calculation seemed to return the > appropriate result for all combinations. > > Otherwise, the only significant difference of use_dma32 seems to be to > switch TTM's allocation flags from GFP_HIGHUSER to GFP_DMA32. Could it > just be that the highmem support somewhere between TTM and radeon has > bitrotted, and it hasn't been noticed until this change because everyone > still using a 32-bit system with highmem also happens not to be using a > newer 40-bit-capable GPU? Or perhaps it never worked for AGP at all, in > which case an explicit special case might be clearer? > > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > index d33fec488713..acb2d534bff5 100644 > --- a/drivers/gpu/drm/radeon/radeon_ttm.c > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > @@ -696,6 +696,7 @@ int radeon_ttm_init(struct radeon_device *rdev) > rdev->ddev->anon_inode->i_mapping, > rdev->ddev->vma_offset_manager, > rdev->need_swiotlb, > + rdev->flags & RADEON_IS_AGP || > dma_addressing_limited(&rdev->pdev->dev)); > if (r) { > DRM_ERROR("failed initializing buffer object driver(%d).\n", r); > > Robin. Sorry, not sure what you mean, I'll try to guess: The bug exists on the stock 32-bit non-pae debian kernel (pae one also works, but bug persists even there): https://packages.debian.org/stable/kernel/linux-image-5.10.0-18-686 It has the following memory layout related options: CONFIG_HIGHMEM4G=y CONFIG_VMSPLIT_3G=y CONFIG_HIGHMEM=y The machine itself has 1.5G of RAM (1024M + 512M sticks). [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver @ 2022-12-01 15:28 ` Mikhail Krylov 0 siblings, 0 replies; 38+ messages in thread From: Mikhail Krylov @ 2022-12-01 15:28 UTC (permalink / raw) To: Robin Murphy; +Cc: Maling list - DRI developers, amd-gfx list [-- Attachment #1: Type: text/plain, Size: 7775 bytes --] On Thu, Dec 01, 2022 at 02:00:58PM +0000, Robin Murphy wrote: > On 2022-11-30 19:59, Mikhail Krylov wrote: > > On Wed, Nov 30, 2022 at 11:07:32AM -0500, Alex Deucher wrote: > > > On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@arm.com> wrote: > > > > > > > > On 2022-11-30 14:28, Alex Deucher wrote: > > > > > On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: > > > > > > > > > > > > On 2022-11-29 17:11, Mikhail Krylov wrote: > > > > > > > On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: > > > > > > > > On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > > > > > > > > > > > > > > On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > > > > > > > > > > On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > > > > > > > > > > > > > > > > > > > > > > > > > [excessive quoting removed] > > > > > > > > > > > > > > > > > > > > > > > > So, is there any progress on this issue? I do understand it's not a high > > > > > > > > > > > > > priority one, and today I've checked it on 6.0 kernel, and > > > > > > > > > > > > > unfortunately, it still persists... > > > > > > > > > > > > > > > > > > > > > > > > > > I'm considering writing a patch that will allow user to override > > > > > > > > > > > > > need_dma32/dma_bits setting with a module parameter. I'll have some time > > > > > > > > > > > > > after the New Year for that. > > > > > > > > > > > > > > > > > > > > > > > > > > Is it at all possible that such a patch will be merged into kernel? > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > > > > > > > > > > > Unless someone familiar with HIMEM can figure out what is going wrong > > > > > > > > > > > > we should just revert the patch. > > > > > > > > > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Okay, I was suggesting that mostly because > > > > > > > > > > > > > > > > > > > > > > a) it works for me with dma_bits = 40 (I understand that's what it is > > > > > > > > > > > without the original patch applied); > > > > > > > > > > > > > > > > > > > > > > b) there's a hint of uncertainity on this line > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > > > > > > > > > > > saying that for AGP dma_bits = 32 is the safest option, so apparently there are > > > > > > > > > > > setups, unlike mine, where dma_bits = 32 is better than 40. > > > > > > > > > > > > > > > > > > > > > > But I'm in no position to argue, just wanted to make myself clear. > > > > > > > > > > > I'm okay with rebuilding the kernel for my machine until the original > > > > > > > > > > > patch is reverted or any other fix is applied. > > > > > > > > > > > > > > > > > > > > What GPU do you have and is it AGP? If it is AGP, does setting > > > > > > > > > > radeon.agpmode=-1 also fix it? > > > > > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > > That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > > > > > > > > > help, it just makes 3D acceleration in games such as OpenArena stop > > > > > > > > > working. > > > > > > > > > > > > > > > > Just to confirm, is the board AGP or PCIe? > > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > It is AGP. That's an old machine. > > > > > > > > > > > > Can you check whether dma_addressing_limited() is actually returning the > > > > > > expected result at the point of radeon_ttm_init()? Disabling highmem is > > > > > > presumably just hiding whatever problem exists, by throwing away all > > > > > > >32-bit RAM such that use_dma32 doesn't matter. > > > > > > > > > > The device in question only supports a 32 bit DMA mask so > > > > > dma_addressing_limited() should return true. Bounce buffers are not > > > > > really usable on GPUs because they map so much memory. If > > > > > dma_addressing_limited() returns false, that would explain it. > > > > > > > > Right, it appears to be the only part of the offending commit that > > > > *could* reasonably make any difference, so I'm primarily wondering if > > > > dma_get_required_mask() somehow gets confused. > > > > > > Mikhail, > > > > > > Can you see that dma_addressing_limited() and dma_get_required_mask() > > > return in this case? > > > > > > Alex > > > > > > > > > > > > > > Thanks, > > > > Robin. > > > > Unfortunately, right now I don't have enough time for kernel > > modifications and rebuilds (I will later!), so I did a quick-and-dirty > > research with kprobe. > > > > The problem is that dma_addressing_limited() seems to be inlined and > > kprobe fails to intercept it. > > > > But I managed to get the result of dma_get_required_mask(). It returns > > 0x7fffffff (!) on the vanilla (with the patch, buggy) kernel: > > $ sudo kprobe-perf 'r:dma_get_required_mask $retval' > > Tracing kprobe dma_get_required_mask. Ctrl-C to end. > > modprobe-1244 [000] d... 105.582816: dma_get_required_mask: (radeon_ttm_init+0x61/0x240 [radeon] <- dma_get_required_mask) arg1=0x7fffffff > > > > This function does not even get called in the kernel without the patch > > that I built myself. I believe that's because ttm_bo_device_init() > > doesn't call it without the patch. > > > > Hope that helps at least a bit. If not, I'll be able to do more thorough > > research in a couple of weeks, probably. > > Hmm, just to clarify, what's your actual RAM layout? I've been assuming > that the issue must be caused by unexpected DMA address truncation, but > double-checking the older threads it seems that might not be the case. > I just did a quick sanity-check of both HIGHMEM4G and HIGHMEM64G configs > in a VM with either 2GB or 4GB of RAM assigned, and the > dma_direct_get_required_mask() calculation seemed to return the > appropriate result for all combinations. > > Otherwise, the only significant difference of use_dma32 seems to be to > switch TTM's allocation flags from GFP_HIGHUSER to GFP_DMA32. Could it > just be that the highmem support somewhere between TTM and radeon has > bitrotted, and it hasn't been noticed until this change because everyone > still using a 32-bit system with highmem also happens not to be using a > newer 40-bit-capable GPU? Or perhaps it never worked for AGP at all, in > which case an explicit special case might be clearer? > > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > index d33fec488713..acb2d534bff5 100644 > --- a/drivers/gpu/drm/radeon/radeon_ttm.c > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > @@ -696,6 +696,7 @@ int radeon_ttm_init(struct radeon_device *rdev) > rdev->ddev->anon_inode->i_mapping, > rdev->ddev->vma_offset_manager, > rdev->need_swiotlb, > + rdev->flags & RADEON_IS_AGP || > dma_addressing_limited(&rdev->pdev->dev)); > if (r) { > DRM_ERROR("failed initializing buffer object driver(%d).\n", r); > > Robin. Sorry, not sure what you mean, I'll try to guess: The bug exists on the stock 32-bit non-pae debian kernel (pae one also works, but bug persists even there): https://packages.debian.org/stable/kernel/linux-image-5.10.0-18-686 It has the following memory layout related options: CONFIG_HIGHMEM4G=y CONFIG_VMSPLIT_3G=y CONFIG_HIGHMEM=y The machine itself has 1.5G of RAM (1024M + 512M sticks). [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-11-30 16:07 ` Alex Deucher 2022-11-30 19:59 ` Mikhail Krylov @ 2022-12-10 15:32 ` Mikhail Krylov 2022-12-11 5:52 ` Luben Tuikov 1 sibling, 1 reply; 38+ messages in thread From: Mikhail Krylov @ 2022-12-10 15:32 UTC (permalink / raw) To: Alex Deucher; +Cc: Robin Murphy, Maling list - DRI developers, amd-gfx list [-- Attachment #1: Type: text/plain, Size: 4155 bytes --] On Wed, Nov 30, 2022 at 11:07:32AM -0500, Alex Deucher wrote: > On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@arm.com> wrote: > > > > On 2022-11-30 14:28, Alex Deucher wrote: > > > On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: > > >> > > >> On 2022-11-29 17:11, Mikhail Krylov wrote: > > >>> On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: > > >>>> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > >>>>> > > >>>>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: > > >>>>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: > > >>>>>>> > > >>>>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: > > >>>>>>> > > >>>>>>>>>> [excessive quoting removed] > > >>>>>>> > > >>>>>>>>> So, is there any progress on this issue? I do understand it's not a high > > >>>>>>>>> priority one, and today I've checked it on 6.0 kernel, and > > >>>>>>>>> unfortunately, it still persists... > > >>>>>>>>> > > >>>>>>>>> I'm considering writing a patch that will allow user to override > > >>>>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time > > >>>>>>>>> after the New Year for that. > > >>>>>>>>> > > >>>>>>>>> Is it at all possible that such a patch will be merged into kernel? > > >>>>>>>>> > > >>>>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: > > >>>>>>>> Unless someone familiar with HIMEM can figure out what is going wrong > > >>>>>>>> we should just revert the patch. > > >>>>>>>> > > >>>>>>>> Alex > > >>>>>>> > > >>>>>>> > > >>>>>>> Okay, I was suggesting that mostly because > > >>>>>>> > > >>>>>>> a) it works for me with dma_bits = 40 (I understand that's what it is > > >>>>>>> without the original patch applied); > > >>>>>>> > > >>>>>>> b) there's a hint of uncertainity on this line > > >>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 > > >>>>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are > > >>>>>>> setups, unlike mine, where dma_bits = 32 is better than 40. > > >>>>>>> > > >>>>>>> But I'm in no position to argue, just wanted to make myself clear. > > >>>>>>> I'm okay with rebuilding the kernel for my machine until the original > > >>>>>>> patch is reverted or any other fix is applied. > > >>>>>> > > >>>>>> What GPU do you have and is it AGP? If it is AGP, does setting > > >>>>>> radeon.agpmode=-1 also fix it? > > >>>>>> > > >>>>>> Alex > > >>>>> > > >>>>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't > > >>>>> help, it just makes 3D acceleration in games such as OpenArena stop > > >>>>> working. > > >>>> > > >>>> Just to confirm, is the board AGP or PCIe? > > >>>> > > >>>> Alex > > >>> > > >>> It is AGP. That's an old machine. > > >> > > >> Can you check whether dma_addressing_limited() is actually returning the > > >> expected result at the point of radeon_ttm_init()? Disabling highmem is > > >> presumably just hiding whatever problem exists, by throwing away all > > >> >32-bit RAM such that use_dma32 doesn't matter. > > > > > > The device in question only supports a 32 bit DMA mask so > > > dma_addressing_limited() should return true. Bounce buffers are not > > > really usable on GPUs because they map so much memory. If > > > dma_addressing_limited() returns false, that would explain it. > > > > Right, it appears to be the only part of the offending commit that > > *could* reasonably make any difference, so I'm primarily wondering if > > dma_get_required_mask() somehow gets confused. > > Mikhail, > > Can you see that dma_addressing_limited() and dma_get_required_mask() > return in this case? > > Alex > > > > > > Thanks, > > Robin. Hello again, I was able to confirm by adding printk() to the functions and recompiling the kernel that dma_addressing_limited() returns *false* on the kernel with the bug. And dma_get_required_mask() returns 0x7fffffff, as I said before. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Screen corruption using radeon kernel driver 2022-12-10 15:32 ` Mikhail Krylov @ 2022-12-11 5:52 ` Luben Tuikov 2022-12-11 11:42 ` [PATCH] drm/radeon: Fix screen corruption Luben Tuikov 0 siblings, 1 reply; 38+ messages in thread From: Luben Tuikov @ 2022-12-11 5:52 UTC (permalink / raw) To: Mikhail Krylov, Alex Deucher Cc: Robin Murphy, amd-gfx list, Maling list - DRI developers On 2022-12-10 10:32, Mikhail Krylov wrote: > On Wed, Nov 30, 2022 at 11:07:32AM -0500, Alex Deucher wrote: >> On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@arm.com> wrote: >>> >>> On 2022-11-30 14:28, Alex Deucher wrote: >>>> On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@arm.com> wrote: >>>>> >>>>> On 2022-11-29 17:11, Mikhail Krylov wrote: >>>>>> On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: >>>>>>> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>>>> >>>>>>>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: >>>>>>>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: >>>>>>>>>> >>>>>>>>>>>>> [excessive quoting removed] >>>>>>>>>> >>>>>>>>>>>> So, is there any progress on this issue? I do understand it's not a high >>>>>>>>>>>> priority one, and today I've checked it on 6.0 kernel, and >>>>>>>>>>>> unfortunately, it still persists... >>>>>>>>>>>> >>>>>>>>>>>> I'm considering writing a patch that will allow user to override >>>>>>>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time >>>>>>>>>>>> after the New Year for that. >>>>>>>>>>>> >>>>>>>>>>>> Is it at all possible that such a patch will be merged into kernel? >>>>>>>>>>>> >>>>>>>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@gmail.com> wrote: >>>>>>>>>>> Unless someone familiar with HIMEM can figure out what is going wrong >>>>>>>>>>> we should just revert the patch. >>>>>>>>>>> >>>>>>>>>>> Alex >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Okay, I was suggesting that mostly because >>>>>>>>>> >>>>>>>>>> a) it works for me with dma_bits = 40 (I understand that's what it is >>>>>>>>>> without the original patch applied); >>>>>>>>>> >>>>>>>>>> b) there's a hint of uncertainity on this line >>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 >>>>>>>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are >>>>>>>>>> setups, unlike mine, where dma_bits = 32 is better than 40. >>>>>>>>>> >>>>>>>>>> But I'm in no position to argue, just wanted to make myself clear. >>>>>>>>>> I'm okay with rebuilding the kernel for my machine until the original >>>>>>>>>> patch is reverted or any other fix is applied. >>>>>>>>> >>>>>>>>> What GPU do you have and is it AGP? If it is AGP, does setting >>>>>>>>> radeon.agpmode=-1 also fix it? >>>>>>>>> >>>>>>>>> Alex >>>>>>>> >>>>>>>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't >>>>>>>> help, it just makes 3D acceleration in games such as OpenArena stop >>>>>>>> working. >>>>>>> >>>>>>> Just to confirm, is the board AGP or PCIe? >>>>>>> >>>>>>> Alex >>>>>> >>>>>> It is AGP. That's an old machine. >>>>> >>>>> Can you check whether dma_addressing_limited() is actually returning the >>>>> expected result at the point of radeon_ttm_init()? Disabling highmem is >>>>> presumably just hiding whatever problem exists, by throwing away all >>>>> >32-bit RAM such that use_dma32 doesn't matter. >>>> >>>> The device in question only supports a 32 bit DMA mask so >>>> dma_addressing_limited() should return true. Bounce buffers are not >>>> really usable on GPUs because they map so much memory. If >>>> dma_addressing_limited() returns false, that would explain it. >>> >>> Right, it appears to be the only part of the offending commit that >>> *could* reasonably make any difference, so I'm primarily wondering if >>> dma_get_required_mask() somehow gets confused. >> >> Mikhail, >> >> Can you see that dma_addressing_limited() and dma_get_required_mask() >> return in this case? >> >> Alex >> >> >>> >>> Thanks, >>> Robin. > > Hello again, I was able to confirm by adding printk() to the functions > and recompiling the kernel that dma_addressing_limited() returns > *false* on the kernel with the bug. > > And dma_get_required_mask() returns 0x7fffffff, as I said before. Yes, dma_addressing_limited() evaluates to "false" in your case, and this is the correct answer according to the function's comment: "Return %true if the devices DMA mask is too small to address all memory in the system, else %false." In this case the device's DMA mask is 0xFFFFFFFF and the mask for the 1.5 GiB memory is 0x7FFFFFFF, so the static inline returns "false". (dma_direct_get_required_mask() returns this for your memory size.) It would appear that dma_addressing_limited() isn't answering the question which the last parameter to ttm_device_init(), "use GFP_DMA32", wants answered. Perhaps we should use another method to make sure that that parameter is set in the scenario in question. Regards, Luben ^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH] drm/radeon: Fix screen corruption 2022-12-11 5:52 ` Luben Tuikov @ 2022-12-11 11:42 ` Luben Tuikov 2022-12-12 2:08 ` [PATCH] drm/radeon: Fix screen corruption (v2) Luben Tuikov 0 siblings, 1 reply; 38+ messages in thread From: Luben Tuikov @ 2022-12-11 11:42 UTC (permalink / raw) To: AMD Graphics Cc: Alex Deucher, Mikhail Krylov, Luben Tuikov, Robin Murphy, Direct Rendering Infrastructure - Development Fix screen corruption on older 32-bit systems using AGP chips. Partially revert commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713. Cc: Mikhail Krylov <sqarert@gmail.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org> Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> Fixes: 33b3ad3788aba8 ("drm/radeon: handle PCIe root ports with addressing limitations") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> --- drivers/gpu/drm/radeon/radeon.h | 1 + drivers/gpu/drm/radeon/radeon_device.c | 2 +- drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 37dec92339b16a..4fe38fd9be3267 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -2426,6 +2426,7 @@ struct radeon_device { struct radeon_wb wb; struct radeon_dummy_page dummy_page; bool shutdown; + bool need_dma32; bool need_swiotlb; bool accel_working; bool fastfb_working; /* IGP feature*/ diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 6344454a772172..3643a3cfe061bd 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1370,7 +1370,7 @@ int radeon_device_init(struct radeon_device *rdev, if (rdev->family == CHIP_CEDAR) dma_bits = 32; #endif - + rdev->need_dma32 = dma_bits == 32; r = dma_set_mask_and_coherent(&rdev->pdev->dev, DMA_BIT_MASK(dma_bits)); if (r) { pr_warn("radeon: No suitable DMA available\n"); diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index bdb4c0e0736ba2..3debaeb720d173 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -696,7 +696,7 @@ int radeon_ttm_init(struct radeon_device *rdev) rdev->ddev->anon_inode->i_mapping, rdev->ddev->vma_offset_manager, rdev->need_swiotlb, - dma_addressing_limited(&rdev->pdev->dev)); + rdev->need_dma32); if (r) { DRM_ERROR("failed initializing buffer object driver(%d).\n", r); return r; base-commit: 20e03e7f6e8efd42168db6d3fe044b804e0ede8f -- 2.39.0.rc2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-11 11:42 ` [PATCH] drm/radeon: Fix screen corruption Luben Tuikov @ 2022-12-12 2:08 ` Luben Tuikov 2022-12-14 21:53 ` Robin Murphy 0 siblings, 1 reply; 38+ messages in thread From: Luben Tuikov @ 2022-12-12 2:08 UTC (permalink / raw) To: AMD Graphics Cc: Alex Deucher, Mikhail Krylov, Luben Tuikov, Robin Murphy, Direct Rendering Infrastructure - Development Fix screen corruption on older 32-bit systems using AGP chips. On older systems with little memory, for instance 1.5 GiB, using an AGP chip, the device's DMA mask is 0xFFFFFFFF, but the memory mask is 0x7FFFFFF, and subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, false. As such the result of this static inline isn't suitable for the last argument to ttm_device_init()--it simply needs to now whether to use GFP_DMA32 when allocating DMA buffers. Partially reverts commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713. v2: Amend the commit description. Cc: Mikhail Krylov <sqarert@gmail.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org> Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> Fixes: 33b3ad3788aba8 ("drm/radeon: handle PCIe root ports with addressing limitations") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> --- drivers/gpu/drm/radeon/radeon.h | 1 + drivers/gpu/drm/radeon/radeon_device.c | 2 +- drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 37dec92339b16a..4fe38fd9be3267 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -2426,6 +2426,7 @@ struct radeon_device { struct radeon_wb wb; struct radeon_dummy_page dummy_page; bool shutdown; + bool need_dma32; bool need_swiotlb; bool accel_working; bool fastfb_working; /* IGP feature*/ diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 6344454a772172..3643a3cfe061bd 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1370,7 +1370,7 @@ int radeon_device_init(struct radeon_device *rdev, if (rdev->family == CHIP_CEDAR) dma_bits = 32; #endif - + rdev->need_dma32 = dma_bits == 32; r = dma_set_mask_and_coherent(&rdev->pdev->dev, DMA_BIT_MASK(dma_bits)); if (r) { pr_warn("radeon: No suitable DMA available\n"); diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index bdb4c0e0736ba2..3debaeb720d173 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -696,7 +696,7 @@ int radeon_ttm_init(struct radeon_device *rdev) rdev->ddev->anon_inode->i_mapping, rdev->ddev->vma_offset_manager, rdev->need_swiotlb, - dma_addressing_limited(&rdev->pdev->dev)); + rdev->need_dma32); if (r) { DRM_ERROR("failed initializing buffer object driver(%d).\n", r); return r; base-commit: 20e03e7f6e8efd42168db6d3fe044b804e0ede8f -- 2.39.0.rc2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-12 2:08 ` [PATCH] drm/radeon: Fix screen corruption (v2) Luben Tuikov @ 2022-12-14 21:53 ` Robin Murphy 2022-12-14 22:02 ` Alex Deucher 0 siblings, 1 reply; 38+ messages in thread From: Robin Murphy @ 2022-12-14 21:53 UTC (permalink / raw) To: Luben Tuikov, AMD Graphics Cc: Alex Deucher, Mikhail Krylov, Direct Rendering Infrastructure - Development On 2022-12-12 02:08, Luben Tuikov wrote: > Fix screen corruption on older 32-bit systems using AGP chips. > > On older systems with little memory, for instance 1.5 GiB, using an AGP chip, > the device's DMA mask is 0xFFFFFFFF, but the memory mask is 0x7FFFFFF, and > subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, > false. As such the result of this static inline isn't suitable for the last > argument to ttm_device_init()--it simply needs to now whether to use GFP_DMA32 > when allocating DMA buffers. This sounds wrong to me. If the issues happen on systems without PAE it clearly can't have anything to with the actual DMA address size. Not to mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so GFP_DMA32 would be functionally meaningless anyway. Although the reported symptoms initially sounded like they could be caused by DMA going to the wrong place, that is also equally consistent with a loss of cache coherency. My (limited) understanding of AGP is that the GART can effectively alias memory to a second physical address, so I could well believe that something somewhere in the driver stack needs to perform some cache maintenance to avoid coherency issues, and that in these particular setups whatever that is might be assuming the memory is direct-mapped and thus going wrong for highmem pages. So as I said before, I really think this is not about using GFP_DMA32 at all, but about *not* using GFP_HIGHUSER. Thanks, Robin. > Partially reverts commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713. > > v2: Amend the commit description. > > Cc: Mikhail Krylov <sqarert@gmail.com> > Cc: Alex Deucher <Alexander.Deucher@amd.com> > Cc: Robin Murphy <robin.murphy@arm.com> > Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org> > Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> > Fixes: 33b3ad3788aba8 ("drm/radeon: handle PCIe root ports with addressing limitations") > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> > --- > drivers/gpu/drm/radeon/radeon.h | 1 + > drivers/gpu/drm/radeon/radeon_device.c | 2 +- > drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- > 3 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h > index 37dec92339b16a..4fe38fd9be3267 100644 > --- a/drivers/gpu/drm/radeon/radeon.h > +++ b/drivers/gpu/drm/radeon/radeon.h > @@ -2426,6 +2426,7 @@ struct radeon_device { > struct radeon_wb wb; > struct radeon_dummy_page dummy_page; > bool shutdown; > + bool need_dma32; > bool need_swiotlb; > bool accel_working; > bool fastfb_working; /* IGP feature*/ > diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c > index 6344454a772172..3643a3cfe061bd 100644 > --- a/drivers/gpu/drm/radeon/radeon_device.c > +++ b/drivers/gpu/drm/radeon/radeon_device.c > @@ -1370,7 +1370,7 @@ int radeon_device_init(struct radeon_device *rdev, > if (rdev->family == CHIP_CEDAR) > dma_bits = 32; > #endif > - > + rdev->need_dma32 = dma_bits == 32; > r = dma_set_mask_and_coherent(&rdev->pdev->dev, DMA_BIT_MASK(dma_bits)); > if (r) { > pr_warn("radeon: No suitable DMA available\n"); > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > index bdb4c0e0736ba2..3debaeb720d173 100644 > --- a/drivers/gpu/drm/radeon/radeon_ttm.c > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > @@ -696,7 +696,7 @@ int radeon_ttm_init(struct radeon_device *rdev) > rdev->ddev->anon_inode->i_mapping, > rdev->ddev->vma_offset_manager, > rdev->need_swiotlb, > - dma_addressing_limited(&rdev->pdev->dev)); > + rdev->need_dma32); > if (r) { > DRM_ERROR("failed initializing buffer object driver(%d).\n", r); > return r; > > base-commit: 20e03e7f6e8efd42168db6d3fe044b804e0ede8f ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-14 21:53 ` Robin Murphy @ 2022-12-14 22:02 ` Alex Deucher 2022-12-14 23:08 ` Robin Murphy 0 siblings, 1 reply; 38+ messages in thread From: Alex Deucher @ 2022-12-14 22:02 UTC (permalink / raw) To: Robin Murphy Cc: Alex Deucher, Mikhail Krylov, Luben Tuikov, Direct Rendering Infrastructure - Development, AMD Graphics On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> wrote: > > On 2022-12-12 02:08, Luben Tuikov wrote: > > Fix screen corruption on older 32-bit systems using AGP chips. > > > > On older systems with little memory, for instance 1.5 GiB, using an AGP chip, > > the device's DMA mask is 0xFFFFFFFF, but the memory mask is 0x7FFFFFF, and > > subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, > > false. As such the result of this static inline isn't suitable for the last > > argument to ttm_device_init()--it simply needs to now whether to use GFP_DMA32 > > when allocating DMA buffers. > > This sounds wrong to me. If the issues happen on systems without PAE it > clearly can't have anything to with the actual DMA address size. Not to > mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so > GFP_DMA32 would be functionally meaningless anyway. Although the > reported symptoms initially sounded like they could be caused by DMA > going to the wrong place, that is also equally consistent with a loss of > cache coherency. > > My (limited) understanding of AGP is that the GART can effectively alias > memory to a second physical address, so I could well believe that > something somewhere in the driver stack needs to perform some cache > maintenance to avoid coherency issues, and that in these particular > setups whatever that is might be assuming the memory is direct-mapped > and thus going wrong for highmem pages. > > So as I said before, I really think this is not about using GFP_DMA32 at > all, but about *not* using GFP_HIGHUSER. One of the wonderful features of AGP is that it has to be used with uncached memory. The aperture basically just provides a remapping of physical pages into a linear aperture that you point the GPU at. TTM has to jump through quite a few hoops to get uncached memory in the first place, so it's likely that that somehow isn't compatible with HIGHMEM. Can you get uncached HIGHMEM? Alex > > Thanks, > Robin. > > > Partially reverts commit 33b3ad3788aba846fc8b9a065fe2685a0b64f713. > > > > v2: Amend the commit description. > > > > Cc: Mikhail Krylov <sqarert@gmail.com> > > Cc: Alex Deucher <Alexander.Deucher@amd.com> > > Cc: Robin Murphy <robin.murphy@arm.com> > > Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org> > > Cc: AMD Graphics <amd-gfx@lists.freedesktop.org> > > Fixes: 33b3ad3788aba8 ("drm/radeon: handle PCIe root ports with addressing limitations") > > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> > > --- > > drivers/gpu/drm/radeon/radeon.h | 1 + > > drivers/gpu/drm/radeon/radeon_device.c | 2 +- > > drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- > > 3 files changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h > > index 37dec92339b16a..4fe38fd9be3267 100644 > > --- a/drivers/gpu/drm/radeon/radeon.h > > +++ b/drivers/gpu/drm/radeon/radeon.h > > @@ -2426,6 +2426,7 @@ struct radeon_device { > > struct radeon_wb wb; > > struct radeon_dummy_page dummy_page; > > bool shutdown; > > + bool need_dma32; > > bool need_swiotlb; > > bool accel_working; > > bool fastfb_working; /* IGP feature*/ > > diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c > > index 6344454a772172..3643a3cfe061bd 100644 > > --- a/drivers/gpu/drm/radeon/radeon_device.c > > +++ b/drivers/gpu/drm/radeon/radeon_device.c > > @@ -1370,7 +1370,7 @@ int radeon_device_init(struct radeon_device *rdev, > > if (rdev->family == CHIP_CEDAR) > > dma_bits = 32; > > #endif > > - > > + rdev->need_dma32 = dma_bits == 32; > > r = dma_set_mask_and_coherent(&rdev->pdev->dev, DMA_BIT_MASK(dma_bits)); > > if (r) { > > pr_warn("radeon: No suitable DMA available\n"); > > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > > index bdb4c0e0736ba2..3debaeb720d173 100644 > > --- a/drivers/gpu/drm/radeon/radeon_ttm.c > > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > > @@ -696,7 +696,7 @@ int radeon_ttm_init(struct radeon_device *rdev) > > rdev->ddev->anon_inode->i_mapping, > > rdev->ddev->vma_offset_manager, > > rdev->need_swiotlb, > > - dma_addressing_limited(&rdev->pdev->dev)); > > + rdev->need_dma32); > > if (r) { > > DRM_ERROR("failed initializing buffer object driver(%d).\n", r); > > return r; > > > > base-commit: 20e03e7f6e8efd42168db6d3fe044b804e0ede8f ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-14 22:02 ` Alex Deucher @ 2022-12-14 23:08 ` Robin Murphy 2022-12-15 8:07 ` Christian König 0 siblings, 1 reply; 38+ messages in thread From: Robin Murphy @ 2022-12-14 23:08 UTC (permalink / raw) To: Alex Deucher Cc: Alex Deucher, Mikhail Krylov, Luben Tuikov, Direct Rendering Infrastructure - Development, AMD Graphics On 2022-12-14 22:02, Alex Deucher wrote: > On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> wrote: >> >> On 2022-12-12 02:08, Luben Tuikov wrote: >>> Fix screen corruption on older 32-bit systems using AGP chips. >>> >>> On older systems with little memory, for instance 1.5 GiB, using an AGP chip, >>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is 0x7FFFFFF, and >>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>> false. As such the result of this static inline isn't suitable for the last >>> argument to ttm_device_init()--it simply needs to now whether to use GFP_DMA32 >>> when allocating DMA buffers. >> >> This sounds wrong to me. If the issues happen on systems without PAE it >> clearly can't have anything to with the actual DMA address size. Not to >> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >> GFP_DMA32 would be functionally meaningless anyway. Although the >> reported symptoms initially sounded like they could be caused by DMA >> going to the wrong place, that is also equally consistent with a loss of >> cache coherency. >> >> My (limited) understanding of AGP is that the GART can effectively alias >> memory to a second physical address, so I could well believe that >> something somewhere in the driver stack needs to perform some cache >> maintenance to avoid coherency issues, and that in these particular >> setups whatever that is might be assuming the memory is direct-mapped >> and thus going wrong for highmem pages. >> >> So as I said before, I really think this is not about using GFP_DMA32 at >> all, but about *not* using GFP_HIGHUSER. > > One of the wonderful features of AGP is that it has to be used with > uncached memory. The aperture basically just provides a remapping of > physical pages into a linear aperture that you point the GPU at. TTM > has to jump through quite a few hoops to get uncached memory in the > first place, so it's likely that that somehow isn't compatible with > HIGHMEM. Can you get uncached HIGHMEM? I guess in principle yes, if you're careful not to use regular kmap()/kmap_atomic(), and always use pgprot_noncached() for userspace/vmalloc mappings, but clearly that leaves lots of scope for slipping up. Working backwards from primitives like set_memory_uc(), I see various paths in TTM where manipulating the caching state is skipped for highmem pages, but I wouldn't even know where to start looking for whether the right state is propagated to all the places where they might eventually be mapped somewhere. Cheers, Robin. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-14 23:08 ` Robin Murphy @ 2022-12-15 8:07 ` Christian König 2022-12-15 9:08 ` Luben Tuikov 0 siblings, 1 reply; 38+ messages in thread From: Christian König @ 2022-12-15 8:07 UTC (permalink / raw) To: Robin Murphy, Alex Deucher Cc: Alex Deucher, Mikhail Krylov, Luben Tuikov, AMD Graphics, Direct Rendering Infrastructure - Development Am 15.12.22 um 00:08 schrieb Robin Murphy: > On 2022-12-14 22:02, Alex Deucher wrote: >> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> >> wrote: >>> >>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>> Fix screen corruption on older 32-bit systems using AGP chips. >>>> >>>> On older systems with little memory, for instance 1.5 GiB, using an >>>> AGP chip, >>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is >>>> 0x7FFFFFF, and >>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>>> false. As such the result of this static inline isn't suitable for >>>> the last >>>> argument to ttm_device_init()--it simply needs to now whether to >>>> use GFP_DMA32 >>>> when allocating DMA buffers. >>> >>> This sounds wrong to me. If the issues happen on systems without PAE it >>> clearly can't have anything to with the actual DMA address size. Not to >>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >>> GFP_DMA32 would be functionally meaningless anyway. Although the >>> reported symptoms initially sounded like they could be caused by DMA >>> going to the wrong place, that is also equally consistent with a >>> loss of >>> cache coherency. >>> >>> My (limited) understanding of AGP is that the GART can effectively >>> alias >>> memory to a second physical address, so I could well believe that >>> something somewhere in the driver stack needs to perform some cache >>> maintenance to avoid coherency issues, and that in these particular >>> setups whatever that is might be assuming the memory is direct-mapped >>> and thus going wrong for highmem pages. >>> >>> So as I said before, I really think this is not about using >>> GFP_DMA32 at >>> all, but about *not* using GFP_HIGHUSER. >> >> One of the wonderful features of AGP is that it has to be used with >> uncached memory. The aperture basically just provides a remapping of >> physical pages into a linear aperture that you point the GPU at. TTM >> has to jump through quite a few hoops to get uncached memory in the >> first place, so it's likely that that somehow isn't compatible with >> HIGHMEM. Can you get uncached HIGHMEM? > > I guess in principle yes, if you're careful not to use regular > kmap()/kmap_atomic(), and always use pgprot_noncached() for > userspace/vmalloc mappings, but clearly that leaves lots of scope for > slipping up. I theory we should do exactly that in TTM, but we have very few users who actually still exercise that functionality. > > Working backwards from primitives like set_memory_uc(), I see various > paths in TTM where manipulating the caching state is skipped for > highmem pages, but I wouldn't even know where to start looking for > whether the right state is propagated to all the places where they > might eventually be mapped somewhere. The tt object has the caching state for the pages and ttm_prot_from_caching() then uses pgprot_noncached() and co for the userspace/vmalloc mappings. Regards, Christian. > > Cheers, > Robin. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-15 8:07 ` Christian König @ 2022-12-15 9:08 ` Luben Tuikov 2022-12-15 9:46 ` Christian König 0 siblings, 1 reply; 38+ messages in thread From: Luben Tuikov @ 2022-12-15 9:08 UTC (permalink / raw) To: Christian König, Robin Murphy, Alex Deucher Cc: Alex Deucher, Mikhail Krylov, AMD Graphics, Direct Rendering Infrastructure - Development On 2022-12-15 03:07, Christian König wrote: > Am 15.12.22 um 00:08 schrieb Robin Murphy: >> On 2022-12-14 22:02, Alex Deucher wrote: >>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> >>> wrote: >>>> >>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>> Fix screen corruption on older 32-bit systems using AGP chips. >>>>> >>>>> On older systems with little memory, for instance 1.5 GiB, using an >>>>> AGP chip, >>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is >>>>> 0x7FFFFFF, and >>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>>>> false. As such the result of this static inline isn't suitable for >>>>> the last >>>>> argument to ttm_device_init()--it simply needs to now whether to >>>>> use GFP_DMA32 >>>>> when allocating DMA buffers. >>>> >>>> This sounds wrong to me. If the issues happen on systems without PAE it >>>> clearly can't have anything to with the actual DMA address size. Not to >>>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >>>> GFP_DMA32 would be functionally meaningless anyway. Although the >>>> reported symptoms initially sounded like they could be caused by DMA >>>> going to the wrong place, that is also equally consistent with a >>>> loss of >>>> cache coherency. >>>> >>>> My (limited) understanding of AGP is that the GART can effectively >>>> alias >>>> memory to a second physical address, so I could well believe that >>>> something somewhere in the driver stack needs to perform some cache >>>> maintenance to avoid coherency issues, and that in these particular >>>> setups whatever that is might be assuming the memory is direct-mapped >>>> and thus going wrong for highmem pages. >>>> >>>> So as I said before, I really think this is not about using >>>> GFP_DMA32 at >>>> all, but about *not* using GFP_HIGHUSER. >>> >>> One of the wonderful features of AGP is that it has to be used with >>> uncached memory. The aperture basically just provides a remapping of >>> physical pages into a linear aperture that you point the GPU at. TTM >>> has to jump through quite a few hoops to get uncached memory in the >>> first place, so it's likely that that somehow isn't compatible with >>> HIGHMEM. Can you get uncached HIGHMEM? >> >> I guess in principle yes, if you're careful not to use regular >> kmap()/kmap_atomic(), and always use pgprot_noncached() for >> userspace/vmalloc mappings, but clearly that leaves lots of scope for >> slipping up. > > I theory we should do exactly that in TTM, but we have very few users > who actually still exercise that functionality. > >> >> Working backwards from primitives like set_memory_uc(), I see various >> paths in TTM where manipulating the caching state is skipped for >> highmem pages, but I wouldn't even know where to start looking for >> whether the right state is propagated to all the places where they >> might eventually be mapped somewhere. > > The tt object has the caching state for the pages and > ttm_prot_from_caching() then uses pgprot_noncached() and co for the > userspace/vmalloc mappings. > The point of this patch is that dma_addressing_limited() is unsuitable as the last parameter to ttm_pool_init(), since if it is "false"--as it is in this particular case--then TTM ends up using HIGHUSER, and we get the screen corruption. (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) Is there an objection to this patch, if it fixes the screen corruption? Or does TTM need fixing, in that what we really need is to specify whether caching is desired and/or DMA32 when we allocate a TTM pool (ttm_pool_init(), called from ttm_device_init(), called from radeon_ttm_init.c)? Regards, Luben ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-15 9:08 ` Luben Tuikov @ 2022-12-15 9:46 ` Christian König 2022-12-15 10:19 ` Luben Tuikov 0 siblings, 1 reply; 38+ messages in thread From: Christian König @ 2022-12-15 9:46 UTC (permalink / raw) To: Luben Tuikov, Robin Murphy, Alex Deucher Cc: Alex Deucher, Mikhail Krylov, AMD Graphics, Direct Rendering Infrastructure - Development Am 15.12.22 um 10:08 schrieb Luben Tuikov: > On 2022-12-15 03:07, Christian König wrote: >> Am 15.12.22 um 00:08 schrieb Robin Murphy: >>> On 2022-12-14 22:02, Alex Deucher wrote: >>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> >>>> wrote: >>>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>>> Fix screen corruption on older 32-bit systems using AGP chips. >>>>>> >>>>>> On older systems with little memory, for instance 1.5 GiB, using an >>>>>> AGP chip, >>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is >>>>>> 0x7FFFFFF, and >>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>>>>> false. As such the result of this static inline isn't suitable for >>>>>> the last >>>>>> argument to ttm_device_init()--it simply needs to now whether to >>>>>> use GFP_DMA32 >>>>>> when allocating DMA buffers. >>>>> This sounds wrong to me. If the issues happen on systems without PAE it >>>>> clearly can't have anything to with the actual DMA address size. Not to >>>>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >>>>> GFP_DMA32 would be functionally meaningless anyway. Although the >>>>> reported symptoms initially sounded like they could be caused by DMA >>>>> going to the wrong place, that is also equally consistent with a >>>>> loss of >>>>> cache coherency. >>>>> >>>>> My (limited) understanding of AGP is that the GART can effectively >>>>> alias >>>>> memory to a second physical address, so I could well believe that >>>>> something somewhere in the driver stack needs to perform some cache >>>>> maintenance to avoid coherency issues, and that in these particular >>>>> setups whatever that is might be assuming the memory is direct-mapped >>>>> and thus going wrong for highmem pages. >>>>> >>>>> So as I said before, I really think this is not about using >>>>> GFP_DMA32 at >>>>> all, but about *not* using GFP_HIGHUSER. >>>> One of the wonderful features of AGP is that it has to be used with >>>> uncached memory. The aperture basically just provides a remapping of >>>> physical pages into a linear aperture that you point the GPU at. TTM >>>> has to jump through quite a few hoops to get uncached memory in the >>>> first place, so it's likely that that somehow isn't compatible with >>>> HIGHMEM. Can you get uncached HIGHMEM? >>> I guess in principle yes, if you're careful not to use regular >>> kmap()/kmap_atomic(), and always use pgprot_noncached() for >>> userspace/vmalloc mappings, but clearly that leaves lots of scope for >>> slipping up. >> I theory we should do exactly that in TTM, but we have very few users >> who actually still exercise that functionality. >> >>> Working backwards from primitives like set_memory_uc(), I see various >>> paths in TTM where manipulating the caching state is skipped for >>> highmem pages, but I wouldn't even know where to start looking for >>> whether the right state is propagated to all the places where they >>> might eventually be mapped somewhere. >> The tt object has the caching state for the pages and >> ttm_prot_from_caching() then uses pgprot_noncached() and co for the >> userspace/vmalloc mappings. >> > The point of this patch is that dma_addressing_limited() is unsuitable as > the last parameter to ttm_pool_init(), since if it is "false"--as it is in this > particular case--then TTM ends up using HIGHUSER, and we get the screen corruption. > (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) Well I would rather say that dma_addressing_limited() works, but the default value from dma_get_required_mask() is broken. 32 bits only work with bounce buffers and we can't use those on graphics hardware. > Is there an objection to this patch, if it fixes the screen corruption? Not from my side, but fixing the underlying issues would be better I think. > Or does TTM need fixing, in that what we really need is to specify whether > caching is desired and/or DMA32 when we allocate a TTM pool (ttm_pool_init(), > called from ttm_device_init(), called from radeon_ttm_init.c)? Could be, but it's more likely that the problem is in the DMA layer because we fail to recognize that the device can't access all of the memory. Regards, Christian. > > Regards, > Luben > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-15 9:46 ` Christian König @ 2022-12-15 10:19 ` Luben Tuikov 2022-12-15 11:27 ` Christian König 0 siblings, 1 reply; 38+ messages in thread From: Luben Tuikov @ 2022-12-15 10:19 UTC (permalink / raw) To: Christian König, Robin Murphy, Alex Deucher Cc: Alex Deucher, Mikhail Krylov, AMD Graphics, Direct Rendering Infrastructure - Development On 2022-12-15 04:46, Christian König wrote: > Am 15.12.22 um 10:08 schrieb Luben Tuikov: >> On 2022-12-15 03:07, Christian König wrote: >>> Am 15.12.22 um 00:08 schrieb Robin Murphy: >>>> On 2022-12-14 22:02, Alex Deucher wrote: >>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> >>>>> wrote: >>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>>>> Fix screen corruption on older 32-bit systems using AGP chips. >>>>>>> >>>>>>> On older systems with little memory, for instance 1.5 GiB, using an >>>>>>> AGP chip, >>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is >>>>>>> 0x7FFFFFF, and >>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>>>>>> false. As such the result of this static inline isn't suitable for >>>>>>> the last >>>>>>> argument to ttm_device_init()--it simply needs to now whether to >>>>>>> use GFP_DMA32 >>>>>>> when allocating DMA buffers. >>>>>> This sounds wrong to me. If the issues happen on systems without PAE it >>>>>> clearly can't have anything to with the actual DMA address size. Not to >>>>>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >>>>>> GFP_DMA32 would be functionally meaningless anyway. Although the >>>>>> reported symptoms initially sounded like they could be caused by DMA >>>>>> going to the wrong place, that is also equally consistent with a >>>>>> loss of >>>>>> cache coherency. >>>>>> >>>>>> My (limited) understanding of AGP is that the GART can effectively >>>>>> alias >>>>>> memory to a second physical address, so I could well believe that >>>>>> something somewhere in the driver stack needs to perform some cache >>>>>> maintenance to avoid coherency issues, and that in these particular >>>>>> setups whatever that is might be assuming the memory is direct-mapped >>>>>> and thus going wrong for highmem pages. >>>>>> >>>>>> So as I said before, I really think this is not about using >>>>>> GFP_DMA32 at >>>>>> all, but about *not* using GFP_HIGHUSER. >>>>> One of the wonderful features of AGP is that it has to be used with >>>>> uncached memory. The aperture basically just provides a remapping of >>>>> physical pages into a linear aperture that you point the GPU at. TTM >>>>> has to jump through quite a few hoops to get uncached memory in the >>>>> first place, so it's likely that that somehow isn't compatible with >>>>> HIGHMEM. Can you get uncached HIGHMEM? >>>> I guess in principle yes, if you're careful not to use regular >>>> kmap()/kmap_atomic(), and always use pgprot_noncached() for >>>> userspace/vmalloc mappings, but clearly that leaves lots of scope for >>>> slipping up. >>> I theory we should do exactly that in TTM, but we have very few users >>> who actually still exercise that functionality. >>> >>>> Working backwards from primitives like set_memory_uc(), I see various >>>> paths in TTM where manipulating the caching state is skipped for >>>> highmem pages, but I wouldn't even know where to start looking for >>>> whether the right state is propagated to all the places where they >>>> might eventually be mapped somewhere. >>> The tt object has the caching state for the pages and >>> ttm_prot_from_caching() then uses pgprot_noncached() and co for the >>> userspace/vmalloc mappings. >>> >> The point of this patch is that dma_addressing_limited() is unsuitable as >> the last parameter to ttm_pool_init(), since if it is "false"--as it is in this >> particular case--then TTM ends up using HIGHUSER, and we get the screen corruption. >> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) > > Well I would rather say that dma_addressing_limited() works, but the > default value from dma_get_required_mask() is broken. > dma_get_required_mask() for his setup of 1.5 GiB of memory returns 0x7FFFFFF. While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in radeon_device_init(). > 32 bits only work with bounce buffers and we can't use those on graphics > hardware. > >> Is there an objection to this patch, if it fixes the screen corruption? > > Not from my side, but fixing the underlying issues would be better I think. > Have they been identified? >> Or does TTM need fixing, in that what we really need is to specify whether >> caching is desired and/or DMA32 when we allocate a TTM pool (ttm_pool_init(), >> called from ttm_device_init(), called from radeon_ttm_init.c)? > > Could be, but it's more likely that the problem is in the DMA layer > because we fail to recognize that the device can't access all of the memory. > Right, I agree. Ideally, setting dev->{dma_mask, coherent_dma_mask, bus_dma_limit}, should be sufficient to tell the DMA layer what kind of memory the device can handle. But this patch doesn't change non-local behaviour and as such is local and safe to apply. Regards, Luben ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-15 10:19 ` Luben Tuikov @ 2022-12-15 11:27 ` Christian König 2022-12-15 11:40 ` Luben Tuikov 0 siblings, 1 reply; 38+ messages in thread From: Christian König @ 2022-12-15 11:27 UTC (permalink / raw) To: Luben Tuikov, Robin Murphy, Alex Deucher Cc: Alex Deucher, Mikhail Krylov, AMD Graphics, Direct Rendering Infrastructure - Development Am 15.12.22 um 11:19 schrieb Luben Tuikov: > On 2022-12-15 04:46, Christian König wrote: >> Am 15.12.22 um 10:08 schrieb Luben Tuikov: >>> On 2022-12-15 03:07, Christian König wrote: >>>> Am 15.12.22 um 00:08 schrieb Robin Murphy: >>>>> On 2022-12-14 22:02, Alex Deucher wrote: >>>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> >>>>>> wrote: >>>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>>>>> Fix screen corruption on older 32-bit systems using AGP chips. >>>>>>>> >>>>>>>> On older systems with little memory, for instance 1.5 GiB, using an >>>>>>>> AGP chip, >>>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is >>>>>>>> 0x7FFFFFF, and >>>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>>>>>>> false. As such the result of this static inline isn't suitable for >>>>>>>> the last >>>>>>>> argument to ttm_device_init()--it simply needs to now whether to >>>>>>>> use GFP_DMA32 >>>>>>>> when allocating DMA buffers. >>>>>>> This sounds wrong to me. If the issues happen on systems without PAE it >>>>>>> clearly can't have anything to with the actual DMA address size. Not to >>>>>>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >>>>>>> GFP_DMA32 would be functionally meaningless anyway. Although the >>>>>>> reported symptoms initially sounded like they could be caused by DMA >>>>>>> going to the wrong place, that is also equally consistent with a >>>>>>> loss of >>>>>>> cache coherency. >>>>>>> >>>>>>> My (limited) understanding of AGP is that the GART can effectively >>>>>>> alias >>>>>>> memory to a second physical address, so I could well believe that >>>>>>> something somewhere in the driver stack needs to perform some cache >>>>>>> maintenance to avoid coherency issues, and that in these particular >>>>>>> setups whatever that is might be assuming the memory is direct-mapped >>>>>>> and thus going wrong for highmem pages. >>>>>>> >>>>>>> So as I said before, I really think this is not about using >>>>>>> GFP_DMA32 at >>>>>>> all, but about *not* using GFP_HIGHUSER. >>>>>> One of the wonderful features of AGP is that it has to be used with >>>>>> uncached memory. The aperture basically just provides a remapping of >>>>>> physical pages into a linear aperture that you point the GPU at. TTM >>>>>> has to jump through quite a few hoops to get uncached memory in the >>>>>> first place, so it's likely that that somehow isn't compatible with >>>>>> HIGHMEM. Can you get uncached HIGHMEM? >>>>> I guess in principle yes, if you're careful not to use regular >>>>> kmap()/kmap_atomic(), and always use pgprot_noncached() for >>>>> userspace/vmalloc mappings, but clearly that leaves lots of scope for >>>>> slipping up. >>>> I theory we should do exactly that in TTM, but we have very few users >>>> who actually still exercise that functionality. >>>> >>>>> Working backwards from primitives like set_memory_uc(), I see various >>>>> paths in TTM where manipulating the caching state is skipped for >>>>> highmem pages, but I wouldn't even know where to start looking for >>>>> whether the right state is propagated to all the places where they >>>>> might eventually be mapped somewhere. >>>> The tt object has the caching state for the pages and >>>> ttm_prot_from_caching() then uses pgprot_noncached() and co for the >>>> userspace/vmalloc mappings. >>>> >>> The point of this patch is that dma_addressing_limited() is unsuitable as >>> the last parameter to ttm_pool_init(), since if it is "false"--as it is in this >>> particular case--then TTM ends up using HIGHUSER, and we get the screen corruption. >>> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) >> Well I would rather say that dma_addressing_limited() works, but the >> default value from dma_get_required_mask() is broken. >> > dma_get_required_mask() for his setup of 1.5 GiB of memory returns 0x7FFFFFF. This 0x7FFFFFF mask looks fishy to me. That would only be 128MiB addressable memory (27 bits set)? Or is there another F missing? > While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in radeon_device_init(). > >> 32 bits only work with bounce buffers and we can't use those on graphics >> hardware. >> >>> Is there an objection to this patch, if it fixes the screen corruption? >> Not from my side, but fixing the underlying issues would be better I think. >> > Have they been identified? I'm not 100% sure. I think by using GFP_DMA32 we just work around the issue somehow. >>> Or does TTM need fixing, in that what we really need is to specify whether >>> caching is desired and/or DMA32 when we allocate a TTM pool (ttm_pool_init(), >>> called from ttm_device_init(), called from radeon_ttm_init.c)? >> Could be, but it's more likely that the problem is in the DMA layer >> because we fail to recognize that the device can't access all of the memory. >> > Right, I agree. Ideally, setting dev->{dma_mask, coherent_dma_mask, bus_dma_limit}, > should be sufficient to tell the DMA layer what kind of memory the device > can handle. > > But this patch doesn't change non-local behaviour and as such is local and safe > to apply. Yeah, agree. It's pretty hard to find such old hardware for testing anyway. I do still have a working AGP system somewhere in my basement, but dusting that of just for testing this doesn't sounds like valuable time spend. Regards, Christian. > > Regards, > Luben > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-15 11:27 ` Christian König @ 2022-12-15 11:40 ` Luben Tuikov 2022-12-15 11:53 ` Robin Murphy 0 siblings, 1 reply; 38+ messages in thread From: Luben Tuikov @ 2022-12-15 11:40 UTC (permalink / raw) To: Christian König, Robin Murphy, Alex Deucher Cc: Alex Deucher, Mikhail Krylov, AMD Graphics, Direct Rendering Infrastructure - Development On 2022-12-15 06:27, Christian König wrote: > Am 15.12.22 um 11:19 schrieb Luben Tuikov: >> On 2022-12-15 04:46, Christian König wrote: >>> Am 15.12.22 um 10:08 schrieb Luben Tuikov: >>>> On 2022-12-15 03:07, Christian König wrote: >>>>> Am 15.12.22 um 00:08 schrieb Robin Murphy: >>>>>> On 2022-12-14 22:02, Alex Deucher wrote: >>>>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> >>>>>>> wrote: >>>>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>>>>>> Fix screen corruption on older 32-bit systems using AGP chips. >>>>>>>>> >>>>>>>>> On older systems with little memory, for instance 1.5 GiB, using an >>>>>>>>> AGP chip, >>>>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is >>>>>>>>> 0x7FFFFFF, and >>>>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>>>>>>>> false. As such the result of this static inline isn't suitable for >>>>>>>>> the last >>>>>>>>> argument to ttm_device_init()--it simply needs to now whether to >>>>>>>>> use GFP_DMA32 >>>>>>>>> when allocating DMA buffers. >>>>>>>> This sounds wrong to me. If the issues happen on systems without PAE it >>>>>>>> clearly can't have anything to with the actual DMA address size. Not to >>>>>>>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >>>>>>>> GFP_DMA32 would be functionally meaningless anyway. Although the >>>>>>>> reported symptoms initially sounded like they could be caused by DMA >>>>>>>> going to the wrong place, that is also equally consistent with a >>>>>>>> loss of >>>>>>>> cache coherency. >>>>>>>> >>>>>>>> My (limited) understanding of AGP is that the GART can effectively >>>>>>>> alias >>>>>>>> memory to a second physical address, so I could well believe that >>>>>>>> something somewhere in the driver stack needs to perform some cache >>>>>>>> maintenance to avoid coherency issues, and that in these particular >>>>>>>> setups whatever that is might be assuming the memory is direct-mapped >>>>>>>> and thus going wrong for highmem pages. >>>>>>>> >>>>>>>> So as I said before, I really think this is not about using >>>>>>>> GFP_DMA32 at >>>>>>>> all, but about *not* using GFP_HIGHUSER. >>>>>>> One of the wonderful features of AGP is that it has to be used with >>>>>>> uncached memory. The aperture basically just provides a remapping of >>>>>>> physical pages into a linear aperture that you point the GPU at. TTM >>>>>>> has to jump through quite a few hoops to get uncached memory in the >>>>>>> first place, so it's likely that that somehow isn't compatible with >>>>>>> HIGHMEM. Can you get uncached HIGHMEM? >>>>>> I guess in principle yes, if you're careful not to use regular >>>>>> kmap()/kmap_atomic(), and always use pgprot_noncached() for >>>>>> userspace/vmalloc mappings, but clearly that leaves lots of scope for >>>>>> slipping up. >>>>> I theory we should do exactly that in TTM, but we have very few users >>>>> who actually still exercise that functionality. >>>>> >>>>>> Working backwards from primitives like set_memory_uc(), I see various >>>>>> paths in TTM where manipulating the caching state is skipped for >>>>>> highmem pages, but I wouldn't even know where to start looking for >>>>>> whether the right state is propagated to all the places where they >>>>>> might eventually be mapped somewhere. >>>>> The tt object has the caching state for the pages and >>>>> ttm_prot_from_caching() then uses pgprot_noncached() and co for the >>>>> userspace/vmalloc mappings. >>>>> >>>> The point of this patch is that dma_addressing_limited() is unsuitable as >>>> the last parameter to ttm_pool_init(), since if it is "false"--as it is in this >>>> particular case--then TTM ends up using HIGHUSER, and we get the screen corruption. >>>> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) >>> Well I would rather say that dma_addressing_limited() works, but the >>> default value from dma_get_required_mask() is broken. >>> >> dma_get_required_mask() for his setup of 1.5 GiB of memory returns 0x7FFFFFF. > > This 0x7FFFFFF mask looks fishy to me. That would only be 128MiB > addressable memory (27 bits set)? Or is there another F missing? Yeah, I'm missing an F--it is correctly described at the top of the thread above, i.e. in the commit of v2 patch. 0x7FFF_FFFF, which seems correct, no? >> While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in radeon_device_init(). >> >>> 32 bits only work with bounce buffers and we can't use those on graphics >>> hardware. >>> >>>> Is there an objection to this patch, if it fixes the screen corruption? >>> Not from my side, but fixing the underlying issues would be better I think. >>> >> Have they been identified? > > I'm not 100% sure. I think by using GFP_DMA32 we just work around the > issue somehow. Right. Using GFP_DMA32, we don't touch high-mem. I was looking at the DRM code trying to understand what we do when GFP_DMA32 is not set, and the immediate thing I see is that we set GFP_HIGHUSER when use_dma32 is unset in the device struct. (Then I got down to the caching attributes...) It's be nice if we can find the actual issue--what else would it show us that needs fixing...? So what do we do with this patch? Shouldn't leave it in a limbo--some OSes ship their kernel with 33b3ad3788ab ("drm/radeon: handle PCIe root ports with addressing limitations") wholly reverted. Regards, Luben ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-15 11:40 ` Luben Tuikov @ 2022-12-15 11:53 ` Robin Murphy 2022-12-15 12:07 ` Luben Tuikov 0 siblings, 1 reply; 38+ messages in thread From: Robin Murphy @ 2022-12-15 11:53 UTC (permalink / raw) To: Luben Tuikov, Christian König, Alex Deucher Cc: Alex Deucher, Mikhail Krylov, AMD Graphics, Direct Rendering Infrastructure - Development On 2022-12-15 11:40, Luben Tuikov wrote: > On 2022-12-15 06:27, Christian König wrote: >> Am 15.12.22 um 11:19 schrieb Luben Tuikov: >>> On 2022-12-15 04:46, Christian König wrote: >>>> Am 15.12.22 um 10:08 schrieb Luben Tuikov: >>>>> On 2022-12-15 03:07, Christian König wrote: >>>>>> Am 15.12.22 um 00:08 schrieb Robin Murphy: >>>>>>> On 2022-12-14 22:02, Alex Deucher wrote: >>>>>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> >>>>>>>> wrote: >>>>>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>>>>>>> Fix screen corruption on older 32-bit systems using AGP chips. >>>>>>>>>> >>>>>>>>>> On older systems with little memory, for instance 1.5 GiB, using an >>>>>>>>>> AGP chip, >>>>>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is >>>>>>>>>> 0x7FFFFFF, and >>>>>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>>>>>>>>> false. As such the result of this static inline isn't suitable for >>>>>>>>>> the last >>>>>>>>>> argument to ttm_device_init()--it simply needs to now whether to >>>>>>>>>> use GFP_DMA32 >>>>>>>>>> when allocating DMA buffers. >>>>>>>>> This sounds wrong to me. If the issues happen on systems without PAE it >>>>>>>>> clearly can't have anything to with the actual DMA address size. Not to >>>>>>>>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >>>>>>>>> GFP_DMA32 would be functionally meaningless anyway. Although the >>>>>>>>> reported symptoms initially sounded like they could be caused by DMA >>>>>>>>> going to the wrong place, that is also equally consistent with a >>>>>>>>> loss of >>>>>>>>> cache coherency. >>>>>>>>> >>>>>>>>> My (limited) understanding of AGP is that the GART can effectively >>>>>>>>> alias >>>>>>>>> memory to a second physical address, so I could well believe that >>>>>>>>> something somewhere in the driver stack needs to perform some cache >>>>>>>>> maintenance to avoid coherency issues, and that in these particular >>>>>>>>> setups whatever that is might be assuming the memory is direct-mapped >>>>>>>>> and thus going wrong for highmem pages. >>>>>>>>> >>>>>>>>> So as I said before, I really think this is not about using >>>>>>>>> GFP_DMA32 at >>>>>>>>> all, but about *not* using GFP_HIGHUSER. >>>>>>>> One of the wonderful features of AGP is that it has to be used with >>>>>>>> uncached memory. The aperture basically just provides a remapping of >>>>>>>> physical pages into a linear aperture that you point the GPU at. TTM >>>>>>>> has to jump through quite a few hoops to get uncached memory in the >>>>>>>> first place, so it's likely that that somehow isn't compatible with >>>>>>>> HIGHMEM. Can you get uncached HIGHMEM? >>>>>>> I guess in principle yes, if you're careful not to use regular >>>>>>> kmap()/kmap_atomic(), and always use pgprot_noncached() for >>>>>>> userspace/vmalloc mappings, but clearly that leaves lots of scope for >>>>>>> slipping up. >>>>>> I theory we should do exactly that in TTM, but we have very few users >>>>>> who actually still exercise that functionality. >>>>>> >>>>>>> Working backwards from primitives like set_memory_uc(), I see various >>>>>>> paths in TTM where manipulating the caching state is skipped for >>>>>>> highmem pages, but I wouldn't even know where to start looking for >>>>>>> whether the right state is propagated to all the places where they >>>>>>> might eventually be mapped somewhere. >>>>>> The tt object has the caching state for the pages and >>>>>> ttm_prot_from_caching() then uses pgprot_noncached() and co for the >>>>>> userspace/vmalloc mappings. >>>>>> >>>>> The point of this patch is that dma_addressing_limited() is unsuitable as >>>>> the last parameter to ttm_pool_init(), since if it is "false"--as it is in this >>>>> particular case--then TTM ends up using HIGHUSER, and we get the screen corruption. >>>>> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) >>>> Well I would rather say that dma_addressing_limited() works, but the >>>> default value from dma_get_required_mask() is broken. >>>> >>> dma_get_required_mask() for his setup of 1.5 GiB of memory returns 0x7FFFFFF. >> >> This 0x7FFFFFF mask looks fishy to me. That would only be 128MiB >> addressable memory (27 bits set)? Or is there another F missing? > > Yeah, I'm missing an F--it is correctly described at the top of the thread above, > i.e. in the commit of v2 patch. > > 0x7FFF_FFFF, which seems correct, no? > >>> While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in radeon_device_init(). >>> >>>> 32 bits only work with bounce buffers and we can't use those on graphics >>>> hardware. >>>> >>>>> Is there an objection to this patch, if it fixes the screen corruption? >>>> Not from my side, but fixing the underlying issues would be better I think. >>>> >>> Have they been identified? >> >> I'm not 100% sure. I think by using GFP_DMA32 we just work around the >> issue somehow. > > Right. Using GFP_DMA32, we don't touch high-mem. I was looking at the DRM > code trying to understand what we do when GFP_DMA32 is not set, and the immediate > thing I see is that we set GFP_HIGHUSER when use_dma32 is unset in the device struct. > (Then I got down to the caching attributes...) > > It's be nice if we can find the actual issue--what else would it show us that needs fixing...? > > So what do we do with this patch? > > Shouldn't leave it in a limbo--some OSes ship their kernel > with 33b3ad3788ab ("drm/radeon: handle PCIe root ports with addressing limitations") wholly > reverted. Removing dma_addressing_limited() is still wrong, for the reasons given in that commit. What we need is an *additional* condition that encapsulates "also pass use_dma32 for AGP devices because it avoids some weird coherency issue with 32-bit highmem that isn't worth trying to debug further". Thanks, Robin. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-15 11:53 ` Robin Murphy @ 2022-12-15 12:07 ` Luben Tuikov 2023-01-19 16:56 ` Krylov Michael 0 siblings, 1 reply; 38+ messages in thread From: Luben Tuikov @ 2022-12-15 12:07 UTC (permalink / raw) To: Robin Murphy, Christian König, Alex Deucher Cc: Alex Deucher, Mikhail Krylov, AMD Graphics, Direct Rendering Infrastructure - Development On 2022-12-15 06:53, Robin Murphy wrote: > On 2022-12-15 11:40, Luben Tuikov wrote: >> On 2022-12-15 06:27, Christian König wrote: >>> Am 15.12.22 um 11:19 schrieb Luben Tuikov: >>>> On 2022-12-15 04:46, Christian König wrote: >>>>> Am 15.12.22 um 10:08 schrieb Luben Tuikov: >>>>>> On 2022-12-15 03:07, Christian König wrote: >>>>>>> Am 15.12.22 um 00:08 schrieb Robin Murphy: >>>>>>>> On 2022-12-14 22:02, Alex Deucher wrote: >>>>>>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy@arm.com> >>>>>>>>> wrote: >>>>>>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>>>>>>>> Fix screen corruption on older 32-bit systems using AGP chips. >>>>>>>>>>> >>>>>>>>>>> On older systems with little memory, for instance 1.5 GiB, using an >>>>>>>>>>> AGP chip, >>>>>>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is >>>>>>>>>>> 0x7FFFFFF, and >>>>>>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF, >>>>>>>>>>> false. As such the result of this static inline isn't suitable for >>>>>>>>>>> the last >>>>>>>>>>> argument to ttm_device_init()--it simply needs to now whether to >>>>>>>>>>> use GFP_DMA32 >>>>>>>>>>> when allocating DMA buffers. >>>>>>>>>> This sounds wrong to me. If the issues happen on systems without PAE it >>>>>>>>>> clearly can't have anything to with the actual DMA address size. Not to >>>>>>>>>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so >>>>>>>>>> GFP_DMA32 would be functionally meaningless anyway. Although the >>>>>>>>>> reported symptoms initially sounded like they could be caused by DMA >>>>>>>>>> going to the wrong place, that is also equally consistent with a >>>>>>>>>> loss of >>>>>>>>>> cache coherency. >>>>>>>>>> >>>>>>>>>> My (limited) understanding of AGP is that the GART can effectively >>>>>>>>>> alias >>>>>>>>>> memory to a second physical address, so I could well believe that >>>>>>>>>> something somewhere in the driver stack needs to perform some cache >>>>>>>>>> maintenance to avoid coherency issues, and that in these particular >>>>>>>>>> setups whatever that is might be assuming the memory is direct-mapped >>>>>>>>>> and thus going wrong for highmem pages. >>>>>>>>>> >>>>>>>>>> So as I said before, I really think this is not about using >>>>>>>>>> GFP_DMA32 at >>>>>>>>>> all, but about *not* using GFP_HIGHUSER. >>>>>>>>> One of the wonderful features of AGP is that it has to be used with >>>>>>>>> uncached memory. The aperture basically just provides a remapping of >>>>>>>>> physical pages into a linear aperture that you point the GPU at. TTM >>>>>>>>> has to jump through quite a few hoops to get uncached memory in the >>>>>>>>> first place, so it's likely that that somehow isn't compatible with >>>>>>>>> HIGHMEM. Can you get uncached HIGHMEM? >>>>>>>> I guess in principle yes, if you're careful not to use regular >>>>>>>> kmap()/kmap_atomic(), and always use pgprot_noncached() for >>>>>>>> userspace/vmalloc mappings, but clearly that leaves lots of scope for >>>>>>>> slipping up. >>>>>>> I theory we should do exactly that in TTM, but we have very few users >>>>>>> who actually still exercise that functionality. >>>>>>> >>>>>>>> Working backwards from primitives like set_memory_uc(), I see various >>>>>>>> paths in TTM where manipulating the caching state is skipped for >>>>>>>> highmem pages, but I wouldn't even know where to start looking for >>>>>>>> whether the right state is propagated to all the places where they >>>>>>>> might eventually be mapped somewhere. >>>>>>> The tt object has the caching state for the pages and >>>>>>> ttm_prot_from_caching() then uses pgprot_noncached() and co for the >>>>>>> userspace/vmalloc mappings. >>>>>>> >>>>>> The point of this patch is that dma_addressing_limited() is unsuitable as >>>>>> the last parameter to ttm_pool_init(), since if it is "false"--as it is in this >>>>>> particular case--then TTM ends up using HIGHUSER, and we get the screen corruption. >>>>>> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) >>>>> Well I would rather say that dma_addressing_limited() works, but the >>>>> default value from dma_get_required_mask() is broken. >>>>> >>>> dma_get_required_mask() for his setup of 1.5 GiB of memory returns 0x7FFFFFF. >>> >>> This 0x7FFFFFF mask looks fishy to me. That would only be 128MiB >>> addressable memory (27 bits set)? Or is there another F missing? >> >> Yeah, I'm missing an F--it is correctly described at the top of the thread above, >> i.e. in the commit of v2 patch. >> >> 0x7FFF_FFFF, which seems correct, no? >> >>>> While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in radeon_device_init(). >>>> >>>>> 32 bits only work with bounce buffers and we can't use those on graphics >>>>> hardware. >>>>> >>>>>> Is there an objection to this patch, if it fixes the screen corruption? >>>>> Not from my side, but fixing the underlying issues would be better I think. >>>>> >>>> Have they been identified? >>> >>> I'm not 100% sure. I think by using GFP_DMA32 we just work around the >>> issue somehow. >> >> Right. Using GFP_DMA32, we don't touch high-mem. I was looking at the DRM >> code trying to understand what we do when GFP_DMA32 is not set, and the immediate >> thing I see is that we set GFP_HIGHUSER when use_dma32 is unset in the device struct. >> (Then I got down to the caching attributes...) >> >> It's be nice if we can find the actual issue--what else would it show us that needs fixing...? >> >> So what do we do with this patch? >> >> Shouldn't leave it in a limbo--some OSes ship their kernel >> with 33b3ad3788ab ("drm/radeon: handle PCIe root ports with addressing limitations") wholly >> reverted. > > Removing dma_addressing_limited() is still wrong, for the reasons given > in that commit. What we need is an *additional* condition that > encapsulates "also pass use_dma32 for AGP devices because it avoids some > weird coherency issue with 32-bit highmem that isn't worth trying to > debug further". Yes, you had a patch earlier which did exactly that--why not push that patch? Q: If host memory is 1.5 GiB, i.e. mask of 0x7FFF_FFFF, but the device's mask is 0xFFFF_FFFF, shouldn't we use GFP_DMA32, instead of GFP_HIGHUSER? Regards, Luben ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2022-12-15 12:07 ` Luben Tuikov @ 2023-01-19 16:56 ` Krylov Michael 0 siblings, 0 replies; 38+ messages in thread From: Krylov Michael @ 2023-01-19 16:56 UTC (permalink / raw) To: Luben Tuikov Cc: Christian König, AMD Graphics, Alex Deucher, Direct Rendering Infrastructure - Development, Alex Deucher, Robin Murphy [-- Attachment #1: Type: text/plain, Size: 7062 bytes --] On Thu, 15 Dec 2022 07:07:33 -0500 Luben Tuikov <luben.tuikov@amd.com> wrote: > On 2022-12-15 06:53, Robin Murphy wrote: > > On 2022-12-15 11:40, Luben Tuikov wrote: > >> On 2022-12-15 06:27, Christian König wrote: > >>> Am 15.12.22 um 11:19 schrieb Luben Tuikov: > >>>> On 2022-12-15 04:46, Christian König wrote: > >>>>> Am 15.12.22 um 10:08 schrieb Luben Tuikov: > >>>>>> On 2022-12-15 03:07, Christian König wrote: > >>>>>>> Am 15.12.22 um 00:08 schrieb Robin Murphy: > >>>>>>>> On 2022-12-14 22:02, Alex Deucher wrote: > >>>>>>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy > >>>>>>>>> <robin.murphy@arm.com> wrote: > >>>>>>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: > >>>>>>>>>>> Fix screen corruption on older 32-bit systems using AGP > >>>>>>>>>>> chips. > >>>>>>>>>>> > >>>>>>>>>>> On older systems with little memory, for instance 1.5 > >>>>>>>>>>> GiB, using an AGP chip, > >>>>>>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask > >>>>>>>>>>> is 0x7FFFFFF, and > >>>>>>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF > >>>>>>>>>>> < 0x7FFFFFFF, false. As such the result of this static > >>>>>>>>>>> inline isn't suitable for the last > >>>>>>>>>>> argument to ttm_device_init()--it simply needs to now > >>>>>>>>>>> whether to use GFP_DMA32 > >>>>>>>>>>> when allocating DMA buffers. > >>>>>>>>>> This sounds wrong to me. If the issues happen on systems > >>>>>>>>>> without PAE it clearly can't have anything to with the > >>>>>>>>>> actual DMA address size. Not to mention that AFAICS 32-bit > >>>>>>>>>> x86 doesn't even have ZONE_DMA32, so GFP_DMA32 would be > >>>>>>>>>> functionally meaningless anyway. Although the reported > >>>>>>>>>> symptoms initially sounded like they could be caused by > >>>>>>>>>> DMA going to the wrong place, that is also equally > >>>>>>>>>> consistent with a loss of cache coherency. > >>>>>>>>>> > >>>>>>>>>> My (limited) understanding of AGP is that the GART can > >>>>>>>>>> effectively alias > >>>>>>>>>> memory to a second physical address, so I could well > >>>>>>>>>> believe that something somewhere in the driver stack needs > >>>>>>>>>> to perform some cache maintenance to avoid coherency > >>>>>>>>>> issues, and that in these particular setups whatever that > >>>>>>>>>> is might be assuming the memory is direct-mapped and thus > >>>>>>>>>> going wrong for highmem pages. > >>>>>>>>>> > >>>>>>>>>> So as I said before, I really think this is not about using > >>>>>>>>>> GFP_DMA32 at > >>>>>>>>>> all, but about *not* using GFP_HIGHUSER. > >>>>>>>>> One of the wonderful features of AGP is that it has to be > >>>>>>>>> used with uncached memory. The aperture basically just > >>>>>>>>> provides a remapping of physical pages into a linear > >>>>>>>>> aperture that you point the GPU at. TTM has to jump > >>>>>>>>> through quite a few hoops to get uncached memory in the > >>>>>>>>> first place, so it's likely that that somehow isn't > >>>>>>>>> compatible with HIGHMEM. Can you get uncached HIGHMEM? > >>>>>>>> I guess in principle yes, if you're careful not to use > >>>>>>>> regular kmap()/kmap_atomic(), and always use > >>>>>>>> pgprot_noncached() for userspace/vmalloc mappings, but > >>>>>>>> clearly that leaves lots of scope for slipping up. > >>>>>>> I theory we should do exactly that in TTM, but we have very > >>>>>>> few users who actually still exercise that functionality. > >>>>>>> > >>>>>>>> Working backwards from primitives like set_memory_uc(), I > >>>>>>>> see various paths in TTM where manipulating the caching > >>>>>>>> state is skipped for highmem pages, but I wouldn't even know > >>>>>>>> where to start looking for whether the right state is > >>>>>>>> propagated to all the places where they might eventually be > >>>>>>>> mapped somewhere. > >>>>>>> The tt object has the caching state for the pages and > >>>>>>> ttm_prot_from_caching() then uses pgprot_noncached() and co > >>>>>>> for the userspace/vmalloc mappings. > >>>>>>> > >>>>>> The point of this patch is that dma_addressing_limited() is > >>>>>> unsuitable as the last parameter to ttm_pool_init(), since if > >>>>>> it is "false"--as it is in this particular case--then TTM ends > >>>>>> up using HIGHUSER, and we get the screen corruption. > >>>>>> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) > >>>>> Well I would rather say that dma_addressing_limited() works, > >>>>> but the default value from dma_get_required_mask() is broken. > >>>>> > >>>> dma_get_required_mask() for his setup of 1.5 GiB of memory > >>>> returns 0x7FFFFFF. > >>> > >>> This 0x7FFFFFF mask looks fishy to me. That would only be 128MiB > >>> addressable memory (27 bits set)? Or is there another F missing? > >> > >> Yeah, I'm missing an F--it is correctly described at the top of > >> the thread above, i.e. in the commit of v2 patch. > >> > >> 0x7FFF_FFFF, which seems correct, no? > >> > >>>> While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in > >>>> radeon_device_init(). > >>>> > >>>>> 32 bits only work with bounce buffers and we can't use those on > >>>>> graphics hardware. > >>>>> > >>>>>> Is there an objection to this patch, if it fixes the screen > >>>>>> corruption? > >>>>> Not from my side, but fixing the underlying issues would be > >>>>> better I think. > >>>>> > >>>> Have they been identified? > >>> > >>> I'm not 100% sure. I think by using GFP_DMA32 we just work around > >>> the issue somehow. > >> > >> Right. Using GFP_DMA32, we don't touch high-mem. I was looking at > >> the DRM code trying to understand what we do when GFP_DMA32 is not > >> set, and the immediate thing I see is that we set GFP_HIGHUSER > >> when use_dma32 is unset in the device struct. (Then I got down to > >> the caching attributes...) > >> > >> It's be nice if we can find the actual issue--what else would it > >> show us that needs fixing...? > >> > >> So what do we do with this patch? > >> > >> Shouldn't leave it in a limbo--some OSes ship their kernel > >> with 33b3ad3788ab ("drm/radeon: handle PCIe root ports with > >> addressing limitations") wholly reverted. > > > > Removing dma_addressing_limited() is still wrong, for the reasons > > given in that commit. What we need is an *additional* condition > > that encapsulates "also pass use_dma32 for AGP devices because it > > avoids some weird coherency issue with 32-bit highmem that isn't > > worth trying to debug further". > > Yes, you had a patch earlier which did exactly that--why not push > that patch? > > Q: If host memory is 1.5 GiB, i.e. mask of 0x7FFF_FFFF, but the > device's mask is 0xFFFF_FFFF, shouldn't we use GFP_DMA32, instead of > GFP_HIGHUSER? > > Regards, > Luben > Sorry for being pushy, but given that we are so close to the finish, is there any chance that one of the variants will be merged to the kernel sources any time soon and if so, can I help with testing? I would really benefit from this patch making it into Debian 12. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) @ 2023-01-19 16:56 ` Krylov Michael 0 siblings, 0 replies; 38+ messages in thread From: Krylov Michael @ 2023-01-19 16:56 UTC (permalink / raw) To: Luben Tuikov Cc: Christian König, AMD Graphics, Alex Deucher, Direct Rendering Infrastructure - Development, Robin Murphy [-- Attachment #1: Type: text/plain, Size: 7062 bytes --] On Thu, 15 Dec 2022 07:07:33 -0500 Luben Tuikov <luben.tuikov@amd.com> wrote: > On 2022-12-15 06:53, Robin Murphy wrote: > > On 2022-12-15 11:40, Luben Tuikov wrote: > >> On 2022-12-15 06:27, Christian König wrote: > >>> Am 15.12.22 um 11:19 schrieb Luben Tuikov: > >>>> On 2022-12-15 04:46, Christian König wrote: > >>>>> Am 15.12.22 um 10:08 schrieb Luben Tuikov: > >>>>>> On 2022-12-15 03:07, Christian König wrote: > >>>>>>> Am 15.12.22 um 00:08 schrieb Robin Murphy: > >>>>>>>> On 2022-12-14 22:02, Alex Deucher wrote: > >>>>>>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy > >>>>>>>>> <robin.murphy@arm.com> wrote: > >>>>>>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: > >>>>>>>>>>> Fix screen corruption on older 32-bit systems using AGP > >>>>>>>>>>> chips. > >>>>>>>>>>> > >>>>>>>>>>> On older systems with little memory, for instance 1.5 > >>>>>>>>>>> GiB, using an AGP chip, > >>>>>>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask > >>>>>>>>>>> is 0x7FFFFFF, and > >>>>>>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF > >>>>>>>>>>> < 0x7FFFFFFF, false. As such the result of this static > >>>>>>>>>>> inline isn't suitable for the last > >>>>>>>>>>> argument to ttm_device_init()--it simply needs to now > >>>>>>>>>>> whether to use GFP_DMA32 > >>>>>>>>>>> when allocating DMA buffers. > >>>>>>>>>> This sounds wrong to me. If the issues happen on systems > >>>>>>>>>> without PAE it clearly can't have anything to with the > >>>>>>>>>> actual DMA address size. Not to mention that AFAICS 32-bit > >>>>>>>>>> x86 doesn't even have ZONE_DMA32, so GFP_DMA32 would be > >>>>>>>>>> functionally meaningless anyway. Although the reported > >>>>>>>>>> symptoms initially sounded like they could be caused by > >>>>>>>>>> DMA going to the wrong place, that is also equally > >>>>>>>>>> consistent with a loss of cache coherency. > >>>>>>>>>> > >>>>>>>>>> My (limited) understanding of AGP is that the GART can > >>>>>>>>>> effectively alias > >>>>>>>>>> memory to a second physical address, so I could well > >>>>>>>>>> believe that something somewhere in the driver stack needs > >>>>>>>>>> to perform some cache maintenance to avoid coherency > >>>>>>>>>> issues, and that in these particular setups whatever that > >>>>>>>>>> is might be assuming the memory is direct-mapped and thus > >>>>>>>>>> going wrong for highmem pages. > >>>>>>>>>> > >>>>>>>>>> So as I said before, I really think this is not about using > >>>>>>>>>> GFP_DMA32 at > >>>>>>>>>> all, but about *not* using GFP_HIGHUSER. > >>>>>>>>> One of the wonderful features of AGP is that it has to be > >>>>>>>>> used with uncached memory. The aperture basically just > >>>>>>>>> provides a remapping of physical pages into a linear > >>>>>>>>> aperture that you point the GPU at. TTM has to jump > >>>>>>>>> through quite a few hoops to get uncached memory in the > >>>>>>>>> first place, so it's likely that that somehow isn't > >>>>>>>>> compatible with HIGHMEM. Can you get uncached HIGHMEM? > >>>>>>>> I guess in principle yes, if you're careful not to use > >>>>>>>> regular kmap()/kmap_atomic(), and always use > >>>>>>>> pgprot_noncached() for userspace/vmalloc mappings, but > >>>>>>>> clearly that leaves lots of scope for slipping up. > >>>>>>> I theory we should do exactly that in TTM, but we have very > >>>>>>> few users who actually still exercise that functionality. > >>>>>>> > >>>>>>>> Working backwards from primitives like set_memory_uc(), I > >>>>>>>> see various paths in TTM where manipulating the caching > >>>>>>>> state is skipped for highmem pages, but I wouldn't even know > >>>>>>>> where to start looking for whether the right state is > >>>>>>>> propagated to all the places where they might eventually be > >>>>>>>> mapped somewhere. > >>>>>>> The tt object has the caching state for the pages and > >>>>>>> ttm_prot_from_caching() then uses pgprot_noncached() and co > >>>>>>> for the userspace/vmalloc mappings. > >>>>>>> > >>>>>> The point of this patch is that dma_addressing_limited() is > >>>>>> unsuitable as the last parameter to ttm_pool_init(), since if > >>>>>> it is "false"--as it is in this particular case--then TTM ends > >>>>>> up using HIGHUSER, and we get the screen corruption. > >>>>>> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) > >>>>> Well I would rather say that dma_addressing_limited() works, > >>>>> but the default value from dma_get_required_mask() is broken. > >>>>> > >>>> dma_get_required_mask() for his setup of 1.5 GiB of memory > >>>> returns 0x7FFFFFF. > >>> > >>> This 0x7FFFFFF mask looks fishy to me. That would only be 128MiB > >>> addressable memory (27 bits set)? Or is there another F missing? > >> > >> Yeah, I'm missing an F--it is correctly described at the top of > >> the thread above, i.e. in the commit of v2 patch. > >> > >> 0x7FFF_FFFF, which seems correct, no? > >> > >>>> While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in > >>>> radeon_device_init(). > >>>> > >>>>> 32 bits only work with bounce buffers and we can't use those on > >>>>> graphics hardware. > >>>>> > >>>>>> Is there an objection to this patch, if it fixes the screen > >>>>>> corruption? > >>>>> Not from my side, but fixing the underlying issues would be > >>>>> better I think. > >>>>> > >>>> Have they been identified? > >>> > >>> I'm not 100% sure. I think by using GFP_DMA32 we just work around > >>> the issue somehow. > >> > >> Right. Using GFP_DMA32, we don't touch high-mem. I was looking at > >> the DRM code trying to understand what we do when GFP_DMA32 is not > >> set, and the immediate thing I see is that we set GFP_HIGHUSER > >> when use_dma32 is unset in the device struct. (Then I got down to > >> the caching attributes...) > >> > >> It's be nice if we can find the actual issue--what else would it > >> show us that needs fixing...? > >> > >> So what do we do with this patch? > >> > >> Shouldn't leave it in a limbo--some OSes ship their kernel > >> with 33b3ad3788ab ("drm/radeon: handle PCIe root ports with > >> addressing limitations") wholly reverted. > > > > Removing dma_addressing_limited() is still wrong, for the reasons > > given in that commit. What we need is an *additional* condition > > that encapsulates "also pass use_dma32 for AGP devices because it > > avoids some weird coherency issue with 32-bit highmem that isn't > > worth trying to debug further". > > Yes, you had a patch earlier which did exactly that--why not push > that patch? > > Q: If host memory is 1.5 GiB, i.e. mask of 0x7FFF_FFFF, but the > device's mask is 0xFFFF_FFFF, shouldn't we use GFP_DMA32, instead of > GFP_HIGHUSER? > > Regards, > Luben > Sorry for being pushy, but given that we are so close to the finish, is there any chance that one of the variants will be merged to the kernel sources any time soon and if so, can I help with testing? I would really benefit from this patch making it into Debian 12. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) 2023-01-19 16:56 ` Krylov Michael @ 2023-01-20 4:31 ` Luben Tuikov -1 siblings, 0 replies; 38+ messages in thread From: Luben Tuikov @ 2023-01-20 4:31 UTC (permalink / raw) To: Krylov Michael Cc: Christian König, AMD Graphics, Alex Deucher, Direct Rendering Infrastructure - Development, Robin Murphy On 2023-01-19 11:56, Krylov Michael wrote: > On Thu, 15 Dec 2022 07:07:33 -0500 > Luben Tuikov <luben.tuikov@amd.com> wrote: > >> On 2022-12-15 06:53, Robin Murphy wrote: >>> On 2022-12-15 11:40, Luben Tuikov wrote: >>>> On 2022-12-15 06:27, Christian König wrote: >>>>> Am 15.12.22 um 11:19 schrieb Luben Tuikov: >>>>>> On 2022-12-15 04:46, Christian König wrote: >>>>>>> Am 15.12.22 um 10:08 schrieb Luben Tuikov: >>>>>>>> On 2022-12-15 03:07, Christian König wrote: >>>>>>>>> Am 15.12.22 um 00:08 schrieb Robin Murphy: >>>>>>>>>> On 2022-12-14 22:02, Alex Deucher wrote: >>>>>>>>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy >>>>>>>>>>> <robin.murphy@arm.com> wrote: >>>>>>>>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>>>>>>>>>> Fix screen corruption on older 32-bit systems using AGP >>>>>>>>>>>>> chips. >>>>>>>>>>>>> >>>>>>>>>>>>> On older systems with little memory, for instance 1.5 >>>>>>>>>>>>> GiB, using an AGP chip, >>>>>>>>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask >>>>>>>>>>>>> is 0x7FFFFFF, and >>>>>>>>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF >>>>>>>>>>>>> < 0x7FFFFFFF, false. As such the result of this static >>>>>>>>>>>>> inline isn't suitable for the last >>>>>>>>>>>>> argument to ttm_device_init()--it simply needs to now >>>>>>>>>>>>> whether to use GFP_DMA32 >>>>>>>>>>>>> when allocating DMA buffers. >>>>>>>>>>>> This sounds wrong to me. If the issues happen on systems >>>>>>>>>>>> without PAE it clearly can't have anything to with the >>>>>>>>>>>> actual DMA address size. Not to mention that AFAICS 32-bit >>>>>>>>>>>> x86 doesn't even have ZONE_DMA32, so GFP_DMA32 would be >>>>>>>>>>>> functionally meaningless anyway. Although the reported >>>>>>>>>>>> symptoms initially sounded like they could be caused by >>>>>>>>>>>> DMA going to the wrong place, that is also equally >>>>>>>>>>>> consistent with a loss of cache coherency. >>>>>>>>>>>> >>>>>>>>>>>> My (limited) understanding of AGP is that the GART can >>>>>>>>>>>> effectively alias >>>>>>>>>>>> memory to a second physical address, so I could well >>>>>>>>>>>> believe that something somewhere in the driver stack needs >>>>>>>>>>>> to perform some cache maintenance to avoid coherency >>>>>>>>>>>> issues, and that in these particular setups whatever that >>>>>>>>>>>> is might be assuming the memory is direct-mapped and thus >>>>>>>>>>>> going wrong for highmem pages. >>>>>>>>>>>> >>>>>>>>>>>> So as I said before, I really think this is not about using >>>>>>>>>>>> GFP_DMA32 at >>>>>>>>>>>> all, but about *not* using GFP_HIGHUSER. >>>>>>>>>>> One of the wonderful features of AGP is that it has to be >>>>>>>>>>> used with uncached memory. The aperture basically just >>>>>>>>>>> provides a remapping of physical pages into a linear >>>>>>>>>>> aperture that you point the GPU at. TTM has to jump >>>>>>>>>>> through quite a few hoops to get uncached memory in the >>>>>>>>>>> first place, so it's likely that that somehow isn't >>>>>>>>>>> compatible with HIGHMEM. Can you get uncached HIGHMEM? >>>>>>>>>> I guess in principle yes, if you're careful not to use >>>>>>>>>> regular kmap()/kmap_atomic(), and always use >>>>>>>>>> pgprot_noncached() for userspace/vmalloc mappings, but >>>>>>>>>> clearly that leaves lots of scope for slipping up. >>>>>>>>> I theory we should do exactly that in TTM, but we have very >>>>>>>>> few users who actually still exercise that functionality. >>>>>>>>> >>>>>>>>>> Working backwards from primitives like set_memory_uc(), I >>>>>>>>>> see various paths in TTM where manipulating the caching >>>>>>>>>> state is skipped for highmem pages, but I wouldn't even know >>>>>>>>>> where to start looking for whether the right state is >>>>>>>>>> propagated to all the places where they might eventually be >>>>>>>>>> mapped somewhere. >>>>>>>>> The tt object has the caching state for the pages and >>>>>>>>> ttm_prot_from_caching() then uses pgprot_noncached() and co >>>>>>>>> for the userspace/vmalloc mappings. >>>>>>>>> >>>>>>>> The point of this patch is that dma_addressing_limited() is >>>>>>>> unsuitable as the last parameter to ttm_pool_init(), since if >>>>>>>> it is "false"--as it is in this particular case--then TTM ends >>>>>>>> up using HIGHUSER, and we get the screen corruption. >>>>>>>> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) >>>>>>> Well I would rather say that dma_addressing_limited() works, >>>>>>> but the default value from dma_get_required_mask() is broken. >>>>>>> >>>>>> dma_get_required_mask() for his setup of 1.5 GiB of memory >>>>>> returns 0x7FFFFFF. >>>>> >>>>> This 0x7FFFFFF mask looks fishy to me. That would only be 128MiB >>>>> addressable memory (27 bits set)? Or is there another F missing? >>>> >>>> Yeah, I'm missing an F--it is correctly described at the top of >>>> the thread above, i.e. in the commit of v2 patch. >>>> >>>> 0x7FFF_FFFF, which seems correct, no? >>>> >>>>>> While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in >>>>>> radeon_device_init(). >>>>>> >>>>>>> 32 bits only work with bounce buffers and we can't use those on >>>>>>> graphics hardware. >>>>>>> >>>>>>>> Is there an objection to this patch, if it fixes the screen >>>>>>>> corruption? >>>>>>> Not from my side, but fixing the underlying issues would be >>>>>>> better I think. >>>>>>> >>>>>> Have they been identified? >>>>> >>>>> I'm not 100% sure. I think by using GFP_DMA32 we just work around >>>>> the issue somehow. >>>> >>>> Right. Using GFP_DMA32, we don't touch high-mem. I was looking at >>>> the DRM code trying to understand what we do when GFP_DMA32 is not >>>> set, and the immediate thing I see is that we set GFP_HIGHUSER >>>> when use_dma32 is unset in the device struct. (Then I got down to >>>> the caching attributes...) >>>> >>>> It's be nice if we can find the actual issue--what else would it >>>> show us that needs fixing...? >>>> >>>> So what do we do with this patch? >>>> >>>> Shouldn't leave it in a limbo--some OSes ship their kernel >>>> with 33b3ad3788ab ("drm/radeon: handle PCIe root ports with >>>> addressing limitations") wholly reverted. >>> >>> Removing dma_addressing_limited() is still wrong, for the reasons >>> given in that commit. What we need is an *additional* condition >>> that encapsulates "also pass use_dma32 for AGP devices because it >>> avoids some weird coherency issue with 32-bit highmem that isn't >>> worth trying to debug further". >> >> Yes, you had a patch earlier which did exactly that--why not push >> that patch? >> >> Q: If host memory is 1.5 GiB, i.e. mask of 0x7FFF_FFFF, but the >> device's mask is 0xFFFF_FFFF, shouldn't we use GFP_DMA32, instead of >> GFP_HIGHUSER? >> >> Regards, >> Luben >> > > Sorry for being pushy, but given that we are so close to the finish, is > there any chance that one of the variants will be merged to the kernel > sources any time soon and if so, can I help with testing? I would really > benefit from this patch making it into Debian 12. Well, there's a couple of patches addressing this problem here in this thread. If consensus is reached, perhaps one of them could be picked up by upstream. -- Regards, Luben ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] drm/radeon: Fix screen corruption (v2) @ 2023-01-20 4:31 ` Luben Tuikov 0 siblings, 0 replies; 38+ messages in thread From: Luben Tuikov @ 2023-01-20 4:31 UTC (permalink / raw) To: Krylov Michael Cc: Christian König, AMD Graphics, Alex Deucher, Direct Rendering Infrastructure - Development, Alex Deucher, Robin Murphy On 2023-01-19 11:56, Krylov Michael wrote: > On Thu, 15 Dec 2022 07:07:33 -0500 > Luben Tuikov <luben.tuikov@amd.com> wrote: > >> On 2022-12-15 06:53, Robin Murphy wrote: >>> On 2022-12-15 11:40, Luben Tuikov wrote: >>>> On 2022-12-15 06:27, Christian König wrote: >>>>> Am 15.12.22 um 11:19 schrieb Luben Tuikov: >>>>>> On 2022-12-15 04:46, Christian König wrote: >>>>>>> Am 15.12.22 um 10:08 schrieb Luben Tuikov: >>>>>>>> On 2022-12-15 03:07, Christian König wrote: >>>>>>>>> Am 15.12.22 um 00:08 schrieb Robin Murphy: >>>>>>>>>> On 2022-12-14 22:02, Alex Deucher wrote: >>>>>>>>>>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy >>>>>>>>>>> <robin.murphy@arm.com> wrote: >>>>>>>>>>>> On 2022-12-12 02:08, Luben Tuikov wrote: >>>>>>>>>>>>> Fix screen corruption on older 32-bit systems using AGP >>>>>>>>>>>>> chips. >>>>>>>>>>>>> >>>>>>>>>>>>> On older systems with little memory, for instance 1.5 >>>>>>>>>>>>> GiB, using an AGP chip, >>>>>>>>>>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask >>>>>>>>>>>>> is 0x7FFFFFF, and >>>>>>>>>>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF >>>>>>>>>>>>> < 0x7FFFFFFF, false. As such the result of this static >>>>>>>>>>>>> inline isn't suitable for the last >>>>>>>>>>>>> argument to ttm_device_init()--it simply needs to now >>>>>>>>>>>>> whether to use GFP_DMA32 >>>>>>>>>>>>> when allocating DMA buffers. >>>>>>>>>>>> This sounds wrong to me. If the issues happen on systems >>>>>>>>>>>> without PAE it clearly can't have anything to with the >>>>>>>>>>>> actual DMA address size. Not to mention that AFAICS 32-bit >>>>>>>>>>>> x86 doesn't even have ZONE_DMA32, so GFP_DMA32 would be >>>>>>>>>>>> functionally meaningless anyway. Although the reported >>>>>>>>>>>> symptoms initially sounded like they could be caused by >>>>>>>>>>>> DMA going to the wrong place, that is also equally >>>>>>>>>>>> consistent with a loss of cache coherency. >>>>>>>>>>>> >>>>>>>>>>>> My (limited) understanding of AGP is that the GART can >>>>>>>>>>>> effectively alias >>>>>>>>>>>> memory to a second physical address, so I could well >>>>>>>>>>>> believe that something somewhere in the driver stack needs >>>>>>>>>>>> to perform some cache maintenance to avoid coherency >>>>>>>>>>>> issues, and that in these particular setups whatever that >>>>>>>>>>>> is might be assuming the memory is direct-mapped and thus >>>>>>>>>>>> going wrong for highmem pages. >>>>>>>>>>>> >>>>>>>>>>>> So as I said before, I really think this is not about using >>>>>>>>>>>> GFP_DMA32 at >>>>>>>>>>>> all, but about *not* using GFP_HIGHUSER. >>>>>>>>>>> One of the wonderful features of AGP is that it has to be >>>>>>>>>>> used with uncached memory. The aperture basically just >>>>>>>>>>> provides a remapping of physical pages into a linear >>>>>>>>>>> aperture that you point the GPU at. TTM has to jump >>>>>>>>>>> through quite a few hoops to get uncached memory in the >>>>>>>>>>> first place, so it's likely that that somehow isn't >>>>>>>>>>> compatible with HIGHMEM. Can you get uncached HIGHMEM? >>>>>>>>>> I guess in principle yes, if you're careful not to use >>>>>>>>>> regular kmap()/kmap_atomic(), and always use >>>>>>>>>> pgprot_noncached() for userspace/vmalloc mappings, but >>>>>>>>>> clearly that leaves lots of scope for slipping up. >>>>>>>>> I theory we should do exactly that in TTM, but we have very >>>>>>>>> few users who actually still exercise that functionality. >>>>>>>>> >>>>>>>>>> Working backwards from primitives like set_memory_uc(), I >>>>>>>>>> see various paths in TTM where manipulating the caching >>>>>>>>>> state is skipped for highmem pages, but I wouldn't even know >>>>>>>>>> where to start looking for whether the right state is >>>>>>>>>> propagated to all the places where they might eventually be >>>>>>>>>> mapped somewhere. >>>>>>>>> The tt object has the caching state for the pages and >>>>>>>>> ttm_prot_from_caching() then uses pgprot_noncached() and co >>>>>>>>> for the userspace/vmalloc mappings. >>>>>>>>> >>>>>>>> The point of this patch is that dma_addressing_limited() is >>>>>>>> unsuitable as the last parameter to ttm_pool_init(), since if >>>>>>>> it is "false"--as it is in this particular case--then TTM ends >>>>>>>> up using HIGHUSER, and we get the screen corruption. >>>>>>>> (gfp_flags |= GFP_HIGHUSER in in ttm_pool_alloc()) >>>>>>> Well I would rather say that dma_addressing_limited() works, >>>>>>> but the default value from dma_get_required_mask() is broken. >>>>>>> >>>>>> dma_get_required_mask() for his setup of 1.5 GiB of memory >>>>>> returns 0x7FFFFFF. >>>>> >>>>> This 0x7FFFFFF mask looks fishy to me. That would only be 128MiB >>>>> addressable memory (27 bits set)? Or is there another F missing? >>>> >>>> Yeah, I'm missing an F--it is correctly described at the top of >>>> the thread above, i.e. in the commit of v2 patch. >>>> >>>> 0x7FFF_FFFF, which seems correct, no? >>>> >>>>>> While the dma mask is 0xFFFFFFFF, as set in radeon_device.c in >>>>>> radeon_device_init(). >>>>>> >>>>>>> 32 bits only work with bounce buffers and we can't use those on >>>>>>> graphics hardware. >>>>>>> >>>>>>>> Is there an objection to this patch, if it fixes the screen >>>>>>>> corruption? >>>>>>> Not from my side, but fixing the underlying issues would be >>>>>>> better I think. >>>>>>> >>>>>> Have they been identified? >>>>> >>>>> I'm not 100% sure. I think by using GFP_DMA32 we just work around >>>>> the issue somehow. >>>> >>>> Right. Using GFP_DMA32, we don't touch high-mem. I was looking at >>>> the DRM code trying to understand what we do when GFP_DMA32 is not >>>> set, and the immediate thing I see is that we set GFP_HIGHUSER >>>> when use_dma32 is unset in the device struct. (Then I got down to >>>> the caching attributes...) >>>> >>>> It's be nice if we can find the actual issue--what else would it >>>> show us that needs fixing...? >>>> >>>> So what do we do with this patch? >>>> >>>> Shouldn't leave it in a limbo--some OSes ship their kernel >>>> with 33b3ad3788ab ("drm/radeon: handle PCIe root ports with >>>> addressing limitations") wholly reverted. >>> >>> Removing dma_addressing_limited() is still wrong, for the reasons >>> given in that commit. What we need is an *additional* condition >>> that encapsulates "also pass use_dma32 for AGP devices because it >>> avoids some weird coherency issue with 32-bit highmem that isn't >>> worth trying to debug further". >> >> Yes, you had a patch earlier which did exactly that--why not push >> that patch? >> >> Q: If host memory is 1.5 GiB, i.e. mask of 0x7FFF_FFFF, but the >> device's mask is 0xFFFF_FFFF, shouldn't we use GFP_DMA32, instead of >> GFP_HIGHUSER? >> >> Regards, >> Luben >> > > Sorry for being pushy, but given that we are so close to the finish, is > there any chance that one of the variants will be merged to the kernel > sources any time soon and if so, can I help with testing? I would really > benefit from this patch making it into Debian 12. Well, there's a couple of patches addressing this problem here in this thread. If consensus is reached, perhaps one of them could be picked up by upstream. -- Regards, Luben ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2023-01-20 8:09 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-23 16:31 Screen corruption using radeon kernel driver Krylov Michael 2022-04-25 17:22 ` Alex Deucher 2022-05-16 13:01 ` Mikhail Krylov 2022-11-28 14:31 ` Mikhail Krylov 2022-11-28 14:50 ` Alex Deucher 2022-11-28 20:48 ` Mikhail Krylov 2022-11-29 14:44 ` Alex Deucher 2022-11-29 15:59 ` Mikhail Krylov 2022-11-29 16:05 ` Alex Deucher 2022-11-29 17:11 ` Mikhail Krylov 2022-11-30 12:54 ` Robin Murphy 2022-11-30 14:28 ` Alex Deucher 2022-11-30 15:42 ` Robin Murphy 2022-11-30 16:07 ` Alex Deucher 2022-11-30 19:59 ` Mikhail Krylov 2022-12-01 14:00 ` Robin Murphy 2022-12-01 14:06 ` Alex Deucher 2022-12-01 15:28 ` Mikhail Krylov 2022-12-01 15:28 ` Mikhail Krylov 2022-12-10 15:32 ` Mikhail Krylov 2022-12-11 5:52 ` Luben Tuikov 2022-12-11 11:42 ` [PATCH] drm/radeon: Fix screen corruption Luben Tuikov 2022-12-12 2:08 ` [PATCH] drm/radeon: Fix screen corruption (v2) Luben Tuikov 2022-12-14 21:53 ` Robin Murphy 2022-12-14 22:02 ` Alex Deucher 2022-12-14 23:08 ` Robin Murphy 2022-12-15 8:07 ` Christian König 2022-12-15 9:08 ` Luben Tuikov 2022-12-15 9:46 ` Christian König 2022-12-15 10:19 ` Luben Tuikov 2022-12-15 11:27 ` Christian König 2022-12-15 11:40 ` Luben Tuikov 2022-12-15 11:53 ` Robin Murphy 2022-12-15 12:07 ` Luben Tuikov 2023-01-19 16:56 ` Krylov Michael 2023-01-19 16:56 ` Krylov Michael 2023-01-20 4:31 ` Luben Tuikov 2023-01-20 4:31 ` Luben Tuikov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.