All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 13:40 ` Alex Bee
  0 siblings, 0 replies; 64+ messages in thread
From: Alex Bee @ 2021-08-24 13:40 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Andrew Morton, Anshuman Khandual,
	Andrew Morton
  Cc: Linux Kernel Mailing List, linux-mm, Linux ARM

Hi list(s),

it seems there is a regression in arm64 memory mapping in 5.14, since it 
fails on Rockchip RK3328 when the pl330 dmac tries to map with:

[    8.921909] ------------[ cut here ]------------
[    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 
dma_map_resource+0x68/0xc0
[    8.921973] Modules linked in: spi_rockchip(+) fuse
[    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 
5.14.0-rc7 #1
[    8.922004] Hardware name: Pine64 Rock64 (DT)
[    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[    8.922018] pc : dma_map_resource+0x68/0xc0
[    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
[    8.922040] sp : ffff800012102ae0
[    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 
0000000000000000
[    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 
0000000000000001
[    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 
0000000000000001
[    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 
0000000000000000
[    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 
0000000000000000
[    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 
0000000000000000
[    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : 
ffff800012102a80
[    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : 
ffff0000fe7b1100
[    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 
0000000000000001
[    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : 
ffff000000628c00
[    8.922158] Call trace:
[    8.922163]  dma_map_resource+0x68/0xc0
[    8.922173]  pl330_prep_slave_sg+0x58/0x220
[    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
[    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
[    8.922220]  spi_transfer_one_message+0x284/0x57c
[    8.922232]  __spi_pump_messages+0x3dc/0x650
[    8.922240]  __spi_sync+0x3e4/0x48c
[    8.922247]  spi_sync+0x30/0x54
[    8.922253]  spi_mem_exec_op+0x264/0x444
[    8.922260]  spi_nor_spimem_read_data+0x148/0x160
[    8.922269]  spi_nor_read_data+0x30/0x40
[    8.922276]  spi_nor_read_sfdp+0x74/0xe4
[    8.922285]  spi_nor_parse_sfdp+0x1d0/0x1130
[    8.922293]  spi_nor_sfdp_init_params+0x3c/0x90
[    8.922304]  spi_nor_scan+0x7b4/0xacc
[    8.922311]  spi_nor_probe+0x94/0x2d0
[    8.922317]  spi_mem_probe+0x6c/0xb0
[    8.922325]  spi_probe+0x84/0xe4
[    8.922335]  really_probe+0xb4/0x45c
[    8.922349]  __driver_probe_device+0x114/0x190
[    8.922358]  driver_probe_device+0x40/0x100
[    8.922367]  __device_attach_driver+0x98/0x130
[    8.922375]  bus_for_each_drv+0x78/0xd0
[    8.922383]  __device_attach+0xdc/0x1c0
[    8.922391]  device_initial_probe+0x14/0x20
[    8.922400]  bus_probe_device+0x98/0xa0
[    8.922408]  device_add+0x36c/0x8c0
[    8.922416]  __spi_add_device+0x74/0x170
[    8.922423]  spi_add_device+0x64/0xa4
[    8.922429]  of_register_spi_device+0x21c/0x36c
[    8.922436]  spi_register_controller+0x5e0/0x834
[    8.922443]  devm_spi_register_controller+0x24/0x80
[    8.922450]  rockchip_spi_probe+0x434/0x5b0 [spi_rockchip]
[    8.922468]  platform_probe+0x68/0xe0
[    8.922478]  really_probe+0xb4/0x45c
[    8.922487]  __driver_probe_device+0x114/0x190
[    8.922497]  driver_probe_device+0x40/0x100
[    8.922507]  __driver_attach+0xcc/0x1e0
[    8.922516]  bus_for_each_dev+0x70/0xd0
[    8.922524]  driver_attach+0x24/0x30
[    8.922533]  bus_add_driver+0x140/0x234
[    8.922540]  driver_register+0x78/0x130
[    8.922547]  __platform_driver_register+0x28/0x34
[    8.922554]  rockchip_spi_driver_init+0x20/0x1000 [spi_rockchip]
[    8.922566]  do_one_initcall+0x50/0x1b0
[    8.922579]  do_init_module+0x54/0x250
[    8.922589]  load_module+0x2230/0x285c
[    8.922597]  __do_sys_finit_module+0xbc/0x12c
[    8.922605]  __arm64_sys_finit_module+0x24/0x30
[    8.922613]  invoke_syscall+0x48/0x114
[    8.922625]  el0_svc_common+0x40/0xfc
[    8.922632]  do_el0_svc_compat+0x20/0x50
[    8.922640]  el0_svc_compat+0x2c/0x54
[    8.922652]  el0t_32_sync_handler+0x90/0x140
[    8.922660]  el0t_32_sync+0x19c/0x1a0
[    8.922669] ---[ end trace 2245d8ba23a1d75c ]---

Note: This does not relate to the spi driver - when disabling this 
device in the device tree it fails for any other (i2s, for instance) 
which uses dma.
Commenting out the failing check at [1], however, helps and the mapping 
works again.

I tried to follow the recent changes for arm64 mm which could relate to 
the check failing at [1] and reverting
   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
helps and makes it work again, but I'm 100% uncertain if that commit is 
really the culprit.

Note, that the firmware (legacy u-boot) injects memory configuration in 
the device tree as follows:

/memreserve/    0x00000000fcefc000 0x000000000000d000;
/ {
..
     compatible = "pine64,rock64\0rockchip,rk3328";
..
     memory {
         reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
         device_type = "memory";
     };

..
}

So: there is a "hole" in the mappable memory and reading the commit 
message of
   commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify 
pfn_valid()")
suggests, there was a change for that case recently.

I also noticed there is a diff in the kernel log regarding memory init 
up until 5.13.12 it says

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000] Initmem setup node 0 [mem 
0x0000000000200000-0x00000000feffffff]
[    0.000000] On node 0 totalpages: 1043968
[    0.000000]   DMA zone: 16312 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 1043968 pages, LIFO batch:63


In contrary in 5.14-rc7 it says:

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000] Initmem setup node 0 [mem 
0x0000000000200000-0x00000000feffffff]
[    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
[    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges

(note the "unavailable ranges")
I'm uncertain again here, if that diff is expected behavior because of 
those recent mm changes for arm64.

After reverting
   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
the log changes to

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000] Initmem setup node 0 [mem 
0x0000000000200000-0x00000000feffffff]

(no DMA zones here)

As you might have noticed I have _zero_ clue about memory mapping and 
dma subsystem - so let me know if there is any more information needed 
for that and thanks for your help.

Alex

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 13:40 ` Alex Bee
  0 siblings, 0 replies; 64+ messages in thread
From: Alex Bee @ 2021-08-24 13:40 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Andrew Morton, Anshuman Khandual,
	Andrew Morton
  Cc: Linux Kernel Mailing List, linux-mm, Linux ARM

Hi list(s),

it seems there is a regression in arm64 memory mapping in 5.14, since it 
fails on Rockchip RK3328 when the pl330 dmac tries to map with:

[    8.921909] ------------[ cut here ]------------
[    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 
dma_map_resource+0x68/0xc0
[    8.921973] Modules linked in: spi_rockchip(+) fuse
[    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 
5.14.0-rc7 #1
[    8.922004] Hardware name: Pine64 Rock64 (DT)
[    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[    8.922018] pc : dma_map_resource+0x68/0xc0
[    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
[    8.922040] sp : ffff800012102ae0
[    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 
0000000000000000
[    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 
0000000000000001
[    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 
0000000000000001
[    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 
0000000000000000
[    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 
0000000000000000
[    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 
0000000000000000
[    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : 
ffff800012102a80
[    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : 
ffff0000fe7b1100
[    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 
0000000000000001
[    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : 
ffff000000628c00
[    8.922158] Call trace:
[    8.922163]  dma_map_resource+0x68/0xc0
[    8.922173]  pl330_prep_slave_sg+0x58/0x220
[    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
[    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
[    8.922220]  spi_transfer_one_message+0x284/0x57c
[    8.922232]  __spi_pump_messages+0x3dc/0x650
[    8.922240]  __spi_sync+0x3e4/0x48c
[    8.922247]  spi_sync+0x30/0x54
[    8.922253]  spi_mem_exec_op+0x264/0x444
[    8.922260]  spi_nor_spimem_read_data+0x148/0x160
[    8.922269]  spi_nor_read_data+0x30/0x40
[    8.922276]  spi_nor_read_sfdp+0x74/0xe4
[    8.922285]  spi_nor_parse_sfdp+0x1d0/0x1130
[    8.922293]  spi_nor_sfdp_init_params+0x3c/0x90
[    8.922304]  spi_nor_scan+0x7b4/0xacc
[    8.922311]  spi_nor_probe+0x94/0x2d0
[    8.922317]  spi_mem_probe+0x6c/0xb0
[    8.922325]  spi_probe+0x84/0xe4
[    8.922335]  really_probe+0xb4/0x45c
[    8.922349]  __driver_probe_device+0x114/0x190
[    8.922358]  driver_probe_device+0x40/0x100
[    8.922367]  __device_attach_driver+0x98/0x130
[    8.922375]  bus_for_each_drv+0x78/0xd0
[    8.922383]  __device_attach+0xdc/0x1c0
[    8.922391]  device_initial_probe+0x14/0x20
[    8.922400]  bus_probe_device+0x98/0xa0
[    8.922408]  device_add+0x36c/0x8c0
[    8.922416]  __spi_add_device+0x74/0x170
[    8.922423]  spi_add_device+0x64/0xa4
[    8.922429]  of_register_spi_device+0x21c/0x36c
[    8.922436]  spi_register_controller+0x5e0/0x834
[    8.922443]  devm_spi_register_controller+0x24/0x80
[    8.922450]  rockchip_spi_probe+0x434/0x5b0 [spi_rockchip]
[    8.922468]  platform_probe+0x68/0xe0
[    8.922478]  really_probe+0xb4/0x45c
[    8.922487]  __driver_probe_device+0x114/0x190
[    8.922497]  driver_probe_device+0x40/0x100
[    8.922507]  __driver_attach+0xcc/0x1e0
[    8.922516]  bus_for_each_dev+0x70/0xd0
[    8.922524]  driver_attach+0x24/0x30
[    8.922533]  bus_add_driver+0x140/0x234
[    8.922540]  driver_register+0x78/0x130
[    8.922547]  __platform_driver_register+0x28/0x34
[    8.922554]  rockchip_spi_driver_init+0x20/0x1000 [spi_rockchip]
[    8.922566]  do_one_initcall+0x50/0x1b0
[    8.922579]  do_init_module+0x54/0x250
[    8.922589]  load_module+0x2230/0x285c
[    8.922597]  __do_sys_finit_module+0xbc/0x12c
[    8.922605]  __arm64_sys_finit_module+0x24/0x30
[    8.922613]  invoke_syscall+0x48/0x114
[    8.922625]  el0_svc_common+0x40/0xfc
[    8.922632]  do_el0_svc_compat+0x20/0x50
[    8.922640]  el0_svc_compat+0x2c/0x54
[    8.922652]  el0t_32_sync_handler+0x90/0x140
[    8.922660]  el0t_32_sync+0x19c/0x1a0
[    8.922669] ---[ end trace 2245d8ba23a1d75c ]---

Note: This does not relate to the spi driver - when disabling this 
device in the device tree it fails for any other (i2s, for instance) 
which uses dma.
Commenting out the failing check at [1], however, helps and the mapping 
works again.

I tried to follow the recent changes for arm64 mm which could relate to 
the check failing at [1] and reverting
   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
helps and makes it work again, but I'm 100% uncertain if that commit is 
really the culprit.

Note, that the firmware (legacy u-boot) injects memory configuration in 
the device tree as follows:

/memreserve/    0x00000000fcefc000 0x000000000000d000;
/ {
..
     compatible = "pine64,rock64\0rockchip,rk3328";
..
     memory {
         reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
         device_type = "memory";
     };

..
}

So: there is a "hole" in the mappable memory and reading the commit 
message of
   commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify 
pfn_valid()")
suggests, there was a change for that case recently.

I also noticed there is a diff in the kernel log regarding memory init 
up until 5.13.12 it says

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000] Initmem setup node 0 [mem 
0x0000000000200000-0x00000000feffffff]
[    0.000000] On node 0 totalpages: 1043968
[    0.000000]   DMA zone: 16312 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 1043968 pages, LIFO batch:63


In contrary in 5.14-rc7 it says:

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000] Initmem setup node 0 [mem 
0x0000000000200000-0x00000000feffffff]
[    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
[    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges

(note the "unavailable ranges")
I'm uncertain again here, if that diff is expected behavior because of 
those recent mm changes for arm64.

After reverting
   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
the log changes to

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000] Initmem setup node 0 [mem 
0x0000000000200000-0x00000000feffffff]

(no DMA zones here)

As you might have noticed I have _zero_ clue about memory mapping and 
dma subsystem - so let me know if there is any more information needed 
for that and thanks for your help.

Alex

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 13:40 ` Alex Bee
@ 2021-08-24 17:37   ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-08-24 17:37 UTC (permalink / raw)
  To: Alex Bee
  Cc: Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Mike Rapoport,
	Robin Murphy

Hi Alex,

Thanks for the report.

On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> it seems there is a regression in arm64 memory mapping in 5.14, since it
> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> 
> [    8.921909] ------------[ cut here ]------------
> [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> [    8.921973] Modules linked in: spi_rockchip(+) fuse
> [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> [    8.922004] Hardware name: Pine64 Rock64 (DT)
> [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> [    8.922018] pc : dma_map_resource+0x68/0xc0
> [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
> [    8.922040] sp : ffff800012102ae0
> [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> [    8.922158] Call trace:
> [    8.922163]  dma_map_resource+0x68/0xc0
> [    8.922173]  pl330_prep_slave_sg+0x58/0x220
> [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
[...]
> Note: This does not relate to the spi driver - when disabling this device in
> the device tree it fails for any other (i2s, for instance) which uses dma.
> Commenting out the failing check at [1], however, helps and the mapping
> works again.

Do you know which address dma_map_resource() is trying to map (maybe
add some printk())? It's not supposed to map RAM, hence the warning.
Random guess, the address is 0xff190800 (based on the x1 above but the
regs might as well be mangled).

> I tried to follow the recent changes for arm64 mm which could relate to the
> check failing at [1] and reverting
>   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> helps and makes it work again, but I'm 100% uncertain if that commit is
> really the culprit.
> 
> Note, that the firmware (legacy u-boot) injects memory configuration in the
> device tree as follows:
> 
> /memreserve/    0x00000000fcefc000 0x000000000000d000;
> / {
> ..
>     compatible = "pine64,rock64\0rockchip,rk3328";
> ..
>     memory {
>         reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>         device_type = "memory";
>     };
> 
> ..
> }

Either pfn_valid() gets confused in 5.14 or something is wrong with the
DT. I have a suspicion it's the former since reverting the above commit
makes it disappear.

> So: there is a "hole" in the mappable memory and reading the commit message
> of
>   commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
> pfn_valid()")
> suggests, there was a change for that case recently.

I think the change from the arm64 pfn_valid() to the generic one is
avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
returns true just because we have a struct page available but the memory
may have been reserved.

Cc'ing Mike R.

> I also noticed there is a diff in the kernel log regarding memory init up
> until 5.13.12 it says
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000]   DMA32    empty
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] On node 0 totalpages: 1043968
> [    0.000000]   DMA zone: 16312 pages used for memmap
> [    0.000000]   DMA zone: 0 pages reserved
> [    0.000000]   DMA zone: 1043968 pages, LIFO batch:63
> 
> In contrary in 5.14-rc7 it says:
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000]   DMA32    empty
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
> [    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
> 
> (note the "unavailable ranges")
> I'm uncertain again here, if that diff is expected behavior because of those
> recent mm changes for arm64.
> 
> After reverting
>   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> the log changes to
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000]   DMA32    empty
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] Initmem setup node 0 [mem
> 0x0000000000200000-0x00000000feffffff]
> 
> (no DMA zones here)
> 
> As you might have noticed I have _zero_ clue about memory mapping and dma
> subsystem - so let me know if there is any more information needed for that
> and thanks for your help.

Adding Robin as well, he has a better clue than us on DMA ;).

> Alex
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 17:37   ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-08-24 17:37 UTC (permalink / raw)
  To: Alex Bee
  Cc: Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Mike Rapoport,
	Robin Murphy

Hi Alex,

Thanks for the report.

On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> it seems there is a regression in arm64 memory mapping in 5.14, since it
> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> 
> [    8.921909] ------------[ cut here ]------------
> [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> [    8.921973] Modules linked in: spi_rockchip(+) fuse
> [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> [    8.922004] Hardware name: Pine64 Rock64 (DT)
> [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> [    8.922018] pc : dma_map_resource+0x68/0xc0
> [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
> [    8.922040] sp : ffff800012102ae0
> [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> [    8.922158] Call trace:
> [    8.922163]  dma_map_resource+0x68/0xc0
> [    8.922173]  pl330_prep_slave_sg+0x58/0x220
> [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
[...]
> Note: This does not relate to the spi driver - when disabling this device in
> the device tree it fails for any other (i2s, for instance) which uses dma.
> Commenting out the failing check at [1], however, helps and the mapping
> works again.

Do you know which address dma_map_resource() is trying to map (maybe
add some printk())? It's not supposed to map RAM, hence the warning.
Random guess, the address is 0xff190800 (based on the x1 above but the
regs might as well be mangled).

> I tried to follow the recent changes for arm64 mm which could relate to the
> check failing at [1] and reverting
>   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> helps and makes it work again, but I'm 100% uncertain if that commit is
> really the culprit.
> 
> Note, that the firmware (legacy u-boot) injects memory configuration in the
> device tree as follows:
> 
> /memreserve/    0x00000000fcefc000 0x000000000000d000;
> / {
> ..
>     compatible = "pine64,rock64\0rockchip,rk3328";
> ..
>     memory {
>         reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>         device_type = "memory";
>     };
> 
> ..
> }

Either pfn_valid() gets confused in 5.14 or something is wrong with the
DT. I have a suspicion it's the former since reverting the above commit
makes it disappear.

> So: there is a "hole" in the mappable memory and reading the commit message
> of
>   commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
> pfn_valid()")
> suggests, there was a change for that case recently.

I think the change from the arm64 pfn_valid() to the generic one is
avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
returns true just because we have a struct page available but the memory
may have been reserved.

Cc'ing Mike R.

> I also noticed there is a diff in the kernel log regarding memory init up
> until 5.13.12 it says
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000]   DMA32    empty
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] On node 0 totalpages: 1043968
> [    0.000000]   DMA zone: 16312 pages used for memmap
> [    0.000000]   DMA zone: 0 pages reserved
> [    0.000000]   DMA zone: 1043968 pages, LIFO batch:63
> 
> In contrary in 5.14-rc7 it says:
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000]   DMA32    empty
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
> [    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
> 
> (note the "unavailable ranges")
> I'm uncertain again here, if that diff is expected behavior because of those
> recent mm changes for arm64.
> 
> After reverting
>   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> the log changes to
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000]   DMA32    empty
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> [    0.000000] Initmem setup node 0 [mem
> 0x0000000000200000-0x00000000feffffff]
> 
> (no DMA zones here)
> 
> As you might have noticed I have _zero_ clue about memory mapping and dma
> subsystem - so let me know if there is any more information needed for that
> and thanks for your help.

Adding Robin as well, he has a better clue than us on DMA ;).

> Alex
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 17:37   ` Catalin Marinas
@ 2021-08-24 18:06     ` Robin Murphy
  -1 siblings, 0 replies; 64+ messages in thread
From: Robin Murphy @ 2021-08-24 18:06 UTC (permalink / raw)
  To: Catalin Marinas, Alex Bee
  Cc: Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Mike Rapoport

On 2021-08-24 18:37, Catalin Marinas wrote:
> Hi Alex,
> 
> Thanks for the report.
> 
> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>
>> [��� 8.921909] ------------[ cut here ]------------
>> [��� 8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>> [��� 8.921973] Modules linked in: spi_rockchip(+) fuse
>> [��� 8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>> [��� 8.922004] Hardware name: Pine64 Rock64 (DT)
>> [��� 8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>> [��� 8.922018] pc : dma_map_resource+0x68/0xc0
>> [��� 8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>> [��� 8.922040] sp : ffff800012102ae0
>> [��� 8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>> [��� 8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>> [��� 8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>> [��� 8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>> [��� 8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>> [��� 8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>> [��� 8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>> [��� 8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>> [��� 8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>> [��� 8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>> [��� 8.922158] Call trace:
>> [��� 8.922163]� dma_map_resource+0x68/0xc0
>> [��� 8.922173]� pl330_prep_slave_sg+0x58/0x220
>> [��� 8.922181]� rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>> [��� 8.922208]� rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> [...]
>> Note: This does not relate to the spi driver - when disabling this device in
>> the device tree it fails for any other (i2s, for instance) which uses dma.
>> Commenting out the failing check at [1], however, helps and the mapping
>> works again.
> 
> Do you know which address dma_map_resource() is trying to map (maybe
> add some printk())? It's not supposed to map RAM, hence the warning.
> Random guess, the address is 0xff190800 (based on the x1 above but the
> regs might as well be mangled).

Yup, that fits the signature of dma_map_resource(), and would indeed be 
right in the middle of the SPI peripheral on RK3328.

FWIW the comment about RAM there is a little inaccurate, but the point 
remains that anything which *is* backed by a page should probably be 
handled by dma_map_page() if at all.

>> I tried to follow the recent changes for arm64 mm which could relate to the
>> check failing at [1] and reverting
>> � commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>> helps and makes it work again, but I'm 100% uncertain if that commit is
>> really the culprit.
>>
>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>> device tree as follows:
>>
>> /memreserve/��� 0x00000000fcefc000 0x000000000000d000;
>> / {
>> ..
>> ��� compatible = "pine64,rock64\0rockchip,rk3328";
>> ..
>> ��� memory {
>> ��� ��� reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>> ��� ��� device_type = "memory";
>> ��� };
>>
>> ..
>> }
> 
> Either pfn_valid() gets confused in 5.14 or something is wrong with the
> DT. I have a suspicion it's the former since reverting the above commit
> makes it disappear.
> 
>> So: there is a "hole" in the mappable memory and reading the commit message
>> of
>> � commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
>> pfn_valid()")
>> suggests, there was a change for that case recently.
> 
> I think the change from the arm64 pfn_valid() to the generic one is
> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
> returns true just because we have a struct page available but the memory
> may have been reserved.

Either way I think something's gone pretty badly wrong if Linux ends up 
thinking that an MMIO region beyond the bounds of any possible RAM 
should be page-backed :/

Robin.

> 
> Cc'ing Mike R.
> 
>> I also noticed there is a diff in the kernel log regarding memory init up
>> until 5.13.12 it says
>>
>> [��� 0.000000] Zone ranges:
>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000]�� DMA32��� empty
>> [��� 0.000000]�� Normal�� empty
>> [��� 0.000000] Movable zone start for each node
>> [��� 0.000000] Early memory node ranges
>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] On node 0 totalpages: 1043968
>> [��� 0.000000]�� DMA zone: 16312 pages used for memmap
>> [��� 0.000000]�� DMA zone: 0 pages reserved
>> [��� 0.000000]�� DMA zone: 1043968 pages, LIFO batch:63
>>
>> In contrary in 5.14-rc7 it says:
>>
>> [��� 0.000000] Zone ranges:
>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000]�� DMA32��� empty
>> [��� 0.000000]�� Normal�� empty
>> [��� 0.000000] Movable zone start for each node
>> [��� 0.000000] Early memory node ranges
>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
>> [��� 0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
>>
>> (note the "unavailable ranges")
>> I'm uncertain again here, if that diff is expected behavior because of those
>> recent mm changes for arm64.
>>
>> After reverting
>> � commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>> the log changes to
>>
>> [��� 0.000000] Zone ranges:
>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000]�� DMA32��� empty
>> [��� 0.000000]�� Normal�� empty
>> [��� 0.000000] Movable zone start for each node
>> [��� 0.000000] Early memory node ranges
>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] Initmem setup node 0 [mem
>> 0x0000000000200000-0x00000000feffffff]
>>
>> (no DMA zones here)
>>
>> As you might have noticed I have _zero_ clue about memory mapping and dma
>> subsystem - so let me know if there is any more information needed for that
>> and thanks for your help.
> 
> Adding Robin as well, he has a better clue than us on DMA ;).
> 
>> Alex
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 18:06     ` Robin Murphy
  0 siblings, 0 replies; 64+ messages in thread
From: Robin Murphy @ 2021-08-24 18:06 UTC (permalink / raw)
  To: Catalin Marinas, Alex Bee
  Cc: Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Mike Rapoport

On 2021-08-24 18:37, Catalin Marinas wrote:
> Hi Alex,
> 
> Thanks for the report.
> 
> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>
>> [��� 8.921909] ------------[ cut here ]------------
>> [��� 8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>> [��� 8.921973] Modules linked in: spi_rockchip(+) fuse
>> [��� 8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>> [��� 8.922004] Hardware name: Pine64 Rock64 (DT)
>> [��� 8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>> [��� 8.922018] pc : dma_map_resource+0x68/0xc0
>> [��� 8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>> [��� 8.922040] sp : ffff800012102ae0
>> [��� 8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>> [��� 8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>> [��� 8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>> [��� 8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>> [��� 8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>> [��� 8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>> [��� 8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>> [��� 8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>> [��� 8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>> [��� 8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>> [��� 8.922158] Call trace:
>> [��� 8.922163]� dma_map_resource+0x68/0xc0
>> [��� 8.922173]� pl330_prep_slave_sg+0x58/0x220
>> [��� 8.922181]� rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>> [��� 8.922208]� rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> [...]
>> Note: This does not relate to the spi driver - when disabling this device in
>> the device tree it fails for any other (i2s, for instance) which uses dma.
>> Commenting out the failing check at [1], however, helps and the mapping
>> works again.
> 
> Do you know which address dma_map_resource() is trying to map (maybe
> add some printk())? It's not supposed to map RAM, hence the warning.
> Random guess, the address is 0xff190800 (based on the x1 above but the
> regs might as well be mangled).

Yup, that fits the signature of dma_map_resource(), and would indeed be 
right in the middle of the SPI peripheral on RK3328.

FWIW the comment about RAM there is a little inaccurate, but the point 
remains that anything which *is* backed by a page should probably be 
handled by dma_map_page() if at all.

>> I tried to follow the recent changes for arm64 mm which could relate to the
>> check failing at [1] and reverting
>> � commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>> helps and makes it work again, but I'm 100% uncertain if that commit is
>> really the culprit.
>>
>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>> device tree as follows:
>>
>> /memreserve/��� 0x00000000fcefc000 0x000000000000d000;
>> / {
>> ..
>> ��� compatible = "pine64,rock64\0rockchip,rk3328";
>> ..
>> ��� memory {
>> ��� ��� reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>> ��� ��� device_type = "memory";
>> ��� };
>>
>> ..
>> }
> 
> Either pfn_valid() gets confused in 5.14 or something is wrong with the
> DT. I have a suspicion it's the former since reverting the above commit
> makes it disappear.
> 
>> So: there is a "hole" in the mappable memory and reading the commit message
>> of
>> � commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
>> pfn_valid()")
>> suggests, there was a change for that case recently.
> 
> I think the change from the arm64 pfn_valid() to the generic one is
> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
> returns true just because we have a struct page available but the memory
> may have been reserved.

Either way I think something's gone pretty badly wrong if Linux ends up 
thinking that an MMIO region beyond the bounds of any possible RAM 
should be page-backed :/

Robin.

> 
> Cc'ing Mike R.
> 
>> I also noticed there is a diff in the kernel log regarding memory init up
>> until 5.13.12 it says
>>
>> [��� 0.000000] Zone ranges:
>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000]�� DMA32��� empty
>> [��� 0.000000]�� Normal�� empty
>> [��� 0.000000] Movable zone start for each node
>> [��� 0.000000] Early memory node ranges
>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] On node 0 totalpages: 1043968
>> [��� 0.000000]�� DMA zone: 16312 pages used for memmap
>> [��� 0.000000]�� DMA zone: 0 pages reserved
>> [��� 0.000000]�� DMA zone: 1043968 pages, LIFO batch:63
>>
>> In contrary in 5.14-rc7 it says:
>>
>> [��� 0.000000] Zone ranges:
>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000]�� DMA32��� empty
>> [��� 0.000000]�� Normal�� empty
>> [��� 0.000000] Movable zone start for each node
>> [��� 0.000000] Early memory node ranges
>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
>> [��� 0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
>>
>> (note the "unavailable ranges")
>> I'm uncertain again here, if that diff is expected behavior because of those
>> recent mm changes for arm64.
>>
>> After reverting
>> � commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>> the log changes to
>>
>> [��� 0.000000] Zone ranges:
>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000]�� DMA32��� empty
>> [��� 0.000000]�� Normal�� empty
>> [��� 0.000000] Movable zone start for each node
>> [��� 0.000000] Early memory node ranges
>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>> [��� 0.000000] Initmem setup node 0 [mem
>> 0x0000000000200000-0x00000000feffffff]
>>
>> (no DMA zones here)
>>
>> As you might have noticed I have _zero_ clue about memory mapping and dma
>> subsystem - so let me know if there is any more information needed for that
>> and thanks for your help.
> 
> Adding Robin as well, he has a better clue than us on DMA ;).
> 
>> Alex
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 17:37   ` Catalin Marinas
@ 2021-08-24 18:28     ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-08-24 18:28 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Alex Bee, Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Robin Murphy

On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> Hi Alex,
> 
> Thanks for the report.
> 
> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > 
> > [    8.921909] ------------[ cut here ]------------
> > [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > [    8.921973] Modules linked in: spi_rockchip(+) fuse
> > [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > [    8.922004] Hardware name: Pine64 Rock64 (DT)
> > [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > [    8.922018] pc : dma_map_resource+0x68/0xc0
> > [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
> > [    8.922040] sp : ffff800012102ae0
> > [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > [    8.922158] Call trace:
> > [    8.922163]  dma_map_resource+0x68/0xc0
> > [    8.922173]  pl330_prep_slave_sg+0x58/0x220
> > [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> [...]
> > Note: This does not relate to the spi driver - when disabling this device in
> > the device tree it fails for any other (i2s, for instance) which uses dma.
> > Commenting out the failing check at [1], however, helps and the mapping
> > works again.

> Do you know which address dma_map_resource() is trying to map (maybe
> add some printk())? It's not supposed to map RAM, hence the warning.
> Random guess, the address is 0xff190800 (based on the x1 above but the
> regs might as well be mangled).

0xff190800 will cause this warning for sure. It has a memory map, but it is
not RAM so old version of pfn_valid() would return 0 and the new one
returns 1.
 
> > I tried to follow the recent changes for arm64 mm which could relate to the
> > check failing at [1] and reverting
> >   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> > helps and makes it work again, but I'm 100% uncertain if that commit is
> > really the culprit.
> > 
> > Note, that the firmware (legacy u-boot) injects memory configuration in the
> > device tree as follows:
> > 
> > /memreserve/    0x00000000fcefc000 0x000000000000d000;
> > / {
> > ..
> >     compatible = "pine64,rock64\0rockchip,rk3328";
> > ..
> >     memory {
> >         reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
> >         device_type = "memory";
> >     };
> > 
> > ..
> > }
> 
> Either pfn_valid() gets confused in 5.14 or something is wrong with the
> DT. I have a suspicion it's the former since reverting the above commit
> makes it disappear.

I think pfn_valid() actually behaves as expected but the caller is wrong
because pfn_valid != RAM (this applies btw to !arm64 as well).

	/* Don't allow RAM to be mapped */
	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
		return DMA_MAPPING_ERROR;

Alex, can you please try this patch:

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 2b06a809d0b9..4715e9641a29 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 		return DMA_MAPPING_ERROR;
 
 	/* Don't allow RAM to be mapped */
-	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
+	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
 		return DMA_MAPPING_ERROR;
 
 	if (dma_map_direct(dev, ops))
 
> > So: there is a "hole" in the mappable memory and reading the commit message
> > of
> >   commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
> > pfn_valid()")
> > suggests, there was a change for that case recently.
> 
> I think the change from the arm64 pfn_valid() to the generic one is
> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
> returns true just because we have a struct page available but the memory
> may have been reserved.
> 
> Cc'ing Mike R.
> 
> > I also noticed there is a diff in the kernel log regarding memory init up
> > until 5.13.12 it says
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000]   DMA32    empty
> > [    0.000000]   Normal   empty
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] On node 0 totalpages: 1043968
> > [    0.000000]   DMA zone: 16312 pages used for memmap
> > [    0.000000]   DMA zone: 0 pages reserved
> > [    0.000000]   DMA zone: 1043968 pages, LIFO batch:63
> > 
> > In contrary in 5.14-rc7 it says:
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000]   DMA32    empty
> > [    0.000000]   Normal   empty
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
> > [    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
> > 
> > (note the "unavailable ranges")
> > I'm uncertain again here, if that diff is expected behavior because of those
> > recent mm changes for arm64.
> > 
> > After reverting
> >   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> > the log changes to
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000]   DMA32    empty
> > [    0.000000]   Normal   empty
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] Initmem setup node 0 [mem
> > 0x0000000000200000-0x00000000feffffff]
> > 
> > (no DMA zones here)
> > 
> > As you might have noticed I have _zero_ clue about memory mapping and dma
> > subsystem - so let me know if there is any more information needed for that
> > and thanks for your help.
> 
> Adding Robin as well, he has a better clue than us on DMA ;).
> 
> > Alex
> > 
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235
> 
> -- 
> Catalin

-- 
Sincerely yours,
Mike.

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 18:28     ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-08-24 18:28 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Alex Bee, Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Robin Murphy

On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> Hi Alex,
> 
> Thanks for the report.
> 
> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > 
> > [    8.921909] ------------[ cut here ]------------
> > [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > [    8.921973] Modules linked in: spi_rockchip(+) fuse
> > [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > [    8.922004] Hardware name: Pine64 Rock64 (DT)
> > [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > [    8.922018] pc : dma_map_resource+0x68/0xc0
> > [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
> > [    8.922040] sp : ffff800012102ae0
> > [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > [    8.922158] Call trace:
> > [    8.922163]  dma_map_resource+0x68/0xc0
> > [    8.922173]  pl330_prep_slave_sg+0x58/0x220
> > [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> [...]
> > Note: This does not relate to the spi driver - when disabling this device in
> > the device tree it fails for any other (i2s, for instance) which uses dma.
> > Commenting out the failing check at [1], however, helps and the mapping
> > works again.

> Do you know which address dma_map_resource() is trying to map (maybe
> add some printk())? It's not supposed to map RAM, hence the warning.
> Random guess, the address is 0xff190800 (based on the x1 above but the
> regs might as well be mangled).

0xff190800 will cause this warning for sure. It has a memory map, but it is
not RAM so old version of pfn_valid() would return 0 and the new one
returns 1.
 
> > I tried to follow the recent changes for arm64 mm which could relate to the
> > check failing at [1] and reverting
> >   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> > helps and makes it work again, but I'm 100% uncertain if that commit is
> > really the culprit.
> > 
> > Note, that the firmware (legacy u-boot) injects memory configuration in the
> > device tree as follows:
> > 
> > /memreserve/    0x00000000fcefc000 0x000000000000d000;
> > / {
> > ..
> >     compatible = "pine64,rock64\0rockchip,rk3328";
> > ..
> >     memory {
> >         reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
> >         device_type = "memory";
> >     };
> > 
> > ..
> > }
> 
> Either pfn_valid() gets confused in 5.14 or something is wrong with the
> DT. I have a suspicion it's the former since reverting the above commit
> makes it disappear.

I think pfn_valid() actually behaves as expected but the caller is wrong
because pfn_valid != RAM (this applies btw to !arm64 as well).

	/* Don't allow RAM to be mapped */
	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
		return DMA_MAPPING_ERROR;

Alex, can you please try this patch:

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 2b06a809d0b9..4715e9641a29 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 		return DMA_MAPPING_ERROR;
 
 	/* Don't allow RAM to be mapped */
-	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
+	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
 		return DMA_MAPPING_ERROR;
 
 	if (dma_map_direct(dev, ops))
 
> > So: there is a "hole" in the mappable memory and reading the commit message
> > of
> >   commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
> > pfn_valid()")
> > suggests, there was a change for that case recently.
> 
> I think the change from the arm64 pfn_valid() to the generic one is
> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
> returns true just because we have a struct page available but the memory
> may have been reserved.
> 
> Cc'ing Mike R.
> 
> > I also noticed there is a diff in the kernel log regarding memory init up
> > until 5.13.12 it says
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000]   DMA32    empty
> > [    0.000000]   Normal   empty
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] On node 0 totalpages: 1043968
> > [    0.000000]   DMA zone: 16312 pages used for memmap
> > [    0.000000]   DMA zone: 0 pages reserved
> > [    0.000000]   DMA zone: 1043968 pages, LIFO batch:63
> > 
> > In contrary in 5.14-rc7 it says:
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000]   DMA32    empty
> > [    0.000000]   Normal   empty
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
> > [    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
> > 
> > (note the "unavailable ranges")
> > I'm uncertain again here, if that diff is expected behavior because of those
> > recent mm changes for arm64.
> > 
> > After reverting
> >   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> > the log changes to
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000]   DMA32    empty
> > [    0.000000]   Normal   empty
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
> > [    0.000000] Initmem setup node 0 [mem
> > 0x0000000000200000-0x00000000feffffff]
> > 
> > (no DMA zones here)
> > 
> > As you might have noticed I have _zero_ clue about memory mapping and dma
> > subsystem - so let me know if there is any more information needed for that
> > and thanks for your help.
> 
> Adding Robin as well, he has a better clue than us on DMA ;).
> 
> > Alex
> > 
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235
> 
> -- 
> Catalin

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 18:28     ` Mike Rapoport
@ 2021-08-24 18:46       ` Robin Murphy
  -1 siblings, 0 replies; 64+ messages in thread
From: Robin Murphy @ 2021-08-24 18:46 UTC (permalink / raw)
  To: Mike Rapoport, Catalin Marinas
  Cc: Alex Bee, Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On 2021-08-24 19:28, Mike Rapoport wrote:
> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>> Hi Alex,
>>
>> Thanks for the report.
>>
>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>
>>> [��� 8.921909] ------------[ cut here ]------------
>>> [��� 8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>> [��� 8.921973] Modules linked in: spi_rockchip(+) fuse
>>> [��� 8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>> [��� 8.922004] Hardware name: Pine64 Rock64 (DT)
>>> [��� 8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>> [��� 8.922018] pc : dma_map_resource+0x68/0xc0
>>> [��� 8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>>> [��� 8.922040] sp : ffff800012102ae0
>>> [��� 8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>> [��� 8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>> [��� 8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>> [��� 8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>> [��� 8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>> [��� 8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>> [��� 8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>> [��� 8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>> [��� 8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>> [��� 8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>> [��� 8.922158] Call trace:
>>> [��� 8.922163]� dma_map_resource+0x68/0xc0
>>> [��� 8.922173]� pl330_prep_slave_sg+0x58/0x220
>>> [��� 8.922181]� rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>> [��� 8.922208]� rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>> [...]
>>> Note: This does not relate to the spi driver - when disabling this device in
>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>> Commenting out the failing check at [1], however, helps and the mapping
>>> works again.
> 
>> Do you know which address dma_map_resource() is trying to map (maybe
>> add some printk())? It's not supposed to map RAM, hence the warning.
>> Random guess, the address is 0xff190800 (based on the x1 above but the
>> regs might as well be mangled).
> 
> 0xff190800 will cause this warning for sure. It has a memory map, but it is
> not RAM so old version of pfn_valid() would return 0 and the new one
> returns 1.

How does that happen, though? It's not a memory address, and it's not 
even within the bounds of anywhere there should or could be memory. This 
SoC has a simple memory map - everything from 0 to 0xfeffffff goes to 
the DRAM controller (which may not all be populated, and may have pieces 
carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why 
do we have pages (or at least the assumption of pages) for somewhere 
which by all rights should not have them?

>>> I tried to follow the recent changes for arm64 mm which could relate to the
>>> check failing at [1] and reverting
>>> � commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>> helps and makes it work again, but I'm 100% uncertain if that commit is
>>> really the culprit.
>>>
>>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>>> device tree as follows:
>>>
>>> /memreserve/��� 0x00000000fcefc000 0x000000000000d000;
>>> / {
>>> ..
>>> ��� compatible = "pine64,rock64\0rockchip,rk3328";
>>> ..
>>> ��� memory {
>>> ��� ��� reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>>> ��� ��� device_type = "memory";
>>> ��� };
>>>
>>> ..
>>> }
>>
>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>> DT. I have a suspicion it's the former since reverting the above commit
>> makes it disappear.
> 
> I think pfn_valid() actually behaves as expected but the caller is wrong
> because pfn_valid != RAM (this applies btw to !arm64 as well).
> 
> 	/* Don't allow RAM to be mapped */
> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> 		return DMA_MAPPING_ERROR;
> 
> Alex, can you please try this patch:

That will certainly paper over the issue, but it's avoiding the question 
of what went wrong with the memory map in the first place. The comment 
is indeed a bit inaccurate, but ultimately dma_map_resource() exists for 
addresses that would be wrong to pass to dma_map_page(), so I believe 
pfn_valid() is still the correct check.

Robin.

> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 2b06a809d0b9..4715e9641a29 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
>   		return DMA_MAPPING_ERROR;
>   
>   	/* Don't allow RAM to be mapped */
> -	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> +	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
>   		return DMA_MAPPING_ERROR;
>   
>   	if (dma_map_direct(dev, ops))
>   
>>> So: there is a "hole" in the mappable memory and reading the commit message
>>> of
>>> � commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
>>> pfn_valid()")
>>> suggests, there was a change for that case recently.
>>
>> I think the change from the arm64 pfn_valid() to the generic one is
>> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
>> returns true just because we have a struct page available but the memory
>> may have been reserved.
>>
>> Cc'ing Mike R.
>>
>>> I also noticed there is a diff in the kernel log regarding memory init up
>>> until 5.13.12 it says
>>>
>>> [��� 0.000000] Zone ranges:
>>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000]�� DMA32��� empty
>>> [��� 0.000000]�� Normal�� empty
>>> [��� 0.000000] Movable zone start for each node
>>> [��� 0.000000] Early memory node ranges
>>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] On node 0 totalpages: 1043968
>>> [��� 0.000000]�� DMA zone: 16312 pages used for memmap
>>> [��� 0.000000]�� DMA zone: 0 pages reserved
>>> [��� 0.000000]�� DMA zone: 1043968 pages, LIFO batch:63
>>>
>>> In contrary in 5.14-rc7 it says:
>>>
>>> [��� 0.000000] Zone ranges:
>>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000]�� DMA32��� empty
>>> [��� 0.000000]�� Normal�� empty
>>> [��� 0.000000] Movable zone start for each node
>>> [��� 0.000000] Early memory node ranges
>>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
>>> [��� 0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
>>>
>>> (note the "unavailable ranges")
>>> I'm uncertain again here, if that diff is expected behavior because of those
>>> recent mm changes for arm64.
>>>
>>> After reverting
>>> � commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>> the log changes to
>>>
>>> [��� 0.000000] Zone ranges:
>>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000]�� DMA32��� empty
>>> [��� 0.000000]�� Normal�� empty
>>> [��� 0.000000] Movable zone start for each node
>>> [��� 0.000000] Early memory node ranges
>>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] Initmem setup node 0 [mem
>>> 0x0000000000200000-0x00000000feffffff]
>>>
>>> (no DMA zones here)
>>>
>>> As you might have noticed I have _zero_ clue about memory mapping and dma
>>> subsystem - so let me know if there is any more information needed for that
>>> and thanks for your help.
>>
>> Adding Robin as well, he has a better clue than us on DMA ;).
>>
>>> Alex
>>>
>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235
>>
>> -- 
>> Catalin
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 18:46       ` Robin Murphy
  0 siblings, 0 replies; 64+ messages in thread
From: Robin Murphy @ 2021-08-24 18:46 UTC (permalink / raw)
  To: Mike Rapoport, Catalin Marinas
  Cc: Alex Bee, Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On 2021-08-24 19:28, Mike Rapoport wrote:
> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>> Hi Alex,
>>
>> Thanks for the report.
>>
>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>
>>> [��� 8.921909] ------------[ cut here ]------------
>>> [��� 8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>> [��� 8.921973] Modules linked in: spi_rockchip(+) fuse
>>> [��� 8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>> [��� 8.922004] Hardware name: Pine64 Rock64 (DT)
>>> [��� 8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>> [��� 8.922018] pc : dma_map_resource+0x68/0xc0
>>> [��� 8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>>> [��� 8.922040] sp : ffff800012102ae0
>>> [��� 8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>> [��� 8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>> [��� 8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>> [��� 8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>> [��� 8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>> [��� 8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>> [��� 8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>> [��� 8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>> [��� 8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>> [��� 8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>> [��� 8.922158] Call trace:
>>> [��� 8.922163]� dma_map_resource+0x68/0xc0
>>> [��� 8.922173]� pl330_prep_slave_sg+0x58/0x220
>>> [��� 8.922181]� rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>> [��� 8.922208]� rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>> [...]
>>> Note: This does not relate to the spi driver - when disabling this device in
>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>> Commenting out the failing check at [1], however, helps and the mapping
>>> works again.
> 
>> Do you know which address dma_map_resource() is trying to map (maybe
>> add some printk())? It's not supposed to map RAM, hence the warning.
>> Random guess, the address is 0xff190800 (based on the x1 above but the
>> regs might as well be mangled).
> 
> 0xff190800 will cause this warning for sure. It has a memory map, but it is
> not RAM so old version of pfn_valid() would return 0 and the new one
> returns 1.

How does that happen, though? It's not a memory address, and it's not 
even within the bounds of anywhere there should or could be memory. This 
SoC has a simple memory map - everything from 0 to 0xfeffffff goes to 
the DRAM controller (which may not all be populated, and may have pieces 
carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why 
do we have pages (or at least the assumption of pages) for somewhere 
which by all rights should not have them?

>>> I tried to follow the recent changes for arm64 mm which could relate to the
>>> check failing at [1] and reverting
>>> � commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>> helps and makes it work again, but I'm 100% uncertain if that commit is
>>> really the culprit.
>>>
>>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>>> device tree as follows:
>>>
>>> /memreserve/��� 0x00000000fcefc000 0x000000000000d000;
>>> / {
>>> ..
>>> ��� compatible = "pine64,rock64\0rockchip,rk3328";
>>> ..
>>> ��� memory {
>>> ��� ��� reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>>> ��� ��� device_type = "memory";
>>> ��� };
>>>
>>> ..
>>> }
>>
>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>> DT. I have a suspicion it's the former since reverting the above commit
>> makes it disappear.
> 
> I think pfn_valid() actually behaves as expected but the caller is wrong
> because pfn_valid != RAM (this applies btw to !arm64 as well).
> 
> 	/* Don't allow RAM to be mapped */
> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> 		return DMA_MAPPING_ERROR;
> 
> Alex, can you please try this patch:

That will certainly paper over the issue, but it's avoiding the question 
of what went wrong with the memory map in the first place. The comment 
is indeed a bit inaccurate, but ultimately dma_map_resource() exists for 
addresses that would be wrong to pass to dma_map_page(), so I believe 
pfn_valid() is still the correct check.

Robin.

> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 2b06a809d0b9..4715e9641a29 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
>   		return DMA_MAPPING_ERROR;
>   
>   	/* Don't allow RAM to be mapped */
> -	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> +	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
>   		return DMA_MAPPING_ERROR;
>   
>   	if (dma_map_direct(dev, ops))
>   
>>> So: there is a "hole" in the mappable memory and reading the commit message
>>> of
>>> � commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
>>> pfn_valid()")
>>> suggests, there was a change for that case recently.
>>
>> I think the change from the arm64 pfn_valid() to the generic one is
>> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
>> returns true just because we have a struct page available but the memory
>> may have been reserved.
>>
>> Cc'ing Mike R.
>>
>>> I also noticed there is a diff in the kernel log regarding memory init up
>>> until 5.13.12 it says
>>>
>>> [��� 0.000000] Zone ranges:
>>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000]�� DMA32��� empty
>>> [��� 0.000000]�� Normal�� empty
>>> [��� 0.000000] Movable zone start for each node
>>> [��� 0.000000] Early memory node ranges
>>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] On node 0 totalpages: 1043968
>>> [��� 0.000000]�� DMA zone: 16312 pages used for memmap
>>> [��� 0.000000]�� DMA zone: 0 pages reserved
>>> [��� 0.000000]�� DMA zone: 1043968 pages, LIFO batch:63
>>>
>>> In contrary in 5.14-rc7 it says:
>>>
>>> [��� 0.000000] Zone ranges:
>>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000]�� DMA32��� empty
>>> [��� 0.000000]�� Normal�� empty
>>> [��� 0.000000] Movable zone start for each node
>>> [��� 0.000000] Early memory node ranges
>>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
>>> [��� 0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
>>>
>>> (note the "unavailable ranges")
>>> I'm uncertain again here, if that diff is expected behavior because of those
>>> recent mm changes for arm64.
>>>
>>> After reverting
>>> � commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>> the log changes to
>>>
>>> [��� 0.000000] Zone ranges:
>>> [��� 0.000000]�� DMA����� [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000]�� DMA32��� empty
>>> [��� 0.000000]�� Normal�� empty
>>> [��� 0.000000] Movable zone start for each node
>>> [��� 0.000000] Early memory node ranges
>>> [��� 0.000000]�� node�� 0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [��� 0.000000] Initmem setup node 0 [mem
>>> 0x0000000000200000-0x00000000feffffff]
>>>
>>> (no DMA zones here)
>>>
>>> As you might have noticed I have _zero_ clue about memory mapping and dma
>>> subsystem - so let me know if there is any more information needed for that
>>> and thanks for your help.
>>
>> Adding Robin as well, he has a better clue than us on DMA ;).
>>
>>> Alex
>>>
>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235
>>
>> -- 
>> Catalin
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 18:46       ` Robin Murphy
@ 2021-08-24 18:59         ` David Hildenbrand
  -1 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-08-24 18:59 UTC (permalink / raw)
  To: Robin Murphy, Mike Rapoport, Catalin Marinas
  Cc: Alex Bee, Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On 24.08.21 20:46, Robin Murphy wrote:
> On 2021-08-24 19:28, Mike Rapoport wrote:
>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>> Hi Alex,
>>>
>>> Thanks for the report.
>>>
>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>
>>>> [��� 8.921909] ------------[ cut here ]------------
>>>> [��� 8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>>> [��� 8.921973] Modules linked in: spi_rockchip(+) fuse
>>>> [��� 8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>> [��� 8.922004] Hardware name: Pine64 Rock64 (DT)
>>>> [��� 8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>> [��� 8.922018] pc : dma_map_resource+0x68/0xc0
>>>> [��� 8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>>>> [��� 8.922040] sp : ffff800012102ae0
>>>> [��� 8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>> [��� 8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>> [��� 8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>> [��� 8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>> [��� 8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>> [��� 8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>> [��� 8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>> [��� 8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>> [��� 8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>> [��� 8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>> [��� 8.922158] Call trace:
>>>> [��� 8.922163]� dma_map_resource+0x68/0xc0
>>>> [��� 8.922173]� pl330_prep_slave_sg+0x58/0x220
>>>> [��� 8.922181]� rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>> [��� 8.922208]� rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>> [...]
>>>> Note: This does not relate to the spi driver - when disabling this device in
>>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>>> Commenting out the failing check at [1], however, helps and the mapping
>>>> works again.
>>
>>> Do you know which address dma_map_resource() is trying to map (maybe
>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>> Random guess, the address is 0xff190800 (based on the x1 above but the
>>> regs might as well be mangled).
>>
>> 0xff190800 will cause this warning for sure. It has a memory map, but it is
>> not RAM so old version of pfn_valid() would return 0 and the new one
>> returns 1.
> 
> How does that happen, though? It's not a memory address, and it's not
> even within the bounds of anywhere there should or could be memory. This
> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> the DRAM controller (which may not all be populated, and may have pieces
> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> do we have pages (or at least the assumption of pages) for somewhere
> which by all rights should not have them?

Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to 
avoid any such hacks. If there is a memory hole, it gets a memmap as well.

Tricking pfn_valid() into returning "false" where we actually have a 
memmap only makes it look like there is no memmap; but there is one, and
it's PG_reserved.

[...]

>>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>>> DT. I have a suspicion it's the former since reverting the above commit
>>> makes it disappear.
>>
>> I think pfn_valid() actually behaves as expected but the caller is wrong
>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>
>> 	/* Don't allow RAM to be mapped */
>> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>> 		return DMA_MAPPING_ERROR;
>>
>> Alex, can you please try this patch:
> 
> That will certainly paper over the issue, but it's avoiding the question
> of what went wrong with the memory map in the first place. The comment
> is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> addresses that would be wrong to pass to dma_map_page(), so I believe
> pfn_valid() is still the correct check.

If we want to check for RAM, pfn_valid() would be wrong. If we want to 
check for "is there a memmap, for whatever lives or does not live 
there", pfn_valid() is the right check.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 18:59         ` David Hildenbrand
  0 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-08-24 18:59 UTC (permalink / raw)
  To: Robin Murphy, Mike Rapoport, Catalin Marinas
  Cc: Alex Bee, Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On 24.08.21 20:46, Robin Murphy wrote:
> On 2021-08-24 19:28, Mike Rapoport wrote:
>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>> Hi Alex,
>>>
>>> Thanks for the report.
>>>
>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>
>>>> [��� 8.921909] ------------[ cut here ]------------
>>>> [��� 8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>>> [��� 8.921973] Modules linked in: spi_rockchip(+) fuse
>>>> [��� 8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>> [��� 8.922004] Hardware name: Pine64 Rock64 (DT)
>>>> [��� 8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>> [��� 8.922018] pc : dma_map_resource+0x68/0xc0
>>>> [��� 8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>>>> [��� 8.922040] sp : ffff800012102ae0
>>>> [��� 8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>> [��� 8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>> [��� 8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>> [��� 8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>> [��� 8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>> [��� 8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>> [��� 8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>> [��� 8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>> [��� 8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>> [��� 8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>> [��� 8.922158] Call trace:
>>>> [��� 8.922163]� dma_map_resource+0x68/0xc0
>>>> [��� 8.922173]� pl330_prep_slave_sg+0x58/0x220
>>>> [��� 8.922181]� rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>> [��� 8.922208]� rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>> [...]
>>>> Note: This does not relate to the spi driver - when disabling this device in
>>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>>> Commenting out the failing check at [1], however, helps and the mapping
>>>> works again.
>>
>>> Do you know which address dma_map_resource() is trying to map (maybe
>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>> Random guess, the address is 0xff190800 (based on the x1 above but the
>>> regs might as well be mangled).
>>
>> 0xff190800 will cause this warning for sure. It has a memory map, but it is
>> not RAM so old version of pfn_valid() would return 0 and the new one
>> returns 1.
> 
> How does that happen, though? It's not a memory address, and it's not
> even within the bounds of anywhere there should or could be memory. This
> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> the DRAM controller (which may not all be populated, and may have pieces
> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> do we have pages (or at least the assumption of pages) for somewhere
> which by all rights should not have them?

Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to 
avoid any such hacks. If there is a memory hole, it gets a memmap as well.

Tricking pfn_valid() into returning "false" where we actually have a 
memmap only makes it look like there is no memmap; but there is one, and
it's PG_reserved.

[...]

>>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>>> DT. I have a suspicion it's the former since reverting the above commit
>>> makes it disappear.
>>
>> I think pfn_valid() actually behaves as expected but the caller is wrong
>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>
>> 	/* Don't allow RAM to be mapped */
>> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>> 		return DMA_MAPPING_ERROR;
>>
>> Alex, can you please try this patch:
> 
> That will certainly paper over the issue, but it's avoiding the question
> of what went wrong with the memory map in the first place. The comment
> is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> addresses that would be wrong to pass to dma_map_page(), so I believe
> pfn_valid() is still the correct check.

If we want to check for RAM, pfn_valid() would be wrong. If we want to 
check for "is there a memmap, for whatever lives or does not live 
there", pfn_valid() is the right check.


-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 17:37   ` Catalin Marinas
@ 2021-08-24 20:07     ` Alex Bee
  -1 siblings, 0 replies; 64+ messages in thread
From: Alex Bee @ 2021-08-24 20:07 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Mike Rapoport,
	Robin Murphy

Hi Catalin,

thanks for your quick reply.

Am 24.08.21 um 19:37 schrieb Catalin Marinas:
> Hi Alex,
>
> Thanks for the report.
>
> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>
>> [    8.921909] ------------[ cut here ]------------
>> [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>> [    8.921973] Modules linked in: spi_rockchip(+) fuse
>> [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>> [    8.922004] Hardware name: Pine64 Rock64 (DT)
>> [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>> [    8.922018] pc : dma_map_resource+0x68/0xc0
>> [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>> [    8.922040] sp : ffff800012102ae0
>> [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>> [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>> [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>> [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>> [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>> [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>> [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>> [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>> [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>> [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>> [    8.922158] Call trace:
>> [    8.922163]  dma_map_resource+0x68/0xc0
>> [    8.922173]  pl330_prep_slave_sg+0x58/0x220
>> [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>> [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> [...]
>> Note: This does not relate to the spi driver - when disabling this device in
>> the device tree it fails for any other (i2s, for instance) which uses dma.
>> Commenting out the failing check at [1], however, helps and the mapping
>> works again.
> Do you know which address dma_map_resource() is trying to map (maybe
> add some printk())? It's not supposed to map RAM, hence the warning.
> Random guess, the address is 0xff190800 (based on the x1 above but the
> regs might as well be mangled).

Confirmed:

[..]

[    8.353879] dma_map_resource Failed to map address 0xff190800

[..]

Alex

>
>> I tried to follow the recent changes for arm64 mm which could relate to the
>> check failing at [1] and reverting
>>    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>> helps and makes it work again, but I'm 100% uncertain if that commit is
>> really the culprit.
>>
>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>> device tree as follows:
>>
>> /memreserve/    0x00000000fcefc000 0x000000000000d000;
>> / {
>> ..
>>      compatible = "pine64,rock64\0rockchip,rk3328";
>> ..
>>      memory {
>>          reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>>          device_type = "memory";
>>      };
>>
>> ..
>> }
> Either pfn_valid() gets confused in 5.14 or something is wrong with the
> DT. I have a suspicion it's the former since reverting the above commit
> makes it disappear.
>
>> So: there is a "hole" in the mappable memory and reading the commit message
>> of
>>    commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
>> pfn_valid()")
>> suggests, there was a change for that case recently.
> I think the change from the arm64 pfn_valid() to the generic one is
> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
> returns true just because we have a struct page available but the memory
> may have been reserved.
>
> Cc'ing Mike R.
>
>> I also noticed there is a diff in the kernel log regarding memory init up
>> until 5.13.12 it says
>>
>> [    0.000000] Zone ranges:
>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000]   DMA32    empty
>> [    0.000000]   Normal   empty
>> [    0.000000] Movable zone start for each node
>> [    0.000000] Early memory node ranges
>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] On node 0 totalpages: 1043968
>> [    0.000000]   DMA zone: 16312 pages used for memmap
>> [    0.000000]   DMA zone: 0 pages reserved
>> [    0.000000]   DMA zone: 1043968 pages, LIFO batch:63
>>
>> In contrary in 5.14-rc7 it says:
>>
>> [    0.000000] Zone ranges:
>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000]   DMA32    empty
>> [    0.000000]   Normal   empty
>> [    0.000000] Movable zone start for each node
>> [    0.000000] Early memory node ranges
>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
>> [    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
>>
>> (note the "unavailable ranges")
>> I'm uncertain again here, if that diff is expected behavior because of those
>> recent mm changes for arm64.
>>
>> After reverting
>>    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>> the log changes to
>>
>> [    0.000000] Zone ranges:
>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000]   DMA32    empty
>> [    0.000000]   Normal   empty
>> [    0.000000] Movable zone start for each node
>> [    0.000000] Early memory node ranges
>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] Initmem setup node 0 [mem
>> 0x0000000000200000-0x00000000feffffff]
>>
>> (no DMA zones here)
>>
>> As you might have noticed I have _zero_ clue about memory mapping and dma
>> subsystem - so let me know if there is any more information needed for that
>> and thanks for your help.
> Adding Robin as well, he has a better clue than us on DMA ;).
>
>> Alex
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 20:07     ` Alex Bee
  0 siblings, 0 replies; 64+ messages in thread
From: Alex Bee @ 2021-08-24 20:07 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Mike Rapoport,
	Robin Murphy

Hi Catalin,

thanks for your quick reply.

Am 24.08.21 um 19:37 schrieb Catalin Marinas:
> Hi Alex,
>
> Thanks for the report.
>
> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>
>> [    8.921909] ------------[ cut here ]------------
>> [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>> [    8.921973] Modules linked in: spi_rockchip(+) fuse
>> [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>> [    8.922004] Hardware name: Pine64 Rock64 (DT)
>> [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>> [    8.922018] pc : dma_map_resource+0x68/0xc0
>> [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>> [    8.922040] sp : ffff800012102ae0
>> [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>> [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>> [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>> [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>> [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>> [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>> [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>> [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>> [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>> [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>> [    8.922158] Call trace:
>> [    8.922163]  dma_map_resource+0x68/0xc0
>> [    8.922173]  pl330_prep_slave_sg+0x58/0x220
>> [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>> [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> [...]
>> Note: This does not relate to the spi driver - when disabling this device in
>> the device tree it fails for any other (i2s, for instance) which uses dma.
>> Commenting out the failing check at [1], however, helps and the mapping
>> works again.
> Do you know which address dma_map_resource() is trying to map (maybe
> add some printk())? It's not supposed to map RAM, hence the warning.
> Random guess, the address is 0xff190800 (based on the x1 above but the
> regs might as well be mangled).

Confirmed:

[..]

[    8.353879] dma_map_resource Failed to map address 0xff190800

[..]

Alex

>
>> I tried to follow the recent changes for arm64 mm which could relate to the
>> check failing at [1] and reverting
>>    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>> helps and makes it work again, but I'm 100% uncertain if that commit is
>> really the culprit.
>>
>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>> device tree as follows:
>>
>> /memreserve/    0x00000000fcefc000 0x000000000000d000;
>> / {
>> ..
>>      compatible = "pine64,rock64\0rockchip,rk3328";
>> ..
>>      memory {
>>          reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>>          device_type = "memory";
>>      };
>>
>> ..
>> }
> Either pfn_valid() gets confused in 5.14 or something is wrong with the
> DT. I have a suspicion it's the former since reverting the above commit
> makes it disappear.
>
>> So: there is a "hole" in the mappable memory and reading the commit message
>> of
>>    commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
>> pfn_valid()")
>> suggests, there was a change for that case recently.
> I think the change from the arm64 pfn_valid() to the generic one is
> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
> returns true just because we have a struct page available but the memory
> may have been reserved.
>
> Cc'ing Mike R.
>
>> I also noticed there is a diff in the kernel log regarding memory init up
>> until 5.13.12 it says
>>
>> [    0.000000] Zone ranges:
>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000]   DMA32    empty
>> [    0.000000]   Normal   empty
>> [    0.000000] Movable zone start for each node
>> [    0.000000] Early memory node ranges
>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] On node 0 totalpages: 1043968
>> [    0.000000]   DMA zone: 16312 pages used for memmap
>> [    0.000000]   DMA zone: 0 pages reserved
>> [    0.000000]   DMA zone: 1043968 pages, LIFO batch:63
>>
>> In contrary in 5.14-rc7 it says:
>>
>> [    0.000000] Zone ranges:
>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000]   DMA32    empty
>> [    0.000000]   Normal   empty
>> [    0.000000] Movable zone start for each node
>> [    0.000000] Early memory node ranges
>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
>> [    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
>>
>> (note the "unavailable ranges")
>> I'm uncertain again here, if that diff is expected behavior because of those
>> recent mm changes for arm64.
>>
>> After reverting
>>    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>> the log changes to
>>
>> [    0.000000] Zone ranges:
>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000]   DMA32    empty
>> [    0.000000]   Normal   empty
>> [    0.000000] Movable zone start for each node
>> [    0.000000] Early memory node ranges
>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>> [    0.000000] Initmem setup node 0 [mem
>> 0x0000000000200000-0x00000000feffffff]
>>
>> (no DMA zones here)
>>
>> As you might have noticed I have _zero_ clue about memory mapping and dma
>> subsystem - so let me know if there is any more information needed for that
>> and thanks for your help.
> Adding Robin as well, he has a better clue than us on DMA ;).
>
>> Alex
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 18:28     ` Mike Rapoport
@ 2021-08-24 20:14       ` Alex Bee
  -1 siblings, 0 replies; 64+ messages in thread
From: Alex Bee @ 2021-08-24 20:14 UTC (permalink / raw)
  To: Mike Rapoport, Catalin Marinas
  Cc: Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Robin Murphy

Hi Mike,

thanks for your reply

Am 24.08.21 um 20:28 schrieb Mike Rapoport:
> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>> Hi Alex,
>>
>> Thanks for the report.
>>
>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>
>>> [    8.921909] ------------[ cut here ]------------
>>> [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>> [    8.921973] Modules linked in: spi_rockchip(+) fuse
>>> [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>> [    8.922004] Hardware name: Pine64 Rock64 (DT)
>>> [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>> [    8.922018] pc : dma_map_resource+0x68/0xc0
>>> [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>>> [    8.922040] sp : ffff800012102ae0
>>> [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>> [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>> [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>> [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>> [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>> [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>> [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>> [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>> [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>> [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>> [    8.922158] Call trace:
>>> [    8.922163]  dma_map_resource+0x68/0xc0
>>> [    8.922173]  pl330_prep_slave_sg+0x58/0x220
>>> [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>> [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>> [...]
>>> Note: This does not relate to the spi driver - when disabling this device in
>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>> Commenting out the failing check at [1], however, helps and the mapping
>>> works again.
>> Do you know which address dma_map_resource() is trying to map (maybe
>> add some printk())? It's not supposed to map RAM, hence the warning.
>> Random guess, the address is 0xff190800 (based on the x1 above but the
>> regs might as well be mangled).
> 0xff190800 will cause this warning for sure. It has a memory map, but it is
> not RAM so old version of pfn_valid() would return 0 and the new one
> returns 1.
>   
>>> I tried to follow the recent changes for arm64 mm which could relate to the
>>> check failing at [1] and reverting
>>>    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>> helps and makes it work again, but I'm 100% uncertain if that commit is
>>> really the culprit.
>>>
>>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>>> device tree as follows:
>>>
>>> /memreserve/    0x00000000fcefc000 0x000000000000d000;
>>> / {
>>> ..
>>>      compatible = "pine64,rock64\0rockchip,rk3328";
>>> ..
>>>      memory {
>>>          reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>>>          device_type = "memory";
>>>      };
>>>
>>> ..
>>> }
>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>> DT. I have a suspicion it's the former since reverting the above commit
>> makes it disappear.
> I think pfn_valid() actually behaves as expected but the caller is wrong
> because pfn_valid != RAM (this applies btw to !arm64 as well).
>
> 	/* Don't allow RAM to be mapped */
> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> 		return DMA_MAPPING_ERROR;
>
> Alex, can you please try this patch:
>
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 2b06a809d0b9..4715e9641a29 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
>   		return DMA_MAPPING_ERROR;
>   
>   	/* Don't allow RAM to be mapped */
> -	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> +	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
>   		return DMA_MAPPING_ERROR;
>   
>   	if (dma_map_direct(dev, ops))
>   

Nope, doesn't help:

[    8.353879] dma_map_resource Failed to map address 0xff190800
[    8.353886] dma_map_resource pfn_valid(PHYS_PFN(0xff190800)): 1
[    8.353892] dma_map_resource memblock_is_memory(0xff190800): 0

If understand the comment for that check correct, that we _don't_ want 
RAM to be mapped - shoudn't that be:

+	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))

?

Alex

>>> So: there is a "hole" in the mappable memory and reading the commit message
>>> of
>>>    commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
>>> pfn_valid()")
>>> suggests, there was a change for that case recently.
>> I think the change from the arm64 pfn_valid() to the generic one is
>> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
>> returns true just because we have a struct page available but the memory
>> may have been reserved.
>>
>> Cc'ing Mike R.
>>
>>> I also noticed there is a diff in the kernel log regarding memory init up
>>> until 5.13.12 it says
>>>
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000]   DMA32    empty
>>> [    0.000000]   Normal   empty
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] On node 0 totalpages: 1043968
>>> [    0.000000]   DMA zone: 16312 pages used for memmap
>>> [    0.000000]   DMA zone: 0 pages reserved
>>> [    0.000000]   DMA zone: 1043968 pages, LIFO batch:63
>>>
>>> In contrary in 5.14-rc7 it says:
>>>
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000]   DMA32    empty
>>> [    0.000000]   Normal   empty
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
>>> [    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
>>>
>>> (note the "unavailable ranges")
>>> I'm uncertain again here, if that diff is expected behavior because of those
>>> recent mm changes for arm64.
>>>
>>> After reverting
>>>    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>> the log changes to
>>>
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000]   DMA32    empty
>>> [    0.000000]   Normal   empty
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] Initmem setup node 0 [mem
>>> 0x0000000000200000-0x00000000feffffff]
>>>
>>> (no DMA zones here)
>>>
>>> As you might have noticed I have _zero_ clue about memory mapping and dma
>>> subsystem - so let me know if there is any more information needed for that
>>> and thanks for your help.
>> Adding Robin as well, he has a better clue than us on DMA ;).
>>
>>> Alex
>>>
>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235
>> -- 
>> Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-24 20:14       ` Alex Bee
  0 siblings, 0 replies; 64+ messages in thread
From: Alex Bee @ 2021-08-24 20:14 UTC (permalink / raw)
  To: Mike Rapoport, Catalin Marinas
  Cc: Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Robin Murphy

Hi Mike,

thanks for your reply

Am 24.08.21 um 20:28 schrieb Mike Rapoport:
> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>> Hi Alex,
>>
>> Thanks for the report.
>>
>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>
>>> [    8.921909] ------------[ cut here ]------------
>>> [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>> [    8.921973] Modules linked in: spi_rockchip(+) fuse
>>> [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>> [    8.922004] Hardware name: Pine64 Rock64 (DT)
>>> [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>> [    8.922018] pc : dma_map_resource+0x68/0xc0
>>> [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>>> [    8.922040] sp : ffff800012102ae0
>>> [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>> [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>> [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>> [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>> [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>> [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>> [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>> [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>> [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>> [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>> [    8.922158] Call trace:
>>> [    8.922163]  dma_map_resource+0x68/0xc0
>>> [    8.922173]  pl330_prep_slave_sg+0x58/0x220
>>> [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>> [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>> [...]
>>> Note: This does not relate to the spi driver - when disabling this device in
>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>> Commenting out the failing check at [1], however, helps and the mapping
>>> works again.
>> Do you know which address dma_map_resource() is trying to map (maybe
>> add some printk())? It's not supposed to map RAM, hence the warning.
>> Random guess, the address is 0xff190800 (based on the x1 above but the
>> regs might as well be mangled).
> 0xff190800 will cause this warning for sure. It has a memory map, but it is
> not RAM so old version of pfn_valid() would return 0 and the new one
> returns 1.
>   
>>> I tried to follow the recent changes for arm64 mm which could relate to the
>>> check failing at [1] and reverting
>>>    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>> helps and makes it work again, but I'm 100% uncertain if that commit is
>>> really the culprit.
>>>
>>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>>> device tree as follows:
>>>
>>> /memreserve/    0x00000000fcefc000 0x000000000000d000;
>>> / {
>>> ..
>>>      compatible = "pine64,rock64\0rockchip,rk3328";
>>> ..
>>>      memory {
>>>          reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>>>          device_type = "memory";
>>>      };
>>>
>>> ..
>>> }
>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>> DT. I have a suspicion it's the former since reverting the above commit
>> makes it disappear.
> I think pfn_valid() actually behaves as expected but the caller is wrong
> because pfn_valid != RAM (this applies btw to !arm64 as well).
>
> 	/* Don't allow RAM to be mapped */
> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> 		return DMA_MAPPING_ERROR;
>
> Alex, can you please try this patch:
>
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 2b06a809d0b9..4715e9641a29 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
>   		return DMA_MAPPING_ERROR;
>   
>   	/* Don't allow RAM to be mapped */
> -	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> +	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
>   		return DMA_MAPPING_ERROR;
>   
>   	if (dma_map_direct(dev, ops))
>   

Nope, doesn't help:

[    8.353879] dma_map_resource Failed to map address 0xff190800
[    8.353886] dma_map_resource pfn_valid(PHYS_PFN(0xff190800)): 1
[    8.353892] dma_map_resource memblock_is_memory(0xff190800): 0

If understand the comment for that check correct, that we _don't_ want 
RAM to be mapped - shoudn't that be:

+	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))

?

Alex

>>> So: there is a "hole" in the mappable memory and reading the commit message
>>> of
>>>    commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
>>> pfn_valid()")
>>> suggests, there was a change for that case recently.
>> I think the change from the arm64 pfn_valid() to the generic one is
>> avoiding the call to memblock_is_memory(). I wonder whether pfn_valid()
>> returns true just because we have a struct page available but the memory
>> may have been reserved.
>>
>> Cc'ing Mike R.
>>
>>> I also noticed there is a diff in the kernel log regarding memory init up
>>> until 5.13.12 it says
>>>
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000]   DMA32    empty
>>> [    0.000000]   Normal   empty
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] On node 0 totalpages: 1043968
>>> [    0.000000]   DMA zone: 16312 pages used for memmap
>>> [    0.000000]   DMA zone: 0 pages reserved
>>> [    0.000000]   DMA zone: 1043968 pages, LIFO batch:63
>>>
>>> In contrary in 5.14-rc7 it says:
>>>
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000]   DMA32    empty
>>> [    0.000000]   Normal   empty
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] On node 0, zone DMA: 512 pages in unavailable ranges
>>> [    0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges
>>>
>>> (note the "unavailable ranges")
>>> I'm uncertain again here, if that diff is expected behavior because of those
>>> recent mm changes for arm64.
>>>
>>> After reverting
>>>    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>> the log changes to
>>>
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000]   DMA32    empty
>>> [    0.000000]   Normal   empty
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000200000-0x00000000feffffff]
>>> [    0.000000] Initmem setup node 0 [mem
>>> 0x0000000000200000-0x00000000feffffff]
>>>
>>> (no DMA zones here)
>>>
>>> As you might have noticed I have _zero_ clue about memory mapping and dma
>>> subsystem - so let me know if there is any more information needed for that
>>> and thanks for your help.
>> Adding Robin as well, he has a better clue than us on DMA ;).
>>
>>> Alex
>>>
>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235
>> -- 
>> Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 20:14       ` Alex Bee
@ 2021-08-25  4:39         ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-08-25  4:39 UTC (permalink / raw)
  To: Alex Bee
  Cc: Catalin Marinas, Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Robin Murphy

On Tue, Aug 24, 2021 at 10:14:01PM +0200, Alex Bee wrote:
> Hi Mike,
> 
> thanks for your reply
> 
> Am 24.08.21 um 20:28 schrieb Mike Rapoport:
> > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > Hi Alex,
> > > 
> > > Thanks for the report.
> > > 
> > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > 
> > > > [    8.921909] ------------[ cut here ]------------
> > > > [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > [    8.921973] Modules linked in: spi_rockchip(+) fuse
> > > > [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > [    8.922004] Hardware name: Pine64 Rock64 (DT)
> > > > [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > [    8.922018] pc : dma_map_resource+0x68/0xc0
> > > > [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > [    8.922040] sp : ffff800012102ae0
> > > > [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > [    8.922158] Call trace:
> > > > [    8.922163]  dma_map_resource+0x68/0xc0
> > > > [    8.922173]  pl330_prep_slave_sg+0x58/0x220
> > > > [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > [...]
> > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > works again.
> > > Do you know which address dma_map_resource() is trying to map (maybe
> > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > regs might as well be mangled).
> > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > not RAM so old version of pfn_valid() would return 0 and the new one
> > returns 1.
> > > > I tried to follow the recent changes for arm64 mm which could relate to the
> > > > check failing at [1] and reverting
> > > >    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> > > > helps and makes it work again, but I'm 100% uncertain if that commit is
> > > > really the culprit.
> > > > 
> > > > Note, that the firmware (legacy u-boot) injects memory configuration in the
> > > > device tree as follows:
> > > > 
> > > > /memreserve/    0x00000000fcefc000 0x000000000000d000;
> > > > / {
> > > > ..
> > > >      compatible = "pine64,rock64\0rockchip,rk3328";
> > > > ..
> > > >      memory {
> > > >          reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
> > > >          device_type = "memory";
> > > >      };
> > > > 
> > > > ..
> > > > }
> > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > DT. I have a suspicion it's the former since reverting the above commit
> > > makes it disappear.
> > I think pfn_valid() actually behaves as expected but the caller is wrong
> > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > 
> > 	/* Don't allow RAM to be mapped */
> > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > 		return DMA_MAPPING_ERROR;
> > 
> > Alex, can you please try this patch:
> > 
> > diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> > index 2b06a809d0b9..4715e9641a29 100644
> > --- a/kernel/dma/mapping.c
> > +++ b/kernel/dma/mapping.c
> > @@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
> >   		return DMA_MAPPING_ERROR;
> >   	/* Don't allow RAM to be mapped */
> > -	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > +	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
> >   		return DMA_MAPPING_ERROR;
> >   	if (dma_map_direct(dev, ops))
> 
> Nope, doesn't help:
> 
> [    8.353879] dma_map_resource Failed to map address 0xff190800
> [    8.353886] dma_map_resource pfn_valid(PHYS_PFN(0xff190800)): 1
> [    8.353892] dma_map_resource memblock_is_memory(0xff190800): 0
> 
> If understand the comment for that check correct, that we _don't_ want RAM
> to be mapped - shoudn't that be:
> 
> +	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))
> 
> ?

Right, we don't want RAM to be mapped, the negation was wrong and it should
be 

	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))
 
-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25  4:39         ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-08-25  4:39 UTC (permalink / raw)
  To: Alex Bee
  Cc: Catalin Marinas, Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM, Robin Murphy

On Tue, Aug 24, 2021 at 10:14:01PM +0200, Alex Bee wrote:
> Hi Mike,
> 
> thanks for your reply
> 
> Am 24.08.21 um 20:28 schrieb Mike Rapoport:
> > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > Hi Alex,
> > > 
> > > Thanks for the report.
> > > 
> > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > 
> > > > [    8.921909] ------------[ cut here ]------------
> > > > [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > [    8.921973] Modules linked in: spi_rockchip(+) fuse
> > > > [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > [    8.922004] Hardware name: Pine64 Rock64 (DT)
> > > > [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > [    8.922018] pc : dma_map_resource+0x68/0xc0
> > > > [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > [    8.922040] sp : ffff800012102ae0
> > > > [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > [    8.922158] Call trace:
> > > > [    8.922163]  dma_map_resource+0x68/0xc0
> > > > [    8.922173]  pl330_prep_slave_sg+0x58/0x220
> > > > [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > [...]
> > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > works again.
> > > Do you know which address dma_map_resource() is trying to map (maybe
> > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > regs might as well be mangled).
> > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > not RAM so old version of pfn_valid() would return 0 and the new one
> > returns 1.
> > > > I tried to follow the recent changes for arm64 mm which could relate to the
> > > > check failing at [1] and reverting
> > > >    commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
> > > > helps and makes it work again, but I'm 100% uncertain if that commit is
> > > > really the culprit.
> > > > 
> > > > Note, that the firmware (legacy u-boot) injects memory configuration in the
> > > > device tree as follows:
> > > > 
> > > > /memreserve/    0x00000000fcefc000 0x000000000000d000;
> > > > / {
> > > > ..
> > > >      compatible = "pine64,rock64\0rockchip,rk3328";
> > > > ..
> > > >      memory {
> > > >          reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
> > > >          device_type = "memory";
> > > >      };
> > > > 
> > > > ..
> > > > }
> > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > DT. I have a suspicion it's the former since reverting the above commit
> > > makes it disappear.
> > I think pfn_valid() actually behaves as expected but the caller is wrong
> > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > 
> > 	/* Don't allow RAM to be mapped */
> > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > 		return DMA_MAPPING_ERROR;
> > 
> > Alex, can you please try this patch:
> > 
> > diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> > index 2b06a809d0b9..4715e9641a29 100644
> > --- a/kernel/dma/mapping.c
> > +++ b/kernel/dma/mapping.c
> > @@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
> >   		return DMA_MAPPING_ERROR;
> >   	/* Don't allow RAM to be mapped */
> > -	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > +	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
> >   		return DMA_MAPPING_ERROR;
> >   	if (dma_map_direct(dev, ops))
> 
> Nope, doesn't help:
> 
> [    8.353879] dma_map_resource Failed to map address 0xff190800
> [    8.353886] dma_map_resource pfn_valid(PHYS_PFN(0xff190800)): 1
> [    8.353892] dma_map_resource memblock_is_memory(0xff190800): 0
> 
> If understand the comment for that check correct, that we _don't_ want RAM
> to be mapped - shoudn't that be:
> 
> +	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))
> 
> ?

Right, we don't want RAM to be mapped, the negation was wrong and it should
be 

	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))
 
-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25  4:39         ` Mike Rapoport
@ 2021-08-25 10:00           ` Alex Bee
  -1 siblings, 0 replies; 64+ messages in thread
From: Alex Bee @ 2021-08-25 10:00 UTC (permalink / raw)
  To: Mike Rapoport, Robin Murphy, Andrew Morton, Catalin Marinas
  Cc: Will Deacon, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM


Am 25.08.21 um 06:39 schrieb Mike Rapoport:
> On Tue, Aug 24, 2021 at 10:14:01PM +0200, Alex Bee wrote:
>> Hi Mike,
>>
>> thanks for your reply
>>
>> Am 24.08.21 um 20:28 schrieb Mike Rapoport:
>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>> Hi Alex,
>>>>
>>>> Thanks for the report.
>>>>
>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>
>>>>> [    8.921909] ------------[ cut here ]------------
>>>>> [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>>>> [    8.921973] Modules linked in: spi_rockchip(+) fuse
>>>>> [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>> [    8.922004] Hardware name: Pine64 Rock64 (DT)
>>>>> [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>> [    8.922018] pc : dma_map_resource+0x68/0xc0
>>>>> [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>> [    8.922040] sp : ffff800012102ae0
>>>>> [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>> [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>> [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>> [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>> [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>> [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>> [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>> [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>> [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>> [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>> [    8.922158] Call trace:
>>>>> [    8.922163]  dma_map_resource+0x68/0xc0
>>>>> [    8.922173]  pl330_prep_slave_sg+0x58/0x220
>>>>> [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>> [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>> [...]
>>>>> Note: This does not relate to the spi driver - when disabling this device in
>>>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>>>> Commenting out the failing check at [1], however, helps and the mapping
>>>>> works again.
>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>> Random guess, the address is 0xff190800 (based on the x1 above but the
>>>> regs might as well be mangled).
>>> 0xff190800 will cause this warning for sure. It has a memory map, but it is
>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>> returns 1.
>>>>> I tried to follow the recent changes for arm64 mm which could relate to the
>>>>> check failing at [1] and reverting
>>>>>     commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>>>> helps and makes it work again, but I'm 100% uncertain if that commit is
>>>>> really the culprit.
>>>>>
>>>>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>>>>> device tree as follows:
>>>>>
>>>>> /memreserve/    0x00000000fcefc000 0x000000000000d000;
>>>>> / {
>>>>> ..
>>>>>       compatible = "pine64,rock64\0rockchip,rk3328";
>>>>> ..
>>>>>       memory {
>>>>>           reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>>>>>           device_type = "memory";
>>>>>       };
>>>>>
>>>>> ..
>>>>> }
>>>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>>>> DT. I have a suspicion it's the former since reverting the above commit
>>>> makes it disappear.
>>> I think pfn_valid() actually behaves as expected but the caller is wrong
>>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>>
>>> 	/* Don't allow RAM to be mapped */
>>> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>> 		return DMA_MAPPING_ERROR;
>>>
>>> Alex, can you please try this patch:
>>>
>>> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
>>> index 2b06a809d0b9..4715e9641a29 100644
>>> --- a/kernel/dma/mapping.c
>>> +++ b/kernel/dma/mapping.c
>>> @@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
>>>    		return DMA_MAPPING_ERROR;
>>>    	/* Don't allow RAM to be mapped */
>>> -	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>> +	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
>>>    		return DMA_MAPPING_ERROR;
>>>    	if (dma_map_direct(dev, ops))
>> Nope, doesn't help:
>>
>> [    8.353879] dma_map_resource Failed to map address 0xff190800
>> [    8.353886] dma_map_resource pfn_valid(PHYS_PFN(0xff190800)): 1
>> [    8.353892] dma_map_resource memblock_is_memory(0xff190800): 0
>>
>> If understand the comment for that check correct, that we _don't_ want RAM
>> to be mapped - shoudn't that be:
>>
>> +	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))
>>
>> ?
> Right, we don't want RAM to be mapped, the negation was wrong and it should
> be
>
> 	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))
>   

OK, great: that helps with no other complaints - at least on this 
(arm64) SoC.

Since this affects all other archs, I doubt this will be in 5.14 - but 
it would be great if there could be fix for that in 5.14 since at system 
breaking bug. I'm not sure why it wasn't reported yet - is there any 
chance that at least

   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")

can be reverted in 5.14?

Alex


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:00           ` Alex Bee
  0 siblings, 0 replies; 64+ messages in thread
From: Alex Bee @ 2021-08-25 10:00 UTC (permalink / raw)
  To: Mike Rapoport, Robin Murphy, Andrew Morton, Catalin Marinas
  Cc: Will Deacon, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM


Am 25.08.21 um 06:39 schrieb Mike Rapoport:
> On Tue, Aug 24, 2021 at 10:14:01PM +0200, Alex Bee wrote:
>> Hi Mike,
>>
>> thanks for your reply
>>
>> Am 24.08.21 um 20:28 schrieb Mike Rapoport:
>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>> Hi Alex,
>>>>
>>>> Thanks for the report.
>>>>
>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>
>>>>> [    8.921909] ------------[ cut here ]------------
>>>>> [    8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>>>> [    8.921973] Modules linked in: spi_rockchip(+) fuse
>>>>> [    8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>> [    8.922004] Hardware name: Pine64 Rock64 (DT)
>>>>> [    8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>> [    8.922018] pc : dma_map_resource+0x68/0xc0
>>>>> [    8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>> [    8.922040] sp : ffff800012102ae0
>>>>> [    8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>> [    8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>> [    8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>> [    8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>> [    8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>> [    8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>> [    8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>> [    8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>> [    8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>> [    8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>> [    8.922158] Call trace:
>>>>> [    8.922163]  dma_map_resource+0x68/0xc0
>>>>> [    8.922173]  pl330_prep_slave_sg+0x58/0x220
>>>>> [    8.922181]  rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>> [    8.922208]  rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>> [...]
>>>>> Note: This does not relate to the spi driver - when disabling this device in
>>>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>>>> Commenting out the failing check at [1], however, helps and the mapping
>>>>> works again.
>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>> Random guess, the address is 0xff190800 (based on the x1 above but the
>>>> regs might as well be mangled).
>>> 0xff190800 will cause this warning for sure. It has a memory map, but it is
>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>> returns 1.
>>>>> I tried to follow the recent changes for arm64 mm which could relate to the
>>>>> check failing at [1] and reverting
>>>>>     commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")
>>>>> helps and makes it work again, but I'm 100% uncertain if that commit is
>>>>> really the culprit.
>>>>>
>>>>> Note, that the firmware (legacy u-boot) injects memory configuration in the
>>>>> device tree as follows:
>>>>>
>>>>> /memreserve/    0x00000000fcefc000 0x000000000000d000;
>>>>> / {
>>>>> ..
>>>>>       compatible = "pine64,rock64\0rockchip,rk3328";
>>>>> ..
>>>>>       memory {
>>>>>           reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>;
>>>>>           device_type = "memory";
>>>>>       };
>>>>>
>>>>> ..
>>>>> }
>>>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>>>> DT. I have a suspicion it's the former since reverting the above commit
>>>> makes it disappear.
>>> I think pfn_valid() actually behaves as expected but the caller is wrong
>>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>>
>>> 	/* Don't allow RAM to be mapped */
>>> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>> 		return DMA_MAPPING_ERROR;
>>>
>>> Alex, can you please try this patch:
>>>
>>> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
>>> index 2b06a809d0b9..4715e9641a29 100644
>>> --- a/kernel/dma/mapping.c
>>> +++ b/kernel/dma/mapping.c
>>> @@ -232,7 +232,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
>>>    		return DMA_MAPPING_ERROR;
>>>    	/* Don't allow RAM to be mapped */
>>> -	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>> +	if (WARN_ON_ONCE(!memblock_is_memory(phys_addr)))
>>>    		return DMA_MAPPING_ERROR;
>>>    	if (dma_map_direct(dev, ops))
>> Nope, doesn't help:
>>
>> [    8.353879] dma_map_resource Failed to map address 0xff190800
>> [    8.353886] dma_map_resource pfn_valid(PHYS_PFN(0xff190800)): 1
>> [    8.353892] dma_map_resource memblock_is_memory(0xff190800): 0
>>
>> If understand the comment for that check correct, that we _don't_ want RAM
>> to be mapped - shoudn't that be:
>>
>> +	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))
>>
>> ?
> Right, we don't want RAM to be mapped, the negation was wrong and it should
> be
>
> 	if (WARN_ON_ONCE(memblock_is_memory(phys_addr)))
>   

OK, great: that helps with no other complaints - at least on this 
(arm64) SoC.

Since this affects all other archs, I doubt this will be in 5.14 - but 
it would be great if there could be fix for that in 5.14 since at system 
breaking bug. I'm not sure why it wasn't reported yet - is there any 
chance that at least

   commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID")

can be reverted in 5.14?

Alex


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-24 18:59         ` David Hildenbrand
@ 2021-08-25 10:20           ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-08-25 10:20 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

+ hch

On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
> On 24.08.21 20:46, Robin Murphy wrote:
> > On 2021-08-24 19:28, Mike Rapoport wrote:
> > > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > > 
> > > > >  ------------[ cut here ]------------
> > > > >  WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > >  Modules linked in: spi_rockchip(+) fuse
> > > > >  CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > >  Hardware name: Pine64 Rock64 (DT)
> > > > >  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > >  pc : dma_map_resource+0x68/0xc0
> > > > >  lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > >  sp : ffff800012102ae0
> > > > >  x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > >  x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > >  x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > >  x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > >  x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > >  x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > >  x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > >  x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > >  x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > >  x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > >  Call trace:
> > > > >    dma_map_resource+0x68/0xc0
> > > > >    pl330_prep_slave_sg+0x58/0x220
> > > > >    rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > >    rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > > [...]
> > > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > > works again.
> > > 
> > > > Do you know which address dma_map_resource() is trying to map (maybe
> > > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > > regs might as well be mangled).
> > > 
> > > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > > not RAM so old version of pfn_valid() would return 0 and the new one
> > > returns 1.
> > 
> > How does that happen, though? It's not a memory address, and it's not
> > even within the bounds of anywhere there should or could be memory. This
> > SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> > the DRAM controller (which may not all be populated, and may have pieces
> > carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> > do we have pages (or at least the assumption of pages) for somewhere
> > which by all rights should not have them?
> 
> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
> any such hacks. If there is a memory hole, it gets a memmap as well.
> 
> Tricking pfn_valid() into returning "false" where we actually have a memmap
> only makes it look like there is no memmap; but there is one, and
> it's PG_reserved.

I can see the documentation for pfn_valid() does not claim anything more
than the presence of an memmap entry. But I wonder whether the confusion
is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
assumes __va() can be used on pfn_valid(), though I suspect it relies on
the calling function to check that the resource was RAM. The arm64
kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
standard memcpy on it, which wouldn't work for I/O (should we change
this check to pfn_is_map_memory() for arm64?).

> > > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > > DT. I have a suspicion it's the former since reverting the above commit
> > > > makes it disappear.
> > > 
> > > I think pfn_valid() actually behaves as expected but the caller is wrong
> > > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > > 
> > > 	/* Don't allow RAM to be mapped */
> > > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > > 		return DMA_MAPPING_ERROR;
> > > 
> > > Alex, can you please try this patch:
> > 
> > That will certainly paper over the issue, but it's avoiding the question
> > of what went wrong with the memory map in the first place. The comment
> > is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> > addresses that would be wrong to pass to dma_map_page(), so I believe
> > pfn_valid() is still the correct check.
> 
> If we want to check for RAM, pfn_valid() would be wrong. If we want to check
> for "is there a memmap, for whatever lives or does not live there",
> pfn_valid() is the right check.

So what should the DMA code use instead? Last time we needed something
similar, the recommendation was to use pfn_to_online_page(). Mike is
suggesting memblock_is_memory().

Given how later we are in the -rc cycle, I suggest we revert Anshuman's
commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
assess the implications in 5.15 (the patch doesn't seem to have the
arm64 maintainers' ack anyway ;)).

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:20           ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-08-25 10:20 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

+ hch

On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
> On 24.08.21 20:46, Robin Murphy wrote:
> > On 2021-08-24 19:28, Mike Rapoport wrote:
> > > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > > 
> > > > >  ------------[ cut here ]------------
> > > > >  WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > >  Modules linked in: spi_rockchip(+) fuse
> > > > >  CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > >  Hardware name: Pine64 Rock64 (DT)
> > > > >  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > >  pc : dma_map_resource+0x68/0xc0
> > > > >  lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > >  sp : ffff800012102ae0
> > > > >  x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > >  x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > >  x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > >  x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > >  x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > >  x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > >  x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > >  x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > >  x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > >  x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > >  Call trace:
> > > > >    dma_map_resource+0x68/0xc0
> > > > >    pl330_prep_slave_sg+0x58/0x220
> > > > >    rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > >    rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > > [...]
> > > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > > works again.
> > > 
> > > > Do you know which address dma_map_resource() is trying to map (maybe
> > > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > > regs might as well be mangled).
> > > 
> > > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > > not RAM so old version of pfn_valid() would return 0 and the new one
> > > returns 1.
> > 
> > How does that happen, though? It's not a memory address, and it's not
> > even within the bounds of anywhere there should or could be memory. This
> > SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> > the DRAM controller (which may not all be populated, and may have pieces
> > carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> > do we have pages (or at least the assumption of pages) for somewhere
> > which by all rights should not have them?
> 
> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
> any such hacks. If there is a memory hole, it gets a memmap as well.
> 
> Tricking pfn_valid() into returning "false" where we actually have a memmap
> only makes it look like there is no memmap; but there is one, and
> it's PG_reserved.

I can see the documentation for pfn_valid() does not claim anything more
than the presence of an memmap entry. But I wonder whether the confusion
is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
assumes __va() can be used on pfn_valid(), though I suspect it relies on
the calling function to check that the resource was RAM. The arm64
kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
standard memcpy on it, which wouldn't work for I/O (should we change
this check to pfn_is_map_memory() for arm64?).

> > > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > > DT. I have a suspicion it's the former since reverting the above commit
> > > > makes it disappear.
> > > 
> > > I think pfn_valid() actually behaves as expected but the caller is wrong
> > > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > > 
> > > 	/* Don't allow RAM to be mapped */
> > > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > > 		return DMA_MAPPING_ERROR;
> > > 
> > > Alex, can you please try this patch:
> > 
> > That will certainly paper over the issue, but it's avoiding the question
> > of what went wrong with the memory map in the first place. The comment
> > is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> > addresses that would be wrong to pass to dma_map_page(), so I believe
> > pfn_valid() is still the correct check.
> 
> If we want to check for RAM, pfn_valid() would be wrong. If we want to check
> for "is there a memmap, for whatever lives or does not live there",
> pfn_valid() is the right check.

So what should the DMA code use instead? Last time we needed something
similar, the recommendation was to use pfn_to_online_page(). Mike is
suggesting memblock_is_memory().

Given how later we are in the -rc cycle, I suggest we revert Anshuman's
commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
assess the implications in 5.15 (the patch doesn't seem to have the
arm64 maintainers' ack anyway ;)).

Thanks.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:20           ` Catalin Marinas
@ 2021-08-25 10:28             ` Will Deacon
  -1 siblings, 0 replies; 64+ messages in thread
From: Will Deacon @ 2021-08-25 10:28 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Robin Murphy, Mike Rapoport, Alex Bee,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> assess the implications in 5.15 (the patch doesn't seem to have the
> arm64 maintainers' ack anyway ;)).

I'll stick the revert (below) into kernelci now so we can get some coverage
in case it breaks something else.

Will

--->8

From e97ba0e39e486c20d8f76f3e632e4b7d933602cd Mon Sep 17 00:00:00 2001
From: Will Deacon <will@kernel.org>
Date: Wed, 25 Aug 2021 11:10:07 +0100
Subject: [PATCH] Revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"

This reverts commit 16c9afc776608324ca71c0bc354987bab532f51d.

Alex Bee reports a regression in 5.14 on their RK3328 SoC when
configuring the PL330 DMA controller:

 | ------------[ cut here ]------------
 | WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
 | Modules linked in: spi_rockchip(+) fuse
 | CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
 | Hardware name: Pine64 Rock64 (DT)
 | pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
 | pc : dma_map_resource+0x68/0xc0
 | lr : pl330_prep_slave_fifo+0x78/0xd0

This appears to be because dma_map_resource() is being called for a
physical address which does not correspond to a memory address yet does
have a valid 'struct page' due to the way in which the vmemmap is
constructed.

Prior to 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), the arm64
implementation of pfn_valid() called memblock_is_memory() to return
'false' for such regions and the DMA mapping request would proceed.
However, now that we are using the generic implementation where only the
presence of the memory map entry is considered, we return 'true' and
erroneously fail with DMA_MAPPING_ERROR because we identify the region
as DRAM.

Although fixing this in the DMA mapping code is arguably the right fix,
it is a risky, cross-architecture change at this stage in the cycle. So
just revert arm64 back to its old pfn_valid() implementation for v5.14.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Reported-by: Alex Bee <knaerzche@gmail.com>
Link: https://lore.kernel.org/r/d3a3c828-b777-faf8-e901-904995688437@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/Kconfig            |  1 +
 arch/arm64/include/asm/page.h |  1 +
 arch/arm64/mm/init.c          | 37 +++++++++++++++++++++++++++++++++++
 include/linux/mmzone.h        |  9 ---------
 4 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fdcd54d39c1e..62c3c1d2190f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -156,6 +156,7 @@ config ARM64
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_MMAP_RND_BITS
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
+	select HAVE_ARCH_PFN_VALID
 	select HAVE_ARCH_PREL32_RELOCATIONS
 	select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
 	select HAVE_ARCH_SECCOMP_FILTER
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 993a27ea6f54..f98c91bbd7c1 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -41,6 +41,7 @@ void tag_clear_highpage(struct page *to);
 
 typedef struct page *pgtable_t;
 
+int pfn_valid(unsigned long pfn);
 int pfn_is_map_memory(unsigned long pfn);
 
 #include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8490ed2917ff..1fdb7bb7c198 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -219,6 +219,43 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 	free_area_init(max_zone_pfns);
 }
 
+int pfn_valid(unsigned long pfn)
+{
+	phys_addr_t addr = PFN_PHYS(pfn);
+	struct mem_section *ms;
+
+	/*
+	 * Ensure the upper PAGE_SHIFT bits are clear in the
+	 * pfn. Else it might lead to false positives when
+	 * some of the upper bits are set, but the lower bits
+	 * match a valid pfn.
+	 */
+	if (PHYS_PFN(addr) != pfn)
+		return 0;
+
+	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
+		return 0;
+
+	ms = __pfn_to_section(pfn);
+	if (!valid_section(ms))
+		return 0;
+
+	/*
+	 * ZONE_DEVICE memory does not have the memblock entries.
+	 * memblock_is_map_memory() check for ZONE_DEVICE based
+	 * addresses will always fail. Even the normal hotplugged
+	 * memory will never have MEMBLOCK_NOMAP flag set in their
+	 * memblock entries. Skip memblock search for all non early
+	 * memory sections covering all of hotplug memory including
+	 * both normal and ZONE_DEVICE based.
+	 */
+	if (!early_section(ms))
+		return pfn_section_valid(ms, pfn);
+
+	return memblock_is_memory(addr);
+}
+EXPORT_SYMBOL(pfn_valid);
+
 int pfn_is_map_memory(unsigned long pfn)
 {
 	phys_addr_t addr = PFN_PHYS(pfn);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index fcb535560028..ee70f21a79d5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1463,15 +1463,6 @@ static inline int pfn_valid(unsigned long pfn)
 {
 	struct mem_section *ms;
 
-	/*
-	 * Ensure the upper PAGE_SHIFT bits are clear in the
-	 * pfn. Else it might lead to false positives when
-	 * some of the upper bits are set, but the lower bits
-	 * match a valid pfn.
-	 */
-	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
-		return 0;
-
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
 	ms = __nr_to_section(pfn_to_section_nr(pfn));
-- 
2.33.0.rc2.250.ged5fa647cd-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:28             ` Will Deacon
  0 siblings, 0 replies; 64+ messages in thread
From: Will Deacon @ 2021-08-25 10:28 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Robin Murphy, Mike Rapoport, Alex Bee,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> assess the implications in 5.15 (the patch doesn't seem to have the
> arm64 maintainers' ack anyway ;)).

I'll stick the revert (below) into kernelci now so we can get some coverage
in case it breaks something else.

Will

--->8

From e97ba0e39e486c20d8f76f3e632e4b7d933602cd Mon Sep 17 00:00:00 2001
From: Will Deacon <will@kernel.org>
Date: Wed, 25 Aug 2021 11:10:07 +0100
Subject: [PATCH] Revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"

This reverts commit 16c9afc776608324ca71c0bc354987bab532f51d.

Alex Bee reports a regression in 5.14 on their RK3328 SoC when
configuring the PL330 DMA controller:

 | ------------[ cut here ]------------
 | WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
 | Modules linked in: spi_rockchip(+) fuse
 | CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
 | Hardware name: Pine64 Rock64 (DT)
 | pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
 | pc : dma_map_resource+0x68/0xc0
 | lr : pl330_prep_slave_fifo+0x78/0xd0

This appears to be because dma_map_resource() is being called for a
physical address which does not correspond to a memory address yet does
have a valid 'struct page' due to the way in which the vmemmap is
constructed.

Prior to 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), the arm64
implementation of pfn_valid() called memblock_is_memory() to return
'false' for such regions and the DMA mapping request would proceed.
However, now that we are using the generic implementation where only the
presence of the memory map entry is considered, we return 'true' and
erroneously fail with DMA_MAPPING_ERROR because we identify the region
as DRAM.

Although fixing this in the DMA mapping code is arguably the right fix,
it is a risky, cross-architecture change at this stage in the cycle. So
just revert arm64 back to its old pfn_valid() implementation for v5.14.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Reported-by: Alex Bee <knaerzche@gmail.com>
Link: https://lore.kernel.org/r/d3a3c828-b777-faf8-e901-904995688437@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/Kconfig            |  1 +
 arch/arm64/include/asm/page.h |  1 +
 arch/arm64/mm/init.c          | 37 +++++++++++++++++++++++++++++++++++
 include/linux/mmzone.h        |  9 ---------
 4 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fdcd54d39c1e..62c3c1d2190f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -156,6 +156,7 @@ config ARM64
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_MMAP_RND_BITS
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
+	select HAVE_ARCH_PFN_VALID
 	select HAVE_ARCH_PREL32_RELOCATIONS
 	select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
 	select HAVE_ARCH_SECCOMP_FILTER
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 993a27ea6f54..f98c91bbd7c1 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -41,6 +41,7 @@ void tag_clear_highpage(struct page *to);
 
 typedef struct page *pgtable_t;
 
+int pfn_valid(unsigned long pfn);
 int pfn_is_map_memory(unsigned long pfn);
 
 #include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8490ed2917ff..1fdb7bb7c198 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -219,6 +219,43 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 	free_area_init(max_zone_pfns);
 }
 
+int pfn_valid(unsigned long pfn)
+{
+	phys_addr_t addr = PFN_PHYS(pfn);
+	struct mem_section *ms;
+
+	/*
+	 * Ensure the upper PAGE_SHIFT bits are clear in the
+	 * pfn. Else it might lead to false positives when
+	 * some of the upper bits are set, but the lower bits
+	 * match a valid pfn.
+	 */
+	if (PHYS_PFN(addr) != pfn)
+		return 0;
+
+	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
+		return 0;
+
+	ms = __pfn_to_section(pfn);
+	if (!valid_section(ms))
+		return 0;
+
+	/*
+	 * ZONE_DEVICE memory does not have the memblock entries.
+	 * memblock_is_map_memory() check for ZONE_DEVICE based
+	 * addresses will always fail. Even the normal hotplugged
+	 * memory will never have MEMBLOCK_NOMAP flag set in their
+	 * memblock entries. Skip memblock search for all non early
+	 * memory sections covering all of hotplug memory including
+	 * both normal and ZONE_DEVICE based.
+	 */
+	if (!early_section(ms))
+		return pfn_section_valid(ms, pfn);
+
+	return memblock_is_memory(addr);
+}
+EXPORT_SYMBOL(pfn_valid);
+
 int pfn_is_map_memory(unsigned long pfn)
 {
 	phys_addr_t addr = PFN_PHYS(pfn);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index fcb535560028..ee70f21a79d5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1463,15 +1463,6 @@ static inline int pfn_valid(unsigned long pfn)
 {
 	struct mem_section *ms;
 
-	/*
-	 * Ensure the upper PAGE_SHIFT bits are clear in the
-	 * pfn. Else it might lead to false positives when
-	 * some of the upper bits are set, but the lower bits
-	 * match a valid pfn.
-	 */
-	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
-		return 0;
-
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
 	ms = __nr_to_section(pfn_to_section_nr(pfn));
-- 
2.33.0.rc2.250.ged5fa647cd-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:28             ` Will Deacon
@ 2021-08-25 10:32               ` Will Deacon
  -1 siblings, 0 replies; 64+ messages in thread
From: Will Deacon @ 2021-08-25 10:32 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Robin Murphy, Mike Rapoport, Alex Bee,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 11:28:56AM +0100, Will Deacon wrote:
> On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
> > Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> > commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> > assess the implications in 5.15 (the patch doesn't seem to have the
> > arm64 maintainers' ack anyway ;)).
> 
> I'll stick the revert (below) into kernelci now so we can get some coverage
> in case it breaks something else.

Bah, having said that...

> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index fcb535560028..ee70f21a79d5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1463,15 +1463,6 @@ static inline int pfn_valid(unsigned long pfn)
>  {
>  	struct mem_section *ms;
>  
> -	/*
> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> -	 * pfn. Else it might lead to false positives when
> -	 * some of the upper bits are set, but the lower bits
> -	 * match a valid pfn.
> -	 */
> -	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
> -		return 0;
> -

I suppose we should leave this bit as-is, since the whole point here is
trying to minimise the impact on other architectures.

Will

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:32               ` Will Deacon
  0 siblings, 0 replies; 64+ messages in thread
From: Will Deacon @ 2021-08-25 10:32 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Robin Murphy, Mike Rapoport, Alex Bee,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 11:28:56AM +0100, Will Deacon wrote:
> On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
> > Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> > commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> > assess the implications in 5.15 (the patch doesn't seem to have the
> > arm64 maintainers' ack anyway ;)).
> 
> I'll stick the revert (below) into kernelci now so we can get some coverage
> in case it breaks something else.

Bah, having said that...

> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index fcb535560028..ee70f21a79d5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1463,15 +1463,6 @@ static inline int pfn_valid(unsigned long pfn)
>  {
>  	struct mem_section *ms;
>  
> -	/*
> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> -	 * pfn. Else it might lead to false positives when
> -	 * some of the upper bits are set, but the lower bits
> -	 * match a valid pfn.
> -	 */
> -	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
> -		return 0;
> -

I suppose we should leave this bit as-is, since the whole point here is
trying to minimise the impact on other architectures.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:28             ` Will Deacon
@ 2021-08-25 10:33               ` Robin Murphy
  -1 siblings, 0 replies; 64+ messages in thread
From: Robin Murphy @ 2021-08-25 10:33 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas
  Cc: David Hildenbrand, Mike Rapoport, Alex Bee, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM, Christoph Hellwig

On 2021-08-25 11:28, Will Deacon wrote:
> On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
>> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
>> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
>> assess the implications in 5.15 (the patch doesn't seem to have the
>> arm64 maintainers' ack anyway ;)).
> 
> I'll stick the revert (below) into kernelci now so we can get some coverage
> in case it breaks something else.
> 
> Will
> 
> --->8
> 
>  From e97ba0e39e486c20d8f76f3e632e4b7d933602cd Mon Sep 17 00:00:00 2001
> From: Will Deacon <will@kernel.org>
> Date: Wed, 25 Aug 2021 11:10:07 +0100
> Subject: [PATCH] Revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"
> 
> This reverts commit 16c9afc776608324ca71c0bc354987bab532f51d.
> 
> Alex Bee reports a regression in 5.14 on their RK3328 SoC when
> configuring the PL330 DMA controller:
> 
>   | ------------[ cut here ]------------
>   | WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>   | Modules linked in: spi_rockchip(+) fuse
>   | CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>   | Hardware name: Pine64 Rock64 (DT)
>   | pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>   | pc : dma_map_resource+0x68/0xc0
>   | lr : pl330_prep_slave_fifo+0x78/0xd0
> 
> This appears to be because dma_map_resource() is being called for a
> physical address which does not correspond to a memory address yet does
> have a valid 'struct page' due to the way in which the vmemmap is
> constructed.
> 
> Prior to 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), the arm64
> implementation of pfn_valid() called memblock_is_memory() to return
> 'false' for such regions and the DMA mapping request would proceed.
> However, now that we are using the generic implementation where only the
> presence of the memory map entry is considered, we return 'true' and
> erroneously fail with DMA_MAPPING_ERROR because we identify the region
> as DRAM.
> 
> Although fixing this in the DMA mapping code is arguably the right fix,
> it is a risky, cross-architecture change at this stage in the cycle. So
> just revert arm64 back to its old pfn_valid() implementation for v5.14.

TBH the offending warning is only meant to be a quick sanity check, so I 
don't think there should be much impact to changing the DMA code; it's 
just a question of figuring out what change to make. I'm digging in now...

Thanks,
Robin.

> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Reported-by: Alex Bee <knaerzche@gmail.com>
> Link: https://lore.kernel.org/r/d3a3c828-b777-faf8-e901-904995688437@gmail.com
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>   arch/arm64/Kconfig            |  1 +
>   arch/arm64/include/asm/page.h |  1 +
>   arch/arm64/mm/init.c          | 37 +++++++++++++++++++++++++++++++++++
>   include/linux/mmzone.h        |  9 ---------
>   4 files changed, 39 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index fdcd54d39c1e..62c3c1d2190f 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -156,6 +156,7 @@ config ARM64
>   	select HAVE_ARCH_KGDB
>   	select HAVE_ARCH_MMAP_RND_BITS
>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> +	select HAVE_ARCH_PFN_VALID
>   	select HAVE_ARCH_PREL32_RELOCATIONS
>   	select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
>   	select HAVE_ARCH_SECCOMP_FILTER
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 993a27ea6f54..f98c91bbd7c1 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -41,6 +41,7 @@ void tag_clear_highpage(struct page *to);
>   
>   typedef struct page *pgtable_t;
>   
> +int pfn_valid(unsigned long pfn);
>   int pfn_is_map_memory(unsigned long pfn);
>   
>   #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 8490ed2917ff..1fdb7bb7c198 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -219,6 +219,43 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>   	free_area_init(max_zone_pfns);
>   }
>   
> +int pfn_valid(unsigned long pfn)
> +{
> +	phys_addr_t addr = PFN_PHYS(pfn);
> +	struct mem_section *ms;
> +
> +	/*
> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> +	 * pfn. Else it might lead to false positives when
> +	 * some of the upper bits are set, but the lower bits
> +	 * match a valid pfn.
> +	 */
> +	if (PHYS_PFN(addr) != pfn)
> +		return 0;
> +
> +	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> +		return 0;
> +
> +	ms = __pfn_to_section(pfn);
> +	if (!valid_section(ms))
> +		return 0;
> +
> +	/*
> +	 * ZONE_DEVICE memory does not have the memblock entries.
> +	 * memblock_is_map_memory() check for ZONE_DEVICE based
> +	 * addresses will always fail. Even the normal hotplugged
> +	 * memory will never have MEMBLOCK_NOMAP flag set in their
> +	 * memblock entries. Skip memblock search for all non early
> +	 * memory sections covering all of hotplug memory including
> +	 * both normal and ZONE_DEVICE based.
> +	 */
> +	if (!early_section(ms))
> +		return pfn_section_valid(ms, pfn);
> +
> +	return memblock_is_memory(addr);
> +}
> +EXPORT_SYMBOL(pfn_valid);
> +
>   int pfn_is_map_memory(unsigned long pfn)
>   {
>   	phys_addr_t addr = PFN_PHYS(pfn);
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index fcb535560028..ee70f21a79d5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1463,15 +1463,6 @@ static inline int pfn_valid(unsigned long pfn)
>   {
>   	struct mem_section *ms;
>   
> -	/*
> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> -	 * pfn. Else it might lead to false positives when
> -	 * some of the upper bits are set, but the lower bits
> -	 * match a valid pfn.
> -	 */
> -	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
> -		return 0;
> -
>   	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>   		return 0;
>   	ms = __nr_to_section(pfn_to_section_nr(pfn));
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:33               ` Robin Murphy
  0 siblings, 0 replies; 64+ messages in thread
From: Robin Murphy @ 2021-08-25 10:33 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas
  Cc: David Hildenbrand, Mike Rapoport, Alex Bee, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM, Christoph Hellwig

On 2021-08-25 11:28, Will Deacon wrote:
> On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
>> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
>> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
>> assess the implications in 5.15 (the patch doesn't seem to have the
>> arm64 maintainers' ack anyway ;)).
> 
> I'll stick the revert (below) into kernelci now so we can get some coverage
> in case it breaks something else.
> 
> Will
> 
> --->8
> 
>  From e97ba0e39e486c20d8f76f3e632e4b7d933602cd Mon Sep 17 00:00:00 2001
> From: Will Deacon <will@kernel.org>
> Date: Wed, 25 Aug 2021 11:10:07 +0100
> Subject: [PATCH] Revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"
> 
> This reverts commit 16c9afc776608324ca71c0bc354987bab532f51d.
> 
> Alex Bee reports a regression in 5.14 on their RK3328 SoC when
> configuring the PL330 DMA controller:
> 
>   | ------------[ cut here ]------------
>   | WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>   | Modules linked in: spi_rockchip(+) fuse
>   | CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>   | Hardware name: Pine64 Rock64 (DT)
>   | pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>   | pc : dma_map_resource+0x68/0xc0
>   | lr : pl330_prep_slave_fifo+0x78/0xd0
> 
> This appears to be because dma_map_resource() is being called for a
> physical address which does not correspond to a memory address yet does
> have a valid 'struct page' due to the way in which the vmemmap is
> constructed.
> 
> Prior to 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), the arm64
> implementation of pfn_valid() called memblock_is_memory() to return
> 'false' for such regions and the DMA mapping request would proceed.
> However, now that we are using the generic implementation where only the
> presence of the memory map entry is considered, we return 'true' and
> erroneously fail with DMA_MAPPING_ERROR because we identify the region
> as DRAM.
> 
> Although fixing this in the DMA mapping code is arguably the right fix,
> it is a risky, cross-architecture change at this stage in the cycle. So
> just revert arm64 back to its old pfn_valid() implementation for v5.14.

TBH the offending warning is only meant to be a quick sanity check, so I 
don't think there should be much impact to changing the DMA code; it's 
just a question of figuring out what change to make. I'm digging in now...

Thanks,
Robin.

> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Reported-by: Alex Bee <knaerzche@gmail.com>
> Link: https://lore.kernel.org/r/d3a3c828-b777-faf8-e901-904995688437@gmail.com
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>   arch/arm64/Kconfig            |  1 +
>   arch/arm64/include/asm/page.h |  1 +
>   arch/arm64/mm/init.c          | 37 +++++++++++++++++++++++++++++++++++
>   include/linux/mmzone.h        |  9 ---------
>   4 files changed, 39 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index fdcd54d39c1e..62c3c1d2190f 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -156,6 +156,7 @@ config ARM64
>   	select HAVE_ARCH_KGDB
>   	select HAVE_ARCH_MMAP_RND_BITS
>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> +	select HAVE_ARCH_PFN_VALID
>   	select HAVE_ARCH_PREL32_RELOCATIONS
>   	select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
>   	select HAVE_ARCH_SECCOMP_FILTER
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 993a27ea6f54..f98c91bbd7c1 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -41,6 +41,7 @@ void tag_clear_highpage(struct page *to);
>   
>   typedef struct page *pgtable_t;
>   
> +int pfn_valid(unsigned long pfn);
>   int pfn_is_map_memory(unsigned long pfn);
>   
>   #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 8490ed2917ff..1fdb7bb7c198 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -219,6 +219,43 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>   	free_area_init(max_zone_pfns);
>   }
>   
> +int pfn_valid(unsigned long pfn)
> +{
> +	phys_addr_t addr = PFN_PHYS(pfn);
> +	struct mem_section *ms;
> +
> +	/*
> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> +	 * pfn. Else it might lead to false positives when
> +	 * some of the upper bits are set, but the lower bits
> +	 * match a valid pfn.
> +	 */
> +	if (PHYS_PFN(addr) != pfn)
> +		return 0;
> +
> +	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> +		return 0;
> +
> +	ms = __pfn_to_section(pfn);
> +	if (!valid_section(ms))
> +		return 0;
> +
> +	/*
> +	 * ZONE_DEVICE memory does not have the memblock entries.
> +	 * memblock_is_map_memory() check for ZONE_DEVICE based
> +	 * addresses will always fail. Even the normal hotplugged
> +	 * memory will never have MEMBLOCK_NOMAP flag set in their
> +	 * memblock entries. Skip memblock search for all non early
> +	 * memory sections covering all of hotplug memory including
> +	 * both normal and ZONE_DEVICE based.
> +	 */
> +	if (!early_section(ms))
> +		return pfn_section_valid(ms, pfn);
> +
> +	return memblock_is_memory(addr);
> +}
> +EXPORT_SYMBOL(pfn_valid);
> +
>   int pfn_is_map_memory(unsigned long pfn)
>   {
>   	phys_addr_t addr = PFN_PHYS(pfn);
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index fcb535560028..ee70f21a79d5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1463,15 +1463,6 @@ static inline int pfn_valid(unsigned long pfn)
>   {
>   	struct mem_section *ms;
>   
> -	/*
> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> -	 * pfn. Else it might lead to false positives when
> -	 * some of the upper bits are set, but the lower bits
> -	 * match a valid pfn.
> -	 */
> -	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
> -		return 0;
> -
>   	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>   		return 0;
>   	ms = __nr_to_section(pfn_to_section_nr(pfn));
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:20           ` Catalin Marinas
@ 2021-08-25 10:38             ` David Hildenbrand
  -1 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-08-25 10:38 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On 25.08.21 12:20, Catalin Marinas wrote:
> + hch
> 
> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
>> On 24.08.21 20:46, Robin Murphy wrote:
>>> On 2021-08-24 19:28, Mike Rapoport wrote:
>>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>>
>>>>>>   ------------[ cut here ]------------
>>>>>>   WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>>>>>   Modules linked in: spi_rockchip(+) fuse
>>>>>>   CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>>>   Hardware name: Pine64 Rock64 (DT)
>>>>>>   pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>>>   pc : dma_map_resource+0x68/0xc0
>>>>>>   lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>>>   sp : ffff800012102ae0
>>>>>>   x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>>>   x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>>>   x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>>>   x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>>>   x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>>>   x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>>>   x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>>>   x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>>>   x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>>>   x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>>>   Call trace:
>>>>>>     dma_map_resource+0x68/0xc0
>>>>>>     pl330_prep_slave_sg+0x58/0x220
>>>>>>     rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>>>     rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>>> [...]
>>>>>> Note: This does not relate to the spi driver - when disabling this device in
>>>>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>>>>> Commenting out the failing check at [1], however, helps and the mapping
>>>>>> works again.
>>>>
>>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>>> Random guess, the address is 0xff190800 (based on the x1 above but the
>>>>> regs might as well be mangled).
>>>>
>>>> 0xff190800 will cause this warning for sure. It has a memory map, but it is
>>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>>> returns 1.
>>>
>>> How does that happen, though? It's not a memory address, and it's not
>>> even within the bounds of anywhere there should or could be memory. This
>>> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
>>> the DRAM controller (which may not all be populated, and may have pieces
>>> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
>>> do we have pages (or at least the assumption of pages) for somewhere
>>> which by all rights should not have them?
>>
>> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
>> any such hacks. If there is a memory hole, it gets a memmap as well.
>>
>> Tricking pfn_valid() into returning "false" where we actually have a memmap
>> only makes it look like there is no memmap; but there is one, and
>> it's PG_reserved.
> 
> I can see the documentation for pfn_valid() does not claim anything more
> than the presence of an memmap entry. But I wonder whether the confusion
> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> assumes __va() can be used on pfn_valid(), though I suspect it relies on
> the calling function to check that the resource was RAM. The arm64
> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> standard memcpy on it, which wouldn't work for I/O (should we change
> this check to pfn_is_map_memory() for arm64?).
> 

kern_addr_valid() checks that there is a direct map entry, and that the 
mapped address has a valid mmap. (copied from x86-64)

Would you expect to have a direct map for memory holes and similar (IOW, 
!System RAM)?


>>>>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>>>>> DT. I have a suspicion it's the former since reverting the above commit
>>>>> makes it disappear.
>>>>
>>>> I think pfn_valid() actually behaves as expected but the caller is wrong
>>>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>>>
>>>> 	/* Don't allow RAM to be mapped */
>>>> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>>> 		return DMA_MAPPING_ERROR;
>>>>
>>>> Alex, can you please try this patch:
>>>
>>> That will certainly paper over the issue, but it's avoiding the question
>>> of what went wrong with the memory map in the first place. The comment
>>> is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
>>> addresses that would be wrong to pass to dma_map_page(), so I believe
>>> pfn_valid() is still the correct check.
>>
>> If we want to check for RAM, pfn_valid() would be wrong. If we want to check
>> for "is there a memmap, for whatever lives or does not live there",
>> pfn_valid() is the right check.
> 
> So what should the DMA code use instead? Last time we needed something
> similar, the recommendation was to use pfn_to_online_page(). Mike is
> suggesting memblock_is_memory().

We use pfn_to_online_page() when we want to know if it's system RAM and 
that the memmap actually contains something sane (-> memmap content has 
a well defined state).

You can have offline memory blocks where pfn_to_online_page() would 
return "false", memblock_is_memory() would return "true". IOW, there is 
a memmap, it's System RAM, but the memmap is stale and not trustworthy.

If you want to make sure no System RAM (online/offline/...) will get 
mapped, memblock_is_memory() should be the right thing to use. I recall 
that x86 traverse the resource tree instead to exclude system ram 
regions similarly.

> 
> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> assess the implications in 5.15 (the patch doesn't seem to have the
> arm64 maintainers' ack anyway ;)).

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:38             ` David Hildenbrand
  0 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-08-25 10:38 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On 25.08.21 12:20, Catalin Marinas wrote:
> + hch
> 
> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
>> On 24.08.21 20:46, Robin Murphy wrote:
>>> On 2021-08-24 19:28, Mike Rapoport wrote:
>>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>>
>>>>>>   ------------[ cut here ]------------
>>>>>>   WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>>>>>   Modules linked in: spi_rockchip(+) fuse
>>>>>>   CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>>>   Hardware name: Pine64 Rock64 (DT)
>>>>>>   pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>>>   pc : dma_map_resource+0x68/0xc0
>>>>>>   lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>>>   sp : ffff800012102ae0
>>>>>>   x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>>>   x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>>>   x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>>>   x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>>>   x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>>>   x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>>>   x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>>>   x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>>>   x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>>>   x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>>>   Call trace:
>>>>>>     dma_map_resource+0x68/0xc0
>>>>>>     pl330_prep_slave_sg+0x58/0x220
>>>>>>     rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>>>     rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>>> [...]
>>>>>> Note: This does not relate to the spi driver - when disabling this device in
>>>>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>>>>> Commenting out the failing check at [1], however, helps and the mapping
>>>>>> works again.
>>>>
>>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>>> Random guess, the address is 0xff190800 (based on the x1 above but the
>>>>> regs might as well be mangled).
>>>>
>>>> 0xff190800 will cause this warning for sure. It has a memory map, but it is
>>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>>> returns 1.
>>>
>>> How does that happen, though? It's not a memory address, and it's not
>>> even within the bounds of anywhere there should or could be memory. This
>>> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
>>> the DRAM controller (which may not all be populated, and may have pieces
>>> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
>>> do we have pages (or at least the assumption of pages) for somewhere
>>> which by all rights should not have them?
>>
>> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
>> any such hacks. If there is a memory hole, it gets a memmap as well.
>>
>> Tricking pfn_valid() into returning "false" where we actually have a memmap
>> only makes it look like there is no memmap; but there is one, and
>> it's PG_reserved.
> 
> I can see the documentation for pfn_valid() does not claim anything more
> than the presence of an memmap entry. But I wonder whether the confusion
> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> assumes __va() can be used on pfn_valid(), though I suspect it relies on
> the calling function to check that the resource was RAM. The arm64
> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> standard memcpy on it, which wouldn't work for I/O (should we change
> this check to pfn_is_map_memory() for arm64?).
> 

kern_addr_valid() checks that there is a direct map entry, and that the 
mapped address has a valid mmap. (copied from x86-64)

Would you expect to have a direct map for memory holes and similar (IOW, 
!System RAM)?


>>>>> Either pfn_valid() gets confused in 5.14 or something is wrong with the
>>>>> DT. I have a suspicion it's the former since reverting the above commit
>>>>> makes it disappear.
>>>>
>>>> I think pfn_valid() actually behaves as expected but the caller is wrong
>>>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>>>
>>>> 	/* Don't allow RAM to be mapped */
>>>> 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>>> 		return DMA_MAPPING_ERROR;
>>>>
>>>> Alex, can you please try this patch:
>>>
>>> That will certainly paper over the issue, but it's avoiding the question
>>> of what went wrong with the memory map in the first place. The comment
>>> is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
>>> addresses that would be wrong to pass to dma_map_page(), so I believe
>>> pfn_valid() is still the correct check.
>>
>> If we want to check for RAM, pfn_valid() would be wrong. If we want to check
>> for "is there a memmap, for whatever lives or does not live there",
>> pfn_valid() is the right check.
> 
> So what should the DMA code use instead? Last time we needed something
> similar, the recommendation was to use pfn_to_online_page(). Mike is
> suggesting memblock_is_memory().

We use pfn_to_online_page() when we want to know if it's system RAM and 
that the memmap actually contains something sane (-> memmap content has 
a well defined state).

You can have offline memory blocks where pfn_to_online_page() would 
return "false", memblock_is_memory() would return "true". IOW, there is 
a memmap, it's System RAM, but the memmap is stale and not trustworthy.

If you want to make sure no System RAM (online/offline/...) will get 
mapped, memblock_is_memory() should be the right thing to use. I recall 
that x86 traverse the resource tree instead to exclude system ram 
regions similarly.

> 
> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> assess the implications in 5.15 (the patch doesn't seem to have the
> arm64 maintainers' ack anyway ;)).

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:20           ` Catalin Marinas
@ 2021-08-25 10:52             ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-08-25 10:52 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Robin Murphy, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
> + hch
> 
> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
> > On 24.08.21 20:46, Robin Murphy wrote:
> > > On 2021-08-24 19:28, Mike Rapoport wrote:
> > > > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > > > 
> > > > > >  ------------[ cut here ]------------
> > > > > >  WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > > >  Modules linked in: spi_rockchip(+) fuse
> > > > > >  CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > > >  Hardware name: Pine64 Rock64 (DT)
> > > > > >  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > > >  pc : dma_map_resource+0x68/0xc0
> > > > > >  lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > > >  sp : ffff800012102ae0
> > > > > >  x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > > >  x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > > >  x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > > >  x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > > >  x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > > >  x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > > >  x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > > >  x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > > >  x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > > >  x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > > >  Call trace:
> > > > > >    dma_map_resource+0x68/0xc0
> > > > > >    pl330_prep_slave_sg+0x58/0x220
> > > > > >    rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > > >    rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > > > [...]
> > > > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > > > works again.
> > > > 
> > > > > Do you know which address dma_map_resource() is trying to map (maybe
> > > > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > > > regs might as well be mangled).
> > > > 
> > > > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > > > not RAM so old version of pfn_valid() would return 0 and the new one
> > > > returns 1.
> > > 
> > > How does that happen, though? It's not a memory address, and it's not
> > > even within the bounds of anywhere there should or could be memory. This
> > > SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> > > the DRAM controller (which may not all be populated, and may have pieces
> > > carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> > > do we have pages (or at least the assumption of pages) for somewhere
> > > which by all rights should not have them?
> > 
> > Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
> > any such hacks. If there is a memory hole, it gets a memmap as well.
> > 
> > Tricking pfn_valid() into returning "false" where we actually have a memmap
> > only makes it look like there is no memmap; but there is one, and
> > it's PG_reserved.
> 
> I can see the documentation for pfn_valid() does not claim anything more
> than the presence of an memmap entry. But I wonder whether the confusion
> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> assumes __va() can be used on pfn_valid(), though I suspect it relies on
> the calling function to check that the resource was RAM. The arm64
> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> standard memcpy on it, which wouldn't work for I/O (should we change
> this check to pfn_is_map_memory() for arm64?).

Since for the most cases pfn_valid() would actually mean RAM, the confusion
is really widespread :(

Using pfn_is_map_memory() in kern_addr_valid() seems to me a better choice
that pfn_valid().

> > > > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > > > DT. I have a suspicion it's the former since reverting the above commit
> > > > > makes it disappear.
> > > > 
> > > > I think pfn_valid() actually behaves as expected but the caller is wrong
> > > > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > > > 
> > > > 	/* Don't allow RAM to be mapped */
> > > > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > > > 		return DMA_MAPPING_ERROR;
> > > > 
> > > > Alex, can you please try this patch:
> > > 
> > > That will certainly paper over the issue, but it's avoiding the question
> > > of what went wrong with the memory map in the first place. The comment
> > > is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> > > addresses that would be wrong to pass to dma_map_page(), so I believe
> > > pfn_valid() is still the correct check.
> > 
> > If we want to check for RAM, pfn_valid() would be wrong. If we want to check
> > for "is there a memmap, for whatever lives or does not live there",
> > pfn_valid() is the right check.
> 
> So what should the DMA code use instead? Last time we needed something
> similar, the recommendation was to use pfn_to_online_page(). Mike is
> suggesting memblock_is_memory().

This was a for testing purposes :)
I considered some ugly fix for v5.14 with a __weak check in dma/mapping.c for
backward compatibility and an override in arm64 and then a proper audit for
v5.15.

But as Will went for revert this is not really relevant.
 
> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> assess the implications in 5.15 (the patch doesn't seem to have the
> arm64 maintainers' ack anyway ;)).
> 
> Thanks.
> 
> -- 
> Catalin

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:52             ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-08-25 10:52 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Robin Murphy, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
> + hch
> 
> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
> > On 24.08.21 20:46, Robin Murphy wrote:
> > > On 2021-08-24 19:28, Mike Rapoport wrote:
> > > > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > > > 
> > > > > >  ------------[ cut here ]------------
> > > > > >  WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > > >  Modules linked in: spi_rockchip(+) fuse
> > > > > >  CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > > >  Hardware name: Pine64 Rock64 (DT)
> > > > > >  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > > >  pc : dma_map_resource+0x68/0xc0
> > > > > >  lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > > >  sp : ffff800012102ae0
> > > > > >  x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > > >  x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > > >  x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > > >  x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > > >  x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > > >  x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > > >  x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > > >  x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > > >  x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > > >  x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > > >  Call trace:
> > > > > >    dma_map_resource+0x68/0xc0
> > > > > >    pl330_prep_slave_sg+0x58/0x220
> > > > > >    rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > > >    rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > > > [...]
> > > > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > > > works again.
> > > > 
> > > > > Do you know which address dma_map_resource() is trying to map (maybe
> > > > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > > > regs might as well be mangled).
> > > > 
> > > > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > > > not RAM so old version of pfn_valid() would return 0 and the new one
> > > > returns 1.
> > > 
> > > How does that happen, though? It's not a memory address, and it's not
> > > even within the bounds of anywhere there should or could be memory. This
> > > SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> > > the DRAM controller (which may not all be populated, and may have pieces
> > > carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> > > do we have pages (or at least the assumption of pages) for somewhere
> > > which by all rights should not have them?
> > 
> > Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
> > any such hacks. If there is a memory hole, it gets a memmap as well.
> > 
> > Tricking pfn_valid() into returning "false" where we actually have a memmap
> > only makes it look like there is no memmap; but there is one, and
> > it's PG_reserved.
> 
> I can see the documentation for pfn_valid() does not claim anything more
> than the presence of an memmap entry. But I wonder whether the confusion
> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> assumes __va() can be used on pfn_valid(), though I suspect it relies on
> the calling function to check that the resource was RAM. The arm64
> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> standard memcpy on it, which wouldn't work for I/O (should we change
> this check to pfn_is_map_memory() for arm64?).

Since for the most cases pfn_valid() would actually mean RAM, the confusion
is really widespread :(

Using pfn_is_map_memory() in kern_addr_valid() seems to me a better choice
that pfn_valid().

> > > > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > > > DT. I have a suspicion it's the former since reverting the above commit
> > > > > makes it disappear.
> > > > 
> > > > I think pfn_valid() actually behaves as expected but the caller is wrong
> > > > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > > > 
> > > > 	/* Don't allow RAM to be mapped */
> > > > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > > > 		return DMA_MAPPING_ERROR;
> > > > 
> > > > Alex, can you please try this patch:
> > > 
> > > That will certainly paper over the issue, but it's avoiding the question
> > > of what went wrong with the memory map in the first place. The comment
> > > is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> > > addresses that would be wrong to pass to dma_map_page(), so I believe
> > > pfn_valid() is still the correct check.
> > 
> > If we want to check for RAM, pfn_valid() would be wrong. If we want to check
> > for "is there a memmap, for whatever lives or does not live there",
> > pfn_valid() is the right check.
> 
> So what should the DMA code use instead? Last time we needed something
> similar, the recommendation was to use pfn_to_online_page(). Mike is
> suggesting memblock_is_memory().

This was a for testing purposes :)
I considered some ugly fix for v5.14 with a __weak check in dma/mapping.c for
backward compatibility and an override in arm64 and then a proper audit for
v5.15.

But as Will went for revert this is not really relevant.
 
> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> assess the implications in 5.15 (the patch doesn't seem to have the
> arm64 maintainers' ack anyway ;)).
> 
> Thanks.
> 
> -- 
> Catalin

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:38             ` David Hildenbrand
@ 2021-08-25 10:54               ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-08-25 10:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Catalin Marinas, Robin Murphy, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 12:38:31PM +0200, David Hildenbrand wrote:
> On 25.08.21 12:20, Catalin Marinas wrote:
>
> > I can see the documentation for pfn_valid() does not claim anything more
> > than the presence of an memmap entry. But I wonder whether the confusion
> > is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> > assumes __va() can be used on pfn_valid(), though I suspect it relies on
> > the calling function to check that the resource was RAM. The arm64
> > kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> > standard memcpy on it, which wouldn't work for I/O (should we change
> > this check to pfn_is_map_memory() for arm64?).
> 
> kern_addr_valid() checks that there is a direct map entry, and that the
> mapped address has a valid mmap. (copied from x86-64)
> 
> Would you expect to have a direct map for memory holes and similar (IOW,
> !System RAM)?

I don't see where will it bail out for an IOMEM mapping before doing the
pfn_valid() check...
 
-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:54               ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-08-25 10:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Catalin Marinas, Robin Murphy, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 12:38:31PM +0200, David Hildenbrand wrote:
> On 25.08.21 12:20, Catalin Marinas wrote:
>
> > I can see the documentation for pfn_valid() does not claim anything more
> > than the presence of an memmap entry. But I wonder whether the confusion
> > is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> > assumes __va() can be used on pfn_valid(), though I suspect it relies on
> > the calling function to check that the resource was RAM. The arm64
> > kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> > standard memcpy on it, which wouldn't work for I/O (should we change
> > this check to pfn_is_map_memory() for arm64?).
> 
> kern_addr_valid() checks that there is a direct map entry, and that the
> mapped address has a valid mmap. (copied from x86-64)
> 
> Would you expect to have a direct map for memory holes and similar (IOW,
> !System RAM)?

I don't see where will it bail out for an IOMEM mapping before doing the
pfn_valid() check...
 
-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:38             ` David Hildenbrand
@ 2021-08-25 10:55               ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-08-25 10:55 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 12:38:31PM +0200, David Hildenbrand wrote:
> On 25.08.21 12:20, Catalin Marinas wrote:
> > On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
> > > On 24.08.21 20:46, Robin Murphy wrote:
> > > > On 2021-08-24 19:28, Mike Rapoport wrote:
> > > > > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > > > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > > > > 
> > > > > > >   ------------[ cut here ]------------
> > > > > > >   WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > > > >   Modules linked in: spi_rockchip(+) fuse
> > > > > > >   CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > > > >   Hardware name: Pine64 Rock64 (DT)
> > > > > > >   pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > > > >   pc : dma_map_resource+0x68/0xc0
> > > > > > >   lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > > > >   sp : ffff800012102ae0
> > > > > > >   x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > > > >   x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > > > >   x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > > > >   x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > > > >   x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > > > >   x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > > > >   x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > > > >   x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > > > >   x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > > > >   x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > > > >   Call trace:
> > > > > > >     dma_map_resource+0x68/0xc0
> > > > > > >     pl330_prep_slave_sg+0x58/0x220
> > > > > > >     rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > > > >     rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > > > > [...]
> > > > > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > > > > works again.
> > > > > 
> > > > > > Do you know which address dma_map_resource() is trying to map (maybe
> > > > > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > > > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > > > > regs might as well be mangled).
> > > > > 
> > > > > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > > > > not RAM so old version of pfn_valid() would return 0 and the new one
> > > > > returns 1.
> > > > 
> > > > How does that happen, though? It's not a memory address, and it's not
> > > > even within the bounds of anywhere there should or could be memory. This
> > > > SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> > > > the DRAM controller (which may not all be populated, and may have pieces
> > > > carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> > > > do we have pages (or at least the assumption of pages) for somewhere
> > > > which by all rights should not have them?
> > > 
> > > Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
> > > any such hacks. If there is a memory hole, it gets a memmap as well.
> > > 
> > > Tricking pfn_valid() into returning "false" where we actually have a memmap
> > > only makes it look like there is no memmap; but there is one, and
> > > it's PG_reserved.
> > 
> > I can see the documentation for pfn_valid() does not claim anything more
> > than the presence of an memmap entry. But I wonder whether the confusion
> > is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> > assumes __va() can be used on pfn_valid(), though I suspect it relies on
> > the calling function to check that the resource was RAM. The arm64
> > kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> > standard memcpy on it, which wouldn't work for I/O (should we change
> > this check to pfn_is_map_memory() for arm64?).
> 
> kern_addr_valid() checks that there is a direct map entry, and that the
> mapped address has a valid mmap. (copied from x86-64)

It checks that there is a va->pa mapping, not necessarily in the linear
map as it walks the page tables. So for some I/O range that happens to
be mapped but which was in close proximity to RAM so that pfn_valid() is
true, kern_addr_valid() would return true. I don't thin that was the
intention.

> Would you expect to have a direct map for memory holes and similar (IOW,
> !System RAM)?

No, but we with the generic pfn_valid(), it may return true for mapped
MMIO (with different attributes than the direct map).

> > > > > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > > > > DT. I have a suspicion it's the former since reverting the above commit
> > > > > > makes it disappear.
> > > > > 
> > > > > I think pfn_valid() actually behaves as expected but the caller is wrong
> > > > > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > > > > 
> > > > > 	/* Don't allow RAM to be mapped */
> > > > > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > > > > 		return DMA_MAPPING_ERROR;
> > > > > 
> > > > > Alex, can you please try this patch:
> > > > 
> > > > That will certainly paper over the issue, but it's avoiding the question
> > > > of what went wrong with the memory map in the first place. The comment
> > > > is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> > > > addresses that would be wrong to pass to dma_map_page(), so I believe
> > > > pfn_valid() is still the correct check.
> > > 
> > > If we want to check for RAM, pfn_valid() would be wrong. If we want to check
> > > for "is there a memmap, for whatever lives or does not live there",
> > > pfn_valid() is the right check.
> > 
> > So what should the DMA code use instead? Last time we needed something
> > similar, the recommendation was to use pfn_to_online_page(). Mike is
> > suggesting memblock_is_memory().
> 
> We use pfn_to_online_page() when we want to know if it's system RAM and that
> the memmap actually contains something sane (-> memmap content has a well
> defined state).
> 
> You can have offline memory blocks where pfn_to_online_page() would return
> "false", memblock_is_memory() would return "true". IOW, there is a memmap,
> it's System RAM, but the memmap is stale and not trustworthy.
> 
> If you want to make sure no System RAM (online/offline/...) will get mapped,
> memblock_is_memory() should be the right thing to use. I recall that x86
> traverse the resource tree instead to exclude system ram regions similarly.

Thanks, this makes sense.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:55               ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-08-25 10:55 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 12:38:31PM +0200, David Hildenbrand wrote:
> On 25.08.21 12:20, Catalin Marinas wrote:
> > On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
> > > On 24.08.21 20:46, Robin Murphy wrote:
> > > > On 2021-08-24 19:28, Mike Rapoport wrote:
> > > > > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > > > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > > > > 
> > > > > > >   ------------[ cut here ]------------
> > > > > > >   WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > > > >   Modules linked in: spi_rockchip(+) fuse
> > > > > > >   CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > > > >   Hardware name: Pine64 Rock64 (DT)
> > > > > > >   pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > > > >   pc : dma_map_resource+0x68/0xc0
> > > > > > >   lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > > > >   sp : ffff800012102ae0
> > > > > > >   x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > > > >   x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > > > >   x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > > > >   x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > > > >   x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > > > >   x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > > > >   x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > > > >   x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > > > >   x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > > > >   x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > > > >   Call trace:
> > > > > > >     dma_map_resource+0x68/0xc0
> > > > > > >     pl330_prep_slave_sg+0x58/0x220
> > > > > > >     rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > > > >     rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > > > > [...]
> > > > > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > > > > works again.
> > > > > 
> > > > > > Do you know which address dma_map_resource() is trying to map (maybe
> > > > > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > > > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > > > > regs might as well be mangled).
> > > > > 
> > > > > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > > > > not RAM so old version of pfn_valid() would return 0 and the new one
> > > > > returns 1.
> > > > 
> > > > How does that happen, though? It's not a memory address, and it's not
> > > > even within the bounds of anywhere there should or could be memory. This
> > > > SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> > > > the DRAM controller (which may not all be populated, and may have pieces
> > > > carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> > > > do we have pages (or at least the assumption of pages) for somewhere
> > > > which by all rights should not have them?
> > > 
> > > Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
> > > any such hacks. If there is a memory hole, it gets a memmap as well.
> > > 
> > > Tricking pfn_valid() into returning "false" where we actually have a memmap
> > > only makes it look like there is no memmap; but there is one, and
> > > it's PG_reserved.
> > 
> > I can see the documentation for pfn_valid() does not claim anything more
> > than the presence of an memmap entry. But I wonder whether the confusion
> > is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> > assumes __va() can be used on pfn_valid(), though I suspect it relies on
> > the calling function to check that the resource was RAM. The arm64
> > kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> > standard memcpy on it, which wouldn't work for I/O (should we change
> > this check to pfn_is_map_memory() for arm64?).
> 
> kern_addr_valid() checks that there is a direct map entry, and that the
> mapped address has a valid mmap. (copied from x86-64)

It checks that there is a va->pa mapping, not necessarily in the linear
map as it walks the page tables. So for some I/O range that happens to
be mapped but which was in close proximity to RAM so that pfn_valid() is
true, kern_addr_valid() would return true. I don't thin that was the
intention.

> Would you expect to have a direct map for memory holes and similar (IOW,
> !System RAM)?

No, but we with the generic pfn_valid(), it may return true for mapped
MMIO (with different attributes than the direct map).

> > > > > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > > > > DT. I have a suspicion it's the former since reverting the above commit
> > > > > > makes it disappear.
> > > > > 
> > > > > I think pfn_valid() actually behaves as expected but the caller is wrong
> > > > > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > > > > 
> > > > > 	/* Don't allow RAM to be mapped */
> > > > > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > > > > 		return DMA_MAPPING_ERROR;
> > > > > 
> > > > > Alex, can you please try this patch:
> > > > 
> > > > That will certainly paper over the issue, but it's avoiding the question
> > > > of what went wrong with the memory map in the first place. The comment
> > > > is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> > > > addresses that would be wrong to pass to dma_map_page(), so I believe
> > > > pfn_valid() is still the correct check.
> > > 
> > > If we want to check for RAM, pfn_valid() would be wrong. If we want to check
> > > for "is there a memmap, for whatever lives or does not live there",
> > > pfn_valid() is the right check.
> > 
> > So what should the DMA code use instead? Last time we needed something
> > similar, the recommendation was to use pfn_to_online_page(). Mike is
> > suggesting memblock_is_memory().
> 
> We use pfn_to_online_page() when we want to know if it's system RAM and that
> the memmap actually contains something sane (-> memmap content has a well
> defined state).
> 
> You can have offline memory blocks where pfn_to_online_page() would return
> "false", memblock_is_memory() would return "true". IOW, there is a memmap,
> it's System RAM, but the memmap is stale and not trustworthy.
> 
> If you want to make sure no System RAM (online/offline/...) will get mapped,
> memblock_is_memory() should be the right thing to use. I recall that x86
> traverse the resource tree instead to exclude system ram regions similarly.

Thanks, this makes sense.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:38             ` David Hildenbrand
@ 2021-08-25 10:58               ` Robin Murphy
  -1 siblings, 0 replies; 64+ messages in thread
From: Robin Murphy @ 2021-08-25 10:58 UTC (permalink / raw)
  To: David Hildenbrand, Catalin Marinas
  Cc: Mike Rapoport, Alex Bee, Will Deacon, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM, Christoph Hellwig

On 2021-08-25 11:38, David Hildenbrand wrote:
> On 25.08.21 12:20, Catalin Marinas wrote:
>> + hch
>>
>> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
>>> On 24.08.21 20:46, Robin Murphy wrote:
>>>> On 2021-08-24 19:28, Mike Rapoport wrote:
>>>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>>>> it seems there is a regression in arm64 memory mapping in 5.14, 
>>>>>>> since it
>>>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>>>
>>>>>>>   ------------[ cut here ]------------
>>>>>>>   WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 
>>>>>>> dma_map_resource+0x68/0xc0
>>>>>>>   Modules linked in: spi_rockchip(+) fuse
>>>>>>>   CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>>>>   Hardware name: Pine64 Rock64 (DT)
>>>>>>>   pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>>>>   pc : dma_map_resource+0x68/0xc0
>>>>>>>   lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>>>>   sp : ffff800012102ae0
>>>>>>>   x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>>>>   x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>>>>   x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>>>>   x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>>>>   x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>>>>   x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>>>>   x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>>>>   x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>>>>   x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>>>>   x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>>>>   Call trace:
>>>>>>>     dma_map_resource+0x68/0xc0
>>>>>>>     pl330_prep_slave_sg+0x58/0x220
>>>>>>>     rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>>>>     rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>>>> [...]
>>>>>>> Note: This does not relate to the spi driver - when disabling 
>>>>>>> this device in
>>>>>>> the device tree it fails for any other (i2s, for instance) which 
>>>>>>> uses dma.
>>>>>>> Commenting out the failing check at [1], however, helps and the 
>>>>>>> mapping
>>>>>>> works again.
>>>>>
>>>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>>>> Random guess, the address is 0xff190800 (based on the x1 above but 
>>>>>> the
>>>>>> regs might as well be mangled).
>>>>>
>>>>> 0xff190800 will cause this warning for sure. It has a memory map, 
>>>>> but it is
>>>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>>>> returns 1.
>>>>
>>>> How does that happen, though? It's not a memory address, and it's not
>>>> even within the bounds of anywhere there should or could be memory. 
>>>> This
>>>> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
>>>> the DRAM controller (which may not all be populated, and may have 
>>>> pieces
>>>> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. 
>>>> Why
>>>> do we have pages (or at least the assumption of pages) for somewhere
>>>> which by all rights should not have them?
>>>
>>> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to 
>>> avoid
>>> any such hacks. If there is a memory hole, it gets a memmap as well.

Urgh, apologies for being slow. This case isn't a memory hole as such, 
but I failed to consider the *ends* of memory not being section-aligned 
and leading to an overlap anyway.

>>> Tricking pfn_valid() into returning "false" where we actually have a 
>>> memmap
>>> only makes it look like there is no memmap; but there is one, and
>>> it's PG_reserved.
>>
>> I can see the documentation for pfn_valid() does not claim anything more
>> than the presence of an memmap entry. But I wonder whether the confusion
>> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
>> assumes __va() can be used on pfn_valid(), though I suspect it relies on
>> the calling function to check that the resource was RAM. The arm64
>> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
>> standard memcpy on it, which wouldn't work for I/O (should we change
>> this check to pfn_is_map_memory() for arm64?).
>>
> 
> kern_addr_valid() checks that there is a direct map entry, and that the 
> mapped address has a valid mmap. (copied from x86-64)
> 
> Would you expect to have a direct map for memory holes and similar (IOW, 
> !System RAM)?

Probably - we can have no-map regions for firmware-reserved RAM which I 
believe end up as PG_reserved if they're small enough, for the same 
reasoning as this case.

>>>>>> Either pfn_valid() gets confused in 5.14 or something is wrong 
>>>>>> with the
>>>>>> DT. I have a suspicion it's the former since reverting the above 
>>>>>> commit
>>>>>> makes it disappear.
>>>>>
>>>>> I think pfn_valid() actually behaves as expected but the caller is 
>>>>> wrong
>>>>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>>>>
>>>>>     /* Don't allow RAM to be mapped */
>>>>>     if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>>>>         return DMA_MAPPING_ERROR;
>>>>>
>>>>> Alex, can you please try this patch:
>>>>
>>>> That will certainly paper over the issue, but it's avoiding the 
>>>> question
>>>> of what went wrong with the memory map in the first place. The comment
>>>> is indeed a bit inaccurate, but ultimately dma_map_resource() exists 
>>>> for
>>>> addresses that would be wrong to pass to dma_map_page(), so I believe
>>>> pfn_valid() is still the correct check.
>>>
>>> If we want to check for RAM, pfn_valid() would be wrong. If we want 
>>> to check
>>> for "is there a memmap, for whatever lives or does not live there",
>>> pfn_valid() is the right check.
>>
>> So what should the DMA code use instead? Last time we needed something
>> similar, the recommendation was to use pfn_to_online_page(). Mike is
>> suggesting memblock_is_memory().
> 
> We use pfn_to_online_page() when we want to know if it's system RAM and 
> that the memmap actually contains something sane (-> memmap content has 
> a well defined state).
> 
> You can have offline memory blocks where pfn_to_online_page() would 
> return "false", memblock_is_memory() would return "true". IOW, there is 
> a memmap, it's System RAM, but the memmap is stale and not trustworthy.

That's fine - offline memory is doubly-invalid to map as an MMIO resource :)

> If you want to make sure no System RAM (online/offline/...) will get 
> mapped, memblock_is_memory() should be the right thing to use. I recall 
> that x86 traverse the resource tree instead to exclude system ram 
> regions similarly.

I'm thinking that "pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn))" 
might be the closest thing to what I'd like to express - does that seem 
sensible at all? The one thing I'm not quite sure about is the 
interaction with P2P mappings of ZONE_DEVICE, but that's also true of 
the previous behaviour, and I'm not aware that the usage model has been 
fully nailed down yet, so I suspect we have some wiggle room. At worst, 
false negatives in certain situations wouldn't be the end of the world, 
since this is merely a sanity check for something which is expected to 
be a no-op the vast majority of the time, so being unobtrusive is more 
important than being exhaustive.

Thanks,
Robin.

>> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
>> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
>> assess the implications in 5.15 (the patch doesn't seem to have the
>> arm64 maintainers' ack anyway ;)).
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 10:58               ` Robin Murphy
  0 siblings, 0 replies; 64+ messages in thread
From: Robin Murphy @ 2021-08-25 10:58 UTC (permalink / raw)
  To: David Hildenbrand, Catalin Marinas
  Cc: Mike Rapoport, Alex Bee, Will Deacon, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM, Christoph Hellwig

On 2021-08-25 11:38, David Hildenbrand wrote:
> On 25.08.21 12:20, Catalin Marinas wrote:
>> + hch
>>
>> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
>>> On 24.08.21 20:46, Robin Murphy wrote:
>>>> On 2021-08-24 19:28, Mike Rapoport wrote:
>>>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>>>> it seems there is a regression in arm64 memory mapping in 5.14, 
>>>>>>> since it
>>>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>>>
>>>>>>>   ------------[ cut here ]------------
>>>>>>>   WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 
>>>>>>> dma_map_resource+0x68/0xc0
>>>>>>>   Modules linked in: spi_rockchip(+) fuse
>>>>>>>   CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>>>>   Hardware name: Pine64 Rock64 (DT)
>>>>>>>   pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>>>>   pc : dma_map_resource+0x68/0xc0
>>>>>>>   lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>>>>   sp : ffff800012102ae0
>>>>>>>   x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>>>>   x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>>>>   x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>>>>   x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>>>>   x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>>>>   x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>>>>   x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>>>>   x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>>>>   x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>>>>   x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>>>>   Call trace:
>>>>>>>     dma_map_resource+0x68/0xc0
>>>>>>>     pl330_prep_slave_sg+0x58/0x220
>>>>>>>     rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>>>>     rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>>>> [...]
>>>>>>> Note: This does not relate to the spi driver - when disabling 
>>>>>>> this device in
>>>>>>> the device tree it fails for any other (i2s, for instance) which 
>>>>>>> uses dma.
>>>>>>> Commenting out the failing check at [1], however, helps and the 
>>>>>>> mapping
>>>>>>> works again.
>>>>>
>>>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>>>> Random guess, the address is 0xff190800 (based on the x1 above but 
>>>>>> the
>>>>>> regs might as well be mangled).
>>>>>
>>>>> 0xff190800 will cause this warning for sure. It has a memory map, 
>>>>> but it is
>>>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>>>> returns 1.
>>>>
>>>> How does that happen, though? It's not a memory address, and it's not
>>>> even within the bounds of anywhere there should or could be memory. 
>>>> This
>>>> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
>>>> the DRAM controller (which may not all be populated, and may have 
>>>> pieces
>>>> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. 
>>>> Why
>>>> do we have pages (or at least the assumption of pages) for somewhere
>>>> which by all rights should not have them?
>>>
>>> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to 
>>> avoid
>>> any such hacks. If there is a memory hole, it gets a memmap as well.

Urgh, apologies for being slow. This case isn't a memory hole as such, 
but I failed to consider the *ends* of memory not being section-aligned 
and leading to an overlap anyway.

>>> Tricking pfn_valid() into returning "false" where we actually have a 
>>> memmap
>>> only makes it look like there is no memmap; but there is one, and
>>> it's PG_reserved.
>>
>> I can see the documentation for pfn_valid() does not claim anything more
>> than the presence of an memmap entry. But I wonder whether the confusion
>> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
>> assumes __va() can be used on pfn_valid(), though I suspect it relies on
>> the calling function to check that the resource was RAM. The arm64
>> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
>> standard memcpy on it, which wouldn't work for I/O (should we change
>> this check to pfn_is_map_memory() for arm64?).
>>
> 
> kern_addr_valid() checks that there is a direct map entry, and that the 
> mapped address has a valid mmap. (copied from x86-64)
> 
> Would you expect to have a direct map for memory holes and similar (IOW, 
> !System RAM)?

Probably - we can have no-map regions for firmware-reserved RAM which I 
believe end up as PG_reserved if they're small enough, for the same 
reasoning as this case.

>>>>>> Either pfn_valid() gets confused in 5.14 or something is wrong 
>>>>>> with the
>>>>>> DT. I have a suspicion it's the former since reverting the above 
>>>>>> commit
>>>>>> makes it disappear.
>>>>>
>>>>> I think pfn_valid() actually behaves as expected but the caller is 
>>>>> wrong
>>>>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>>>>
>>>>>     /* Don't allow RAM to be mapped */
>>>>>     if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>>>>         return DMA_MAPPING_ERROR;
>>>>>
>>>>> Alex, can you please try this patch:
>>>>
>>>> That will certainly paper over the issue, but it's avoiding the 
>>>> question
>>>> of what went wrong with the memory map in the first place. The comment
>>>> is indeed a bit inaccurate, but ultimately dma_map_resource() exists 
>>>> for
>>>> addresses that would be wrong to pass to dma_map_page(), so I believe
>>>> pfn_valid() is still the correct check.
>>>
>>> If we want to check for RAM, pfn_valid() would be wrong. If we want 
>>> to check
>>> for "is there a memmap, for whatever lives or does not live there",
>>> pfn_valid() is the right check.
>>
>> So what should the DMA code use instead? Last time we needed something
>> similar, the recommendation was to use pfn_to_online_page(). Mike is
>> suggesting memblock_is_memory().
> 
> We use pfn_to_online_page() when we want to know if it's system RAM and 
> that the memmap actually contains something sane (-> memmap content has 
> a well defined state).
> 
> You can have offline memory blocks where pfn_to_online_page() would 
> return "false", memblock_is_memory() would return "true". IOW, there is 
> a memmap, it's System RAM, but the memmap is stale and not trustworthy.

That's fine - offline memory is doubly-invalid to map as an MMIO resource :)

> If you want to make sure no System RAM (online/offline/...) will get 
> mapped, memblock_is_memory() should be the right thing to use. I recall 
> that x86 traverse the resource tree instead to exclude system ram 
> regions similarly.

I'm thinking that "pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn))" 
might be the closest thing to what I'd like to express - does that seem 
sensible at all? The one thing I'm not quite sure about is the 
interaction with P2P mappings of ZONE_DEVICE, but that's also true of 
the previous behaviour, and I'm not aware that the usage model has been 
fully nailed down yet, so I suspect we have some wiggle room. At worst, 
false negatives in certain situations wouldn't be the end of the world, 
since this is merely a sanity check for something which is expected to 
be a no-op the vast majority of the time, so being unobtrusive is more 
important than being exhaustive.

Thanks,
Robin.

>> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
>> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
>> assess the implications in 5.15 (the patch doesn't seem to have the
>> arm64 maintainers' ack anyway ;)).
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:55               ` Catalin Marinas
@ 2021-08-25 11:12                 ` David Hildenbrand
  -1 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-08-25 11:12 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On 25.08.21 12:55, Catalin Marinas wrote:
> On Wed, Aug 25, 2021 at 12:38:31PM +0200, David Hildenbrand wrote:
>> On 25.08.21 12:20, Catalin Marinas wrote:
>>> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
>>>> On 24.08.21 20:46, Robin Murphy wrote:
>>>>> On 2021-08-24 19:28, Mike Rapoport wrote:
>>>>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>>>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>>>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>>>>
>>>>>>>>    ------------[ cut here ]------------
>>>>>>>>    WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>>>>>>>    Modules linked in: spi_rockchip(+) fuse
>>>>>>>>    CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>>>>>    Hardware name: Pine64 Rock64 (DT)
>>>>>>>>    pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>>>>>    pc : dma_map_resource+0x68/0xc0
>>>>>>>>    lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>>>>>    sp : ffff800012102ae0
>>>>>>>>    x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>>>>>    x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>>>>>    x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>>>>>    x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>>>>>    x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>>>>>    x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>>>>>    x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>>>>>    x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>>>>>    x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>>>>>    x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>>>>>    Call trace:
>>>>>>>>      dma_map_resource+0x68/0xc0
>>>>>>>>      pl330_prep_slave_sg+0x58/0x220
>>>>>>>>      rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>>>>>      rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>>>>> [...]
>>>>>>>> Note: This does not relate to the spi driver - when disabling this device in
>>>>>>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>>>>>>> Commenting out the failing check at [1], however, helps and the mapping
>>>>>>>> works again.
>>>>>>
>>>>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>>>>> Random guess, the address is 0xff190800 (based on the x1 above but the
>>>>>>> regs might as well be mangled).
>>>>>>
>>>>>> 0xff190800 will cause this warning for sure. It has a memory map, but it is
>>>>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>>>>> returns 1.
>>>>>
>>>>> How does that happen, though? It's not a memory address, and it's not
>>>>> even within the bounds of anywhere there should or could be memory. This
>>>>> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
>>>>> the DRAM controller (which may not all be populated, and may have pieces
>>>>> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
>>>>> do we have pages (or at least the assumption of pages) for somewhere
>>>>> which by all rights should not have them?
>>>>
>>>> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
>>>> any such hacks. If there is a memory hole, it gets a memmap as well.
>>>>
>>>> Tricking pfn_valid() into returning "false" where we actually have a memmap
>>>> only makes it look like there is no memmap; but there is one, and
>>>> it's PG_reserved.
>>>
>>> I can see the documentation for pfn_valid() does not claim anything more
>>> than the presence of an memmap entry. But I wonder whether the confusion
>>> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
>>> assumes __va() can be used on pfn_valid(), though I suspect it relies on
>>> the calling function to check that the resource was RAM. The arm64
>>> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
>>> standard memcpy on it, which wouldn't work for I/O (should we change
>>> this check to pfn_is_map_memory() for arm64?).
>>
>> kern_addr_valid() checks that there is a direct map entry, and that the
>> mapped address has a valid mmap. (copied from x86-64)
> 
> It checks that there is a va->pa mapping, not necessarily in the linear
> map as it walks the page tables. So for some I/O range that happens to
> be mapped but which was in close proximity to RAM so that pfn_valid() is
> true, kern_addr_valid() would return true. I don't thin that was the
> intention.
> 
>> Would you expect to have a direct map for memory holes and similar (IOW,
>> !System RAM)?
> 
> No, but we with the generic pfn_valid(), it may return true for mapped
> MMIO (with different attributes than the direct map).

Ah, right. But can we actually run into that via kcore?

kcore builds the RAM list via walk_system_ram_range(), IOW the resource 
tree. And we end up calling kern_addr_valid() only on KCORE_RAM, 
KCORE_VMEMMAP and KCORE_TEXT.

Not saying that kern_addr_valid() shouldn't be improved.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 11:12                 ` David Hildenbrand
  0 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-08-25 11:12 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On 25.08.21 12:55, Catalin Marinas wrote:
> On Wed, Aug 25, 2021 at 12:38:31PM +0200, David Hildenbrand wrote:
>> On 25.08.21 12:20, Catalin Marinas wrote:
>>> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
>>>> On 24.08.21 20:46, Robin Murphy wrote:
>>>>> On 2021-08-24 19:28, Mike Rapoport wrote:
>>>>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>>>>> it seems there is a regression in arm64 memory mapping in 5.14, since it
>>>>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>>>>
>>>>>>>>    ------------[ cut here ]------------
>>>>>>>>    WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
>>>>>>>>    Modules linked in: spi_rockchip(+) fuse
>>>>>>>>    CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>>>>>    Hardware name: Pine64 Rock64 (DT)
>>>>>>>>    pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>>>>>    pc : dma_map_resource+0x68/0xc0
>>>>>>>>    lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>>>>>    sp : ffff800012102ae0
>>>>>>>>    x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>>>>>    x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>>>>>    x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>>>>>    x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>>>>>    x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>>>>>    x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>>>>>    x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>>>>>    x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>>>>>    x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>>>>>    x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>>>>>    Call trace:
>>>>>>>>      dma_map_resource+0x68/0xc0
>>>>>>>>      pl330_prep_slave_sg+0x58/0x220
>>>>>>>>      rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>>>>>      rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>>>>> [...]
>>>>>>>> Note: This does not relate to the spi driver - when disabling this device in
>>>>>>>> the device tree it fails for any other (i2s, for instance) which uses dma.
>>>>>>>> Commenting out the failing check at [1], however, helps and the mapping
>>>>>>>> works again.
>>>>>>
>>>>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>>>>> Random guess, the address is 0xff190800 (based on the x1 above but the
>>>>>>> regs might as well be mangled).
>>>>>>
>>>>>> 0xff190800 will cause this warning for sure. It has a memory map, but it is
>>>>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>>>>> returns 1.
>>>>>
>>>>> How does that happen, though? It's not a memory address, and it's not
>>>>> even within the bounds of anywhere there should or could be memory. This
>>>>> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
>>>>> the DRAM controller (which may not all be populated, and may have pieces
>>>>> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
>>>>> do we have pages (or at least the assumption of pages) for somewhere
>>>>> which by all rights should not have them?
>>>>
>>>> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
>>>> any such hacks. If there is a memory hole, it gets a memmap as well.
>>>>
>>>> Tricking pfn_valid() into returning "false" where we actually have a memmap
>>>> only makes it look like there is no memmap; but there is one, and
>>>> it's PG_reserved.
>>>
>>> I can see the documentation for pfn_valid() does not claim anything more
>>> than the presence of an memmap entry. But I wonder whether the confusion
>>> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
>>> assumes __va() can be used on pfn_valid(), though I suspect it relies on
>>> the calling function to check that the resource was RAM. The arm64
>>> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
>>> standard memcpy on it, which wouldn't work for I/O (should we change
>>> this check to pfn_is_map_memory() for arm64?).
>>
>> kern_addr_valid() checks that there is a direct map entry, and that the
>> mapped address has a valid mmap. (copied from x86-64)
> 
> It checks that there is a va->pa mapping, not necessarily in the linear
> map as it walks the page tables. So for some I/O range that happens to
> be mapped but which was in close proximity to RAM so that pfn_valid() is
> true, kern_addr_valid() would return true. I don't thin that was the
> intention.
> 
>> Would you expect to have a direct map for memory holes and similar (IOW,
>> !System RAM)?
> 
> No, but we with the generic pfn_valid(), it may return true for mapped
> MMIO (with different attributes than the direct map).

Ah, right. But can we actually run into that via kcore?

kcore builds the RAM list via walk_system_ram_range(), IOW the resource 
tree. And we end up calling kern_addr_valid() only on KCORE_RAM, 
KCORE_VMEMMAP and KCORE_TEXT.

Not saying that kern_addr_valid() shouldn't be improved.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:58               ` Robin Murphy
@ 2021-08-25 11:21                 ` David Hildenbrand
  -1 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-08-25 11:21 UTC (permalink / raw)
  To: Robin Murphy, Catalin Marinas
  Cc: Mike Rapoport, Alex Bee, Will Deacon, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM, Christoph Hellwig

On 25.08.21 12:58, Robin Murphy wrote:
> On 2021-08-25 11:38, David Hildenbrand wrote:
>> On 25.08.21 12:20, Catalin Marinas wrote:
>>> + hch
>>>
>>> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
>>>> On 24.08.21 20:46, Robin Murphy wrote:
>>>>> On 2021-08-24 19:28, Mike Rapoport wrote:
>>>>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>>>>> it seems there is a regression in arm64 memory mapping in 5.14,
>>>>>>>> since it
>>>>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>>>>
>>>>>>>>    ------------[ cut here ]------------
>>>>>>>>    WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235
>>>>>>>> dma_map_resource+0x68/0xc0
>>>>>>>>    Modules linked in: spi_rockchip(+) fuse
>>>>>>>>    CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>>>>>    Hardware name: Pine64 Rock64 (DT)
>>>>>>>>    pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>>>>>    pc : dma_map_resource+0x68/0xc0
>>>>>>>>    lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>>>>>    sp : ffff800012102ae0
>>>>>>>>    x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>>>>>    x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>>>>>    x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>>>>>    x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>>>>>    x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>>>>>    x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>>>>>    x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>>>>>    x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>>>>>    x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>>>>>    x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>>>>>    Call trace:
>>>>>>>>      dma_map_resource+0x68/0xc0
>>>>>>>>      pl330_prep_slave_sg+0x58/0x220
>>>>>>>>      rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>>>>>      rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>>>>> [...]
>>>>>>>> Note: This does not relate to the spi driver - when disabling
>>>>>>>> this device in
>>>>>>>> the device tree it fails for any other (i2s, for instance) which
>>>>>>>> uses dma.
>>>>>>>> Commenting out the failing check at [1], however, helps and the
>>>>>>>> mapping
>>>>>>>> works again.
>>>>>>
>>>>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>>>>> Random guess, the address is 0xff190800 (based on the x1 above but
>>>>>>> the
>>>>>>> regs might as well be mangled).
>>>>>>
>>>>>> 0xff190800 will cause this warning for sure. It has a memory map,
>>>>>> but it is
>>>>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>>>>> returns 1.
>>>>>
>>>>> How does that happen, though? It's not a memory address, and it's not
>>>>> even within the bounds of anywhere there should or could be memory.
>>>>> This
>>>>> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
>>>>> the DRAM controller (which may not all be populated, and may have
>>>>> pieces
>>>>> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO.
>>>>> Why
>>>>> do we have pages (or at least the assumption of pages) for somewhere
>>>>> which by all rights should not have them?
>>>>
>>>> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to
>>>> avoid
>>>> any such hacks. If there is a memory hole, it gets a memmap as well.
> 
> Urgh, apologies for being slow. This case isn't a memory hole as such,
> but I failed to consider the *ends* of memory not being section-aligned
> and leading to an overlap anyway.
> 
>>>> Tricking pfn_valid() into returning "false" where we actually have a
>>>> memmap
>>>> only makes it look like there is no memmap; but there is one, and
>>>> it's PG_reserved.
>>>
>>> I can see the documentation for pfn_valid() does not claim anything more
>>> than the presence of an memmap entry. But I wonder whether the confusion
>>> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
>>> assumes __va() can be used on pfn_valid(), though I suspect it relies on
>>> the calling function to check that the resource was RAM. The arm64
>>> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
>>> standard memcpy on it, which wouldn't work for I/O (should we change
>>> this check to pfn_is_map_memory() for arm64?).
>>>
>>
>> kern_addr_valid() checks that there is a direct map entry, and that the
>> mapped address has a valid mmap. (copied from x86-64)
>>
>> Would you expect to have a direct map for memory holes and similar (IOW,
>> !System RAM)?
> 
> Probably - we can have no-map regions for firmware-reserved RAM which I
> believe end up as PG_reserved if they're small enough, for the same
> reasoning as this case.
> 
>>>>>>> Either pfn_valid() gets confused in 5.14 or something is wrong
>>>>>>> with the
>>>>>>> DT. I have a suspicion it's the former since reverting the above
>>>>>>> commit
>>>>>>> makes it disappear.
>>>>>>
>>>>>> I think pfn_valid() actually behaves as expected but the caller is
>>>>>> wrong
>>>>>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>>>>>
>>>>>>      /* Don't allow RAM to be mapped */
>>>>>>      if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>>>>>          return DMA_MAPPING_ERROR;
>>>>>>
>>>>>> Alex, can you please try this patch:
>>>>>
>>>>> That will certainly paper over the issue, but it's avoiding the
>>>>> question
>>>>> of what went wrong with the memory map in the first place. The comment
>>>>> is indeed a bit inaccurate, but ultimately dma_map_resource() exists
>>>>> for
>>>>> addresses that would be wrong to pass to dma_map_page(), so I believe
>>>>> pfn_valid() is still the correct check.
>>>>
>>>> If we want to check for RAM, pfn_valid() would be wrong. If we want
>>>> to check
>>>> for "is there a memmap, for whatever lives or does not live there",
>>>> pfn_valid() is the right check.
>>>
>>> So what should the DMA code use instead? Last time we needed something
>>> similar, the recommendation was to use pfn_to_online_page(). Mike is
>>> suggesting memblock_is_memory().
>>
>> We use pfn_to_online_page() when we want to know if it's system RAM and
>> that the memmap actually contains something sane (-> memmap content has
>> a well defined state).

Sorry that was only partially right: to be more precise 
pfn_to_online_page() might also succeeds on memory holes and similar 
(essentially everywhere where we don't have CONFIG_HAVE_ARCH_PFN_VALID), 
it just means that there is a memmap and that the memmap has a well 
defined. If it's system RAM, it's online and either getting used or 
lying around as free in the buddy.

For example, pfn_to_online_page() won't succeed on ZONE_DEVICE memory, 
while pfn_valid() will.


>>
>> You can have offline memory blocks where pfn_to_online_page() would
>> return "false", memblock_is_memory() would return "true". IOW, there is
>> a memmap, it's System RAM, but the memmap is stale and not trustworthy.
> 
> That's fine - offline memory is doubly-invalid to map as an MMIO resource :)
> 
>> If you want to make sure no System RAM (online/offline/...) will get
>> mapped, memblock_is_memory() should be the right thing to use. I recall
>> that x86 traverse the resource tree instead to exclude system ram
>> regions similarly.
> 
> I'm thinking that "pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn))"

For offline memory blocks or for ZONE_DEVICE memory where pfn_valid() 
succeeds, the memmap might be stale and not contain anything reasonable. 
So the PageReserved() check can be problematic.

Also note that PageReserved() has plenty of different semantics (see 
PG_reserved in include/linux/page-flags.h )

> might be the closest thing to what I'd like to express - does that seem
> sensible at all? The one thing I'm not quite sure about is the
> interaction with P2P mappings of ZONE_DEVICE, but that's also true of
> the previous behaviour, and I'm not aware that the usage model has been
> fully nailed down yet, so I suspect we have some wiggle room. At worst,

Yes.

> false negatives in certain situations wouldn't be the end of the world,
> since this is merely a sanity check for something which is expected to
> be a no-op the vast majority of the time, so being unobtrusive is more
> important than being exhaustive.

Right.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 11:21                 ` David Hildenbrand
  0 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-08-25 11:21 UTC (permalink / raw)
  To: Robin Murphy, Catalin Marinas
  Cc: Mike Rapoport, Alex Bee, Will Deacon, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM, Christoph Hellwig

On 25.08.21 12:58, Robin Murphy wrote:
> On 2021-08-25 11:38, David Hildenbrand wrote:
>> On 25.08.21 12:20, Catalin Marinas wrote:
>>> + hch
>>>
>>> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
>>>> On 24.08.21 20:46, Robin Murphy wrote:
>>>>> On 2021-08-24 19:28, Mike Rapoport wrote:
>>>>>> On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
>>>>>>> On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
>>>>>>>> it seems there is a regression in arm64 memory mapping in 5.14,
>>>>>>>> since it
>>>>>>>> fails on Rockchip RK3328 when the pl330 dmac tries to map with:
>>>>>>>>
>>>>>>>>    ------------[ cut here ]------------
>>>>>>>>    WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235
>>>>>>>> dma_map_resource+0x68/0xc0
>>>>>>>>    Modules linked in: spi_rockchip(+) fuse
>>>>>>>>    CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
>>>>>>>>    Hardware name: Pine64 Rock64 (DT)
>>>>>>>>    pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>>>>>>>>    pc : dma_map_resource+0x68/0xc0
>>>>>>>>    lr : pl330_prep_slave_fifo+0x78/0xd0
>>>>>>>>    sp : ffff800012102ae0
>>>>>>>>    x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
>>>>>>>>    x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
>>>>>>>>    x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
>>>>>>>>    x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
>>>>>>>>    x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>>>>>>    x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
>>>>>>>>    x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
>>>>>>>>    x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
>>>>>>>>    x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
>>>>>>>>    x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
>>>>>>>>    Call trace:
>>>>>>>>      dma_map_resource+0x68/0xc0
>>>>>>>>      pl330_prep_slave_sg+0x58/0x220
>>>>>>>>      rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
>>>>>>>>      rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
>>>>>>> [...]
>>>>>>>> Note: This does not relate to the spi driver - when disabling
>>>>>>>> this device in
>>>>>>>> the device tree it fails for any other (i2s, for instance) which
>>>>>>>> uses dma.
>>>>>>>> Commenting out the failing check at [1], however, helps and the
>>>>>>>> mapping
>>>>>>>> works again.
>>>>>>
>>>>>>> Do you know which address dma_map_resource() is trying to map (maybe
>>>>>>> add some printk())? It's not supposed to map RAM, hence the warning.
>>>>>>> Random guess, the address is 0xff190800 (based on the x1 above but
>>>>>>> the
>>>>>>> regs might as well be mangled).
>>>>>>
>>>>>> 0xff190800 will cause this warning for sure. It has a memory map,
>>>>>> but it is
>>>>>> not RAM so old version of pfn_valid() would return 0 and the new one
>>>>>> returns 1.
>>>>>
>>>>> How does that happen, though? It's not a memory address, and it's not
>>>>> even within the bounds of anywhere there should or could be memory.
>>>>> This
>>>>> SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
>>>>> the DRAM controller (which may not all be populated, and may have
>>>>> pieces
>>>>> carved out by secure firmware), while 0xff000000-0xffffffff is MMIO.
>>>>> Why
>>>>> do we have pages (or at least the assumption of pages) for somewhere
>>>>> which by all rights should not have them?
>>>>
>>>> Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to
>>>> avoid
>>>> any such hacks. If there is a memory hole, it gets a memmap as well.
> 
> Urgh, apologies for being slow. This case isn't a memory hole as such,
> but I failed to consider the *ends* of memory not being section-aligned
> and leading to an overlap anyway.
> 
>>>> Tricking pfn_valid() into returning "false" where we actually have a
>>>> memmap
>>>> only makes it look like there is no memmap; but there is one, and
>>>> it's PG_reserved.
>>>
>>> I can see the documentation for pfn_valid() does not claim anything more
>>> than the presence of an memmap entry. But I wonder whether the confusion
>>> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
>>> assumes __va() can be used on pfn_valid(), though I suspect it relies on
>>> the calling function to check that the resource was RAM. The arm64
>>> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
>>> standard memcpy on it, which wouldn't work for I/O (should we change
>>> this check to pfn_is_map_memory() for arm64?).
>>>
>>
>> kern_addr_valid() checks that there is a direct map entry, and that the
>> mapped address has a valid mmap. (copied from x86-64)
>>
>> Would you expect to have a direct map for memory holes and similar (IOW,
>> !System RAM)?
> 
> Probably - we can have no-map regions for firmware-reserved RAM which I
> believe end up as PG_reserved if they're small enough, for the same
> reasoning as this case.
> 
>>>>>>> Either pfn_valid() gets confused in 5.14 or something is wrong
>>>>>>> with the
>>>>>>> DT. I have a suspicion it's the former since reverting the above
>>>>>>> commit
>>>>>>> makes it disappear.
>>>>>>
>>>>>> I think pfn_valid() actually behaves as expected but the caller is
>>>>>> wrong
>>>>>> because pfn_valid != RAM (this applies btw to !arm64 as well).
>>>>>>
>>>>>>      /* Don't allow RAM to be mapped */
>>>>>>      if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
>>>>>>          return DMA_MAPPING_ERROR;
>>>>>>
>>>>>> Alex, can you please try this patch:
>>>>>
>>>>> That will certainly paper over the issue, but it's avoiding the
>>>>> question
>>>>> of what went wrong with the memory map in the first place. The comment
>>>>> is indeed a bit inaccurate, but ultimately dma_map_resource() exists
>>>>> for
>>>>> addresses that would be wrong to pass to dma_map_page(), so I believe
>>>>> pfn_valid() is still the correct check.
>>>>
>>>> If we want to check for RAM, pfn_valid() would be wrong. If we want
>>>> to check
>>>> for "is there a memmap, for whatever lives or does not live there",
>>>> pfn_valid() is the right check.
>>>
>>> So what should the DMA code use instead? Last time we needed something
>>> similar, the recommendation was to use pfn_to_online_page(). Mike is
>>> suggesting memblock_is_memory().
>>
>> We use pfn_to_online_page() when we want to know if it's system RAM and
>> that the memmap actually contains something sane (-> memmap content has
>> a well defined state).

Sorry that was only partially right: to be more precise 
pfn_to_online_page() might also succeeds on memory holes and similar 
(essentially everywhere where we don't have CONFIG_HAVE_ARCH_PFN_VALID), 
it just means that there is a memmap and that the memmap has a well 
defined. If it's system RAM, it's online and either getting used or 
lying around as free in the buddy.

For example, pfn_to_online_page() won't succeed on ZONE_DEVICE memory, 
while pfn_valid() will.


>>
>> You can have offline memory blocks where pfn_to_online_page() would
>> return "false", memblock_is_memory() would return "true". IOW, there is
>> a memmap, it's System RAM, but the memmap is stale and not trustworthy.
> 
> That's fine - offline memory is doubly-invalid to map as an MMIO resource :)
> 
>> If you want to make sure no System RAM (online/offline/...) will get
>> mapped, memblock_is_memory() should be the right thing to use. I recall
>> that x86 traverse the resource tree instead to exclude system ram
>> regions similarly.
> 
> I'm thinking that "pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn))"

For offline memory blocks or for ZONE_DEVICE memory where pfn_valid() 
succeeds, the memmap might be stale and not contain anything reasonable. 
So the PageReserved() check can be problematic.

Also note that PageReserved() has plenty of different semantics (see 
PG_reserved in include/linux/page-flags.h )

> might be the closest thing to what I'd like to express - does that seem
> sensible at all? The one thing I'm not quite sure about is the
> interaction with P2P mappings of ZONE_DEVICE, but that's also true of
> the previous behaviour, and I'm not aware that the usage model has been
> fully nailed down yet, so I suspect we have some wiggle room. At worst,

Yes.

> false negatives in certain situations wouldn't be the end of the world,
> since this is merely a sanity check for something which is expected to
> be a no-op the vast majority of the time, so being unobtrusive is more
> important than being exhaustive.

Right.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 11:12                 ` David Hildenbrand
@ 2021-08-25 17:15                   ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-08-25 17:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 01:12:37PM +0200, David Hildenbrand wrote:
> On 25.08.21 12:55, Catalin Marinas wrote:
> > On Wed, Aug 25, 2021 at 12:38:31PM +0200, David Hildenbrand wrote:
> > > On 25.08.21 12:20, Catalin Marinas wrote:
> > > > I can see the documentation for pfn_valid() does not claim anything more
> > > > than the presence of an memmap entry. But I wonder whether the confusion
> > > > is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> > > > assumes __va() can be used on pfn_valid(), though I suspect it relies on
> > > > the calling function to check that the resource was RAM. The arm64
> > > > kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> > > > standard memcpy on it, which wouldn't work for I/O (should we change
> > > > this check to pfn_is_map_memory() for arm64?).
> > > 
> > > kern_addr_valid() checks that there is a direct map entry, and that the
> > > mapped address has a valid mmap. (copied from x86-64)
> > 
> > It checks that there is a va->pa mapping, not necessarily in the linear
> > map as it walks the page tables. So for some I/O range that happens to
> > be mapped but which was in close proximity to RAM so that pfn_valid() is
> > true, kern_addr_valid() would return true. I don't thin that was the
> > intention.
> > 
> > > Would you expect to have a direct map for memory holes and similar (IOW,
> > > !System RAM)?
> > 
> > No, but we with the generic pfn_valid(), it may return true for mapped
> > MMIO (with different attributes than the direct map).
> 
> Ah, right. But can we actually run into that via kcore?
> 
> kcore builds the RAM list via walk_system_ram_range(), IOW the resource
> tree. And we end up calling kern_addr_valid() only on KCORE_RAM,
> KCORE_VMEMMAP and KCORE_TEXT.

It's probably fine but I'd rather do some check of the other call sites
before attempting to move arm64 to the generic pfn_valid() again.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-08-25 17:15                   ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-08-25 17:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Robin Murphy, Mike Rapoport, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 01:12:37PM +0200, David Hildenbrand wrote:
> On 25.08.21 12:55, Catalin Marinas wrote:
> > On Wed, Aug 25, 2021 at 12:38:31PM +0200, David Hildenbrand wrote:
> > > On 25.08.21 12:20, Catalin Marinas wrote:
> > > > I can see the documentation for pfn_valid() does not claim anything more
> > > > than the presence of an memmap entry. But I wonder whether the confusion
> > > > is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> > > > assumes __va() can be used on pfn_valid(), though I suspect it relies on
> > > > the calling function to check that the resource was RAM. The arm64
> > > > kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> > > > standard memcpy on it, which wouldn't work for I/O (should we change
> > > > this check to pfn_is_map_memory() for arm64?).
> > > 
> > > kern_addr_valid() checks that there is a direct map entry, and that the
> > > mapped address has a valid mmap. (copied from x86-64)
> > 
> > It checks that there is a va->pa mapping, not necessarily in the linear
> > map as it walks the page tables. So for some I/O range that happens to
> > be mapped but which was in close proximity to RAM so that pfn_valid() is
> > true, kern_addr_valid() would return true. I don't thin that was the
> > intention.
> > 
> > > Would you expect to have a direct map for memory holes and similar (IOW,
> > > !System RAM)?
> > 
> > No, but we with the generic pfn_valid(), it may return true for mapped
> > MMIO (with different attributes than the direct map).
> 
> Ah, right. But can we actually run into that via kcore?
> 
> kcore builds the RAM list via walk_system_ram_range(), IOW the resource
> tree. And we end up calling kern_addr_valid() only on KCORE_RAM,
> KCORE_VMEMMAP and KCORE_TEXT.

It's probably fine but I'd rather do some check of the other call sites
before attempting to move arm64 to the generic pfn_valid() again.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-08-25 10:20           ` Catalin Marinas
@ 2021-09-17 21:22             ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-17 21:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Robin Murphy, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
> + hch
> 
> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
> > On 24.08.21 20:46, Robin Murphy wrote:
> > > On 2021-08-24 19:28, Mike Rapoport wrote:
> > > > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > > > 
> > > > > >  ------------[ cut here ]------------
> > > > > >  WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > > >  Modules linked in: spi_rockchip(+) fuse
> > > > > >  CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > > >  Hardware name: Pine64 Rock64 (DT)
> > > > > >  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > > >  pc : dma_map_resource+0x68/0xc0
> > > > > >  lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > > >  sp : ffff800012102ae0
> > > > > >  x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > > >  x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > > >  x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > > >  x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > > >  x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > > >  x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > > >  x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > > >  x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > > >  x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > > >  x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > > >  Call trace:
> > > > > >    dma_map_resource+0x68/0xc0
> > > > > >    pl330_prep_slave_sg+0x58/0x220
> > > > > >    rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > > >    rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > > > [...]
> > > > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > > > works again.
> > > > 
> > > > > Do you know which address dma_map_resource() is trying to map (maybe
> > > > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > > > regs might as well be mangled).
> > > > 
> > > > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > > > not RAM so old version of pfn_valid() would return 0 and the new one
> > > > returns 1.
> > > 
> > > How does that happen, though? It's not a memory address, and it's not
> > > even within the bounds of anywhere there should or could be memory. This
> > > SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> > > the DRAM controller (which may not all be populated, and may have pieces
> > > carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> > > do we have pages (or at least the assumption of pages) for somewhere
> > > which by all rights should not have them?
> > 
> > Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
> > any such hacks. If there is a memory hole, it gets a memmap as well.
> > 
> > Tricking pfn_valid() into returning "false" where we actually have a memmap
> > only makes it look like there is no memmap; but there is one, and
> > it's PG_reserved.
> 
> I can see the documentation for pfn_valid() does not claim anything more
> than the presence of an memmap entry. But I wonder whether the confusion
> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> assumes __va() can be used on pfn_valid(), though I suspect it relies on
> the calling function to check that the resource was RAM. The arm64
> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> standard memcpy on it, which wouldn't work for I/O (should we change
> this check to pfn_is_map_memory() for arm64?).
> 
> > > > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > > > DT. I have a suspicion it's the former since reverting the above commit
> > > > > makes it disappear.
> > > > 
> > > > I think pfn_valid() actually behaves as expected but the caller is wrong
> > > > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > > > 
> > > > 	/* Don't allow RAM to be mapped */
> > > > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > > > 		return DMA_MAPPING_ERROR;
> > > > 
> > > > Alex, can you please try this patch:
> > > 
> > > That will certainly paper over the issue, but it's avoiding the question
> > > of what went wrong with the memory map in the first place. The comment
> > > is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> > > addresses that would be wrong to pass to dma_map_page(), so I believe
> > > pfn_valid() is still the correct check.
> > 
> > If we want to check for RAM, pfn_valid() would be wrong. If we want to check
> > for "is there a memmap, for whatever lives or does not live there",
> > pfn_valid() is the right check.
> 
> So what should the DMA code use instead? Last time we needed something
> similar, the recommendation was to use pfn_to_online_page(). Mike is
> suggesting memblock_is_memory().

I did some digging and it seems that the most "generic" way to check if a
page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
false negatives for architectures that do not register "System RAM", but
those are not using dma_map_resource() anyway and, apparently, never would.

Alex, can you test this patch please?

From 35586e9dbbdb47962cb0f3803bc650d77867905a Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Sat, 18 Sep 2021 00:01:08 +0300
Subject: [PATCH] dma-mapping: use page_is_ram to ensure RAM is not mapped

dma_map_resource() uses pfn_valid() to ensure the range is not RAM.
However, pfn_valid() only checks for availability of the memory map for a
PFN but it does not ensure that the PFN is actually backed by RAM.

Replace pfn_valid() with page_is_ram() that does verify whether a PFN is in
RAM or not.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/dma/mapping.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 7ee5284bff58..dc6064a213a2 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -296,7 +296,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 		return DMA_MAPPING_ERROR;
 
 	/* Don't allow RAM to be mapped */
-	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
+	if (WARN_ON_ONCE(page_is_ram(PHYS_PFN(phys_addr))))
 		return DMA_MAPPING_ERROR;
 
 	if (dma_map_direct(dev, ops))
-- 
2.28.0

 
> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> assess the implications in 5.15 (the patch doesn't seem to have the
> arm64 maintainers' ack anyway ;)).
> 
> Thanks.
> 
> -- 
> Catalin

-- 
Sincerely yours,
Mike.

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-17 21:22             ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-17 21:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: David Hildenbrand, Robin Murphy, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM, Christoph Hellwig

On Wed, Aug 25, 2021 at 11:20:46AM +0100, Catalin Marinas wrote:
> + hch
> 
> On Tue, Aug 24, 2021 at 08:59:22PM +0200, David Hildenbrand wrote:
> > On 24.08.21 20:46, Robin Murphy wrote:
> > > On 2021-08-24 19:28, Mike Rapoport wrote:
> > > > On Tue, Aug 24, 2021 at 06:37:41PM +0100, Catalin Marinas wrote:
> > > > > On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote:
> > > > > > it seems there is a regression in arm64 memory mapping in 5.14, since it
> > > > > > fails on Rockchip RK3328 when the pl330 dmac tries to map with:
> > > > > > 
> > > > > >  ------------[ cut here ]------------
> > > > > >  WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0
> > > > > >  Modules linked in: spi_rockchip(+) fuse
> > > > > >  CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1
> > > > > >  Hardware name: Pine64 Rock64 (DT)
> > > > > >  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > > > > >  pc : dma_map_resource+0x68/0xc0
> > > > > >  lr : pl330_prep_slave_fifo+0x78/0xd0
> > > > > >  sp : ffff800012102ae0
> > > > > >  x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000
> > > > > >  x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001
> > > > > >  x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001
> > > > > >  x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000
> > > > > >  x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > > > > >  x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000
> > > > > >  x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80
> > > > > >  x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100
> > > > > >  x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001
> > > > > >  x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00
> > > > > >  Call trace:
> > > > > >    dma_map_resource+0x68/0xc0
> > > > > >    pl330_prep_slave_sg+0x58/0x220
> > > > > >    rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip]
> > > > > >    rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip]
> > > > > [...]
> > > > > > Note: This does not relate to the spi driver - when disabling this device in
> > > > > > the device tree it fails for any other (i2s, for instance) which uses dma.
> > > > > > Commenting out the failing check at [1], however, helps and the mapping
> > > > > > works again.
> > > > 
> > > > > Do you know which address dma_map_resource() is trying to map (maybe
> > > > > add some printk())? It's not supposed to map RAM, hence the warning.
> > > > > Random guess, the address is 0xff190800 (based on the x1 above but the
> > > > > regs might as well be mangled).
> > > > 
> > > > 0xff190800 will cause this warning for sure. It has a memory map, but it is
> > > > not RAM so old version of pfn_valid() would return 0 and the new one
> > > > returns 1.
> > > 
> > > How does that happen, though? It's not a memory address, and it's not
> > > even within the bounds of anywhere there should or could be memory. This
> > > SoC has a simple memory map - everything from 0 to 0xfeffffff goes to
> > > the DRAM controller (which may not all be populated, and may have pieces
> > > carved out by secure firmware), while 0xff000000-0xffffffff is MMIO. Why
> > > do we have pages (or at least the assumption of pages) for somewhere
> > > which by all rights should not have them?
> > 
> > Simple: we allocate the vmemmap for whole sections (e.g., 128 MiB) to avoid
> > any such hacks. If there is a memory hole, it gets a memmap as well.
> > 
> > Tricking pfn_valid() into returning "false" where we actually have a memmap
> > only makes it look like there is no memmap; but there is one, and
> > it's PG_reserved.
> 
> I can see the documentation for pfn_valid() does not claim anything more
> than the presence of an memmap entry. But I wonder whether the confusion
> is wider-spread than just the DMA code. At a quick grep, try_ram_remap()
> assumes __va() can be used on pfn_valid(), though I suspect it relies on
> the calling function to check that the resource was RAM. The arm64
> kern_addr_valid() returns true based on pfn_valid() and kcore.c uses
> standard memcpy on it, which wouldn't work for I/O (should we change
> this check to pfn_is_map_memory() for arm64?).
> 
> > > > > Either pfn_valid() gets confused in 5.14 or something is wrong with the
> > > > > DT. I have a suspicion it's the former since reverting the above commit
> > > > > makes it disappear.
> > > > 
> > > > I think pfn_valid() actually behaves as expected but the caller is wrong
> > > > because pfn_valid != RAM (this applies btw to !arm64 as well).
> > > > 
> > > > 	/* Don't allow RAM to be mapped */
> > > > 	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
> > > > 		return DMA_MAPPING_ERROR;
> > > > 
> > > > Alex, can you please try this patch:
> > > 
> > > That will certainly paper over the issue, but it's avoiding the question
> > > of what went wrong with the memory map in the first place. The comment
> > > is indeed a bit inaccurate, but ultimately dma_map_resource() exists for
> > > addresses that would be wrong to pass to dma_map_page(), so I believe
> > > pfn_valid() is still the correct check.
> > 
> > If we want to check for RAM, pfn_valid() would be wrong. If we want to check
> > for "is there a memmap, for whatever lives or does not live there",
> > pfn_valid() is the right check.
> 
> So what should the DMA code use instead? Last time we needed something
> similar, the recommendation was to use pfn_to_online_page(). Mike is
> suggesting memblock_is_memory().

I did some digging and it seems that the most "generic" way to check if a
page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
false negatives for architectures that do not register "System RAM", but
those are not using dma_map_resource() anyway and, apparently, never would.

Alex, can you test this patch please?

From 35586e9dbbdb47962cb0f3803bc650d77867905a Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Sat, 18 Sep 2021 00:01:08 +0300
Subject: [PATCH] dma-mapping: use page_is_ram to ensure RAM is not mapped

dma_map_resource() uses pfn_valid() to ensure the range is not RAM.
However, pfn_valid() only checks for availability of the memory map for a
PFN but it does not ensure that the PFN is actually backed by RAM.

Replace pfn_valid() with page_is_ram() that does verify whether a PFN is in
RAM or not.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/dma/mapping.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 7ee5284bff58..dc6064a213a2 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -296,7 +296,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 		return DMA_MAPPING_ERROR;
 
 	/* Don't allow RAM to be mapped */
-	if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
+	if (WARN_ON_ONCE(page_is_ram(PHYS_PFN(phys_addr))))
 		return DMA_MAPPING_ERROR;
 
 	if (dma_map_direct(dev, ops))
-- 
2.28.0

 
> Given how later we are in the -rc cycle, I suggest we revert Anshuman's
> commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") and try to
> assess the implications in 5.15 (the patch doesn't seem to have the
> arm64 maintainers' ack anyway ;)).
> 
> Thanks.
> 
> -- 
> Catalin

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-17 21:22             ` Mike Rapoport
@ 2021-09-18  5:18               ` Christoph Hellwig
  -1 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-09-18  5:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM,
	Christoph Hellwig

On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
> I did some digging and it seems that the most "generic" way to check if a
> page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
> false negatives for architectures that do not register "System RAM", but
> those are not using dma_map_resource() anyway and, apparently, never would.

The downside of page_is_ram is that it looks really expensiv for
something done at dma mapping time.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-18  5:18               ` Christoph Hellwig
  0 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-09-18  5:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM,
	Christoph Hellwig

On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
> I did some digging and it seems that the most "generic" way to check if a
> page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
> false negatives for architectures that do not register "System RAM", but
> those are not using dma_map_resource() anyway and, apparently, never would.

The downside of page_is_ram is that it looks really expensiv for
something done at dma mapping time.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-18  5:18               ` Christoph Hellwig
@ 2021-09-18  8:37                 ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-18  8:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Sat, Sep 18, 2021 at 07:18:43AM +0200, Christoph Hellwig wrote:
> On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
> > I did some digging and it seems that the most "generic" way to check if a
> > page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
> > false negatives for architectures that do not register "System RAM", but
> > those are not using dma_map_resource() anyway and, apparently, never would.
> 
> The downside of page_is_ram is that it looks really expensiv for
> something done at dma mapping time.

Indeed :(
But pfn_valid is plain wrong...
I'll keep digging.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-18  8:37                 ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-18  8:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Sat, Sep 18, 2021 at 07:18:43AM +0200, Christoph Hellwig wrote:
> On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
> > I did some digging and it seems that the most "generic" way to check if a
> > page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
> > false negatives for architectures that do not register "System RAM", but
> > those are not using dma_map_resource() anyway and, apparently, never would.
> 
> The downside of page_is_ram is that it looks really expensiv for
> something done at dma mapping time.

Indeed :(
But pfn_valid is plain wrong...
I'll keep digging.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-18  8:37                 ` Mike Rapoport
@ 2021-09-18 11:39                   ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-18 11:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Sat, Sep 18, 2021 at 11:37:22AM +0300, Mike Rapoport wrote:
> On Sat, Sep 18, 2021 at 07:18:43AM +0200, Christoph Hellwig wrote:
> > On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
> > > I did some digging and it seems that the most "generic" way to check if a
> > > page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
> > > false negatives for architectures that do not register "System RAM", but
> > > those are not using dma_map_resource() anyway and, apparently, never would.
> > 
> > The downside of page_is_ram is that it looks really expensiv for
> > something done at dma mapping time.
> 
> Indeed :(
> But pfn_valid is plain wrong...
> I'll keep digging.

I did some more archaeology and it that check for pfn_valid() was requested
by arm folks because their MMU may have troubles with alias mappings with
different attributes and so they made the check to use a false assumption
that pfn_valid() == "RAM".

As this WARN_ON(pfn_valid()) is only present in dma_map_resource() it's
probably safe to drop it entirely. 

Otherwise the simplest way would be to hide it behind something like
ARCH_WANTS_DMA_NOT_RAM and make arm/arm64 select it.

Thoughts?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-18 11:39                   ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-18 11:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Sat, Sep 18, 2021 at 11:37:22AM +0300, Mike Rapoport wrote:
> On Sat, Sep 18, 2021 at 07:18:43AM +0200, Christoph Hellwig wrote:
> > On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
> > > I did some digging and it seems that the most "generic" way to check if a
> > > page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
> > > false negatives for architectures that do not register "System RAM", but
> > > those are not using dma_map_resource() anyway and, apparently, never would.
> > 
> > The downside of page_is_ram is that it looks really expensiv for
> > something done at dma mapping time.
> 
> Indeed :(
> But pfn_valid is plain wrong...
> I'll keep digging.

I did some more archaeology and it that check for pfn_valid() was requested
by arm folks because their MMU may have troubles with alias mappings with
different attributes and so they made the check to use a false assumption
that pfn_valid() == "RAM".

As this WARN_ON(pfn_valid()) is only present in dma_map_resource() it's
probably safe to drop it entirely. 

Otherwise the simplest way would be to hide it behind something like
ARCH_WANTS_DMA_NOT_RAM and make arm/arm64 select it.

Thoughts?

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-18 11:39                   ` Mike Rapoport
@ 2021-09-20 10:57                     ` Catalin Marinas
  -1 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-09-20 10:57 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christoph Hellwig, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Sat, Sep 18, 2021 at 02:39:49PM +0300, Mike Rapoport wrote:
> On Sat, Sep 18, 2021 at 11:37:22AM +0300, Mike Rapoport wrote:
> > On Sat, Sep 18, 2021 at 07:18:43AM +0200, Christoph Hellwig wrote:
> > > On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
> > > > I did some digging and it seems that the most "generic" way to check if a
> > > > page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
> > > > false negatives for architectures that do not register "System RAM", but
> > > > those are not using dma_map_resource() anyway and, apparently, never would.
> > > 
> > > The downside of page_is_ram is that it looks really expensiv for
> > > something done at dma mapping time.
> > 
> > Indeed :(
> > But pfn_valid is plain wrong...
> > I'll keep digging.
> 
> I did some more archaeology and it that check for pfn_valid() was requested
> by arm folks because their MMU may have troubles with alias mappings with
> different attributes and so they made the check to use a false assumption
> that pfn_valid() == "RAM".
> 
> As this WARN_ON(pfn_valid()) is only present in dma_map_resource() it's
> probably safe to drop it entirely. 

I agree, we should drop it. IIUC dma_map_resource() does not create any
kernel mapping to cause problems with attribute aliasing. You'd need a
prior devm_ioremap_resource() if you want access to that range from the
CPU side. For arm64 at least, the latter ends up with a
pfn_is_map_memory() check.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-20 10:57                     ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2021-09-20 10:57 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christoph Hellwig, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Sat, Sep 18, 2021 at 02:39:49PM +0300, Mike Rapoport wrote:
> On Sat, Sep 18, 2021 at 11:37:22AM +0300, Mike Rapoport wrote:
> > On Sat, Sep 18, 2021 at 07:18:43AM +0200, Christoph Hellwig wrote:
> > > On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
> > > > I did some digging and it seems that the most "generic" way to check if a
> > > > page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
> > > > false negatives for architectures that do not register "System RAM", but
> > > > those are not using dma_map_resource() anyway and, apparently, never would.
> > > 
> > > The downside of page_is_ram is that it looks really expensiv for
> > > something done at dma mapping time.
> > 
> > Indeed :(
> > But pfn_valid is plain wrong...
> > I'll keep digging.
> 
> I did some more archaeology and it that check for pfn_valid() was requested
> by arm folks because their MMU may have troubles with alias mappings with
> different attributes and so they made the check to use a false assumption
> that pfn_valid() == "RAM".
> 
> As this WARN_ON(pfn_valid()) is only present in dma_map_resource() it's
> probably safe to drop it entirely. 

I agree, we should drop it. IIUC dma_map_resource() does not create any
kernel mapping to cause problems with attribute aliasing. You'd need a
prior devm_ioremap_resource() if you want access to that range from the
CPU side. For arm64 at least, the latter ends up with a
pfn_is_map_memory() check.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-18  5:18               ` Christoph Hellwig
@ 2021-09-20 11:13                 ` David Hildenbrand
  -1 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-09-20 11:13 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Rapoport
  Cc: Catalin Marinas, Robin Murphy, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM

On 18.09.21 07:18, Christoph Hellwig wrote:
> On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
>> I did some digging and it seems that the most "generic" way to check if a
>> page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
>> false negatives for architectures that do not register "System RAM", but
>> those are not using dma_map_resource() anyway and, apparently, never would.
> 
> The downside of page_is_ram is that it looks really expensiv for
> something done at dma mapping time.
> 

There would be ways to speed it up, similar to

https://lkml.kernel.org/r/20210902160919.25683-2-david@redhat.com

but the end result is still walking a list. Question would be, how much 
that overhead matters in practice.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-20 11:13                 ` David Hildenbrand
  0 siblings, 0 replies; 64+ messages in thread
From: David Hildenbrand @ 2021-09-20 11:13 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Rapoport
  Cc: Catalin Marinas, Robin Murphy, Alex Bee, Will Deacon,
	Andrew Morton, Anshuman Khandual, Linux Kernel Mailing List,
	linux-mm, Linux ARM

On 18.09.21 07:18, Christoph Hellwig wrote:
> On Sat, Sep 18, 2021 at 12:22:47AM +0300, Mike Rapoport wrote:
>> I did some digging and it seems that the most "generic" way to check if a
>> page is in RAM is page_is_ram(). It's not 100% bullet proof as it'll give
>> false negatives for architectures that do not register "System RAM", but
>> those are not using dma_map_resource() anyway and, apparently, never would.
> 
> The downside of page_is_ram is that it looks really expensiv for
> something done at dma mapping time.
> 

There would be ways to speed it up, similar to

https://lkml.kernel.org/r/20210902160919.25683-2-david@redhat.com

but the end result is still walking a list. Question would be, how much 
that overhead matters in practice.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-20 10:57                     ` Catalin Marinas
@ 2021-09-21  8:20                       ` Christoph Hellwig
  -1 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-09-21  8:20 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mike Rapoport, Christoph Hellwig, David Hildenbrand,
	Robin Murphy, Alex Bee, Will Deacon, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM

On Mon, Sep 20, 2021 at 11:57:58AM +0100, Catalin Marinas wrote:
> > As this WARN_ON(pfn_valid()) is only present in dma_map_resource() it's
> > probably safe to drop it entirely. 
> 
> I agree, we should drop it. IIUC dma_map_resource() does not create any
> kernel mapping to cause problems with attribute aliasing. You'd need a
> prior devm_ioremap_resource() if you want access to that range from the
> CPU side. For arm64 at least, the latter ends up with a
> pfn_is_map_memory() check.

It doesn't create any new mappings.  The only real issue is that it
does the wrong thing for RAM in a way that might not be noticed on
simple (x86/PC) platforms.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-21  8:20                       ` Christoph Hellwig
  0 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-09-21  8:20 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mike Rapoport, Christoph Hellwig, David Hildenbrand,
	Robin Murphy, Alex Bee, Will Deacon, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM

On Mon, Sep 20, 2021 at 11:57:58AM +0100, Catalin Marinas wrote:
> > As this WARN_ON(pfn_valid()) is only present in dma_map_resource() it's
> > probably safe to drop it entirely. 
> 
> I agree, we should drop it. IIUC dma_map_resource() does not create any
> kernel mapping to cause problems with attribute aliasing. You'd need a
> prior devm_ioremap_resource() if you want access to that range from the
> CPU side. For arm64 at least, the latter ends up with a
> pfn_is_map_memory() check.

It doesn't create any new mappings.  The only real issue is that it
does the wrong thing for RAM in a way that might not be noticed on
simple (x86/PC) platforms.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-21  8:20                       ` Christoph Hellwig
@ 2021-09-21  9:34                         ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-21  9:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Tue, Sep 21, 2021 at 10:20:07AM +0200, Christoph Hellwig wrote:
> On Mon, Sep 20, 2021 at 11:57:58AM +0100, Catalin Marinas wrote:
> > > As this WARN_ON(pfn_valid()) is only present in dma_map_resource() it's
> > > probably safe to drop it entirely. 
> > 
> > I agree, we should drop it. IIUC dma_map_resource() does not create any
> > kernel mapping to cause problems with attribute aliasing. You'd need a
> > prior devm_ioremap_resource() if you want access to that range from the
> > CPU side. For arm64 at least, the latter ends up with a
> > pfn_is_map_memory() check.
> 
> It doesn't create any new mappings.  The only real issue is that it
> does the wrong thing for RAM in a way that might not be noticed on
> simple (x86/PC) platforms.

But if the mapping request was rejected by devm_ioremap_resource() because
of an attempt to map RAM, why we would get to dma_map_resource() at all? 


-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-21  9:34                         ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-21  9:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Tue, Sep 21, 2021 at 10:20:07AM +0200, Christoph Hellwig wrote:
> On Mon, Sep 20, 2021 at 11:57:58AM +0100, Catalin Marinas wrote:
> > > As this WARN_ON(pfn_valid()) is only present in dma_map_resource() it's
> > > probably safe to drop it entirely. 
> > 
> > I agree, we should drop it. IIUC dma_map_resource() does not create any
> > kernel mapping to cause problems with attribute aliasing. You'd need a
> > prior devm_ioremap_resource() if you want access to that range from the
> > CPU side. For arm64 at least, the latter ends up with a
> > pfn_is_map_memory() check.
> 
> It doesn't create any new mappings.  The only real issue is that it
> does the wrong thing for RAM in a way that might not be noticed on
> simple (x86/PC) platforms.

But if the mapping request was rejected by devm_ioremap_resource() because
of an attempt to map RAM, why we would get to dma_map_resource() at all? 


-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-21  9:34                         ` Mike Rapoport
@ 2021-09-21 15:38                           ` Christoph Hellwig
  -1 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-09-21 15:38 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christoph Hellwig, Catalin Marinas, David Hildenbrand,
	Robin Murphy, Alex Bee, Will Deacon, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM

On Tue, Sep 21, 2021 at 12:34:10PM +0300, Mike Rapoport wrote:
> > It doesn't create any new mappings.  The only real issue is that it
> > does the wrong thing for RAM in a way that might not be noticed on
> > simple (x86/PC) platforms.
> 
> But if the mapping request was rejected by devm_ioremap_resource() because
> of an attempt to map RAM, why we would get to dma_map_resource() at all? 

dma_map_resource takes a phys_addr_t that could come from anywhere.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-21 15:38                           ` Christoph Hellwig
  0 siblings, 0 replies; 64+ messages in thread
From: Christoph Hellwig @ 2021-09-21 15:38 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christoph Hellwig, Catalin Marinas, David Hildenbrand,
	Robin Murphy, Alex Bee, Will Deacon, Andrew Morton,
	Anshuman Khandual, Linux Kernel Mailing List, linux-mm,
	Linux ARM

On Tue, Sep 21, 2021 at 12:34:10PM +0300, Mike Rapoport wrote:
> > It doesn't create any new mappings.  The only real issue is that it
> > does the wrong thing for RAM in a way that might not be noticed on
> > simple (x86/PC) platforms.
> 
> But if the mapping request was rejected by devm_ioremap_resource() because
> of an attempt to map RAM, why we would get to dma_map_resource() at all? 

dma_map_resource takes a phys_addr_t that could come from anywhere.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
  2021-09-21 15:38                           ` Christoph Hellwig
@ 2021-09-22  7:22                             ` Mike Rapoport
  -1 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-22  7:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Tue, Sep 21, 2021 at 05:38:05PM +0200, Christoph Hellwig wrote:
> On Tue, Sep 21, 2021 at 12:34:10PM +0300, Mike Rapoport wrote:
> > > It doesn't create any new mappings.  The only real issue is that it
> > > does the wrong thing for RAM in a way that might not be noticed on
> > > simple (x86/PC) platforms.
> > 
> > But if the mapping request was rejected by devm_ioremap_resource() because
> > of an attempt to map RAM, why we would get to dma_map_resource() at all? 
> 
> dma_map_resource takes a phys_addr_t that could come from anywhere.

Right, but it's not different from, say, dma_map_page_attrs() that can get a
struct page from anywhere and there is no actual memory for that struct
page at all. Do you suggest add a check that that struct page is backed by
memory in dma_map_page_attrs() as well?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases)
@ 2021-09-22  7:22                             ` Mike Rapoport
  0 siblings, 0 replies; 64+ messages in thread
From: Mike Rapoport @ 2021-09-22  7:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, David Hildenbrand, Robin Murphy, Alex Bee,
	Will Deacon, Andrew Morton, Anshuman Khandual,
	Linux Kernel Mailing List, linux-mm, Linux ARM

On Tue, Sep 21, 2021 at 05:38:05PM +0200, Christoph Hellwig wrote:
> On Tue, Sep 21, 2021 at 12:34:10PM +0300, Mike Rapoport wrote:
> > > It doesn't create any new mappings.  The only real issue is that it
> > > does the wrong thing for RAM in a way that might not be noticed on
> > > simple (x86/PC) platforms.
> > 
> > But if the mapping request was rejected by devm_ioremap_resource() because
> > of an attempt to map RAM, why we would get to dma_map_resource() at all? 
> 
> dma_map_resource takes a phys_addr_t that could come from anywhere.

Right, but it's not different from, say, dma_map_page_attrs() that can get a
struct page from anywhere and there is no actual memory for that struct
page at all. Do you suggest add a check that that struct page is backed by
memory in dma_map_page_attrs() as well?

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2021-09-22  7:24 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-24 13:40 [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases) Alex Bee
2021-08-24 13:40 ` Alex Bee
2021-08-24 17:37 ` Catalin Marinas
2021-08-24 17:37   ` Catalin Marinas
2021-08-24 18:06   ` Robin Murphy
2021-08-24 18:06     ` Robin Murphy
2021-08-24 18:28   ` Mike Rapoport
2021-08-24 18:28     ` Mike Rapoport
2021-08-24 18:46     ` Robin Murphy
2021-08-24 18:46       ` Robin Murphy
2021-08-24 18:59       ` David Hildenbrand
2021-08-24 18:59         ` David Hildenbrand
2021-08-25 10:20         ` Catalin Marinas
2021-08-25 10:20           ` Catalin Marinas
2021-08-25 10:28           ` Will Deacon
2021-08-25 10:28             ` Will Deacon
2021-08-25 10:32             ` Will Deacon
2021-08-25 10:32               ` Will Deacon
2021-08-25 10:33             ` Robin Murphy
2021-08-25 10:33               ` Robin Murphy
2021-08-25 10:38           ` David Hildenbrand
2021-08-25 10:38             ` David Hildenbrand
2021-08-25 10:54             ` Mike Rapoport
2021-08-25 10:54               ` Mike Rapoport
2021-08-25 10:55             ` Catalin Marinas
2021-08-25 10:55               ` Catalin Marinas
2021-08-25 11:12               ` David Hildenbrand
2021-08-25 11:12                 ` David Hildenbrand
2021-08-25 17:15                 ` Catalin Marinas
2021-08-25 17:15                   ` Catalin Marinas
2021-08-25 10:58             ` Robin Murphy
2021-08-25 10:58               ` Robin Murphy
2021-08-25 11:21               ` David Hildenbrand
2021-08-25 11:21                 ` David Hildenbrand
2021-08-25 10:52           ` Mike Rapoport
2021-08-25 10:52             ` Mike Rapoport
2021-09-17 21:22           ` Mike Rapoport
2021-09-17 21:22             ` Mike Rapoport
2021-09-18  5:18             ` Christoph Hellwig
2021-09-18  5:18               ` Christoph Hellwig
2021-09-18  8:37               ` Mike Rapoport
2021-09-18  8:37                 ` Mike Rapoport
2021-09-18 11:39                 ` Mike Rapoport
2021-09-18 11:39                   ` Mike Rapoport
2021-09-20 10:57                   ` Catalin Marinas
2021-09-20 10:57                     ` Catalin Marinas
2021-09-21  8:20                     ` Christoph Hellwig
2021-09-21  8:20                       ` Christoph Hellwig
2021-09-21  9:34                       ` Mike Rapoport
2021-09-21  9:34                         ` Mike Rapoport
2021-09-21 15:38                         ` Christoph Hellwig
2021-09-21 15:38                           ` Christoph Hellwig
2021-09-22  7:22                           ` Mike Rapoport
2021-09-22  7:22                             ` Mike Rapoport
2021-09-20 11:13               ` David Hildenbrand
2021-09-20 11:13                 ` David Hildenbrand
2021-08-24 20:14     ` Alex Bee
2021-08-24 20:14       ` Alex Bee
2021-08-25  4:39       ` Mike Rapoport
2021-08-25  4:39         ` Mike Rapoport
2021-08-25 10:00         ` Alex Bee
2021-08-25 10:00           ` Alex Bee
2021-08-24 20:07   ` Alex Bee
2021-08-24 20:07     ` Alex Bee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.