All of lore.kernel.org
 help / color / mirror / Atom feed
* "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-04-26 14:52 ` Qian Cai
  0 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-04-26 14:52 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel

Applying some memory pressure would causes smartpqi offline even in today's
linux-next. This can always be reproduced by a LTP test cases [1] or sometimes
just compiling kernels.

Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the issue.

[  213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0000 address=0x1000 flags=0x0000]
[  213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0000 address=0x1800 flags=0x0000]
[  233.362013] smartpqi 0000:23:00.0: controller is offline: status code 0x14803
[  233.369359] smartpqi 0000:23:00.0: controller offline
[  233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags 2000001
[  233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: hostbyte=0x01
driverbyte=0x00
[  233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 00 01
08 00
[  233.389003] Write-error on swap-device (254:1:4474640)
[  233.389015] Write-error on swap-device (254:1:2190776)
[  233.389023] Write-error on swap-device (254:1:8351936)

[1] /opt/ltp/testcases/bin/mtest01 -p80 -w

^ permalink raw reply	[flat|nested] 17+ messages in thread

* "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-04-26 14:52 ` Qian Cai
  0 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-04-26 14:52 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel

Applying some memory pressure would causes smartpqi offline even in today's
linux-next. This can always be reproduced by a LTP test cases [1] or sometimes
just compiling kernels.

Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the issue.

[  213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0000 address=0x1000 flags=0x0000]
[  213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0000 address=0x1800 flags=0x0000]
[  233.362013] smartpqi 0000:23:00.0: controller is offline: status code 0x14803
[  233.369359] smartpqi 0000:23:00.0: controller offline
[  233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags 2000001
[  233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: hostbyte=0x01
driverbyte=0x00
[  233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 00 01
08 00
[  233.389003] Write-error on swap-device (254:1:4474640)
[  233.389015] Write-error on swap-device (254:1:2190776)
[  233.389023] Write-error on swap-device (254:1:8351936)

[1] /opt/ltp/testcases/bin/mtest01 -p80 -w
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-04-26 15:26   ` Joerg Roedel
  0 siblings, 0 replies; 17+ messages in thread
From: Joerg Roedel @ 2019-04-26 15:26 UTC (permalink / raw)
  To: Qian Cai; +Cc: iommu, linux-kernel

On Fri, Apr 26, 2019 at 10:52:28AM -0400, Qian Cai wrote:
> Applying some memory pressure would causes smartpqi offline even in today's
> linux-next. This can always be reproduced by a LTP test cases [1] or sometimes
> just compiling kernels.
> 
> Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the issue.
> 
> [  213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0000 address=0x1000 flags=0x0000]
> [  213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0000 address=0x1800 flags=0x0000]
> [  233.362013] smartpqi 0000:23:00.0: controller is offline: status code 0x14803
> [  233.369359] smartpqi 0000:23:00.0: controller offline
> [  233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags 2000001
> [  233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: hostbyte=0x01
> driverbyte=0x00
> [  233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 00 01
> 08 00
> [  233.389003] Write-error on swap-device (254:1:4474640)
> [  233.389015] Write-error on swap-device (254:1:2190776)
> [  233.389023] Write-error on swap-device (254:1:8351936)
> 
> [1] /opt/ltp/testcases/bin/mtest01 -p80 -w

I can't explain that, can you please boot with 'amd_iommu_dump' on the
kernel command line and send me dmesg after boot?

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-04-26 15:26   ` Joerg Roedel
  0 siblings, 0 replies; 17+ messages in thread
From: Joerg Roedel @ 2019-04-26 15:26 UTC (permalink / raw)
  To: Qian Cai; +Cc: iommu, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="UTF-8", Size: 1546 bytes --]

On Fri, Apr 26, 2019 at 10:52:28AM -0400, Qian Cai wrote:
> Applying some memory pressure would causes smartpqi offline even in today's
> linux-next. This can always be reproduced by a LTP test cases [1] or sometimes
> just compiling kernels.
> 
> Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the issue.
> 
> [  213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0000 address=0x1000 flags=0x0000]
> [  213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0000 address=0x1800 flags=0x0000]
> [  233.362013] smartpqi 0000:23:00.0: controller is offline: status code 0x14803
> [  233.369359] smartpqi 0000:23:00.0: controller offline
> [  233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags 2000001
> [  233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: hostbyte=0x01
> driverbyte=0x00
> [  233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 00 01
> 08 00
> [  233.389003] Write-error on swap-device (254:1:4474640)
> [  233.389015] Write-error on swap-device (254:1:2190776)
> [  233.389023] Write-error on swap-device (254:1:8351936)
> 
> [1] /opt/ltp/testcases/bin/mtest01 -p80 -w

I can't explain that, can you please boot with 'amd_iommu_dump' on the
kernel command line and send me dmesg after boot?

Thanks,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-04-26 15:55     ` Qian Cai
  0 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-04-26 15:55 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On Fri, 2019-04-26 at 17:26 +0200, Joerg Roedel wrote:
> On Fri, Apr 26, 2019 at 10:52:28AM -0400, Qian Cai wrote:
> > Applying some memory pressure would causes smartpqi offline even in today's
> > linux-next. This can always be reproduced by a LTP test cases [1] or
> > sometimes
> > just compiling kernels.
> > 
> > Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the
> > issue.
> > 
> > [  213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x0000 address=0x1000 flags=0x0000]
> > [  213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x0000 address=0x1800 flags=0x0000]
> > [  233.362013] smartpqi 0000:23:00.0: controller is offline: status code
> > 0x14803
> > [  233.369359] smartpqi 0000:23:00.0: controller offline
> > [  233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags
> > 2000001
> > [  233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result:
> > hostbyte=0x01
> > driverbyte=0x00
> > [  233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00
> > 00 01
> > 08 00
> > [  233.389003] Write-error on swap-device (254:1:4474640)
> > [  233.389015] Write-error on swap-device (254:1:2190776)
> > [  233.389023] Write-error on swap-device (254:1:8351936)
> > 
> > [1] /opt/ltp/testcases/bin/mtest01 -p80 -w
> 
> I can't explain that, can you please boot with 'amd_iommu_dump' on the
> kernel command line and send me dmesg after boot?

https://git.sr.ht/~cai/linux-debug/blob/master/dmesg

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-04-26 15:55     ` Qian Cai
  0 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-04-26 15:55 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On Fri, 2019-04-26 at 17:26 +0200, Joerg Roedel wrote:
> On Fri, Apr 26, 2019 at 10:52:28AM -0400, Qian Cai wrote:
> > Applying some memory pressure would causes smartpqi offline even in today's
> > linux-next. This can always be reproduced by a LTP test cases [1] or
> > sometimes
> > just compiling kernels.
> > 
> > Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the
> > issue.
> > 
> > [  213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x0000 address=0x1000 flags=0x0000]
> > [  213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> > domain=0x0000 address=0x1800 flags=0x0000]
> > [  233.362013] smartpqi 0000:23:00.0: controller is offline: status code
> > 0x14803
> > [  233.369359] smartpqi 0000:23:00.0: controller offline
> > [  233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags
> > 2000001
> > [  233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result:
> > hostbyte=0x01
> > driverbyte=0x00
> > [  233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00
> > 00 01
> > 08 00
> > [  233.389003] Write-error on swap-device (254:1:4474640)
> > [  233.389015] Write-error on swap-device (254:1:2190776)
> > [  233.389023] Write-error on swap-device (254:1:8351936)
> > 
> > [1] /opt/ltp/testcases/bin/mtest01 -p80 -w
> 
> I can't explain that, can you please boot with 'amd_iommu_dump' on the
> kernel command line and send me dmesg after boot?

https://git.sr.ht/~cai/linux-debug/blob/master/dmesg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-04-29 14:23       ` Joerg Roedel
  0 siblings, 0 replies; 17+ messages in thread
From: Joerg Roedel @ 2019-04-29 14:23 UTC (permalink / raw)
  To: Qian Cai; +Cc: iommu, linux-kernel

On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote:
> https://git.sr.ht/~cai/linux-debug/blob/master/dmesg

Thanks, I can't see any definitions for unity ranges or exclusion ranges
in the IVRS table dump, which makes it even more weird.

Can you please send me the output of

	for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo "---$f"; cat $f;done

to double-check?

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-04-29 14:23       ` Joerg Roedel
  0 siblings, 0 replies; 17+ messages in thread
From: Joerg Roedel @ 2019-04-29 14:23 UTC (permalink / raw)
  To: Qian Cai; +Cc: iommu, linux-kernel

On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote:
> https://git.sr.ht/~cai/linux-debug/blob/master/dmesg

Thanks, I can't see any definitions for unity ranges or exclusion ranges
in the IVRS table dump, which makes it even more weird.

Can you please send me the output of

	for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo "---$f"; cat $f;done

to double-check?

Thanks,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-05-01  1:41         ` Qian Cai
  0 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-05-01  1:41 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel



On 4/29/19 10:23 AM, Joerg Roedel wrote:
> On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote:
>> https://git.sr.ht/~cai/linux-debug/blob/master/dmesg
> 
> Thanks, I can't see any definitions for unity ranges or exclusion ranges
> in the IVRS table dump, which makes it even more weird.
> 
> Can you please send me the output of
> 
> 	for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo "---$f"; cat $f;done
> 
> to double-check?

It is going to take a while to reserve that system again to gather the
information. BTW, this is only reproducible on linux-next but not mainline.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-05-01  1:41         ` Qian Cai
  0 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-05-01  1:41 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel



On 4/29/19 10:23 AM, Joerg Roedel wrote:
> On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote:
>> https://git.sr.ht/~cai/linux-debug/blob/master/dmesg
> 
> Thanks, I can't see any definitions for unity ranges or exclusion ranges
> in the IVRS table dump, which makes it even more weird.
> 
> Can you please send me the output of
> 
> 	for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo "---$f"; cat $f;done
> 
> to double-check?

It is going to take a while to reserve that system again to gather the
information. BTW, this is only reproducible on linux-next but not mainline.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
  2019-04-29 14:23       ` Joerg Roedel
@ 2019-05-03 20:38         ` Qian Cai
  -1 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-05-03 20:38 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On Mon, 2019-04-29 at 16:23 +0200, Joerg Roedel wrote:
> On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote:
> > https://git.sr.ht/~cai/linux-debug/blob/master/dmesg
> 
> Thanks, I can't see any definitions for unity ranges or exclusion ranges
> in the IVRS table dump, which makes it even more weird.
> 
> Can you please send me the output of
> 
> 	for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo "-
> --$f"; cat $f;done
> 
> to double-check?

https://git.sr.ht/~cai/linux-debug/blob/master/iommu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-05-03 20:38         ` Qian Cai
  0 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-05-03 20:38 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On Mon, 2019-04-29 at 16:23 +0200, Joerg Roedel wrote:
> On Fri, Apr 26, 2019 at 11:55:12AM -0400, Qian Cai wrote:
> > https://git.sr.ht/~cai/linux-debug/blob/master/dmesg
> 
> Thanks, I can't see any definitions for unity ranges or exclusion ranges
> in the IVRS table dump, which makes it even more weird.
> 
> Can you please send me the output of
> 
> 	for f in `ls -1 /sys/kernel/iommu_groups/*/reserved_regions`; do echo "-
> --$f"; cat $f;done
> 
> to double-check?

https://git.sr.ht/~cai/linux-debug/blob/master/iommu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
  2019-04-26 14:52 ` Qian Cai
@ 2019-05-06  2:56   ` Qian Cai
  -1 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-05-06  2:56 UTC (permalink / raw)
  To: jroedel, hch
  Cc: iommu, linux-kernel, linux-scsi, martin.petersen, jejb,
	don.brace, kevin.barnett, scott.teel, david.carroll

On 4/26/19 10:52 AM, Qian Cai wrote:
> Applying some memory pressure would causes smartpqi offline even in today's
> linux-next. This can always be reproduced by a LTP test cases [1] or sometimes
> just compiling kernels.
> 
> Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the issue.
> 
> [  213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0000 address=0x1000 flags=0x0000]
> [  213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0000 address=0x1800 flags=0x0000]
> [  233.362013] smartpqi 0000:23:00.0: controller is offline: status code 0x14803
> [  233.369359] smartpqi 0000:23:00.0: controller offline
> [  233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags 2000001
> [  233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: hostbyte=0x01
> driverbyte=0x00
> [  233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 00 01
> 08 00
> [  233.389003] Write-error on swap-device (254:1:4474640)
> [  233.389015] Write-error on swap-device (254:1:2190776)
> [  233.389023] Write-error on swap-device (254:1:8351936)
> 
> [1] /opt/ltp/testcases/bin/mtest01 -p80 -w

It turned out another linux-next commit is needed to reproduce this, i.e.,
7a5dbf3ab2f0 ("iommu/amd: Remove the leftover of bypass support"). Specifically,
the chunks for map_sg() and unmap_sg(). This has been reproduced on 3 different
HPE ProLiant DL385 Gen10 systems so far.

Either reverted the chunks (map_sg() and unmap_sg()) on the top of the latest
linux-next fixed the issue or applied them on the top of the mainline v5.1
reproduced it immediately.

Lots of time it triggered this BUG_ON(!iova) in iova_magazine_free_pfns()
instead of the smartpqi offline.

    kernel BUG at drivers/iommu/iova.c:813!
    Workqueue: kblockd blk_mq_run_work_fn
    RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0
    Call Trace:
     free_cpu_cached_iovas+0xbd/0x150
     alloc_iova_fast+0x8c/0xba
     dma_ops_alloc_iova.isra.6+0x65/0xa0
     map_sg+0x8c/0x2a0
     scsi_dma_map+0xc6/0x160
     pqi_aio_submit_io+0x1f6/0x440 [smartpqi]
     pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi]
     scsi_queue_rq+0x79c/0x1200
     blk_mq_dispatch_rq_list+0x4dc/0xb70
     blk_mq_sched_dispatch_requests+0x249/0x310
     __blk_mq_run_hw_queue+0x128/0x200
     blk_mq_run_work_fn+0x27/0x30
     process_one_work+0x522/0xa10
     worker_thread+0x63/0x5b0
     kthread+0x1d2/0x1f0
     ret_from_fork+0x22/0x40

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-05-06  2:56   ` Qian Cai
  0 siblings, 0 replies; 17+ messages in thread
From: Qian Cai @ 2019-05-06  2:56 UTC (permalink / raw)
  To: jroedel, hch
  Cc: don.brace, martin.petersen, linux-scsi, kevin.barnett, jejb,
	linux-kernel, iommu, scott.teel, david.carroll

On 4/26/19 10:52 AM, Qian Cai wrote:
> Applying some memory pressure would causes smartpqi offline even in today's
> linux-next. This can always be reproduced by a LTP test cases [1] or sometimes
> just compiling kernels.
> 
> Reverting the commit "iommu/amd: Set exclusion range correctly" fixed the issue.
> 
> [  213.437112] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0000 address=0x1000 flags=0x0000]
> [  213.447659] smartpqi 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0000 address=0x1800 flags=0x0000]
> [  233.362013] smartpqi 0000:23:00.0: controller is offline: status code 0x14803
> [  233.369359] smartpqi 0000:23:00.0: controller offline
> [  233.388915] print_req_error: I/O error, dev sdb, sector 3317352 flags 2000001
> [  233.388921] sd 0:0:0:0: [sdb] tag#95 UNKNOWN(0x2003) Result: hostbyte=0x01
> driverbyte=0x00
> [  233.388931] sd 0:0:0:0: [sdb] tag#95 CDB: opcode=0x2a 2a 00 00 55 89 00 00 01
> 08 00
> [  233.389003] Write-error on swap-device (254:1:4474640)
> [  233.389015] Write-error on swap-device (254:1:2190776)
> [  233.389023] Write-error on swap-device (254:1:8351936)
> 
> [1] /opt/ltp/testcases/bin/mtest01 -p80 -w

It turned out another linux-next commit is needed to reproduce this, i.e.,
7a5dbf3ab2f0 ("iommu/amd: Remove the leftover of bypass support"). Specifically,
the chunks for map_sg() and unmap_sg(). This has been reproduced on 3 different
HPE ProLiant DL385 Gen10 systems so far.

Either reverted the chunks (map_sg() and unmap_sg()) on the top of the latest
linux-next fixed the issue or applied them on the top of the mainline v5.1
reproduced it immediately.

Lots of time it triggered this BUG_ON(!iova) in iova_magazine_free_pfns()
instead of the smartpqi offline.

    kernel BUG at drivers/iommu/iova.c:813!
    Workqueue: kblockd blk_mq_run_work_fn
    RIP: 0010:iova_magazine_free_pfns+0x7d/0xc0
    Call Trace:
     free_cpu_cached_iovas+0xbd/0x150
     alloc_iova_fast+0x8c/0xba
     dma_ops_alloc_iova.isra.6+0x65/0xa0
     map_sg+0x8c/0x2a0
     scsi_dma_map+0xc6/0x160
     pqi_aio_submit_io+0x1f6/0x440 [smartpqi]
     pqi_scsi_queue_command+0x90c/0xdd0 [smartpqi]
     scsi_queue_rq+0x79c/0x1200
     blk_mq_dispatch_rq_list+0x4dc/0xb70
     blk_mq_sched_dispatch_requests+0x249/0x310
     __blk_mq_run_hw_queue+0x128/0x200
     blk_mq_run_work_fn+0x27/0x30
     process_one_work+0x522/0xa10
     worker_thread+0x63/0x5b0
     kthread+0x1d2/0x1f0
     ret_from_fork+0x22/0x40
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
  2019-05-06  2:56   ` Qian Cai
  (?)
@ 2019-05-06 12:29     ` Joerg Roedel
  -1 siblings, 0 replies; 17+ messages in thread
From: Joerg Roedel @ 2019-05-06 12:29 UTC (permalink / raw)
  To: Qian Cai
  Cc: hch, iommu, linux-kernel, linux-scsi, martin.petersen, jejb,
	don.brace, kevin.barnett, scott.teel, david.carroll

On Sun, May 05, 2019 at 10:56:28PM -0400, Qian Cai wrote:
> It turned out another linux-next commit is needed to reproduce this, i.e.,
> 7a5dbf3ab2f0 ("iommu/amd: Remove the leftover of bypass support"). Specifically,
> the chunks for map_sg() and unmap_sg(). This has been reproduced on 3 different
> HPE ProLiant DL385 Gen10 systems so far.
> 
> Either reverted the chunks (map_sg() and unmap_sg()) on the top of the latest
> linux-next fixed the issue or applied them on the top of the mainline v5.1
> reproduced it immediately.
> 
> Lots of time it triggered this BUG_ON(!iova) in iova_magazine_free_pfns()
> instead of the smartpqi offline.

Thanks a lot for tracking this down further. I queued a revert of the
above patch, as it does some questionable things I missed during review.
We should revisit the patch during the next cycle, but for now it is
better to just revert it. Revert attached.

From 89736a0ee81d14439d085c8d4653bc1d86fe64d8 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel@suse.de>
Date: Mon, 6 May 2019 14:24:18 +0200
Subject: [PATCH] Revert "iommu/amd: Remove the leftover of bypass support"

This reverts commit 7a5dbf3ab2f04905cf8468c66fcdbfb643068bcb.

This commit not only removes the leftovers of bypass
support, it also mostly removes the checking of the return
value of the get_domain() function. This can lead to silent
data corruption bugs when a device is not attached to its
dma_ops domain and a DMA-API function is called for that
device.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 80 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 63 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index bc98de5fa867..23c1a7eebb06 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2459,10 +2459,20 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 			   unsigned long attrs)
 {
 	phys_addr_t paddr = page_to_phys(page) + offset;
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	u64 dma_mask;
+
+	domain = get_domain(dev);
+	if (PTR_ERR(domain) == -EINVAL)
+		return (dma_addr_t)paddr;
+	else if (IS_ERR(domain))
+		return DMA_MAPPING_ERROR;
+
+	dma_mask = *dev->dma_mask;
+	dma_dom = to_dma_ops_domain(domain);
 
-	return __map_single(dev, dma_dom, paddr, size, dir, *dev->dma_mask);
+	return __map_single(dev, dma_dom, paddr, size, dir, dma_mask);
 }
 
 /*
@@ -2471,8 +2481,14 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 		       enum dma_data_direction dir, unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return;
+
+	dma_dom = to_dma_ops_domain(domain);
 
 	__unmap_single(dma_dom, dma_addr, size, dir);
 }
@@ -2512,13 +2528,20 @@ static int map_sg(struct device *dev, struct scatterlist *sglist,
 		  unsigned long attrs)
 {
 	int mapped_pages = 0, npages = 0, prot = 0, i;
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
 	struct scatterlist *s;
 	unsigned long address;
-	u64 dma_mask = *dev->dma_mask;
+	u64 dma_mask;
 	int ret;
 
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return 0;
+
+	dma_dom  = to_dma_ops_domain(domain);
+	dma_mask = *dev->dma_mask;
+
 	npages = sg_num_pages(dev, sglist, nelems);
 
 	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
@@ -2592,11 +2615,20 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
 		     int nelems, enum dma_data_direction dir,
 		     unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	unsigned long startaddr;
+	int npages = 2;
+
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return;
+
+	startaddr = sg_dma_address(sglist) & PAGE_MASK;
+	dma_dom   = to_dma_ops_domain(domain);
+	npages    = sg_num_pages(dev, sglist, nelems);
 
-	__unmap_single(dma_dom, sg_dma_address(sglist) & PAGE_MASK,
-			sg_num_pages(dev, sglist, nelems) << PAGE_SHIFT, dir);
+	__unmap_single(dma_dom, startaddr, npages << PAGE_SHIFT, dir);
 }
 
 /*
@@ -2607,11 +2639,16 @@ static void *alloc_coherent(struct device *dev, size_t size,
 			    unsigned long attrs)
 {
 	u64 dma_mask = dev->coherent_dma_mask;
-	struct protection_domain *domain = get_domain(dev);
+	struct protection_domain *domain;
 	struct dma_ops_domain *dma_dom;
 	struct page *page;
 
-	if (IS_ERR(domain))
+	domain = get_domain(dev);
+	if (PTR_ERR(domain) == -EINVAL) {
+		page = alloc_pages(flag, get_order(size));
+		*dma_addr = page_to_phys(page);
+		return page_address(page);
+	} else if (IS_ERR(domain))
 		return NULL;
 
 	dma_dom   = to_dma_ops_domain(domain);
@@ -2657,13 +2694,22 @@ static void free_coherent(struct device *dev, size_t size,
 			  void *virt_addr, dma_addr_t dma_addr,
 			  unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
-	struct page *page = virt_to_page(virt_addr);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	struct page *page;
 
+	page = virt_to_page(virt_addr);
 	size = PAGE_ALIGN(size);
 
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		goto free_mem;
+
+	dma_dom = to_dma_ops_domain(domain);
+
 	__unmap_single(dma_dom, dma_addr, size, DMA_BIDIRECTIONAL);
+
+free_mem:
 	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
 		__free_pages(page, get_order(size));
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-05-06 12:29     ` Joerg Roedel
  0 siblings, 0 replies; 17+ messages in thread
From: Joerg Roedel @ 2019-05-06 12:29 UTC (permalink / raw)
  To: Qian Cai
  Cc: hch, iommu, linux-kernel, linux-scsi, martin.petersen, jejb,
	don.brace, kevin.barnett, scott.teel, david.carroll

On Sun, May 05, 2019 at 10:56:28PM -0400, Qian Cai wrote:
> It turned out another linux-next commit is needed to reproduce this, i.e.,
> 7a5dbf3ab2f0 ("iommu/amd: Remove the leftover of bypass support"). Specifically,
> the chunks for map_sg() and unmap_sg(). This has been reproduced on 3 different
> HPE ProLiant DL385 Gen10 systems so far.
> 
> Either reverted the chunks (map_sg() and unmap_sg()) on the top of the latest
> linux-next fixed the issue or applied them on the top of the mainline v5.1
> reproduced it immediately.
> 
> Lots of time it triggered this BUG_ON(!iova) in iova_magazine_free_pfns()
> instead of the smartpqi offline.

Thanks a lot for tracking this down further. I queued a revert of the
above patch, as it does some questionable things I missed during review.
We should revisit the patch during the next cycle, but for now it is
better to just revert it. Revert attached.

>From 89736a0ee81d14439d085c8d4653bc1d86fe64d8 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel@suse.de>
Date: Mon, 6 May 2019 14:24:18 +0200
Subject: [PATCH] Revert "iommu/amd: Remove the leftover of bypass support"

This reverts commit 7a5dbf3ab2f04905cf8468c66fcdbfb643068bcb.

This commit not only removes the leftovers of bypass
support, it also mostly removes the checking of the return
value of the get_domain() function. This can lead to silent
data corruption bugs when a device is not attached to its
dma_ops domain and a DMA-API function is called for that
device.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 80 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 63 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index bc98de5fa867..23c1a7eebb06 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2459,10 +2459,20 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 			   unsigned long attrs)
 {
 	phys_addr_t paddr = page_to_phys(page) + offset;
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	u64 dma_mask;
+
+	domain = get_domain(dev);
+	if (PTR_ERR(domain) == -EINVAL)
+		return (dma_addr_t)paddr;
+	else if (IS_ERR(domain))
+		return DMA_MAPPING_ERROR;
+
+	dma_mask = *dev->dma_mask;
+	dma_dom = to_dma_ops_domain(domain);
 
-	return __map_single(dev, dma_dom, paddr, size, dir, *dev->dma_mask);
+	return __map_single(dev, dma_dom, paddr, size, dir, dma_mask);
 }
 
 /*
@@ -2471,8 +2481,14 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 		       enum dma_data_direction dir, unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return;
+
+	dma_dom = to_dma_ops_domain(domain);
 
 	__unmap_single(dma_dom, dma_addr, size, dir);
 }
@@ -2512,13 +2528,20 @@ static int map_sg(struct device *dev, struct scatterlist *sglist,
 		  unsigned long attrs)
 {
 	int mapped_pages = 0, npages = 0, prot = 0, i;
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
 	struct scatterlist *s;
 	unsigned long address;
-	u64 dma_mask = *dev->dma_mask;
+	u64 dma_mask;
 	int ret;
 
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return 0;
+
+	dma_dom  = to_dma_ops_domain(domain);
+	dma_mask = *dev->dma_mask;
+
 	npages = sg_num_pages(dev, sglist, nelems);
 
 	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
@@ -2592,11 +2615,20 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
 		     int nelems, enum dma_data_direction dir,
 		     unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	unsigned long startaddr;
+	int npages = 2;
+
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return;
+
+	startaddr = sg_dma_address(sglist) & PAGE_MASK;
+	dma_dom   = to_dma_ops_domain(domain);
+	npages    = sg_num_pages(dev, sglist, nelems);
 
-	__unmap_single(dma_dom, sg_dma_address(sglist) & PAGE_MASK,
-			sg_num_pages(dev, sglist, nelems) << PAGE_SHIFT, dir);
+	__unmap_single(dma_dom, startaddr, npages << PAGE_SHIFT, dir);
 }
 
 /*
@@ -2607,11 +2639,16 @@ static void *alloc_coherent(struct device *dev, size_t size,
 			    unsigned long attrs)
 {
 	u64 dma_mask = dev->coherent_dma_mask;
-	struct protection_domain *domain = get_domain(dev);
+	struct protection_domain *domain;
 	struct dma_ops_domain *dma_dom;
 	struct page *page;
 
-	if (IS_ERR(domain))
+	domain = get_domain(dev);
+	if (PTR_ERR(domain) == -EINVAL) {
+		page = alloc_pages(flag, get_order(size));
+		*dma_addr = page_to_phys(page);
+		return page_address(page);
+	} else if (IS_ERR(domain))
 		return NULL;
 
 	dma_dom   = to_dma_ops_domain(domain);
@@ -2657,13 +2694,22 @@ static void free_coherent(struct device *dev, size_t size,
 			  void *virt_addr, dma_addr_t dma_addr,
 			  unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
-	struct page *page = virt_to_page(virt_addr);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	struct page *page;
 
+	page = virt_to_page(virt_addr);
 	size = PAGE_ALIGN(size);
 
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		goto free_mem;
+
+	dma_dom = to_dma_ops_domain(domain);
+
 	__unmap_single(dma_dom, dma_addr, size, DMA_BIDIRECTIONAL);
+
+free_mem:
 	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
 		__free_pages(page, get_order(size));
 }
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: "iommu/amd: Set exclusion range correctly" causes smartpqi offline
@ 2019-05-06 12:29     ` Joerg Roedel
  0 siblings, 0 replies; 17+ messages in thread
From: Joerg Roedel @ 2019-05-06 12:29 UTC (permalink / raw)
  To: Qian Cai
  Cc: don.brace, linux-scsi, martin.petersen, kevin.barnett, jejb,
	linux-kernel, iommu, scott.teel, david.carroll, hch

On Sun, May 05, 2019 at 10:56:28PM -0400, Qian Cai wrote:
> It turned out another linux-next commit is needed to reproduce this, i.e.,
> 7a5dbf3ab2f0 ("iommu/amd: Remove the leftover of bypass support"). Specifically,
> the chunks for map_sg() and unmap_sg(). This has been reproduced on 3 different
> HPE ProLiant DL385 Gen10 systems so far.
> 
> Either reverted the chunks (map_sg() and unmap_sg()) on the top of the latest
> linux-next fixed the issue or applied them on the top of the mainline v5.1
> reproduced it immediately.
> 
> Lots of time it triggered this BUG_ON(!iova) in iova_magazine_free_pfns()
> instead of the smartpqi offline.

Thanks a lot for tracking this down further. I queued a revert of the
above patch, as it does some questionable things I missed during review.
We should revisit the patch during the next cycle, but for now it is
better to just revert it. Revert attached.

From 89736a0ee81d14439d085c8d4653bc1d86fe64d8 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel@suse.de>
Date: Mon, 6 May 2019 14:24:18 +0200
Subject: [PATCH] Revert "iommu/amd: Remove the leftover of bypass support"

This reverts commit 7a5dbf3ab2f04905cf8468c66fcdbfb643068bcb.

This commit not only removes the leftovers of bypass
support, it also mostly removes the checking of the return
value of the get_domain() function. This can lead to silent
data corruption bugs when a device is not attached to its
dma_ops domain and a DMA-API function is called for that
device.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 80 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 63 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index bc98de5fa867..23c1a7eebb06 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2459,10 +2459,20 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 			   unsigned long attrs)
 {
 	phys_addr_t paddr = page_to_phys(page) + offset;
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	u64 dma_mask;
+
+	domain = get_domain(dev);
+	if (PTR_ERR(domain) == -EINVAL)
+		return (dma_addr_t)paddr;
+	else if (IS_ERR(domain))
+		return DMA_MAPPING_ERROR;
+
+	dma_mask = *dev->dma_mask;
+	dma_dom = to_dma_ops_domain(domain);
 
-	return __map_single(dev, dma_dom, paddr, size, dir, *dev->dma_mask);
+	return __map_single(dev, dma_dom, paddr, size, dir, dma_mask);
 }
 
 /*
@@ -2471,8 +2481,14 @@ static dma_addr_t map_page(struct device *dev, struct page *page,
 static void unmap_page(struct device *dev, dma_addr_t dma_addr, size_t size,
 		       enum dma_data_direction dir, unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return;
+
+	dma_dom = to_dma_ops_domain(domain);
 
 	__unmap_single(dma_dom, dma_addr, size, dir);
 }
@@ -2512,13 +2528,20 @@ static int map_sg(struct device *dev, struct scatterlist *sglist,
 		  unsigned long attrs)
 {
 	int mapped_pages = 0, npages = 0, prot = 0, i;
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
 	struct scatterlist *s;
 	unsigned long address;
-	u64 dma_mask = *dev->dma_mask;
+	u64 dma_mask;
 	int ret;
 
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return 0;
+
+	dma_dom  = to_dma_ops_domain(domain);
+	dma_mask = *dev->dma_mask;
+
 	npages = sg_num_pages(dev, sglist, nelems);
 
 	address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
@@ -2592,11 +2615,20 @@ static void unmap_sg(struct device *dev, struct scatterlist *sglist,
 		     int nelems, enum dma_data_direction dir,
 		     unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	unsigned long startaddr;
+	int npages = 2;
+
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		return;
+
+	startaddr = sg_dma_address(sglist) & PAGE_MASK;
+	dma_dom   = to_dma_ops_domain(domain);
+	npages    = sg_num_pages(dev, sglist, nelems);
 
-	__unmap_single(dma_dom, sg_dma_address(sglist) & PAGE_MASK,
-			sg_num_pages(dev, sglist, nelems) << PAGE_SHIFT, dir);
+	__unmap_single(dma_dom, startaddr, npages << PAGE_SHIFT, dir);
 }
 
 /*
@@ -2607,11 +2639,16 @@ static void *alloc_coherent(struct device *dev, size_t size,
 			    unsigned long attrs)
 {
 	u64 dma_mask = dev->coherent_dma_mask;
-	struct protection_domain *domain = get_domain(dev);
+	struct protection_domain *domain;
 	struct dma_ops_domain *dma_dom;
 	struct page *page;
 
-	if (IS_ERR(domain))
+	domain = get_domain(dev);
+	if (PTR_ERR(domain) == -EINVAL) {
+		page = alloc_pages(flag, get_order(size));
+		*dma_addr = page_to_phys(page);
+		return page_address(page);
+	} else if (IS_ERR(domain))
 		return NULL;
 
 	dma_dom   = to_dma_ops_domain(domain);
@@ -2657,13 +2694,22 @@ static void free_coherent(struct device *dev, size_t size,
 			  void *virt_addr, dma_addr_t dma_addr,
 			  unsigned long attrs)
 {
-	struct protection_domain *domain = get_domain(dev);
-	struct dma_ops_domain *dma_dom = to_dma_ops_domain(domain);
-	struct page *page = virt_to_page(virt_addr);
+	struct protection_domain *domain;
+	struct dma_ops_domain *dma_dom;
+	struct page *page;
 
+	page = virt_to_page(virt_addr);
 	size = PAGE_ALIGN(size);
 
+	domain = get_domain(dev);
+	if (IS_ERR(domain))
+		goto free_mem;
+
+	dma_dom = to_dma_ops_domain(domain);
+
 	__unmap_single(dma_dom, dma_addr, size, DMA_BIDIRECTIONAL);
+
+free_mem:
 	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
 		__free_pages(page, get_order(size));
 }
-- 
2.16.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-05-06 12:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-26 14:52 "iommu/amd: Set exclusion range correctly" causes smartpqi offline Qian Cai
2019-04-26 14:52 ` Qian Cai
2019-04-26 15:26 ` Joerg Roedel
2019-04-26 15:26   ` Joerg Roedel
2019-04-26 15:55   ` Qian Cai
2019-04-26 15:55     ` Qian Cai
2019-04-29 14:23     ` Joerg Roedel
2019-04-29 14:23       ` Joerg Roedel
2019-05-01  1:41       ` Qian Cai
2019-05-01  1:41         ` Qian Cai
2019-05-03 20:38       ` Qian Cai
2019-05-03 20:38         ` Qian Cai
2019-05-06  2:56 ` Qian Cai
2019-05-06  2:56   ` Qian Cai
2019-05-06 12:29   ` Joerg Roedel
2019-05-06 12:29     ` Joerg Roedel
2019-05-06 12:29     ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.