iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] iommu/iova: silence warnings under memory pressure
@ 2019-11-22  2:55 Qian Cai
  2019-11-22  4:37 ` Joe Perches
  0 siblings, 1 reply; 6+ messages in thread
From: Qian Cai @ 2019-11-22  2:55 UTC (permalink / raw)
  To: jroedel; +Cc: linux-kernel, iommu, joe, dwmw2

When running heavy memory pressure workloads, this 5+ old system is
throwing endless warnings below because disk IO is too slow to recover
from swapping. Since the volume from alloc_iova_fast() could be large,
once it calls printk(), it will trigger disk IO (writing to the log
files) and pending softirqs which could cause an infinite loop and make
no progress for days by the ongoimng memory reclaim. This is the counter
part for Intel where the AMD part has already been merged. See the
commit 3d708895325b ("iommu/amd: Silence warnings under memory
pressure"). Since the allocation failure will be reported in
intel_alloc_iova(), so just call printk_ratelimted() there and silence
the one in alloc_iova_mem() to avoid the expensive warn_alloc().

 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 slab_out_of_memory: 66 callbacks suppressed
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
   node 0: slabs: 1822, objs: 16398, free: 0
   node 1: slabs: 2051, objs: 18459, free: 31
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
   node 0: slabs: 1822, objs: 16398, free: 0
   node 1: slabs: 2051, objs: 18459, free: 31
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 0: slabs: 1822, objs: 16398, free: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 1: slabs: 2051, objs: 18459, free: 31
   node 0: slabs: 697, objs: 4182, free: 0
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   node 1: slabs: 381, objs: 2286, free: 27
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 1: slabs: 381, objs: 2286, free: 27
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 warn_alloc: 96 callbacks suppressed
 kworker/11:1H: page allocation failure: order:0,
mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0-1
 CPU: 11 PID: 1642 Comm: kworker/11:1H Tainted: G    B
 Hardware name: HP ProLiant XL420 Gen9/ProLiant XL420 Gen9, BIOS U19
12/27/2015
 Workqueue: kblockd blk_mq_run_work_fn
 Call Trace:
  dump_stack+0xa0/0xea
  warn_alloc.cold.94+0x8a/0x12d
  __alloc_pages_slowpath+0x1750/0x1870
  __alloc_pages_nodemask+0x58a/0x710
  alloc_pages_current+0x9c/0x110
  alloc_slab_page+0xc9/0x760
  allocate_slab+0x48f/0x5d0
  new_slab+0x46/0x70
  ___slab_alloc+0x4ab/0x7b0
  __slab_alloc+0x43/0x70
  kmem_cache_alloc+0x2dd/0x450
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
  alloc_iova+0x33/0x210
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
  alloc_iova_fast+0x62/0x3d1
   node 1: slabs: 381, objs: 2286, free: 27
  intel_alloc_iova+0xce/0xe0
  intel_map_sg+0xed/0x410
  scsi_dma_map+0xd7/0x160
  scsi_queue_rq+0xbf7/0x1310
  blk_mq_dispatch_rq_list+0x4d9/0xbc0
  blk_mq_sched_dispatch_requests+0x24a/0x300
  __blk_mq_run_hw_queue+0x156/0x230
  blk_mq_run_work_fn+0x3b/0x40
  process_one_work+0x579/0xb90
  worker_thread+0x63/0x5b0
  kthread+0x1e6/0x210
  ret_from_fork+0x3a/0x50
 Mem-Info:
 active_anon:2422723 inactive_anon:361971 isolated_anon:34403
  active_file:2285 inactive_file:1838 isolated_file:0
  unevictable:0 dirty:1 writeback:5 unstable:0
  slab_reclaimable:13972 slab_unreclaimable:453879
  mapped:2380 shmem:154 pagetables:6948 bounce:0
  free:19133 free_pcp:7363 free_cma:0

Signed-off-by: Qian Cai <cai@lca.pw>
---

v2: use dev_err_ratelimited() and improve the commit messages.

 drivers/iommu/intel-iommu.c | 3 ++-
 drivers/iommu/iova.c        | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 6db6d969e31c..c01a7bc99385 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device *dev,
 	iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
 				   IOVA_PFN(dma_mask), true);
 	if (unlikely(!iova_pfn)) {
-		dev_err(dev, "Allocating %ld-page iova failed", nrpages);
+		dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
+				    nrpages);
 		return 0;
 	}
 
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 41c605b0058f..aa1a56aaa5ee 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -233,7 +233,7 @@ static DEFINE_MUTEX(iova_cache_mutex);
 
 struct iova *alloc_iova_mem(void)
 {
-	return kmem_cache_alloc(iova_cache, GFP_ATOMIC);
+	return kmem_cache_alloc(iova_cache, GFP_ATOMIC | __GFP_NOWARN);
 }
 EXPORT_SYMBOL(alloc_iova_mem);
 
-- 
2.21.0 (Apple Git-122.2)

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] iommu/iova: silence warnings under memory pressure
  2019-11-22  2:55 [PATCH v2] iommu/iova: silence warnings under memory pressure Qian Cai
@ 2019-11-22  4:37 ` Joe Perches
  2019-11-22 14:59   ` Qian Cai
  0 siblings, 1 reply; 6+ messages in thread
From: Joe Perches @ 2019-11-22  4:37 UTC (permalink / raw)
  To: Qian Cai, jroedel; +Cc: iommu, dwmw2, linux-kernel

On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> When running heavy memory pressure workloads, this 5+ old system is
> throwing endless warnings below because disk IO is too slow to recover
> from swapping. Since the volume from alloc_iova_fast() could be large,
> once it calls printk(), it will trigger disk IO (writing to the log
> files) and pending softirqs which could cause an infinite loop and make
> no progress for days by the ongoimng memory reclaim. This is the counter
> part for Intel where the AMD part has already been merged. See the
> commit 3d708895325b ("iommu/amd: Silence warnings under memory
> pressure"). Since the allocation failure will be reported in
> intel_alloc_iova(), so just call printk_ratelimted() there and silence
> the one in alloc_iova_mem() to avoid the expensive warn_alloc().
[]
> v2: use dev_err_ratelimited() and improve the commit messages.
[]
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
[]
> @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device *dev,
>  	iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
>  				   IOVA_PFN(dma_mask), true);
>  	if (unlikely(!iova_pfn)) {
> -		dev_err(dev, "Allocating %ld-page iova failed", nrpages);
> +		dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
> +				    nrpages);

Trivia:

This should really have a \n termination on the format string

		dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] iommu/iova: silence warnings under memory pressure
  2019-11-22  4:37 ` Joe Perches
@ 2019-11-22 14:59   ` Qian Cai
  2019-11-22 16:28     ` Joe Perches
  0 siblings, 1 reply; 6+ messages in thread
From: Qian Cai @ 2019-11-22 14:59 UTC (permalink / raw)
  To: Joe Perches, jroedel; +Cc: iommu, dwmw2, linux-kernel

On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote:
> On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> > When running heavy memory pressure workloads, this 5+ old system is
> > throwing endless warnings below because disk IO is too slow to recover
> > from swapping. Since the volume from alloc_iova_fast() could be large,
> > once it calls printk(), it will trigger disk IO (writing to the log
> > files) and pending softirqs which could cause an infinite loop and make
> > no progress for days by the ongoimng memory reclaim. This is the counter
> > part for Intel where the AMD part has already been merged. See the
> > commit 3d708895325b ("iommu/amd: Silence warnings under memory
> > pressure"). Since the allocation failure will be reported in
> > intel_alloc_iova(), so just call printk_ratelimted() there and silence
> > the one in alloc_iova_mem() to avoid the expensive warn_alloc().
> 
> []
> > v2: use dev_err_ratelimited() and improve the commit messages.
> 
> []
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> 
> []
> > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device *dev,
> >  	iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
> >  				   IOVA_PFN(dma_mask), true);
> >  	if (unlikely(!iova_pfn)) {
> > -		dev_err(dev, "Allocating %ld-page iova failed", nrpages);
> > +		dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
> > +				    nrpages);
> 
> Trivia:
> 
> This should really have a \n termination on the format string
> 
> 		dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",
> 
> 

Why do you say so? It is right now printing with a newline added anyway.

 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] iommu/iova: silence warnings under memory pressure
  2019-11-22 14:59   ` Qian Cai
@ 2019-11-22 16:28     ` Joe Perches
  2019-11-22 16:46       ` Qian Cai
  0 siblings, 1 reply; 6+ messages in thread
From: Joe Perches @ 2019-11-22 16:28 UTC (permalink / raw)
  To: Qian Cai, jroedel; +Cc: iommu, dwmw2, linux-kernel

On Fri, 2019-11-22 at 09:59 -0500, Qian Cai wrote:
> On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote:
> > On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> > > When running heavy memory pressure workloads, this 5+ old system is
> > > throwing endless warnings below because disk IO is too slow to recover
> > > from swapping. Since the volume from alloc_iova_fast() could be large,
> > > once it calls printk(), it will trigger disk IO (writing to the log
> > > files) and pending softirqs which could cause an infinite loop and make
> > > no progress for days by the ongoimng memory reclaim. This is the counter
> > > part for Intel where the AMD part has already been merged. See the
> > > commit 3d708895325b ("iommu/amd: Silence warnings under memory
> > > pressure"). Since the allocation failure will be reported in
> > > intel_alloc_iova(), so just call printk_ratelimted() there and silence
> > > the one in alloc_iova_mem() to avoid the expensive warn_alloc().
> > 
> > []
> > > v2: use dev_err_ratelimited() and improve the commit messages.
> > 
> > []
> > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > 
> > []
> > > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device *dev,
> > >  	iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
> > >  				   IOVA_PFN(dma_mask), true);
> > >  	if (unlikely(!iova_pfn)) {
> > > -		dev_err(dev, "Allocating %ld-page iova failed", nrpages);
> > > +		dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
> > > +				    nrpages);
> > 
> > Trivia:
> > 
> > This should really have a \n termination on the format string
> > 
> > 		dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",
> > 
> > 
> 
> Why do you say so? It is right now printing with a newline added anyway.
> 
>  hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed

If another process uses pr_cont at the same time,
it can be interleaved.


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] iommu/iova: silence warnings under memory pressure
  2019-11-22 16:28     ` Joe Perches
@ 2019-11-22 16:46       ` Qian Cai
  2019-11-22 16:49         ` Joe Perches
  0 siblings, 1 reply; 6+ messages in thread
From: Qian Cai @ 2019-11-22 16:46 UTC (permalink / raw)
  To: Joe Perches, jroedel; +Cc: iommu, dwmw2, linux-kernel

On Fri, 2019-11-22 at 08:28 -0800, Joe Perches wrote:
> On Fri, 2019-11-22 at 09:59 -0500, Qian Cai wrote:
> > On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote:
> > > On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> > > > When running heavy memory pressure workloads, this 5+ old system is
> > > > throwing endless warnings below because disk IO is too slow to recover
> > > > from swapping. Since the volume from alloc_iova_fast() could be large,
> > > > once it calls printk(), it will trigger disk IO (writing to the log
> > > > files) and pending softirqs which could cause an infinite loop and make
> > > > no progress for days by the ongoimng memory reclaim. This is the counter
> > > > part for Intel where the AMD part has already been merged. See the
> > > > commit 3d708895325b ("iommu/amd: Silence warnings under memory
> > > > pressure"). Since the allocation failure will be reported in
> > > > intel_alloc_iova(), so just call printk_ratelimted() there and silence
> > > > the one in alloc_iova_mem() to avoid the expensive warn_alloc().
> > > 
> > > []
> > > > v2: use dev_err_ratelimited() and improve the commit messages.
> > > 
> > > []
> > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > > 
> > > []
> > > > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device *dev,
> > > >  	iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
> > > >  				   IOVA_PFN(dma_mask), true);
> > > >  	if (unlikely(!iova_pfn)) {
> > > > -		dev_err(dev, "Allocating %ld-page iova failed", nrpages);
> > > > +		dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
> > > > +				    nrpages);
> > > 
> > > Trivia:
> > > 
> > > This should really have a \n termination on the format string
> > > 
> > > 		dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",
> > > 
> > > 
> > 
> > Why do you say so? It is right now printing with a newline added anyway.
> > 
> >  hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
> 
> If another process uses pr_cont at the same time,
> it can be interleaved.

I lean towards fixing that in a separate patch if ever needed, as the origin
dev_err() has no "\n" enclosed either.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] iommu/iova: silence warnings under memory pressure
  2019-11-22 16:46       ` Qian Cai
@ 2019-11-22 16:49         ` Joe Perches
  0 siblings, 0 replies; 6+ messages in thread
From: Joe Perches @ 2019-11-22 16:49 UTC (permalink / raw)
  To: Qian Cai, jroedel; +Cc: iommu, dwmw2, linux-kernel

On Fri, 2019-11-22 at 11:46 -0500, Qian Cai wrote:
> On Fri, 2019-11-22 at 08:28 -0800, Joe Perches wrote:
> > On Fri, 2019-11-22 at 09:59 -0500, Qian Cai wrote:
> > > On Thu, 2019-11-21 at 20:37 -0800, Joe Perches wrote:
> > > > On Thu, 2019-11-21 at 21:55 -0500, Qian Cai wrote:
> > > > > When running heavy memory pressure workloads, this 5+ old system is
> > > > > throwing endless warnings below because disk IO is too slow to recover
> > > > > from swapping. Since the volume from alloc_iova_fast() could be large,
> > > > > once it calls printk(), it will trigger disk IO (writing to the log
> > > > > files) and pending softirqs which could cause an infinite loop and make
> > > > > no progress for days by the ongoimng memory reclaim. This is the counter
> > > > > part for Intel where the AMD part has already been merged. See the
> > > > > commit 3d708895325b ("iommu/amd: Silence warnings under memory
> > > > > pressure"). Since the allocation failure will be reported in
> > > > > intel_alloc_iova(), so just call printk_ratelimted() there and silence
> > > > > the one in alloc_iova_mem() to avoid the expensive warn_alloc().
> > > > 
> > > > []
> > > > > v2: use dev_err_ratelimited() and improve the commit messages.
> > > > 
> > > > []
> > > > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > > > 
> > > > []
> > > > > @@ -3401,7 +3401,8 @@ static unsigned long intel_alloc_iova(struct device *dev,
> > > > >  	iova_pfn = alloc_iova_fast(&domain->iovad, nrpages,
> > > > >  				   IOVA_PFN(dma_mask), true);
> > > > >  	if (unlikely(!iova_pfn)) {
> > > > > -		dev_err(dev, "Allocating %ld-page iova failed", nrpages);
> > > > > +		dev_err_ratelimited(dev, "Allocating %ld-page iova failed",
> > > > > +				    nrpages);
> > > > 
> > > > Trivia:
> > > > 
> > > > This should really have a \n termination on the format string
> > > > 
> > > > 		dev_err_ratelimited(dev, "Allocating %ld-page iova failed\n",
> > > > 
> > > > 
> > > 
> > > Why do you say so? It is right now printing with a newline added anyway.
> > > 
> > >  hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
> > 
> > If another process uses pr_cont at the same time,
> > it can be interleaved.
> 
> I lean towards fixing that in a separate patch if ever needed, as the origin
> dev_err() has no "\n" enclosed either.

Your choice.

I wrote trivia:, but touching the same line multiple times
is relatively pointless.



_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-23  2:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-22  2:55 [PATCH v2] iommu/iova: silence warnings under memory pressure Qian Cai
2019-11-22  4:37 ` Joe Perches
2019-11-22 14:59   ` Qian Cai
2019-11-22 16:28     ` Joe Perches
2019-11-22 16:46       ` Qian Cai
2019-11-22 16:49         ` Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).