linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: John Garry <john.garry@huawei.com>,
	Feng Tang <feng.tang@intel.com>, Joerg Roedel <joro@8bytes.org>,
	Will Deacon <will@kernel.org>,
	iommu@lists.linux-foundation.org, iommu@lists.linux.dev,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Paul Menzel <pmenzel@molgen.mpg.de>
Subject: Re: [PATCH] iommu/iova: change IOVA_MAG_SIZE to 127 to save memory
Date: Thu, 30 Jun 2022 11:06:23 +0100	[thread overview]
Message-ID: <117b31b5-8d06-0af4-7f1c-231d86becf1d@arm.com> (raw)
In-Reply-To: <7c29d01d-d90c-58d3-a6e0-0b6c404173ac@huawei.com>

On 2022-06-30 10:37, John Garry wrote:
> On 30/06/2022 10:02, Robin Murphy wrote:
>> On 2022-06-30 08:33, Feng Tang wrote:
>>> kmalloc will round up the request size to power of 2, and current
>>> iova_magazine's size is 1032 (1024+8) bytes, so each instance
>>> allocated will get 2048 bytes from kmalloc, causing around 1KB
>>> waste.
>>>
>>> And in some exstreme case, the memory wasted can trigger OOM as
>>> reported in 2019 on a crash kernel with 256 MB memory [1].
>>
>> I don't think it really needs pointing out that excessive memory 
>> consumption can cause OOM. Especially not in the particularly silly 
>> context of a system with only 2MB of RAM per CPU - that's pretty much 
>> guaranteed to be doomed one way or another.
>>
>>>    [    4.319253] iommu: Adding device 0000:06:00.2 to group 5
>>>    [    4.325869] iommu: Adding device 0000:20:01.0 to group 15
>>>    [    4.332648] iommu: Adding device 0000:20:02.0 to group 16
>>>    [    4.338946] swapper/0 invoked oom-killer: 
>>> gfp_mask=0x6040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null), order=0, 
>>> oom_score_adj=0
>>>    [    4.350251] swapper/0 cpuset=/ mems_allowed=0
>>>    [    4.354618] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
>>> 4.19.57.mx64.282 #1
>>>    [    4.355612] Hardware name: Dell Inc. PowerEdge R7425/08V001, 
>>> BIOS 1.9.3 06/25/2019
>>>    [    4.355612] Call Trace:
>>>    [    4.355612]  dump_stack+0x46/0x5b
>>>    [    4.355612]  dump_header+0x6b/0x289
>>>    [    4.355612]  out_of_memory+0x470/0x4c0
>>>    [    4.355612]  __alloc_pages_nodemask+0x970/0x1030
>>>    [    4.355612]  cache_grow_begin+0x7d/0x520
>>>    [    4.355612]  fallback_alloc+0x148/0x200
>>>    [    4.355612]  kmem_cache_alloc_trace+0xac/0x1f0
>>>    [    4.355612]  init_iova_domain+0x112/0x170
>>>    [    4.355612]  amd_iommu_domain_alloc+0x138/0x1a0
>>>    [    4.355612]  iommu_group_get_for_dev+0xc4/0x1a0
>>>    [    4.355612]  amd_iommu_add_device+0x13a/0x610
>>>    [    4.355612]  add_iommu_group+0x20/0x30
>>>    [    4.355612]  bus_for_each_dev+0x76/0xc0
>>>    [    4.355612]  bus_set_iommu+0xb6/0xf0
>>>    [    4.355612]  amd_iommu_init_api+0x112/0x132
>>>    [    4.355612]  state_next+0xfb1/0x1165
>>>    [    4.355612]  amd_iommu_init+0x1f/0x67
>>>    [    4.355612]  pci_iommu_init+0x16/0x3f
>>>    ...
>>>    [    4.670295] Unreclaimable slab info:
>>>    ...
>>>    [    4.857565] kmalloc-2048           59164KB      59164KB
>>>
>>> Change IOVA_MAG_SIZE from 128 to 127 to make size of 'iova_magazine'
>>> 1024 bytes so that no memory will be wasted.
>>>
>>> [1]. https://lkml.org/lkml/2019/8/12/266
>>>
>>> Signed-off-by: Feng Tang <feng.tang@intel.com>
>>> ---
>>>   drivers/iommu/iova.c | 7 ++++++-
>>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index db77aa675145b..27634ddd9b904 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -614,7 +614,12 @@ EXPORT_SYMBOL_GPL(reserve_iova);
>>>    * dynamic size tuning described in the paper.
>>>    */
>>> -#define IOVA_MAG_SIZE 128
>>> +/*
>>> + * As kmalloc's buffer size is fixed to power of 2, 127 is chosen to
>>> + * assure size of 'iova_magzine' to be 1024 bytes, so that no memory
>>
>> Typo: iova_magazine
>>
>>> + * will be wasted.
>>> + */
>>> +#define IOVA_MAG_SIZE 127
> 
> I do wonder if we will see some strange new behaviour since IOVA_FQ_SIZE 
> % IOVA_MAG_SIZE != 0 now...

I doubt it - even if a flush queue does happen to be entirely full of 
equal-sized IOVAs, a CPU's loaded magazines also both being perfectly 
empty when it comes to dump a full fq seem further unlikely, so in 
practice I don't see this making any appreciable change to the 
likelihood of spilling back to the depot or not. In fact the smaller the 
magazines get, the less time would be spent flushing the depot back to 
the rbtree, where your interesting workload falls off the cliff and 
never catches back up with the fq timer, so at some point it might even 
improve (unless it's also already close to the point where smaller 
caches would bottleneck allocation)... might be interesting to 
experiment with a wider range of magazine sizes if you had the time and 
inclination.

Cheers,
Robin.

> 
>>
>> The change itself seems perfectly reasonable, though.
>>
>> Acked-by: Robin Murphy <robin.murphy@arm.com>
> 

  reply	other threads:[~2022-06-30 10:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-30  7:33 [PATCH] iommu/iova: change IOVA_MAG_SIZE to 127 to save memory Feng Tang
2022-06-30  9:02 ` Robin Murphy
2022-06-30  9:33   ` Feng Tang
2022-06-30  9:37   ` John Garry
2022-06-30 10:06     ` Robin Murphy [this message]
2022-06-30 10:52       ` John Garry
2022-07-01  3:56         ` Feng Tang
2022-07-01 11:33           ` John Garry
2022-07-01 12:01             ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=117b31b5-8d06-0af4-7f1c-231d86becf1d@arm.com \
    --to=robin.murphy@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=feng.tang@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=iommu@lists.linux.dev \
    --cc=john.garry@huawei.com \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).