From: akpm@linux-foundation.org
To: akpm@linux-foundation.org, almasrymina@google.com,
gthelen@google.com, linux-mm@kvack.org, mike.kravetz@oracle.com,
mm-commits@vger.kernel.org, rientjes@google.com,
shakeelb@google.com, torvalds@linux-foundation.org
Subject: [patch 129/158] hugetlb: region_chg provides only cache entry
Date: Sat, 30 Nov 2019 17:56:54 -0800 [thread overview]
Message-ID: <20191201015654.AlAAC9Y5f%akpm@linux-foundation.org> (raw)
From: Mina Almasry <almasrymina@google.com>
Subject: hugetlb: region_chg provides only cache entry
Current behavior is that region_chg provides both a cache entry in
resv->region_cache, AND a placeholder entry in resv->regions. region_add
first tries to use the placeholder, and if it finds that the placeholder
has been deleted by a racing region_del call, it uses the cache entry.
This behavior is completely unnecessary and is removed in this patch for a
couple of reasons:
1. region_add needs to either find a cached file_region entry in
resv->region_cache, or find an entry in resv->regions to expand. It
does not need both.
2. region_chg adding a placeholder entry in resv->regions opens up
a possible race with region_del, where region_chg adds a placeholder
region in resv->regions, and this region is deleted by a racing call
to region_del during region_chg execution or before region_add is
called. Removing the race makes the code easier to reason about and
maintain.
In addition, a follow up patch in another series that disables region
coalescing, which would be further complicated if the race with region_del
exists.
Link: http://lkml.kernel.org/r/20190919200428.188797-2-almasrymina@google.com
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/hugetlb.c | 63 ++++++++-----------------------------------------
1 file changed, 11 insertions(+), 52 deletions(-)
--- a/mm/hugetlb.c~hugetlb-region_chg-provides-only-cache-entry
+++ a/mm/hugetlb.c
@@ -246,14 +246,10 @@ struct file_region {
/*
* Add the huge page range represented by [f, t) to the reserve
- * map. In the normal case, existing regions will be expanded
- * to accommodate the specified range. Sufficient regions should
- * exist for expansion due to the previous call to region_chg
- * with the same range. However, it is possible that region_del
- * could have been called after region_chg and modifed the map
- * in such a way that no region exists to be expanded. In this
- * case, pull a region descriptor from the cache associated with
- * the map and use that for the new range.
+ * map. Existing regions will be expanded to accommodate the specified
+ * range, or a region will be taken from the cache. Sufficient regions
+ * must exist in the cache due to the previous call to region_chg with
+ * the same range.
*
* Return the number of new huge pages added to the map. This
* number is greater than or equal to zero.
@@ -272,9 +268,8 @@ static long region_add(struct resv_map *
/*
* If no region exists which can be expanded to include the
- * specified range, the list must have been modified by an
- * interleving call to region_del(). Pull a region descriptor
- * from the cache and use it for this range.
+ * specified range, pull a region descriptor from the cache
+ * and use it for this range.
*/
if (&rg->link == head || t < rg->from) {
VM_BUG_ON(resv->region_cache_count <= 0);
@@ -339,15 +334,9 @@ out_locked:
* call to region_add that will actually modify the reserve
* map to add the specified range [f, t). region_chg does
* not change the number of huge pages represented by the
- * map. However, if the existing regions in the map can not
- * be expanded to represent the new range, a new file_region
- * structure is added to the map as a placeholder. This is
- * so that the subsequent region_add call will have all the
- * regions it needs and will not fail.
- *
- * Upon entry, region_chg will also examine the cache of region descriptors
- * associated with the map. If there are not enough descriptors cached, one
- * will be allocated for the in progress add operation.
+ * map. A new file_region structure is added to the cache
+ * as a placeholder, so that the subsequent region_add
+ * call will have all the regions it needs and will not fail.
*
* Returns the number of huge pages that need to be added to the existing
* reservation map for the range [f, t). This number is greater or equal to
@@ -357,10 +346,9 @@ out_locked:
static long region_chg(struct resv_map *resv, long f, long t)
{
struct list_head *head = &resv->regions;
- struct file_region *rg, *nrg = NULL;
+ struct file_region *rg;
long chg = 0;
-retry:
spin_lock(&resv->lock);
retry_locked:
resv->adds_in_progress++;
@@ -378,10 +366,8 @@ retry_locked:
spin_unlock(&resv->lock);
trg = kmalloc(sizeof(*trg), GFP_KERNEL);
- if (!trg) {
- kfree(nrg);
+ if (!trg)
return -ENOMEM;
- }
spin_lock(&resv->lock);
list_add(&trg->link, &resv->region_cache);
@@ -394,28 +380,6 @@ retry_locked:
if (f <= rg->to)
break;
- /* If we are below the current region then a new region is required.
- * Subtle, allocate a new region at the position but make it zero
- * size such that we can guarantee to record the reservation. */
- if (&rg->link == head || t < rg->from) {
- if (!nrg) {
- resv->adds_in_progress--;
- spin_unlock(&resv->lock);
- nrg = kmalloc(sizeof(*nrg), GFP_KERNEL);
- if (!nrg)
- return -ENOMEM;
-
- nrg->from = f;
- nrg->to = f;
- INIT_LIST_HEAD(&nrg->link);
- goto retry;
- }
-
- list_add(&nrg->link, rg->link.prev);
- chg = t - f;
- goto out_nrg;
- }
-
/* Round our left edge to the current segment if it encloses us. */
if (f > rg->from)
f = rg->from;
@@ -440,11 +404,6 @@ retry_locked:
out:
spin_unlock(&resv->lock);
- /* We already know we raced and no longer need the new region */
- kfree(nrg);
- return chg;
-out_nrg:
- spin_unlock(&resv->lock);
return chg;
}
_
reply other threads:[~2019-12-01 1:56 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191201015654.AlAAC9Y5f%akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=gthelen@google.com \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=mm-commits@vger.kernel.org \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).