linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>, Roman Gushchin <guro@fb.com>
Subject: [PATCH rfc 1/4] percpu: implement partial chunk depopulation
Date: Wed, 24 Mar 2021 12:06:23 -0700	[thread overview]
Message-ID: <20210324190626.564297-2-guro@fb.com> (raw)
In-Reply-To: <20210324190626.564297-1-guro@fb.com>

This patch implements partial depopulation of percpu chunks.

As now, a chunk can be depopulated only as a part of the final
destruction, when there are no more outstanding allocations. However
to minimize a memory waste, it might be useful to depopulate a
partially filed chunk, if a small number of outstanding allocations
prevents the chunk from being reclaimed.

This patch implements the following depopulation process: it scans
over the chunk pages, looks for a range of empty and populated pages
and performs the depopulation. To avoid races with new allocations,
the chunk is previously isolated. After the depopulation the chunk is
returned to the original slot (but is appended to the tail of the list
to minimize the chances of population).

Because the pcpu_lock is dropped while calling pcpu_depopulate_chunk(),
the chunk can be concurrently moved to a different slot. So we need
to isolate it again on each step. pcpu_alloc_mutex is held, so the
chunk can't be populated/depopulated asynchronously.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 mm/percpu.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/mm/percpu.c b/mm/percpu.c
index 6596a0a4286e..78c55c73fa28 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -2055,6 +2055,96 @@ static void __pcpu_balance_workfn(enum pcpu_chunk_type type)
 	mutex_unlock(&pcpu_alloc_mutex);
 }
 
+/**
+ * pcpu_shrink_populated - scan chunks and release unused pages to the system
+ * @type: chunk type
+ *
+ * Scan over all chunks, find those marked with the depopulate flag and
+ * try to release unused pages to the system. On every attempt clear the
+ * chunk's depopulate flag to avoid wasting CPU by scanning the same
+ * chunk again and again.
+ */
+static void pcpu_shrink_populated(enum pcpu_chunk_type type)
+{
+	struct list_head *pcpu_slot = pcpu_chunk_list(type);
+	struct pcpu_chunk *chunk;
+	int slot, i, off, start;
+
+	spin_lock_irq(&pcpu_lock);
+	for (slot = pcpu_nr_slots - 1; slot >= 0; slot--) {
+restart:
+		list_for_each_entry(chunk, &pcpu_slot[slot], list) {
+			bool isolated = false;
+
+			if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_HIGH)
+				break;
+
+			for (i = 0, start = -1; i < chunk->nr_pages; i++) {
+				if (!chunk->nr_empty_pop_pages)
+					break;
+
+				/*
+				 * If the page is empty and populated, start or
+				 * extend the [start, i) range.
+				 */
+				if (test_bit(i, chunk->populated)) {
+					off = find_first_bit(
+						pcpu_index_alloc_map(chunk, i),
+						PCPU_BITMAP_BLOCK_BITS);
+					if (off >= PCPU_BITMAP_BLOCK_BITS) {
+						if (start == -1)
+							start = i;
+						continue;
+					}
+				}
+
+				/*
+				 * Otherwise check if there is an active range,
+				 * and if yes, depopulate it.
+				 */
+				if (start == -1)
+					continue;
+
+				/*
+				 * Isolate the chunk, so new allocations
+				 * wouldn't be served using this chunk.
+				 * Async releases can still happen.
+				 */
+				if (!list_empty(&chunk->list)) {
+					list_del_init(&chunk->list);
+					isolated = true;
+				}
+
+				spin_unlock_irq(&pcpu_lock);
+				pcpu_depopulate_chunk(chunk, start, i);
+				cond_resched();
+				spin_lock_irq(&pcpu_lock);
+
+				pcpu_chunk_depopulated(chunk, start, i);
+
+				/*
+				 * Reset the range and continue.
+				 */
+				start = -1;
+			}
+
+			if (isolated) {
+				/*
+				 * The chunk could have been moved while
+				 * pcpu_lock wasn't held. Make sure we put
+				 * the chunk back into the slot and restart
+				 * the scanning.
+				 */
+				if (list_empty(&chunk->list))
+					list_add_tail(&chunk->list,
+						      &pcpu_slot[slot]);
+				goto restart;
+			}
+		}
+	}
+	spin_unlock_irq(&pcpu_lock);
+}
+
 /**
  * pcpu_balance_workfn - manage the amount of free chunks and populated pages
  * @work: unused
-- 
2.30.2


  reply	other threads:[~2021-03-24 19:07 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-24 19:06 [PATCH rfc 0/4] percpu: partial chunk depopulation Roman Gushchin
2021-03-24 19:06 ` Roman Gushchin [this message]
2021-03-29 17:20   ` [PATCH rfc 1/4] percpu: implement " Dennis Zhou
2021-03-29 18:29     ` Roman Gushchin
2021-03-29 19:28       ` Dennis Zhou
2021-03-29 19:40         ` Roman Gushchin
2021-03-24 19:06 ` [PATCH rfc 2/4] percpu: split __pcpu_balance_workfn() Roman Gushchin
2021-03-29 17:28   ` Dennis Zhou
2021-03-29 18:20     ` Roman Gushchin
2021-03-24 19:06 ` [PATCH rfc 3/4] percpu: on demand chunk depopulation Roman Gushchin
2021-03-29  8:37   ` [percpu] 28c9dada65: invoked_oom-killer:gfp_mask=0x kernel test robot
2021-03-29 18:19     ` Roman Gushchin
2021-03-29 19:21   ` [PATCH rfc 3/4] percpu: on demand chunk depopulation Dennis Zhou
2021-03-29 20:10     ` Roman Gushchin
2021-03-29 23:12       ` Dennis Zhou
2021-03-30  1:04         ` Roman Gushchin
2021-03-24 19:06 ` [PATCH rfc 4/4] percpu: fix a comment about the chunks ordering Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210324190626.564297-2-guro@fb.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).