linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dennis Zhou <dennis@kernel.org>
To: Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>
Cc: Vlad Buslov <vladbu@mellanox.com>,
	kernel-team@fb.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH 08/12] percpu: remember largest area skipped during allocation
Date: Wed, 27 Feb 2019 21:18:35 -0500	[thread overview]
Message-ID: <20190228021839.55779-9-dennis@kernel.org> (raw)
In-Reply-To: <20190228021839.55779-1-dennis@kernel.org>

Percpu allocations attempt to do first fit by scanning forward from the
first_free of a block. However, fragmentation from allocation requests
can cause holes not seen by block hint update functions. To address
this, create a local version of bitmap_find_next_zero_area_off() that
remembers the largest area skipped over. The caveat is that it only sees
regions skipped over due to not fitting, not regions skipped due to
alignment. Prior to updating the scan_hint, a scan backwards is done to
try and recover free bits skipped due to alignment. While this can cause
scanning to miss earlier possible free areas, smaller allocations will
eventually fill those holes.

Signed-off-by: Dennis Zhou <dennis@kernel.org>
---
 mm/percpu.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 99 insertions(+), 2 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index df1aacf58ac8..dac18968d79f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -716,6 +716,43 @@ static void pcpu_block_update(struct pcpu_block_md *block, int start, int end)
 	}
 }
 
+/*
+ * pcpu_block_update_scan - update a block given a free area from a scan
+ * @chunk: chunk of interest
+ * @bit_off: chunk offset
+ * @bits: size of free area
+ *
+ * Finding the final allocation spot first goes through pcpu_find_block_fit()
+ * to find a block that can hold the allocation and then pcpu_alloc_area()
+ * where a scan is used.  When allocations require specific alignments,
+ * we can inadvertently create holes which will not be seen in the alloc
+ * or free paths.
+ *
+ * This takes a given free area hole and updates a block as it may change the
+ * scan_hint.  We need to scan backwards to ensure we don't miss free bits
+ * from alignment.
+ */
+static void pcpu_block_update_scan(struct pcpu_chunk *chunk, int bit_off,
+				   int bits)
+{
+	int s_off = pcpu_off_to_block_off(bit_off);
+	int e_off = s_off + bits;
+	int s_index, l_bit;
+	struct pcpu_block_md *block;
+
+	if (e_off > PCPU_BITMAP_BLOCK_BITS)
+		return;
+
+	s_index = pcpu_off_to_block_index(bit_off);
+	block = chunk->md_blocks + s_index;
+
+	/* scan backwards in case of alignment skipping free bits */
+	l_bit = find_last_bit(pcpu_index_alloc_map(chunk, s_index), s_off);
+	s_off = (s_off == l_bit) ? 0 : l_bit + 1;
+
+	pcpu_block_update(block, s_off, e_off);
+}
+
 /**
  * pcpu_block_refresh_hint
  * @chunk: chunk of interest
@@ -1064,6 +1101,62 @@ static int pcpu_find_block_fit(struct pcpu_chunk *chunk, int alloc_bits,
 	return bit_off;
 }
 
+/*
+ * pcpu_find_zero_area - modified from bitmap_find_next_zero_area_off
+ * @map: the address to base the search on
+ * @size: the bitmap size in bits
+ * @start: the bitnumber to start searching at
+ * @nr: the number of zeroed bits we're looking for
+ * @align_mask: alignment mask for zero area
+ * @largest_off: offset of the largest area skipped
+ * @largest_bits: size of the largest area skipped
+ *
+ * The @align_mask should be one less than a power of 2.
+ *
+ * This is a modified version of bitmap_find_next_zero_area_off() to remember
+ * the largest area that was skipped.  This is imperfect, but in general is
+ * good enough.  The largest remembered region is the largest failed region
+ * seen.  This does not include anything we possibly skipped due to alignment.
+ * pcpu_block_update_scan() does scan backwards to try and recover what was
+ * lost to alignment.  While this can cause scanning to miss earlier possible
+ * free areas, smaller allocations will eventually fill those holes.
+ */
+static unsigned long pcpu_find_zero_area(unsigned long *map,
+					 unsigned long size,
+					 unsigned long start,
+					 unsigned long nr,
+					 unsigned long align_mask,
+					 unsigned long *largest_off,
+					 unsigned long *largest_bits)
+{
+	unsigned long index, end, i, area_off, area_bits;
+again:
+	index = find_next_zero_bit(map, size, start);
+
+	/* Align allocation */
+	index = __ALIGN_MASK(index, align_mask);
+	area_off = index;
+
+	end = index + nr;
+	if (end > size)
+		return end;
+	i = find_next_bit(map, end, index);
+	if (i < end) {
+		area_bits = i - area_off;
+		/* remember largest unused area with best alignment */
+		if (area_bits > *largest_bits ||
+		    (area_bits == *largest_bits && *largest_off &&
+		     (!area_off || __ffs(area_off) > __ffs(*largest_off)))) {
+			*largest_off = area_off;
+			*largest_bits = area_bits;
+		}
+
+		start = i + 1;
+		goto again;
+	}
+	return index;
+}
+
 /**
  * pcpu_alloc_area - allocates an area from a pcpu_chunk
  * @chunk: chunk of interest
@@ -1087,6 +1180,7 @@ static int pcpu_alloc_area(struct pcpu_chunk *chunk, int alloc_bits,
 			   size_t align, int start)
 {
 	size_t align_mask = (align) ? (align - 1) : 0;
+	unsigned long area_off = 0, area_bits = 0;
 	int bit_off, end, oslot;
 
 	lockdep_assert_held(&pcpu_lock);
@@ -1098,11 +1192,14 @@ static int pcpu_alloc_area(struct pcpu_chunk *chunk, int alloc_bits,
 	 */
 	end = min_t(int, start + alloc_bits + PCPU_BITMAP_BLOCK_BITS,
 		    pcpu_chunk_map_bits(chunk));
-	bit_off = bitmap_find_next_zero_area(chunk->alloc_map, end, start,
-					     alloc_bits, align_mask);
+	bit_off = pcpu_find_zero_area(chunk->alloc_map, end, start, alloc_bits,
+				      align_mask, &area_off, &area_bits);
 	if (bit_off >= end)
 		return -1;
 
+	if (area_bits)
+		pcpu_block_update_scan(chunk, area_off, area_bits);
+
 	/* update alloc map */
 	bitmap_set(chunk->alloc_map, bit_off, alloc_bits);
 
-- 
2.17.1


  parent reply	other threads:[~2019-02-28  2:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-28  2:18 [PATCH 00/12] introduce percpu block scan_hint Dennis Zhou
2019-02-28  2:18 ` [PATCH 01/12] percpu: update free path with correct new free region Dennis Zhou
2019-03-02 12:56   ` Peng Fan
2019-02-28  2:18 ` [PATCH 02/12] percpu: do not search past bitmap when allocating an area Dennis Zhou
2019-03-02 13:32   ` Peng Fan
2019-03-02 22:23     ` Dennis Zhou
2019-03-03  8:41       ` Peng Fan
2019-02-28  2:18 ` [PATCH 03/12] percpu: introduce helper to determine if two regions overlap Dennis Zhou
2019-03-02 13:37   ` Peng Fan
2019-03-02 22:24     ` Dennis Zhou
2019-02-28  2:18 ` [PATCH 04/12] percpu: manage chunks based on contig_bits instead of free_bytes Dennis Zhou
2019-03-02 13:48   ` Peng Fan
2019-03-02 22:32     ` Dennis Zhou
2019-03-03  8:42       ` Peng Fan
2019-02-28  2:18 ` [PATCH 05/12] percpu: relegate chunks unusable when failing small allocations Dennis Zhou
2019-03-02 13:55   ` Peng Fan
2019-03-02 22:34     ` Dennis Zhou
2019-02-28  2:18 ` [PATCH 06/12] percpu: set PCPU_BITMAP_BLOCK_SIZE to PAGE_SIZE Dennis Zhou
2019-03-03  4:56   ` Peng Fan
2019-02-28  2:18 ` [PATCH 07/12] percpu: add block level scan_hint Dennis Zhou
2019-03-03  6:01   ` Peng Fan
2019-03-03 20:23     ` Dennis Zhou
2019-03-04  9:36       ` Peng Fan
2019-02-28  2:18 ` Dennis Zhou [this message]
2019-02-28  2:18 ` [PATCH 09/12] percpu: use block scan_hint to only scan forward Dennis Zhou
2019-02-28  2:18 ` [PATCH 10/12] percpu: make pcpu_block_md generic Dennis Zhou
2019-03-03  6:35   ` Peng Fan
2019-02-28  2:18 ` [PATCH 11/12] percpu: convert chunk hints to be based on pcpu_block_md Dennis Zhou
2019-03-03  8:18   ` Peng Fan
2019-03-03 20:22     ` Dennis Zhou
2019-03-04  6:36       ` Peng Fan
2019-02-28  2:18 ` [PATCH 12/12] percpu: use chunk scan_hint to skip some scanning Dennis Zhou
2019-03-03  8:38   ` Peng Fan
2019-02-28 14:47 ` [PATCH 00/12] introduce percpu block scan_hint Vlad Buslov
2019-03-13 20:19 ` Dennis Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190228021839.55779-9-dennis@kernel.org \
    --to=dennis@kernel.org \
    --cc=cl@linux.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    --cc=vladbu@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).