linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Pen <r.peniaev@gmail.com>
Cc: Roman Pen <r.peniaev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Eric Dumazet <edumazet@google.com>,
	David Rientjes <rientjes@google.com>,
	WANG Chao <chaowang@redhat.com>,
	Fabian Frederick <fabf@skynet.be>,
	Christoph Lameter <cl@linux.com>, Gioh Kim <gioh.kim@lge.com>,
	Rob Jones <rob.jones@codethink.co.uk>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: [RFC v2 1/3] mm/vmalloc: fix possible exhaustion of vmalloc space caused by vm_map_ram allocator
Date: Thu, 19 Mar 2015 23:04:39 +0900	[thread overview]
Message-ID: <1426773881-5757-2-git-send-email-r.peniaev@gmail.com> (raw)
In-Reply-To: <1426773881-5757-1-git-send-email-r.peniaev@gmail.com>

If suitable block can't be found, new block is allocated and put into a head
of a free list, so on next iteration this new block will be found first.

That's bad, because old blocks in a free list will not get a chance to be fully
used, thus fragmentation will grow.

Let's consider this simple example:

 #1 We have one block in a free list which is partially used, and where only
    one page is free:

    HEAD |xxxxxxxxx-| TAIL
                   ^
                   free space for 1 page, order 0

 #2 New allocation request of order 1 (2 pages) comes, new block is allocated
    since we do not have free space to complete this request. New block is put
    into a head of a free list:

    HEAD |----------|xxxxxxxxx-| TAIL

 #3 Two pages were occupied in a new found block:

    HEAD |xx--------|xxxxxxxxx-| TAIL
          ^
          two pages mapped here

 #4 New allocation request of order 0 (1 page) comes.  Block, which was created
    on #2 step, is located at the beginning of a free list, so it will be found
    first:

  HEAD |xxX-------|xxxxxxxxx-| TAIL
          ^                 ^
          page mapped here, but better to use this hole

It is obvious, that it is better to complete request of #4 step using the old
block, where free space is left, because in other case fragmentation will be
highly increased.

But fragmentation is not only the case.  The most worst thing is that I can
easily create scenario, when the whole vmalloc space is exhausted by blocks,
which are not used, but already dirty and have several free pages.

Let's consider this function which execution should be pinned to one CPU:

 ------------------------------------------------------------------------------
static void exhaust_virtual_space(struct page *pages[16], int iters)
{
	/* Firstly we have to map a big chunk, e.g. 16 pages.
	 * Then we have to occupy the remaining space with smaller
	 * chunks, i.e. 8 pages. At the end small hole should remain.
	 * So at the end of our allocation sequence block looks like
	 * this:
	 *                XX  big chunk
	 * |XXxxxxxxx-|    x  small chunk
	 *                 -  hole, which is enough for a small chunk,
	 *                    but is not enough for a big chunk
	 */
	while (iters--) {
		int i;
		void *vaddr;

		/* Map/unmap big chunk */
		vaddr = vm_map_ram(pages, 16, -1, PAGE_KERNEL);
		vm_unmap_ram(vaddr, 16);

		/* Map/unmap small chunks.
		 *
		 * -1 for hole, which should be left at the end of each block
		 * to keep it partially used, with some free space available */
		for (i = 0; i < (VMAP_BBMAP_BITS - 16) / 8 - 1; i++) {
			vaddr = vm_map_ram(pages, 8, -1, PAGE_KERNEL);
			vm_unmap_ram(vaddr, 8);
		}
	}
}
 ------------------------------------------------------------------------------

On every iteration new block (1MB of vm area in my case) will be allocated and
then will be occupied, without attempt to resolve small allocation request
using previously allocated blocks in a free list.

In case of random allocation (size should be randomly taken from the range
[1..64] in 64-bit case or [1..32] in 32-bit case) situation is the same:
new blocks continue to appear if maximum possible allocation size (32 or 64)
passed to the allocator, because all remaining blocks in a free list do not
have enough free space to complete this allocation request.

In summary if new blocks are put into the head of a free list eventually
virtual space will be exhausted.

In current patch I simply put newly allocated block to the tail of a free list,
thus reduce fragmentation, giving a chance to resolve allocation request using
older blocks with possible holes left.

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Christoph Lameter <cl@linux.com>
Cc: Gioh Kim <gioh.kim@lge.com>
Cc: Rob Jones <rob.jones@codethink.co.uk>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
---
 mm/vmalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 39c3388..db6bffb 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -837,7 +837,7 @@ static struct vmap_block *new_vmap_block(gfp_t gfp_mask)
 
 	vbq = &get_cpu_var(vmap_block_queue);
 	spin_lock(&vbq->lock);
-	list_add_rcu(&vb->free_list, &vbq->free);
+	list_add_tail_rcu(&vb->free_list, &vbq->free);
 	spin_unlock(&vbq->lock);
 	put_cpu_var(vmap_block_queue);
 
-- 
2.2.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-03-21 21:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-19 14:04 [RFC v2 0/3] mm/vmalloc: fix possible exhaustion of vmalloc space Roman Pen
2015-03-19 14:04 ` Roman Pen [this message]
2015-03-24 22:00   ` [RFC v2 1/3] mm/vmalloc: fix possible exhaustion of vmalloc space caused by vm_map_ram allocator Andrew Morton
2015-03-25  6:07     ` Roman Peniaev
2015-03-25  1:01   ` Gioh Kim
2015-03-19 14:04 ` [RFC v2 2/3] mm/vmalloc: occupy newly allocated vmap block just after allocation Roman Pen
2015-03-19 14:04 ` [RFC v2 3/3] mm/vmalloc: get rid of dirty bitmap inside vmap_block structure Roman Pen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1426773881-5757-2-git-send-email-r.peniaev@gmail.com \
    --to=r.peniaev@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chaowang@redhat.com \
    --cc=cl@linux.com \
    --cc=edumazet@google.com \
    --cc=fabf@skynet.be \
    --cc=gioh.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=rob.jones@codethink.co.uk \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).