All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Magnus Damm" <magnus.damm@gmail.com>
To: balbir@in.ibm.com
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, vatsa@in.ibm.com,
	ckrm-tech@lists.sourceforge.net, xemul@sw.ru, linux-mm@kvack.org,
	menage@google.com, svaidy@linux.vnet.ibm.com, devel@openvz.org
Subject: Re: [RFC][PATCH][0/4] Memory controller (RSS Control)
Date: Mon, 19 Feb 2007 20:56:06 +0900	[thread overview]
Message-ID: <aec7e5c30702190356v31e4997pf02e2887264299ce@mail.gmail.com> (raw)
In-Reply-To: <45D97FAD.9070009@in.ibm.com>

On 2/19/07, Balbir Singh <balbir@in.ibm.com> wrote:
> Magnus Damm wrote:
> > On 2/19/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> >> On Mon, 19 Feb 2007 12:20:19 +0530 Balbir Singh <balbir@in.ibm.com>
> >> wrote:
> >>
> >> > This patch applies on top of Paul Menage's container patches (V7)
> >> posted at
> >> >
> >> >       http://lkml.org/lkml/2007/2/12/88
> >> >
> >> > It implements a controller within the containers framework for limiting
> >> > memory usage (RSS usage).
> >
> >> The key part of this patchset is the reclaim algorithm:
> >>
> >> Alas, I fear this might have quite bad worst-case behaviour.  One small
> >> container which is under constant memory pressure will churn the
> >> system-wide LRUs like mad, and will consume rather a lot of system time.
> >> So it's a point at which container A can deleteriously affect things
> >> which
> >> are running in other containers, which is exactly what we're supposed to
> >> not do.
> >
> > Nice with a simple memory controller. The downside seems to be that it
> > doesn't scale very well when it comes to reclaim, but maybe that just
> > comes with being simple. Step by step, and maybe this is a good first
> > step?
> >
>
> Thanks, I totally agree.
>
> > Ideally I'd like to see unmapped pages handled on a per-container LRU
> > with a fallback to the system-wide LRUs. Shared/mapped pages could be
> > handled using PTE ageing/unmapping instead of page ageing, but that
> > may consume too much resources to be practical.
> >
> > / magnus
>
> Keeping unmapped pages per container sounds interesting. I am not quite
> sure what PTE ageing, will it look it up.

You will most likely have no luck looking it up, so here is what I
mean by PTE ageing:

The most common unit for memory resource control seems to be physical
pages. Keeping track of pages is simple in the case of a single user
per page, but for shared pages tracking the owner becomes more
complex.

I consider unmapped pages to only have a single user at a time, so the
unit for unmapped memory resource control is physical pages. Apart
from implementation details such as fun with struct page and
scalability, handling this case is not so complicated.

Mapped or shared pages should be handled in a different way IMO. PTEs
should be used instead of using physical pages as unit for resource
control and reclaim. For the user this looks pretty much the same as
physical pages, apart for memory overcommit.

So instead of using a global page reclaim policy and reserving
physical pages per container I propose that resource controlled shared
pages should be handled using a PTE replacement policy. This policy is
used to keep the most active PTEs in the container backed by physical
pages. Inactive PTEs gets unmapped in favour over newer PTEs.

One way to implement this could be by populating the address space of
resource controlled processes with multiple smaller LRU2Qs. The
compact data structure that I have in mind is basically an array of
256 bytes, one byte per PTE. Associated with this data strucuture are
start indexes and lengths for two lists. The indexes are used in a
FAT-type of chain to form single linked lists. So we create active and
inactive list here - and we move PTEs between the lists when we check
the young bits from the page reclaim and when we apply memory
pressure. Unmapping is done through the normal page reclaimer but
using information from the PTE LRUs.

In my mind this should lead to more fair resource control of mapped
pages, but if it is possible to implement with low overhead, that's
another question. =)

Thanks for listening.

/ magnus

WARNING: multiple messages have this Message-ID (diff)
From: "Magnus Damm" <magnus.damm@gmail.com>
To: balbir@in.ibm.com
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, vatsa@in.ibm.com,
	ckrm-tech@lists.sourceforge.net, xemul@sw.ru, linux-mm@kvack.org,
	menage@google.com, svaidy@linux.vnet.ibm.com, devel@openvz.org
Subject: Re: [RFC][PATCH][0/4] Memory controller (RSS Control)
Date: Mon, 19 Feb 2007 20:56:06 +0900	[thread overview]
Message-ID: <aec7e5c30702190356v31e4997pf02e2887264299ce@mail.gmail.com> (raw)
In-Reply-To: <45D97FAD.9070009@in.ibm.com>

On 2/19/07, Balbir Singh <balbir@in.ibm.com> wrote:
> Magnus Damm wrote:
> > On 2/19/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> >> On Mon, 19 Feb 2007 12:20:19 +0530 Balbir Singh <balbir@in.ibm.com>
> >> wrote:
> >>
> >> > This patch applies on top of Paul Menage's container patches (V7)
> >> posted at
> >> >
> >> >       http://lkml.org/lkml/2007/2/12/88
> >> >
> >> > It implements a controller within the containers framework for limiting
> >> > memory usage (RSS usage).
> >
> >> The key part of this patchset is the reclaim algorithm:
> >>
> >> Alas, I fear this might have quite bad worst-case behaviour.  One small
> >> container which is under constant memory pressure will churn the
> >> system-wide LRUs like mad, and will consume rather a lot of system time.
> >> So it's a point at which container A can deleteriously affect things
> >> which
> >> are running in other containers, which is exactly what we're supposed to
> >> not do.
> >
> > Nice with a simple memory controller. The downside seems to be that it
> > doesn't scale very well when it comes to reclaim, but maybe that just
> > comes with being simple. Step by step, and maybe this is a good first
> > step?
> >
>
> Thanks, I totally agree.
>
> > Ideally I'd like to see unmapped pages handled on a per-container LRU
> > with a fallback to the system-wide LRUs. Shared/mapped pages could be
> > handled using PTE ageing/unmapping instead of page ageing, but that
> > may consume too much resources to be practical.
> >
> > / magnus
>
> Keeping unmapped pages per container sounds interesting. I am not quite
> sure what PTE ageing, will it look it up.

You will most likely have no luck looking it up, so here is what I
mean by PTE ageing:

The most common unit for memory resource control seems to be physical
pages. Keeping track of pages is simple in the case of a single user
per page, but for shared pages tracking the owner becomes more
complex.

I consider unmapped pages to only have a single user at a time, so the
unit for unmapped memory resource control is physical pages. Apart
from implementation details such as fun with struct page and
scalability, handling this case is not so complicated.

Mapped or shared pages should be handled in a different way IMO. PTEs
should be used instead of using physical pages as unit for resource
control and reclaim. For the user this looks pretty much the same as
physical pages, apart for memory overcommit.

So instead of using a global page reclaim policy and reserving
physical pages per container I propose that resource controlled shared
pages should be handled using a PTE replacement policy. This policy is
used to keep the most active PTEs in the container backed by physical
pages. Inactive PTEs gets unmapped in favour over newer PTEs.

One way to implement this could be by populating the address space of
resource controlled processes with multiple smaller LRU2Qs. The
compact data structure that I have in mind is basically an array of
256 bytes, one byte per PTE. Associated with this data strucuture are
start indexes and lengths for two lists. The indexes are used in a
FAT-type of chain to form single linked lists. So we create active and
inactive list here - and we move PTEs between the lists when we check
the young bits from the page reclaim and when we apply memory
pressure. Unmapping is done through the normal page reclaimer but
using information from the PTE LRUs.

In my mind this should lead to more fair resource control of mapped
pages, but if it is possible to implement with low overhead, that's
another question. =)

Thanks for listening.

/ magnus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-02-19 11:56 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-19  6:50 [RFC][PATCH][0/4] Memory controller (RSS Control) Balbir Singh
2007-02-19  6:50 ` Balbir Singh
2007-02-19  6:50 ` [RFC][PATCH][1/4] RSS controller setup Balbir Singh
2007-02-19  6:50   ` Balbir Singh
2007-02-19  8:57   ` Andrew Morton
2007-02-19  8:57     ` Andrew Morton
2007-02-19  9:18     ` Paul Menage
2007-02-19  9:18       ` Paul Menage
2007-02-19 11:13       ` Balbir Singh
2007-02-19 11:13         ` Balbir Singh
2007-02-19 19:43         ` Matthew Helsley
2007-02-19 19:43           ` Matthew Helsley
2007-02-19 10:06     ` Balbir Singh
2007-02-19 10:06       ` Balbir Singh
2007-02-19  6:50 ` [RFC][PATCH][2/4] Add RSS accounting and control Balbir Singh
2007-02-19  6:50   ` Balbir Singh
2007-02-19  8:58   ` Andrew Morton
2007-02-19  8:58     ` Andrew Morton
2007-02-19 10:37     ` [ckrm-tech] " Balbir Singh
2007-02-19 10:37       ` Balbir Singh
2007-02-19 11:01       ` Andrew Morton
2007-02-19 11:01         ` Andrew Morton
2007-02-19 11:09         ` Balbir Singh
2007-02-19 11:09           ` Balbir Singh
2007-02-19 11:23           ` Andrew Morton
2007-02-19 11:23             ` Andrew Morton
2007-02-19 11:56             ` Balbir Singh
2007-02-19 11:56               ` Balbir Singh
2007-02-19 12:09               ` Paul Menage
2007-02-19 12:09                 ` Paul Menage
2007-02-19 14:10                 ` Balbir Singh
2007-02-19 14:10                   ` Balbir Singh
2007-02-19 16:07                   ` Vaidyanathan Srinivasan
2007-02-19 16:07                     ` Vaidyanathan Srinivasan
2007-02-19 16:17                     ` Balbir Singh
2007-02-19 16:17                       ` Balbir Singh
2007-02-20  6:40                       ` Vaidyanathan Srinivasan
2007-02-20  6:40                         ` Vaidyanathan Srinivasan
2007-02-19  6:50 ` [RFC][PATCH][3/4] Add reclaim support Balbir Singh
2007-02-19  6:50   ` Balbir Singh
2007-02-19  8:59   ` Andrew Morton
2007-02-19  8:59     ` Andrew Morton
2007-02-19 10:50     ` Balbir Singh
2007-02-19 10:50       ` Balbir Singh
2007-02-19 11:10       ` Andrew Morton
2007-02-19 11:10         ` Andrew Morton
2007-02-19 11:16         ` Balbir Singh
2007-02-19 11:16           ` Balbir Singh
2007-02-19  9:48   ` KAMEZAWA Hiroyuki
2007-02-19  9:48     ` KAMEZAWA Hiroyuki
2007-02-19 10:52     ` Balbir Singh
2007-02-19 10:52       ` Balbir Singh
2007-02-19  6:50 ` [RFC][PATCH][4/4] RSS controller documentation Balbir Singh
2007-02-19  6:50   ` Balbir Singh
2007-02-19  8:54 ` [RFC][PATCH][0/4] Memory controller (RSS Control) Andrew Morton
2007-02-19  8:54   ` Andrew Morton
2007-02-19  9:06   ` Paul Menage
2007-02-19  9:06     ` Paul Menage
2007-02-19  9:50     ` [ckrm-tech] " Kirill Korotaev
2007-02-19  9:50       ` Kirill Korotaev
2007-02-19  9:50       ` Paul Menage
2007-02-19  9:50         ` Paul Menage
2007-02-19 10:24       ` Balbir Singh
2007-02-19 10:24         ` Balbir Singh
2007-02-19 10:39     ` Balbir Singh
2007-02-19 10:39       ` Balbir Singh
2007-02-19  9:16   ` Magnus Damm
2007-02-19  9:16     ` Magnus Damm
2007-02-19 10:45     ` Balbir Singh
2007-02-19 10:45       ` Balbir Singh
2007-02-19 11:56       ` Magnus Damm [this message]
2007-02-19 11:56         ` Magnus Damm
2007-02-19 14:07         ` Balbir Singh
2007-02-19 14:07           ` Balbir Singh
2007-02-19 10:00   ` Balbir Singh
2007-02-19 10:00     ` Balbir Singh
2007-02-24 14:45 [RFC][PATCH][0/4] Memory controller (RSS Control) ( Balbir Singh
2007-02-24 14:45 ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aec7e5c30702190356v31e4997pf02e2887264299ce@mail.gmail.com \
    --to=magnus.damm@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@in.ibm.com \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=devel@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=menage@google.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=vatsa@in.ibm.com \
    --cc=xemul@sw.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.