From: Kirill Korotaev <dev@sw.ru>
To: devel@openvz.org
Cc: Kirill Korotaev <dev@openvz.org>,
containers@lists.osdl.org,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org
Subject: Re: [Devel] Re: [RFC][PATCH 2/7] RSS controller core
Date: Tue, 13 Mar 2007 18:43:59 +0300 [thread overview]
Message-ID: <45F6C6BF.5000002@sw.ru> (raw)
In-Reply-To: <m17itltrxe.fsf@ebiederm.dsl.xmission.com>
Eric,
>>>And misses every resource sharing opportunity in sight.
>>
>>that was my point too.
>>
>>
>>>Except for
>>>filtering the which pages are eligible for reclaim an RSS limit should
>>>not need to change the existing reclaim logic, and with things like the
>>>memory zones we have had that kind of restriction in the reclaim logic
>>>for a long time. So filtering out ineligible pages isn't anything new.
>>
>>exactly this is implemented in the current patches from Pavel.
>>the only difference is that filtering is not done in general LRU list,
>>which is not effective, but via per-container LRU list.
>>So the pointer on the page structure does 2 things:
>>- fast reclamation
>
> Better than the rmap list?
>
>>- correct uncharging of page from where it was charged
>> (e.g. shared pages can be mapped first in one container, but the last unmap
>> done from another one).
>
> We should charge/uncharge all of them, not just one.
>
>
>>>>We need to work out what the requirements are before we can settle on an
>>>>implementation.
>>>
>>>
>>>If you are talking about RSS limits the term is well defined. The
>>>number of pages you can have mapped into your set of address space at
>>>any given time.
>>>
>>>Unless I'm totally blind that isn't what the patchset implements.
>>
>>Ouch, what makes you think so?
>>The fact that a page mapped into 2 different processes is charged only once?
>>Imho it is much more correct then sum of process' RSS within container, due to:
>>1. it is clear how much container uses physical pages, not abstract items
>>2. shared pages are charged only once, so the sum of containers RSS is still
>> about physical RAM.
>
>
> No the fact that a page mapped into 2 separate mm_structs in two
> separate accounting domains is counted only once. This is very likely
> to happen with things like glibc if you have a read-only shared copy
> of your distro. There appears to be no technical reason for such a
> restriction.
>
> A page should not be owned.
I would be happy to propose OVZ approach then, where a page is tracked
with page_beancounter data structure, which ties together
a page with beancounters which use it like this:
page -> page_beancounter -> list of beanocunters which has the page mapped
This gives a number of advantages:
- the page is accounted to all the VEs which actually use it.
- allows almost accurate tracking of page fractions used by VEs
depending on how many VEs mapped the page.
- allows to track dirty pages, i.e. which VE dirtied the page
and implement correct disk I/O accounting and CFQ write scheduling
based on VE priorities.
> Going further unless the limits are draconian I don't expect users to
> hit the rss limits often or frequently. So in 99% of all cases page
> reclaim should continue to be global. Which makes me question messing
> with the general page reclaim lists.
It is not that rare when containers hit their limits, believe me :/
In trusted environments - probably you are right, in hosting - no.
Thanks,
Kirill
next prev parent reply other threads:[~2007-03-13 15:30 UTC|newest]
Thread overview: 129+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-06 14:42 [RFC][PATCH 0/7] Resource controllers based on process containers Pavel Emelianov
2007-03-06 14:49 ` [RFC][PATCH 1/7] Resource counters Pavel Emelianov
2007-03-07 4:03 ` Balbir Singh
2007-03-07 7:19 ` Pavel Emelianov
2007-03-09 16:37 ` Herbert Poetzl
2007-03-11 9:01 ` Pavel Emelianov
2007-03-11 19:00 ` Eric W. Biederman
2007-03-12 1:16 ` Herbert Poetzl
2007-03-13 9:09 ` Eric W. Biederman
2007-03-13 9:27 ` Pavel Emelianov
2007-03-13 9:49 ` [Devel] " Kirill Korotaev
2007-03-13 15:21 ` Herbert Poetzl
2007-03-13 15:41 ` Pavel Emelianov
2007-03-13 16:07 ` Srivatsa Vaddagiri
2007-03-14 7:12 ` Pavel Emelianov
2007-03-15 16:51 ` Eric W. Biederman
2007-03-13 16:32 ` Herbert Poetzl
2007-03-06 14:55 ` [RFC][PATCH 2/7] RSS controller core Pavel Emelianov
2007-03-06 22:00 ` Andrew Morton
2007-03-09 16:48 ` Herbert Poetzl
2007-03-11 9:08 ` Pavel Emelianov
2007-03-11 14:32 ` Herbert Poetzl
2007-03-11 15:04 ` Pavel Emelianov
2007-03-12 0:41 ` Herbert Poetzl
2007-03-12 8:31 ` Pavel Emelianov
2007-03-12 9:55 ` Balbir Singh
2007-03-12 23:43 ` Herbert Poetzl
2007-03-13 1:57 ` Balbir Singh
2007-03-13 2:24 ` Srivatsa Vaddagiri
2007-03-13 16:06 ` Herbert Poetzl
2007-03-11 12:26 ` Kirill Korotaev
2007-03-11 12:51 ` Andrew Morton
2007-03-11 15:51 ` Balbir Singh
2007-03-11 19:34 ` Eric W. Biederman
2007-03-12 9:23 ` [Devel] " Kirill Korotaev
2007-03-13 9:26 ` Eric W. Biederman
2007-03-13 15:43 ` Kirill Korotaev [this message]
2007-03-12 1:00 ` Herbert Poetzl
2007-03-12 9:02 ` Pavel Emelianov
2007-03-12 21:11 ` Herbert Poetzl
2007-03-13 7:17 ` Pavel Emelianov
2007-03-13 15:05 ` Herbert Poetzl
2007-03-13 15:32 ` Pavel Emelianov
2007-03-13 15:10 ` Kirill Korotaev
2007-03-13 15:11 ` Herbert Poetzl
2007-03-13 15:54 ` Kirill Korotaev
2007-03-12 18:42 ` Dave Hansen
2007-03-12 22:41 ` Herbert Poetzl
2007-03-12 23:02 ` Dave Hansen
2007-03-18 16:58 ` Eric W. Biederman
2007-03-13 6:04 ` Andrew Morton
2007-03-13 10:19 ` [Devel] " Kirill Korotaev
2007-03-13 11:48 ` Andrew Morton
2007-03-13 14:59 ` Herbert Poetzl
2007-03-13 17:05 ` Dave Hansen
2007-03-14 15:38 ` Mel Gorman
2007-03-14 20:42 ` Dave Hansen
2007-03-20 18:57 ` Mel Gorman
2007-03-18 22:44 ` [Devel] " Paul Menage
2007-03-19 17:41 ` Eric W. Biederman
2007-03-13 17:26 ` Dave Hansen
2007-03-13 19:09 ` Alan Cox
2007-03-13 20:28 ` Dave Hansen
2007-03-16 0:55 ` Eric W. Biederman
2007-03-16 16:31 ` Dave Hansen
2007-03-16 18:54 ` Eric W. Biederman
2007-03-16 19:46 ` Dave Hansen
2007-03-18 17:42 ` Eric W. Biederman
2007-03-19 15:48 ` Herbert Poetzl
2007-03-20 16:15 ` controlling mmap()'d vs read/write() pages Dave Hansen
2007-03-20 21:19 ` Eric W. Biederman
2007-03-23 0:51 ` Herbert Poetzl
2007-03-23 5:57 ` Nick Piggin
2007-03-23 10:12 ` Eric W. Biederman
2007-03-23 10:47 ` Nick Piggin
2007-03-23 12:21 ` Eric W. Biederman
2007-03-28 7:33 ` Nick Piggin
2007-03-23 16:41 ` Dave Hansen
2007-03-23 18:16 ` Herbert Poetzl
2007-03-28 9:18 ` Balbir Singh
2007-03-14 16:47 ` [RFC][PATCH 2/7] RSS controller core Mel Gorman
2007-03-07 5:37 ` Balbir Singh
2007-03-07 7:27 ` Pavel Emelianov
2007-03-06 14:58 ` [RFC][PATCH 3/7] Data structures changes for RSS accounting Pavel Emelianov
2007-03-11 19:13 ` Eric W. Biederman
2007-03-12 16:16 ` Kirill Korotaev
2007-03-12 16:48 ` Dave Hansen
2007-03-12 17:19 ` Pavel Emelianov
2007-03-12 17:27 ` Dave Hansen
2007-03-13 7:10 ` Pavel Emelianov
2007-03-12 17:21 ` Balbir Singh
2007-03-06 15:00 ` [RFC][PATCH 4/7] RSS accounting hooks over the code Pavel Emelianov
2007-03-11 19:14 ` Eric W. Biederman
2007-03-12 16:23 ` Kirill Korotaev
2007-03-12 16:50 ` Dave Hansen
2007-03-12 17:07 ` Kirill Korotaev
2007-03-12 17:33 ` Dave Hansen
2007-03-13 9:43 ` Eric W. Biederman
2007-03-12 23:54 ` Herbert Poetzl
2007-03-13 9:58 ` Eric W. Biederman
2007-03-13 10:25 ` Nick Piggin
2007-03-13 16:01 ` Eric W. Biederman
2007-03-14 3:51 ` Nick Piggin
2007-03-14 6:42 ` Balbir Singh
2007-03-14 6:57 ` Nick Piggin
2007-03-14 7:48 ` Balbir Singh
2007-03-14 13:25 ` Vaidyanathan Srinivasan
2007-03-14 13:49 ` Nick Piggin
2007-03-14 14:43 ` Vaidyanathan Srinivasan
2007-03-14 16:16 ` Kirill Korotaev
2007-03-15 5:01 ` Nick Piggin
2007-03-15 5:44 ` Balbir Singh
2007-03-28 20:15 ` Ethan Solomita
2007-03-14 15:37 ` Cedric Le Goater
2007-03-14 15:45 ` Pavel Emelianov
2007-03-06 15:03 ` [RFC][PATCH 5/7] Per-container OOM killer and page reclamation Pavel Emelianov
2007-03-09 21:21 ` Balbir Singh
2007-03-11 8:41 ` Pavel Emelianov
2007-03-06 15:04 ` [RFC][PATCH 6/7] Account for the number of tasks within container Pavel Emelianov
2007-03-07 2:00 ` Paul Menage
2007-03-07 7:13 ` Pavel Emelianov
2007-03-08 13:49 ` Paul Menage
2007-03-11 8:36 ` Pavel Emelianov
2007-03-06 15:07 ` [RFC][PATCH 7/7] Account for the number of files opened " Pavel Emelianov
2007-03-07 2:02 ` [RFC][PATCH 0/7] Resource controllers based on process containers Paul Menage
2007-03-07 7:30 ` Pavel Emelianov
2007-03-07 6:52 ` Balbir Singh
2007-03-07 7:32 ` Pavel Emelianov
2007-03-07 9:43 ` Kirill Korotaev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45F6C6BF.5000002@sw.ru \
--to=dev@sw.ru \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.osdl.org \
--cc=dev@openvz.org \
--cc=devel@openvz.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).