From: Kyle Moffett <mrmacman_g4@mac.com>
To: "David S. Miller" <davem@davemloft.net>
Cc: sri@us.ibm.com, mpm@selenic.com, ak@suse.de,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: [RFC] Fine-grained memory priorities and PI
Date: Thu, 15 Dec 2005 03:55:53 -0500 [thread overview]
Message-ID: <9E6D85FF-E546-4057-80EF-7479021AFAA1@mac.com> (raw)
In-Reply-To: <20051215.002120.133621586.davem@davemloft.net>
On Dec 15, 2005, at 03:21, David S. Miller wrote:
> Not when we run out, but rather when we reach some low water mark,
> the "critical sockets" would still use GFP_ATOMIC memory but only
> "critical sockets" would be allowed to do so.
>
> But even this has faults, consider the IPSEC scenerio I mentioned,
> and this applies to any kind of encapsulation actually, even simple
> tunneling examples can be concocted which make the "critical
> socket" idea fail.
>
> The knee jerk reaction is "mark IPSEC's sockets critical, and mark
> the tunneling allocations critical, and... and..." well you have
> GFP_ATOMIC then my friend.
>
> In short, these "seperate page pool" and "critical socket" ideas do
> not work and we need a different solution, I'm sorry folks spent so
> much time on them, but they are heavily flawed.
What we really need in the kernel is a more fine-grained memory
priority system with PI, similar in concept to what's being done to
the scheduler in some of the RT patchsets. Currently we have a very
black-and-white memory subsystem; when we go OOM, we just start
killing processes until we are no longer OOM. Perhaps we should have
some way to pass memory allocation priorities throughout the kernel,
including a "this request has X priority", "this request will help
free up X pages of RAM", and "drop while dirty under certain OOM to
free X memory using this method".
The initial benefit would be that OOM handling would become more
reliable and less of a special case. When we start to run low on
free pages, it might be OK to kill the SETI@home process long before
we OOM if such action might prevent the OOM. Likewise, you might be
able to flag certain file pages as being "less critical", such that
the kernel can kill a process and drop its dirty pages for files in /
tmp. Or the kernel might do a variety of other things just by
failing new allocations with low priority and forcing existing
allocations with low priority to go away using preregistered handlers.
When processes request memory through any subsystem, their memory
priority would be passed through the kernel layers to the allocator,
along with any associated information about how to free the memory in
a low-memory condition. As a result, I could configure my database
to have a much higher priority than SETI@home (or boinc or whatever),
so that when the database server wants to fill memory with clean DB
cache pages, the kernel will kill SETI@home for it's memory, even if
we could just leave some DB cache pages unfaulted.
Questions? Comments? "This is a terrible idea that should never have
seen the light of day"? Both constructive and destructive criticism
welcomed! (Just please keep the language clean! :-D)
Cheers,
Kyle Moffett
--
Q: Why do programmers confuse Halloween and Christmas?
A: Because OCT 31 == DEC 25.
next prev parent reply other threads:[~2005-12-15 8:56 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-14 9:12 [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism Sridhar Samudrala
2005-12-14 9:22 ` Andi Kleen
2005-12-14 17:55 ` Sridhar Samudrala
2005-12-14 18:41 ` Andi Kleen
2005-12-14 19:20 ` David Stevens
2005-12-15 3:39 ` Matt Mackall
2005-12-15 4:30 ` David S. Miller
2005-12-15 5:02 ` Matt Mackall
2005-12-15 5:23 ` David S. Miller
2005-12-15 5:48 ` Matt Mackall
2005-12-15 5:53 ` Nick Piggin
2005-12-15 5:56 ` Stephen Hemminger
2005-12-15 8:44 ` David Stevens
2005-12-15 8:58 ` David S. Miller
2005-12-15 9:27 ` David Stevens
2005-12-15 5:42 ` Andi Kleen
2005-12-15 6:06 ` Stephen Hemminger
2005-12-15 7:37 ` Sridhar Samudrala
2005-12-15 8:21 ` David S. Miller
2005-12-15 8:35 ` Arjan van de Ven
2005-12-15 8:55 ` Kyle Moffett [this message]
2005-12-15 9:04 ` [RFC] Fine-grained memory priorities and PI Andi Kleen
2005-12-15 12:51 ` Kyle Moffett
2005-12-15 13:31 ` Andi Kleen
2005-12-15 12:45 ` Con Kolivas
2005-12-15 12:58 ` Kyle Moffett
2005-12-15 13:02 ` Con Kolivas
2005-12-16 2:09 ` [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism Sridhar Samudrala
2005-12-16 17:48 ` Stephen Hemminger
2005-12-16 18:38 ` Sridhar Samudrala
2005-12-21 9:11 ` Pavel Machek
2005-12-21 9:39 ` David Stevens
2005-12-14 20:16 ` Jesper Juhl
2005-12-14 20:25 ` Ben Greear
2005-12-14 20:49 ` James Courtier-Dutton
2005-12-14 21:55 ` Sridhar Samudrala
2005-12-14 22:09 ` James Courtier-Dutton
2005-12-14 22:39 ` Ben Greear
2005-12-14 23:42 ` Sridhar Samudrala
2005-12-15 1:54 ` Mitchell Blank Jr
2005-12-15 11:38 ` James Courtier-Dutton
2005-12-15 11:47 ` Arjan van de Ven
2005-12-15 13:00 ` jamal
2005-12-15 13:07 ` Arjan van de Ven
2005-12-15 13:32 ` jamal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9E6D85FF-E546-4057-80EF-7479021AFAA1@mac.com \
--to=mrmacman_g4@mac.com \
--cc=ak@suse.de \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=mpm@selenic.com \
--cc=netdev@vger.kernel.org \
--cc=sri@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).