linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC] Parallelize IO for e2fsck
       [not found]           ` <9Orda-3ub-45@gated-at.bofh.it>
@ 2008-01-24 17:32             ` Bodo Eggert
  2008-01-24 22:07               ` Andreas Dilger
  2008-01-24 23:08               ` Adrian Bunk
  0 siblings, 2 replies; 13+ messages in thread
From: Bodo Eggert @ 2008-01-24 17:32 UTC (permalink / raw)
  To: Alan Cox, Andreas Dilger, Valdis.Kletnieks, David Chinner,
	Valerie Henson, linux-fsdevel, linux-ext4, linux-kernel,
	Theodore Ts'o, Andreas Dilger, Ric Wheeler

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

>> I'd tried to advocate SIGDANGER some years ago as well, but none of
>> the kernel maintainers were interested.  It definitely makes sense
>> to have some sort of mechanism like this.  At the time I first brought
>> it up it was in conjunction with Netscape using too much cache on some
>> system, but it would be just as useful for all kinds of other memory-
>> hungry applications.
> 
> There is an early thread for a /proc file which you can add to your
> poll() set and it will wake people when memory is low. Very elegant and
> if async support is added it will also give you the signal variant for
> free.

IMO you'll need a userspace daemon. The kernel does only know about the
amount of memory available / recommended for a system (or container),
while the user knows which program's cache is most precious today.

(Off cause the userspace daemon will in turn need the /proc file.)

I think a single, system-wide signal is the second-to worst solution: All
applications (or the wrong one, if you select one) would free their caches
and start to crawl, and either stay in this state or slowly increase their
caches again until they get signaled again. And the signal would either
come too early or too late. The userspace daemon could collect the weighted
demand of memory from all applications and tell them how much to use.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] Parallelize IO for e2fsck
  2008-01-24 17:32             ` [RFC] Parallelize IO for e2fsck Bodo Eggert
@ 2008-01-24 22:07               ` Andreas Dilger
  2008-01-24 23:08               ` Adrian Bunk
  1 sibling, 0 replies; 13+ messages in thread
From: Andreas Dilger @ 2008-01-24 22:07 UTC (permalink / raw)
  To: Bodo Eggert
  Cc: Alan Cox, Valdis.Kletnieks, David Chinner, Valerie Henson,
	linux-fsdevel, linux-ext4, linux-kernel, Theodore Ts'o,
	Andreas Dilger, Ric Wheeler

On Jan 24, 2008  18:32 +0100, Bodo Eggert wrote:
> I think a single, system-wide signal is the second-to worst solution: All
> applications (or the wrong one, if you select one) would free their caches
> and start to crawl, and either stay in this state or slowly increase their
> caches again until they get signaled again. And the signal would either
> come too early or too late. The userspace daemon could collect the weighted
> demand of memory from all applications and tell them how much to use.

Well, sending a few signals (maybe to the top 5 processes in the OOM killer
list) is still a LOT better than OOM-killing them without warning...  That
way important system processes could be taught to understand SIGDANGER and
maybe do something about it instead of being killed, and if Firefox and
other memory hungry processes flush some of their cache it is not fatal.

I wouldn't think that SIGDANGER means "free all of your cache", since the
memory usage clearly wasn't a problem a few seconds previously, so as
an application writer I'd code it as "flush the oldest 10% of my cache"
or similar, and the kernel could send SIGDANGER again (or kill the real
offender) if the memory usage again becomes an issue.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] Parallelize IO for e2fsck
  2008-01-24 17:32             ` [RFC] Parallelize IO for e2fsck Bodo Eggert
  2008-01-24 22:07               ` Andreas Dilger
@ 2008-01-24 23:08               ` Adrian Bunk
  2008-01-24 23:40                 ` Theodore Tso
  1 sibling, 1 reply; 13+ messages in thread
From: Adrian Bunk @ 2008-01-24 23:08 UTC (permalink / raw)
  To: Bodo Eggert
  Cc: Alan Cox, Andreas Dilger, Valdis.Kletnieks, David Chinner,
	Valerie Henson, linux-fsdevel, linux-ext4, linux-kernel,
	Theodore Ts'o, Andreas Dilger, Ric Wheeler

On Thu, Jan 24, 2008 at 06:32:15PM +0100, Bodo Eggert wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> 
> >> I'd tried to advocate SIGDANGER some years ago as well, but none of
> >> the kernel maintainers were interested.  It definitely makes sense
> >> to have some sort of mechanism like this.  At the time I first brought
> >> it up it was in conjunction with Netscape using too much cache on some
> >> system, but it would be just as useful for all kinds of other memory-
> >> hungry applications.
> > 
> > There is an early thread for a /proc file which you can add to your
> > poll() set and it will wake people when memory is low. Very elegant and
> > if async support is added it will also give you the signal variant for
> > free.
> 
> IMO you'll need a userspace daemon. The kernel does only know about the
> amount of memory available / recommended for a system (or container),
> while the user knows which program's cache is most precious today.
> 
> (Off cause the userspace daemon will in turn need the /proc file.)
> 
> I think a single, system-wide signal is the second-to worst solution: All
> applications (or the wrong one, if you select one) would free their caches
> and start to crawl, and either stay in this state or slowly increase their
> caches again until they get signaled again. And the signal would either
> come too early or too late. The userspace daemon could collect the weighted
> demand of memory from all applications and tell them how much to use.

I don't think that's something that would require finetuning on a
per-application basis - the kernel should tell all applications once to
reduce memory consumption and write a fat warning to the logs (which
will on well-maintained systems be mailed to the admin).

Your "and tell them how much to use" wouldn't work for most applications 
- e.g. I've worked the last weeks with a computer with 512 MB RAM and no 
Swap, which means usually only 200 MB of free RAM. I've gotten quite 
used to git aborting with "fatal: Out of memory, malloc failed" when 
200 MB weren't enough for git, and I don't think there is any reasonable 
way for git to reduce the memory usage while continuing to run.

In practice, there is a small number of programs that are both the
common memory hogs and should be able to reduce their memory consumption
by 10% or 20% without big problems when requested (e.g. Java VMs,
Firefox and databases come into my mind).

And from a performance point of view letting applications voluntarily 
free some memory is better even than starting to swap.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] Parallelize IO for e2fsck
  2008-01-24 23:08               ` Adrian Bunk
@ 2008-01-24 23:40                 ` Theodore Tso
  2008-01-25  0:25                   ` Zan Lynx
  2008-01-26 12:32                   ` KOSAKI Motohiro
  0 siblings, 2 replies; 13+ messages in thread
From: Theodore Tso @ 2008-01-24 23:40 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Bodo Eggert, Alan Cox, Andreas Dilger, Valdis.Kletnieks,
	David Chinner, Valerie Henson, linux-fsdevel, linux-ext4,
	linux-kernel, Andreas Dilger, Ric Wheeler

On Fri, Jan 25, 2008 at 01:08:09AM +0200, Adrian Bunk wrote:
> In practice, there is a small number of programs that are both the
> common memory hogs and should be able to reduce their memory consumption
> by 10% or 20% without big problems when requested (e.g. Java VMs,
> Firefox and databases come into my mind).

I agree, it's only a few processes where this makes sense.  But for
those that do, it would be useful if they could register with the
kernel that would like to know, (just before the system starts
ejecting cached data, just before swapping, etc.) and at what
frequency.  And presumably, if the kernel notices that a process is
responding to such requests with memory actually getting released back
to the system, that process could get "rewarded" by having the OOM
killer less likely to target that particular thread.

AIX basically did this with SIGDANGER (the signal is ignored by
default), except there wasn't the ability for the process to tell the
kernel at what level of memory pressure before it should start getting
notified, and there was no way for the kernel to tell how bad the
memory pressure actually was.  On the other hand, it was a relatively
simple design.

In practice very few processes would indeed pay attention to
SIGDANGER, so I think you're quite right there.

> And from a performance point of view letting applications voluntarily 
> free some memory is better even than starting to swap.

Absolutely.

						- Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] Parallelize IO for e2fsck
  2008-01-24 23:40                 ` Theodore Tso
@ 2008-01-25  0:25                   ` Zan Lynx
  2008-01-25 11:09                     ` Andreas Dilger
  2008-01-26 12:32                   ` KOSAKI Motohiro
  1 sibling, 1 reply; 13+ messages in thread
From: Zan Lynx @ 2008-01-25  0:25 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Adrian Bunk, Bodo Eggert, Alan Cox, Andreas Dilger,
	Valdis.Kletnieks, David Chinner, Valerie Henson, linux-fsdevel,
	linux-ext4, linux-kernel, Andreas Dilger, Ric Wheeler

[-- Attachment #1: Type: text/plain, Size: 1079 bytes --]


On Thu, 2008-01-24 at 18:40 -0500, Theodore Tso wrote:
> On Fri, Jan 25, 2008 at 01:08:09AM +0200, Adrian Bunk wrote:
> > In practice, there is a small number of programs that are both the
> > common memory hogs and should be able to reduce their memory consumption
> > by 10% or 20% without big problems when requested (e.g. Java VMs,
> > Firefox and databases come into my mind).
> 
> I agree, it's only a few processes where this makes sense.  But for
> those that do, it would be useful if they could register with the
> kernel that would like to know, (just before the system starts
> ejecting cached data, just before swapping, etc.) and at what
> frequency.  And presumably, if the kernel notices that a process is
> responding to such requests with memory actually getting released back
> to the system, that process could get "rewarded" by having the OOM
> killer less likely to target that particular thread.

Have y'all been following the /dev/mem_notify patches?
http://article.gmane.org/gmane.linux.kernel/628653

-- 
Zan Lynx <zlynx@acm.org>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] Parallelize IO for e2fsck
  2008-01-25  0:25                   ` Zan Lynx
@ 2008-01-25 11:09                     ` Andreas Dilger
  2008-01-26  0:55                       ` Zan Lynx
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Dilger @ 2008-01-25 11:09 UTC (permalink / raw)
  To: Zan Lynx
  Cc: Theodore Tso, Adrian Bunk, Bodo Eggert, Alan Cox,
	Valdis.Kletnieks, David Chinner, Valerie Henson, linux-fsdevel,
	linux-ext4, linux-kernel, Ric Wheeler

On Jan 24, 2008  17:25 -0700, Zan Lynx wrote:
> Have y'all been following the /dev/mem_notify patches?
> http://article.gmane.org/gmane.linux.kernel/628653

Having the notification be via poll() is a very restrictive processing
model.  Having the notification be via a signal means that any kind of
process (and not just those that are event loop driven) can register
a callback at some arbitrary point in the code and be notified.  I
don't object to the poll() interface, but it would be good to have a
signal mechanism also.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] Parallelize IO for e2fsck
  2008-01-25 11:09                     ` Andreas Dilger
@ 2008-01-26  0:55                       ` Zan Lynx
  2008-01-26 11:56                         ` KOSAKI Motohiro
  0 siblings, 1 reply; 13+ messages in thread
From: Zan Lynx @ 2008-01-26  0:55 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Theodore Tso, Adrian Bunk, Bodo Eggert, Alan Cox,
	Valdis.Kletnieks, David Chinner, Valerie Henson, linux-fsdevel,
	linux-ext4, linux-kernel, Ric Wheeler

[-- Attachment #1: Type: text/plain, Size: 1145 bytes --]


On Fri, 2008-01-25 at 04:09 -0700, Andreas Dilger wrote:
> On Jan 24, 2008  17:25 -0700, Zan Lynx wrote:
> > Have y'all been following the /dev/mem_notify patches?
> > http://article.gmane.org/gmane.linux.kernel/628653
> 
> Having the notification be via poll() is a very restrictive processing
> model.  Having the notification be via a signal means that any kind of
> process (and not just those that are event loop driven) can register
> a callback at some arbitrary point in the code and be notified.  I
> don't object to the poll() interface, but it would be good to have a
> signal mechanism also.

The commentary on the mem_notify threads claimed that the signal is
easily provided by setting up the file handle for SIGIO.

Yeah.  Here it is...copied from email written by KOSAKI Motohiro:

implement FASYNC capability to /dev/mem_notify.

<usage example>
        fd = open("/dev/mem_notify", O_RDONLY);

        fcntl(fd, F_SETOWN, getpid());

        flags = fcntl(fd, F_GETFL);
        fcntl(fd, F_SETFL, flags|FASYNC);  /* when low memory, receive SIGIO */
</usage example>
-- 
Zan Lynx <zlynx@acm.org>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] Parallelize IO for e2fsck
  2008-01-26  0:55                       ` Zan Lynx
@ 2008-01-26 11:56                         ` KOSAKI Motohiro
  0 siblings, 0 replies; 13+ messages in thread
From: KOSAKI Motohiro @ 2008-01-26 11:56 UTC (permalink / raw)
  To: Zan Lynx, Andreas Dilger
  Cc: Theodore Tso, Adrian Bunk, Bodo Eggert, Alan Cox,
	Valdis.Kletnieks, David Chinner, Valerie Henson, linux-fsdevel,
	linux-ext4, linux-kernel, Ric Wheeler

> The commentary on the mem_notify threads claimed that the signal is
> easily provided by setting up the file handle for SIGIO.

BTW:
Of cource, you can receive any signal instead SIGIO by use fcntl(F_SETSIG)  :-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] Parallelize IO for e2fsck
  2008-01-24 23:40                 ` Theodore Tso
  2008-01-25  0:25                   ` Zan Lynx
@ 2008-01-26 12:32                   ` KOSAKI Motohiro
  2008-01-26 13:55                     ` Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck) Al Boldi
  1 sibling, 1 reply; 13+ messages in thread
From: KOSAKI Motohiro @ 2008-01-26 12:32 UTC (permalink / raw)
  To: Theodore Tso, Adrian Bunk, Bodo Eggert, Alan Cox, Andreas Dilger,
	Valdis.Kletnieks, David Chinner, Valerie Henson, linux-fsdevel,
	linux-ext4, linux-kernel, Andreas Dilger, Ric Wheeler

> > And from a performance point of view letting applications voluntarily
> > free some memory is better even than starting to swap.
>
> Absolutely.

the mem_notify patch can realize "just before starting swapping" notification :)

to be honest, I don't know fs guys requirement.
if lacking feature of fs guys needed, I implement it with presure if
you tell me it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)
  2008-01-26 12:32                   ` KOSAKI Motohiro
@ 2008-01-26 13:55                     ` Al Boldi
  2008-01-26 16:01                       ` KOSAKI Motohiro
  2008-01-28 23:28                       ` Jon Masters
  0 siblings, 2 replies; 13+ messages in thread
From: Al Boldi @ 2008-01-26 13:55 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-fsdevel, linux-kernel

KOSAKI Motohiro wrote:
> > > And from a performance point of view letting applications voluntarily
> > > free some memory is better even than starting to swap.
> >
> > Absolutely.
>
> the mem_notify patch can realize "just before starting swapping"
> notification :)
>
> to be honest, I don't know fs guys requirement.
> if lacking feature of fs guys needed, I implement it with presure if
> you tell me it.

These notifications are really useful, but it may be much wiser to pipe them 
thru some kevent-notification sub-system, instead of introducing kernel 
notifier-chain end-points left, right, and center.


Thanks!

--
Al


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)
  2008-01-26 13:55                     ` Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck) Al Boldi
@ 2008-01-26 16:01                       ` KOSAKI Motohiro
  2008-01-28 23:28                       ` Jon Masters
  1 sibling, 0 replies; 13+ messages in thread
From: KOSAKI Motohiro @ 2008-01-26 16:01 UTC (permalink / raw)
  To: Al Boldi; +Cc: linux-fsdevel, linux-kernel

Hi Al

> > the mem_notify patch can realize "just before starting swapping"
> > notification :)
> >
> > to be honest, I don't know fs guys requirement.
> > if lacking feature of fs guys needed, I implement it with presure if
> > you tell me it.
>
> These notifications are really useful, but it may be much wiser to pipe them
> thru some kevent-notification sub-system, instead of introducing kernel
> notifier-chain end-points left, right, and center.

Aaahh
Your feelings are understood well.
but current design is decided through many poeple discussion.

if anybody need kevent notification, I will add it to the current
implementation instead replace.

thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)
  2008-01-26 13:55                     ` Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck) Al Boldi
  2008-01-26 16:01                       ` KOSAKI Motohiro
@ 2008-01-28 23:28                       ` Jon Masters
  2008-02-03 13:38                         ` KOSAKI Motohiro
  1 sibling, 1 reply; 13+ messages in thread
From: Jon Masters @ 2008-01-28 23:28 UTC (permalink / raw)
  To: linux-kernel

On Sat, 2008-01-26 at 16:55 +0300, Al Boldi wrote:
> KOSAKI Motohiro wrote:
> > > > And from a performance point of view letting applications voluntarily
> > > > free some memory is better even than starting to swap.
> > >
> > > Absolutely.
> >
> > the mem_notify patch can realize "just before starting swapping"
> > notification :)

I looked at this a year or two back, then ran out of time. But the thing
I wanted to do was have libc's memory allocation routines extended to
handle these through reservations - the kernel should send a userspace
notification and then there should be some kind of concept of returning
memory that's been used for "opportunistic" userspace caching, e.g. in
firefox to cache the last 10 web pages. Let us know how you get on :)

Jon.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)
  2008-01-28 23:28                       ` Jon Masters
@ 2008-02-03 13:38                         ` KOSAKI Motohiro
  0 siblings, 0 replies; 13+ messages in thread
From: KOSAKI Motohiro @ 2008-02-03 13:38 UTC (permalink / raw)
  To: Jon Masters; +Cc: Al Boldi, linux-fsdevel, linux-kernel, kosaki.motohiro

Hi Jon

> I looked at this a year or two back, then ran out of time. But the thing
> I wanted to do was have libc's memory allocation routines extended to
> handle these through reservations - the kernel should send a userspace
> notification and then there should be some kind of concept of returning
> memory that's been used for "opportunistic" userspace caching, e.g. in
> firefox to cache the last 10 web pages. Let us know how you get on :)

sorry for late response.
(I didn't notice your mail ;-)

You are right...
stupid user space caching is very important problem.

but I think this is no libc problem.
glibc malloc hardly caches the memory.
(its default behavior only caching 128K.)

but some application use large memory for too opportunistic caching.
I understood we need propagandize that using mem_notify to application guys
after it merge mainline.

I have no idea of solve it easily.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-02-03 13:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <9Mo9w-7Ws-25@gated-at.bofh.it>
     [not found] ` <9Mo9w-7Ws-23@gated-at.bofh.it>
     [not found]   ` <9OdWm-7uN-25@gated-at.bofh.it>
     [not found]     ` <9Oi9A-5EJ-3@gated-at.bofh.it>
     [not found]       ` <9OiMg-6IC-1@gated-at.bofh.it>
     [not found]         ` <9OlqL-2xG-3@gated-at.bofh.it>
     [not found]           ` <9Orda-3ub-45@gated-at.bofh.it>
2008-01-24 17:32             ` [RFC] Parallelize IO for e2fsck Bodo Eggert
2008-01-24 22:07               ` Andreas Dilger
2008-01-24 23:08               ` Adrian Bunk
2008-01-24 23:40                 ` Theodore Tso
2008-01-25  0:25                   ` Zan Lynx
2008-01-25 11:09                     ` Andreas Dilger
2008-01-26  0:55                       ` Zan Lynx
2008-01-26 11:56                         ` KOSAKI Motohiro
2008-01-26 12:32                   ` KOSAKI Motohiro
2008-01-26 13:55                     ` Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck) Al Boldi
2008-01-26 16:01                       ` KOSAKI Motohiro
2008-01-28 23:28                       ` Jon Masters
2008-02-03 13:38                         ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).