* OK to set PF_MEMDIE on cleanup tasks?
@ 2003-10-14 15:17 Paul E. McKenney
2003-10-15 0:12 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Paul E. McKenney @ 2003-10-14 15:17 UTC (permalink / raw)
To: linux-kernel
Hello!
We have tasks that actively return memory to the system, which we
would like to exempt from the OOM killer, as killing such tasks under
low-memory conditions would indeed be counterproductive. It looks like
the "official" way to do this is to catch/ignore signal 15, which results
in PF_MEMDIE being set (in the 2.6 kernel), thus preventing the OOM killer
from killing the task again. I don't see where PF_MEMDIE is cleared,
though there are a number of subtle ways one might do this that I would
have missed.
So... Is it considered legit to simply set PF_MEMDIE when creating
the cleanup task? Or is there some reason that one should deal with
signal 15?
All enlightenment much appreciated!
Thanx, Paul
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: OK to set PF_MEMDIE on cleanup tasks?
2003-10-14 15:17 OK to set PF_MEMDIE on cleanup tasks? Paul E. McKenney
@ 2003-10-15 0:12 ` Andrew Morton
2003-10-15 1:34 ` Gerrit Huizenga
2003-10-16 17:10 ` Paul E. McKenney
0 siblings, 2 replies; 4+ messages in thread
From: Andrew Morton @ 2003-10-15 0:12 UTC (permalink / raw)
To: paulmck; +Cc: linux-kernel
"Paul E. McKenney" <paulmck@us.ibm.com> wrote:
>
> Hello!
>
> We have tasks that actively return memory to the system, which we
> would like to exempt from the OOM killer, as killing such tasks under
> low-memory conditions would indeed be counterproductive. It looks like
> the "official" way to do this is to catch/ignore signal 15, which results
> in PF_MEMDIE being set (in the 2.6 kernel), thus preventing the OOM killer
> from killing the task again. I don't see where PF_MEMDIE is cleared,
> though there are a number of subtle ways one might do this that I would
> have missed.
The PF_MEMDIE flag is there so the oom killer doesn't just sit there
hitting the same task over and over again.
We leave PF_MEMDIE set because we expect the task to exit, or to not want
any more oomkiller attention.
The SIGTERM behaviour is there because the CAP_SYS_RAWIO process may need
to release critical resources.
So as long as your process has CAP_SYS_RAWIO, everything happens to work as
you want it. I don't think it was really designed that way though.
> So... Is it considered legit to simply set PF_MEMDIE when creating
> the cleanup task? Or is there some reason that one should deal with
> signal 15?
Well it's all very unconventional. Catching SIGTERM seems like a suitable
way to do what you want to do.
Possibly your special process should also run as PF_MEMALLOC. I've seen
that done before, with success. There is no existing API with which this
can be set.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: OK to set PF_MEMDIE on cleanup tasks?
2003-10-15 0:12 ` Andrew Morton
@ 2003-10-15 1:34 ` Gerrit Huizenga
2003-10-16 17:10 ` Paul E. McKenney
1 sibling, 0 replies; 4+ messages in thread
From: Gerrit Huizenga @ 2003-10-15 1:34 UTC (permalink / raw)
To: Andrew Morton; +Cc: paulmck, linux-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: OK to set PF_MEMDIE on cleanup tasks?
2003-10-15 0:12 ` Andrew Morton
2003-10-15 1:34 ` Gerrit Huizenga
@ 2003-10-16 17:10 ` Paul E. McKenney
1 sibling, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2003-10-16 17:10 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
Hello, Andrew,
On Tue, Oct 14, 2003 at 05:12:27PM -0700, Andrew Morton wrote:
> > So... Is it considered legit to simply set PF_MEMDIE when creating
> > the cleanup task? Or is there some reason that one should deal with
> > signal 15?
>
> Well it's all very unconventional. Catching SIGTERM seems like a suitable
> way to do what you want to do.
OK, since this particular case is a strictly in-kernel task, SIGTERM
should be a no-op anyway. Unless I am missing something in the signal
delivery code, which is quite probable. ;-)
So the magic code would then be:
cap_raise(current->cap_effective, CAP_SYS_RAWIO);
perhaps with:
cap_raise(current->cap_effective, CAP_SYS_ADMIN);
thrown in for good measure.
> Possibly your special process should also run as PF_MEMALLOC. I've seen
> that done before, with success. There is no existing API with which this
> can be set.
This would certainly head off at least some OOM deadlock situations.
On the (perhaps unlikely) chance that this was an invitation, here
is a patch to create an API.
Thanx, Paul
diff -urN -X /home/linux/2.5/dontdiff linux-2.6.0-test7-mm1/include/linux/sched.h linux-2.6.0-test7-mm1-PF_MEMALLOC/include/linux/sched.h
--- linux-2.6.0-test7-mm1/include/linux/sched.h 2003-10-16 07:16:05.000000000 -0700
+++ linux-2.6.0-test7-mm1-PF_MEMALLOC/include/linux/sched.h 2003-10-16 09:20:34.000000000 -0700
@@ -508,6 +508,27 @@
#define PF_LESS_THROTTLE 0x00100000 /* Throttle me less: I clean memory */
#define PF_SYNCWRITE 0x00200000 /* I am doing a sync write */
+/**
+ * mark_task_memalloc - mark the specified task as deserving of preferential
+ * access to free memory. Note that with great power comes great
+ * responsibility.
+ * @p: the task structure to be granted preferential access.
+ */
+static inline void mark_task_memalloc(task_t *p)
+{
+ p->flags |= PF_MEMALLOC;
+}
+
+/**
+ * unmark_task_memalloc - mark the specified task as no longer deserving
+ * of preferential access to free memory.
+ * @p: the task structure to have its preferential access revoked.
+ */
+static inline void unmark_task_memalloc(task_t *p)
+{
+ p->flags &= ~PF_MEMALLOC;
+}
+
#ifdef CONFIG_SMP
extern int set_cpus_allowed(task_t *p, cpumask_t new_mask);
#else
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-10-17 0:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-14 15:17 OK to set PF_MEMDIE on cleanup tasks? Paul E. McKenney
2003-10-15 0:12 ` Andrew Morton
2003-10-15 1:34 ` Gerrit Huizenga
2003-10-16 17:10 ` Paul E. McKenney
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.