linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] [request for inclusion] Realtime LSM
@ 2004-12-30  2:43 Lee Revell
  2005-01-03 14:03 ` Christoph Hellwig
  0 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2004-12-30  2:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, Ingo Molnar, Jack O'Quin

The realtime LSM has been previously explained on this list.  Its
function is to allow selected nonroot users to run RT tasks.  The most
common application is low latency audio with JACK, http://jackit.sf.net.

Several people have reported that 2.6.10 is the best kernel yet for
audio latency, see
http://ccrma-mail.stanford.edu/pipermail/planetccrma/2004-December/007341.html.    If the realtime LSM were merged, then this would be the last step to making low latency audio work well with the stock kernel.

We (the authors and the Linux audio community) would like to request its
inclusion in the next -mm release, with the eventual goal of having it
in mainline.

This is identical to the last version Jack O'Quin posted (but didn't cc:
Andrew, or make clear that we would like this added to -mm), so I
preserved his Signed-Off-By.

http://lkml.org/lkml/2004/11/24/242

Signed-Off-By: Jack O'Quin <joq@joq.us>

diff -ruN -X /home/joq/bin/kdiff.exclude linux-2.6.10-rc2-mm3/Documentation/realtime-lsm.txt linux-2.6.10-rc2-mm3-rt2/Documentation/realtime-lsm.txt
--- linux-2.6.10-rc2-mm3/Documentation/realtime-lsm.txt	Wed Dec 31 18:00:00 1969
+++ linux-2.6.10-rc2-mm3-rt2/Documentation/realtime-lsm.txt	Wed Nov 24 09:58:29 2004
@@ -0,0 +1,39 @@
+
+		    Realtime Linux Security Module
+
+
+This Linux Security Module (LSM) enables realtime capabilities.  It
+was written by Torben Hohn and Jack O'Quin, under the provisions of
+the GPL (see the COPYING file).  We make no warranty concerning the
+safety, security or even stability of your system when using it.  But,
+we will fix problems if you report them.
+
+Once the LSM has been installed and the kernel for which it was built
+is running, the root user can load it and pass parameters as follows:
+
+  # modprobe realtime any=1
+
+  Any program can request realtime privileges.  This allows any local
+  user to crash the system by hogging the CPU in a tight loop or
+  locking down too much memory.  But, it is simple to administer.  :-)
+
+  # modprobe realtime gid=29
+
+  All users belonging to group 29 and programs that are setgid to that
+  group have realtime privileges.  Use any group number you like.  A
+  `gid' of -1 disables group access.
+
+  # modprobe realtime mlock=0
+
+  Grants realtime scheduling privileges without the ability to lock
+  memory using mlock() or mlockall() system calls.  This option can be
+  used in conjunction with any of the other options.
+
+After the module is loaded, its parameters can be changed dynamically
+via sysfs.
+
+  # echo 1  > /sys/module/realtime/parameters/any
+  # echo 29 > /sys/module/realtime/parameters/gid
+  # echo 1  > /sys/module/realtime/parameters/mlock
+
+Jack O'Quin, joq@joq.us
diff -ruN -X /home/joq/bin/kdiff.exclude linux-2.6.10-rc2-mm3/security/Kconfig linux-2.6.10-rc2-mm3-rt2/security/Kconfig
--- linux-2.6.10-rc2-mm3/security/Kconfig	Wed Nov 24 09:35:44 2004
+++ linux-2.6.10-rc2-mm3-rt2/security/Kconfig	Wed Nov 24 09:58:29 2004
@@ -84,6 +84,17 @@
 
 	  If you are unsure how to answer this question, answer N.
 
+config SECURITY_REALTIME
+	tristate "Realtime Capabilities"
+	depends on SECURITY && SECURITY_CAPABILITIES!=y
+	default n
+	help
+	  This module selectively grants realtime privileges
+	  controlled by parameters set at load time or via files in
+	  /sys/module/realtime/parameters.
+
+	  If you are unsure how to answer this question, answer N.
+
 source security/selinux/Kconfig
 
 endmenu
diff -ruN -X /home/joq/bin/kdiff.exclude linux-2.6.10-rc2-mm3/security/Makefile linux-2.6.10-rc2-mm3-rt2/security/Makefile
--- linux-2.6.10-rc2-mm3/security/Makefile	Wed Nov 24 09:35:44 2004
+++ linux-2.6.10-rc2-mm3-rt2/security/Makefile	Wed Nov 24 09:58:29 2004
@@ -17,3 +17,4 @@
 obj-$(CONFIG_SECURITY_CAPABILITIES)	+= commoncap.o capability.o
 obj-$(CONFIG_SECURITY_ROOTPLUG)		+= commoncap.o root_plug.o
 obj-$(CONFIG_SECURITY_SECLVL)		+= seclvl.o
+obj-$(CONFIG_SECURITY_REALTIME)		+= commoncap.o realtime.o
diff -ruN -X /home/joq/bin/kdiff.exclude linux-2.6.10-rc2-mm3/security/realtime.c linux-2.6.10-rc2-mm3-rt2/security/realtime.c
--- linux-2.6.10-rc2-mm3/security/realtime.c	Wed Dec 31 18:00:00 1969
+++ linux-2.6.10-rc2-mm3-rt2/security/realtime.c	Wed Nov 24 09:59:01 2004
@@ -0,0 +1,147 @@
+/*
+ * Realtime Capabilities Linux Security Module
+ *
+ *  Copyright (C) 2003 Torben Hohn
+ *  Copyright (C) 2003, 2004 Jack O'Quin
+ *
+ *	This program is free software; you can redistribute it and/or modify
+ *	it under the terms of the GNU General Public License as published by
+ *	the Free Software Foundation; either version 2 of the License, or
+ *	(at your option) any later version.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/security.h>
+
+#define RT_LSM "Realtime LSM "		/* syslog module name prefix */
+#define RT_ERR "Realtime: "		/* syslog error message prefix */
+
+#include <linux/vermagic.h>
+MODULE_INFO(vermagic,VERMAGIC_STRING);
+
+/* module parameters
+ *
+ *  These values could change at any time due to some process writing
+ *  a new value in /sys/module/realtime/parameters.  This is OK,
+ *  because each is referenced only once in each function call.
+ *  Nothing depends on parameters having the same value every time.
+ */
+
+/* if TRUE, any process is realtime */
+static int rt_any;
+module_param_named(any, rt_any, int, 0644);
+MODULE_PARM_DESC(any, " grant realtime privileges to any process.");
+
+/* realtime group id, or NO_GROUP */
+static int rt_gid = -1;
+module_param_named(gid, rt_gid, int, 0644);
+MODULE_PARM_DESC(gid, " the group ID with access to realtime privileges.");
+
+/* enable mlock() privileges */
+static int rt_mlock = 1;
+module_param_named(mlock, rt_mlock, int, 0644);
+MODULE_PARM_DESC(mlock, " enable memory locking privileges.");
+
+/* helper function for testing group membership */
+static inline int gid_ok(int gid)
+{
+	if (gid == -1)
+		return 0;
+
+	if (gid == current->gid)
+		return 1;
+
+	return in_egroup_p(gid);
+}
+
+static void realtime_bprm_apply_creds(struct linux_binprm *bprm, int unsafe)
+{
+	cap_bprm_apply_creds(bprm, unsafe);
+
+	/*  If a non-zero `any' parameter was specified, we grant
+	 *  realtime privileges to every process.  If the `gid'
+	 *  parameter was specified and it matches the group id of the
+	 *  executable, of the current process or any supplementary
+	 *  groups, we grant realtime capabilites.
+	 */
+
+	if (rt_any || gid_ok(rt_gid)) {
+		cap_raise(current->cap_effective, CAP_SYS_NICE);
+		if (rt_mlock) {
+			cap_raise(current->cap_effective, CAP_IPC_LOCK);
+			cap_raise(current->cap_effective, CAP_SYS_RESOURCE);
+		}
+	}
+}
+
+static struct security_operations capability_ops = {
+	.ptrace =			cap_ptrace,
+	.capget =			cap_capget,
+	.capset_check =			cap_capset_check,
+	.capset_set =			cap_capset_set,
+	.capable =			cap_capable,
+	.netlink_send =			cap_netlink_send,
+	.netlink_recv =			cap_netlink_recv,
+	.bprm_apply_creds =		realtime_bprm_apply_creds,
+	.bprm_set_security =		cap_bprm_set_security,
+	.bprm_secureexec =		cap_bprm_secureexec,
+	.task_post_setuid =		cap_task_post_setuid,
+	.task_reparent_to_init =	cap_task_reparent_to_init,
+	.syslog =                       cap_syslog,
+	.vm_enough_memory =             cap_vm_enough_memory,
+};
+
+#define MY_NAME __stringify(KBUILD_MODNAME)
+
+static int secondary;	/* flag to keep track of how we were registered */
+
+static int __init realtime_init(void)
+{
+	/* register ourselves with the security framework */
+	if (register_security(&capability_ops)) {
+
+		/* try registering with primary module */
+		if (mod_reg_security(MY_NAME, &capability_ops)) {
+			printk(KERN_INFO RT_ERR "Failure registering "
+			       "capabilities with primary security module.\n");
+			printk(KERN_INFO RT_ERR "Is kernel configured "
+			       "with CONFIG_SECURITY_CAPABILITIES=m?\n");
+			return -EINVAL;
+		}
+		secondary = 1;
+	}
+
+	if (rt_any)
+		printk(KERN_INFO RT_LSM
+		       "initialized (all groups, mlock=%d)\n", rt_mlock);
+	else if (rt_gid == -1)
+		printk(KERN_INFO RT_LSM
+		       "initialized (no groups, mlock=%d)\n", rt_mlock);
+	else
+		printk(KERN_INFO RT_LSM
+		       "initialized (group %d, mlock=%d)\n", rt_gid, rt_mlock);
+		
+	return 0;
+}
+
+static void __exit realtime_exit(void)
+{
+	/* remove ourselves from the security framework */
+	if (secondary) {
+		if (mod_unreg_security(MY_NAME, &capability_ops))
+			printk(KERN_INFO RT_ERR "Failure unregistering "
+				"capabilities with primary module.\n");
+
+	} else if (unregister_security(&capability_ops)) {
+		printk(KERN_INFO RT_ERR
+		       "Failure unregistering capabilities with the kernel\n");
+	}
+	printk(KERN_INFO "Realtime Capability LSM exiting\n");
+}
+
+late_initcall(realtime_init);
+module_exit(realtime_exit);
+
+MODULE_DESCRIPTION("Realtime Capabilities Security Module");
+MODULE_LICENSE("GPL");

-- 
  joq




^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2004-12-30  2:43 [PATCH] [request for inclusion] Realtime LSM Lee Revell
@ 2005-01-03 14:03 ` Christoph Hellwig
  2005-01-03 14:15   ` Arjan van de Ven
  2005-01-04 18:16   ` Lee Revell
  0 siblings, 2 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-03 14:03 UTC (permalink / raw)
  To: Lee Revell; +Cc: linux-kernel, Andrew Morton, Ingo Molnar, Jack O'Quin

On Wed, Dec 29, 2004 at 09:43:22PM -0500, Lee Revell wrote:
> The realtime LSM has been previously explained on this list.  Its
> function is to allow selected nonroot users to run RT tasks.  The most
> common application is low latency audio with JACK, http://jackit.sf.net.
> 
> Several people have reported that 2.6.10 is the best kernel yet for
> audio latency, see
> http://ccrma-mail.stanford.edu/pipermail/planetccrma/2004-December/007341.html.    If the realtime LSM were merged, then this would be the last step to making low latency audio work well with the stock kernel.
> 
> We (the authors and the Linux audio community) would like to request its
> inclusion in the next -mm release, with the eventual goal of having it
> in mainline.
> 
> This is identical to the last version Jack O'Quin posted (but didn't cc:
> Andrew, or make clear that we would like this added to -mm), so I
> preserved his Signed-Off-By.

This is far too specialized.  And option to the capability LSM to grant 
capabilities to certain uids/gids sounds like the better choise - and
would also allow to get rid of the magic hugetlb uid horrors.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-03 14:03 ` Christoph Hellwig
@ 2005-01-03 14:15   ` Arjan van de Ven
  2005-01-07 16:40     ` Lee Revell
  2005-01-04 18:16   ` Lee Revell
  1 sibling, 1 reply; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-03 14:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Lee Revell, linux-kernel, Andrew Morton, Ingo Molnar, Jack O'Quin

On Mon, 2005-01-03 at 14:03 +0000, Christoph Hellwig wrote:
> On Wed, Dec 29, 2004 at 09:43:22PM -0500, Lee Revell wrote:
> > The realtime LSM has been previously explained on this list.  Its
> > function is to allow selected nonroot users to run RT tasks.  The most
> > common application is low latency audio with JACK, http://jackit.sf.net.
> > 
> > Several people have reported that 2.6.10 is the best kernel yet for
> > audio latency, see
> > http://ccrma-mail.stanford.edu/pipermail/planetccrma/2004-December/007341.html.    If the realtime LSM were merged, then this would be the last step to making low latency audio work well with the stock kernel.
> > 
> > We (the authors and the Linux audio community) would like to request its
> > inclusion in the next -mm release, with the eventual goal of having it
> > in mainline.
> > 
> > This is identical to the last version Jack O'Quin posted (but didn't cc:
> > Andrew, or make clear that we would like this added to -mm), so I
> > preserved his Signed-Off-By.
> 
> This is far too specialized.  And option to the capability LSM to grant 
> capabilities to certain uids/gids sounds like the better choise - and
> would also allow to get rid of the magic hugetlb uid horrors.
those can go away anyway now that there is an rlimit to achieve the
exact same thing.....

I can see the point of making an rlimit like thing instead for both the
nice levels allowed and maybe the "can do rt" bit



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-03 14:03 ` Christoph Hellwig
  2005-01-03 14:15   ` Arjan van de Ven
@ 2005-01-04 18:16   ` Lee Revell
  2005-01-04 18:20     ` Christoph Hellwig
  1 sibling, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-04 18:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, Andrew Morton, Ingo Molnar, Jack O'Quin

On Mon, 2005-01-03 at 14:03 +0000, Christoph Hellwig wrote:
> On Wed, Dec 29, 2004 at 09:43:22PM -0500, Lee Revell wrote:
> > The realtime LSM has been previously explained on this list.  Its
> > function is to allow selected nonroot users to run RT tasks.  The most
> > common application is low latency audio with JACK, http://jackit.sf.net.
> > 
> > Several people have reported that 2.6.10 is the best kernel yet for
> > audio latency, see
> > http://ccrma-mail.stanford.edu/pipermail/planetccrma/2004-December/007341.html.    If the realtime LSM were merged, then this would be the last step to making low latency audio work well with the stock kernel.
> > 
> > We (the authors and the Linux audio community) would like to request its
> > inclusion in the next -mm release, with the eventual goal of having it
> > in mainline.
> > 
> > This is identical to the last version Jack O'Quin posted (but didn't cc:
> > Andrew, or make clear that we would like this added to -mm), so I
> > preserved his Signed-Off-By.
> 
> This is far too specialized.  And option to the capability LSM to grant 
> capabilities to certain uids/gids sounds like the better choise - and
> would also allow to get rid of the magic hugetlb uid horrors.
> 

Got a patch?  Code talks, BS walks.  This is working perfectly, right
now, and is being used by thousands of Linux ausio users.

Lee  


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:16   ` Lee Revell
@ 2005-01-04 18:20     ` Christoph Hellwig
  2005-01-04 18:55       ` Jack O'Quin
  2005-01-04 18:57       ` Lee Revell
  0 siblings, 2 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-04 18:20 UTC (permalink / raw)
  To: Lee Revell; +Cc: linux-kernel, Andrew Morton, Ingo Molnar, Jack O'Quin

On Tue, Jan 04, 2005 at 01:16:54PM -0500, Lee Revell wrote:
> Got a patch?  Code talks, BS walks.  This is working perfectly, right
> now, and is being used by thousands of Linux ausio users.

Which still doesn't mean it's the right design.  And no, I don't need the
feature so I won't write it.  If you want a certain feature it's up to
you to implement it in a way that's considered mergeable.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:20     ` Christoph Hellwig
@ 2005-01-04 18:55       ` Jack O'Quin
  2005-01-04 18:59         ` Lee Revell
  2005-01-05 11:20         ` Christoph Hellwig
  2005-01-04 18:57       ` Lee Revell
  1 sibling, 2 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-04 18:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Lee Revell, linux-kernel, Andrew Morton, Ingo Molnar

Christoph Hellwig <hch@infradead.org> writes:

> Which still doesn't mean it's the right design.  And no, I don't
> need the feature so I won't write it.  If you want a certain feature
> it's up to you to implement it in a way that's considered mergeable.

Which is what I have done.  I worked on it because no "real" kernel
developer seemed willing to solve it.  Having worked on other kernels
in an "earlier lifetime", I have *no* desire to do that any more.  I
would much rather write audio software.

But, the lack of this feature has been a continual impediment for
years now.  It affects not just me, but most other serious Linux audio
developers and many of our users.  We need a simple way for users to
configure a Digital Audio Workstation without having to run large,
complex, insecure audio applications as `root'.  Our competition runs
on Windows and Mac systems where no such configuration is needed.

Statements of the form "had I cared enough to do something about this
problem, I would have implemented it differently" are not much help.
This patch is small and clean.  It meshes with existing kernel LSM
mechanisms.  It solves a real problem affecting many Linux desktop
users.

I respectfully request that it be accepted for inclusion in 2.6.11.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:20     ` Christoph Hellwig
  2005-01-04 18:55       ` Jack O'Quin
@ 2005-01-04 18:57       ` Lee Revell
  2005-01-05  1:35         ` Andreas Steinmetz
  2005-01-05 11:24         ` Christoph Hellwig
  1 sibling, 2 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-04 18:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, Andrew Morton, Ingo Molnar, Jack O'Quin

On Tue, 2005-01-04 at 18:20 +0000, Christoph Hellwig wrote:
> On Tue, Jan 04, 2005 at 01:16:54PM -0500, Lee Revell wrote:
> > Got a patch?  Code talks, BS walks.  This is working perfectly, right
> > now, and is being used by thousands of Linux ausio users.
> 
> Which still doesn't mean it's the right design.  And no, I don't need the
> feature so I won't write it.  If you want a certain feature it's up to
> you to implement it in a way that's considered mergeable.
> 

Please specify what's wrong with it.  So far all your objection amounts
to is "I don't like it".

If you do have anything other that your opinion to back up your
assertion that it's a bad design, you should have raised it months ago
when this was first posted.  Now that we have it to a mergeable state
(as far as the people who worked on it are concerned), you want to pop
up and say "Nope, bad design"?

Sorry but last time I checked you were not the ultimate arbiter of good
design on LKML.  If you want to shitcan the _only known good, field
tested, working solution_ then you have to have overwhelming technical
arguments.  So far I've seen zero.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:55       ` Jack O'Quin
@ 2005-01-04 18:59         ` Lee Revell
  2005-01-05  0:01           ` Alan Cox
  2005-01-05 11:25           ` Christoph Hellwig
  2005-01-05 11:20         ` Christoph Hellwig
  1 sibling, 2 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-04 18:59 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Christoph Hellwig, linux-kernel, Andrew Morton, Ingo Molnar

On Tue, 2005-01-04 at 12:55 -0600, Jack O'Quin wrote:
> But, the lack of this feature has been a continual impediment for
> years now.  It affects not just me, but most other serious Linux audio
> developers and many of our users.  We need a simple way for users to
> configure a Digital Audio Workstation without having to run large,
> complex, insecure audio applications as `root'.  Our competition runs
> on Windows and Mac systems where no such configuration is needed.

We could do it the was OSX (our real competition) does if that would
make people happy.  They just let any user run RT tasks.  Oh wait, but
that's a "broken design", everyone knows that OSX is a joke, no one
would use *that* OS to mix a CD or score a movie.  :-)

Lee

 


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:59         ` Lee Revell
@ 2005-01-05  0:01           ` Alan Cox
  2005-01-05  1:28             ` Lee Revell
                               ` (3 more replies)
  2005-01-05 11:25           ` Christoph Hellwig
  1 sibling, 4 replies; 266+ messages in thread
From: Alan Cox @ 2005-01-05  0:01 UTC (permalink / raw)
  To: Lee Revell
  Cc: Jack O'Quin, Christoph Hellwig, Linux Kernel Mailing List,
	Andrew Morton, Ingo Molnar

On Maw, 2005-01-04 at 18:59, Lee Revell wrote:
> We could do it the was OSX (our real competition) does if that would
> make people happy.  They just let any user run RT tasks.  Oh wait, but
> that's a "broken design", everyone knows that OSX is a joke, no one
> would use *that* OS to mix a CD or score a movie.  :-)

You can do that already, just make everyone root

The problem with uid/gid based hacks is that they get really ugly to
administer really fast. Especially once you have users who need realtime
and hugetlb, and users who need one only.

It would be far cleaner to split CAP_SYS_NICE capability down - which
should cover the real time OS functions nicely. Right now it gives a few
too many rights but that could be fixed easily.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  0:01           ` Alan Cox
@ 2005-01-05  1:28             ` Lee Revell
  2005-01-05  1:30             ` Lee Revell
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-05  1:28 UTC (permalink / raw)
  To: Alan Cox
  Cc: Jack O'Quin, Christoph Hellwig, Linux Kernel Mailing List,
	Andrew Morton, Ingo Molnar, Chris Wright

On Wed, 2005-01-05 at 00:01 +0000, Alan Cox wrote:
> The problem with uid/gid based hacks is that they get really ugly to
> administer really fast. Especially once you have users who need realtime
> and hugetlb, and users who need one only.
> 

Sorry, how does hugetlb relate to this?

> It would be far cleaner to split CAP_SYS_NICE capability down - which
> should cover the real time OS functions nicely. Right now it gives a few
> too many rights but that could be fixed easily.
> 

We need selected nonroot users to be able to run SCHED_FIFO tasks and
mlock().  It has to be easy to administer.  That's it.

As Jack mentioned, the developers of this patch are not kernel hackers
by trade, they wrote this to solve a real problem.  In other words, a
patch is worth a thousand words.

It seems distro vendors would be interested in solving this problem.
The linux audio market is smaller than the general desktop of course but
many of the users are professionals who would gladly pay for support.
Look how many people pay for OSX.  Wouldn't Red Hat and SuSE like some
of those customers?

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  0:01           ` Alan Cox
  2005-01-05  1:28             ` Lee Revell
@ 2005-01-05  1:30             ` Lee Revell
  2005-01-05  1:50             ` Chris Wright
  2005-01-05  4:04             ` Jack O'Quin
  3 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-05  1:30 UTC (permalink / raw)
  To: Alan Cox
  Cc: Jack O'Quin, Christoph Hellwig, Linux Kernel Mailing List,
	Andrew Morton, Ingo Molnar

On Wed, 2005-01-05 at 00:01 +0000, Alan Cox wrote:
> The problem with uid/gid based hacks is that they get really ugly to
> administer really fast. Especially once you have users who need realtime
> and hugetlb, and users who need one only.

Why?  Just make a realtime group and a hugetlb group and add users to
one, the other, or both.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:57       ` Lee Revell
@ 2005-01-05  1:35         ` Andreas Steinmetz
  2005-01-05  4:18           ` Alan Cox
  2005-01-05 11:39           ` Christoph Hellwig
  2005-01-05 11:24         ` Christoph Hellwig
  1 sibling, 2 replies; 266+ messages in thread
From: Andreas Steinmetz @ 2005-01-05  1:35 UTC (permalink / raw)
  To: Lee Revell; +Cc: linux-kernel, Andrew Morton, Ingo Molnar, Jack O'Quin

Lee Revell wrote:
> On Tue, 2005-01-04 at 18:20 +0000, Christoph Hellwig wrote:
> 
>>On Tue, Jan 04, 2005 at 01:16:54PM -0500, Lee Revell wrote:
>>
>>>Got a patch?  Code talks, BS walks.  This is working perfectly, right
>>>now, and is being used by thousands of Linux ausio users.
>>
>>Which still doesn't mean it's the right design.  And no, I don't need the
>>feature so I won't write it.  If you want a certain feature it's up to
>>you to implement it in a way that's considered mergeable.
>>
> 
> 
> Please specify what's wrong with it.  So far all your objection amounts
> to is "I don't like it".
> 
> If you do have anything other that your opinion to back up your
> assertion that it's a bad design, you should have raised it months ago
> when this was first posted.  Now that we have it to a mergeable state
> (as far as the people who worked on it are concerned), you want to pop
> up and say "Nope, bad design"?

Let me remind you all that according to lkml history hch has always been 
biased and objecting to anything related to lsm. Nobody can take hch's 
opinion here as objective. I would even go so far that when things are 
related to lsm(s) he's just tro...
-- 
Andreas Steinmetz                       SPAMmers use robotrap@domdv.de

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  0:01           ` Alan Cox
  2005-01-05  1:28             ` Lee Revell
  2005-01-05  1:30             ` Lee Revell
@ 2005-01-05  1:50             ` Chris Wright
  2005-01-05  1:55               ` Lee Revell
  2005-01-05  4:04             ` Jack O'Quin
  3 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-05  1:50 UTC (permalink / raw)
  To: Alan Cox
  Cc: Lee Revell, Jack O'Quin, Christoph Hellwig,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar

* Alan Cox (alan@lxorguk.ukuu.org.uk) wrote:
> On Maw, 2005-01-04 at 18:59, Lee Revell wrote:
> > We could do it the was OSX (our real competition) does if that would
> > make people happy.  They just let any user run RT tasks.  Oh wait, but
> > that's a "broken design", everyone knows that OSX is a joke, no one
> > would use *that* OS to mix a CD or score a movie.  :-)
> 
> You can do that already, just make everyone root
> 
> The problem with uid/gid based hacks is that they get really ugly to
> administer really fast. Especially once you have users who need realtime
> and hugetlb, and users who need one only.

I don't believe the hugetlb gid stuff is useful anymore.  It should be
handled nicely via rlimits.

> It would be far cleaner to split CAP_SYS_NICE capability down - which
> should cover the real time OS functions nicely. Right now it gives a few
> too many rights but that could be fixed easily.

Hmm, how do we do this w/out breaking things?  Maybe I'm misunderstanding
your idea.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  1:50             ` Chris Wright
@ 2005-01-05  1:55               ` Lee Revell
  2005-01-05  2:05                 ` Chris Wright
  2005-01-05 11:52                 ` Ingo Molnar
  0 siblings, 2 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-05  1:55 UTC (permalink / raw)
  To: Chris Wright
  Cc: Alan Cox, Jack O'Quin, Christoph Hellwig,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar

On Tue, 2005-01-04 at 17:50 -0800, Chris Wright wrote:
> * Alan Cox (alan@lxorguk.ukuu.org.uk) wrote:
> > 
> > The problem with uid/gid based hacks is that they get really ugly to
> > administer really fast. Especially once you have users who need realtime
> > and hugetlb, and users who need one only.
> 
> I don't believe the hugetlb gid stuff is useful anymore.  It should be
> handled nicely via rlimits.

The last time I checked users could belong to more than one group.  Am I
missing something?

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  1:55               ` Lee Revell
@ 2005-01-05  2:05                 ` Chris Wright
  2005-01-05  2:58                   ` Kyle Moffett
  2005-01-05  4:06                   ` Jack O'Quin
  2005-01-05 11:52                 ` Ingo Molnar
  1 sibling, 2 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-05  2:05 UTC (permalink / raw)
  To: Lee Revell
  Cc: Chris Wright, Alan Cox, Jack O'Quin, Christoph Hellwig,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar

* Lee Revell (rlrevell@joe-job.com) wrote:
> The last time I checked users could belong to more than one group.  Am I
> missing something?

No, you're not.  I think Alan's just saying the gid based checks
are suboptimal if there's a cleaner way to do it (to which I agree).
Personally, I don't have a big problem with the Realtime LSM.  I've helped
you with it, and suggested a few times that I'd prefer it to be generic;
but never stepped up to deliver code of that sort.  Since it's your itch,
you've scratched it, and it's quite simple and contained, I consider
it acceptable.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  2:05                 ` Chris Wright
@ 2005-01-05  2:58                   ` Kyle Moffett
  2005-01-05  3:45                     ` Chris Wright
  2005-01-05  4:06                   ` Jack O'Quin
  1 sibling, 1 reply; 266+ messages in thread
From: Kyle Moffett @ 2005-01-05  2:58 UTC (permalink / raw)
  To: Chris Wright
  Cc: Ingo Molnar, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Christoph Hellwig, Lee Revell,
	Andrew Morton

On Jan 04, 2005, at 21:05, Chris Wright wrote:
> No, you're not.  I think Alan's just saying the gid based checks
> are suboptimal if there's a cleaner way to do it (to which I agree).
> Personally, I don't have a big problem with the Realtime LSM.  I've 
> helped
> you with it, and suggested a few times that I'd prefer it to be 
> generic;
> but never stepped up to deliver code of that sort.  Since it's your 
> itch,
> you've scratched it, and it's quite simple and contained, I consider
> it acceptable.

Here's a relatively simple idea: Why not make the "Realtime LSM"
just check for a certain "Realtime" credential in the new credential
store (Patch is in 2.6.10, see [1] for control program).  You would
mark it as a system credential and give access to that credential via
the appropriate capability with a small utility program.

Of course, I _do_ respect that I am not providing a patch which they
have done.  I think this serves a useful place and should probably be
included as-is, for now.  A later update to make it use a better
mechanism would be nice, though. :-)

[1] http://people.redhat.com/~dhowells/keys/keyctl.c

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r  
!y?(-)
------END GEEK CODE BLOCK------



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  2:58                   ` Kyle Moffett
@ 2005-01-05  3:45                     ` Chris Wright
  0 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-05  3:45 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Chris Wright, Ingo Molnar, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Christoph Hellwig, Lee Revell,
	Andrew Morton

* Kyle Moffett (mrmacman_g4@mac.com) wrote:
> Here's a relatively simple idea: Why not make the "Realtime LSM"
> just check for a certain "Realtime" credential in the new credential
> store (Patch is in 2.6.10, see [1] for control program).  You would
> mark it as a system credential and give access to that credential via
> the appropriate capability with a small utility program.

Well, that's basically what the gid is in this case.  It's the credential
that's set at login time and has all the proper sharing and inheritance
rules.  So, I'm not yet convinced that this would buy us much.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  0:01           ` Alan Cox
                               ` (2 preceding siblings ...)
  2005-01-05  1:50             ` Chris Wright
@ 2005-01-05  4:04             ` Jack O'Quin
  3 siblings, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-05  4:04 UTC (permalink / raw)
  To: Alan Cox
  Cc: Lee Revell, Christoph Hellwig, Linux Kernel Mailing List,
	Andrew Morton, Ingo Molnar

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Maw, 2005-01-04 at 18:59, Lee Revell wrote:
>> We could do it the was OSX (our real competition) does if that would
>> make people happy.  They just let any user run RT tasks.  Oh wait, but
>> that's a "broken design", everyone knows that OSX is a joke, no one
>> would use *that* OS to mix a CD or score a movie.  :-)
>
> You can do that already, just make everyone root

Surely you're joking.  Is this actually a serious proposal?

> The problem with uid/gid based hacks is that they get really ugly to
> administer really fast. Especially once you have users who need realtime
> and hugetlb, and users who need one only.

This is why POSIX requires supplementary groups.

All I had to do on my system was...

  # adduser joq audio

That is considerably easier than hacking rlimits values via PAM.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  2:05                 ` Chris Wright
  2005-01-05  2:58                   ` Kyle Moffett
@ 2005-01-05  4:06                   ` Jack O'Quin
  1 sibling, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-05  4:06 UTC (permalink / raw)
  To: Chris Wright
  Cc: Lee Revell, Alan Cox, Christoph Hellwig,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar

Chris Wright <chrisw@osdl.org> writes:

> * Lee Revell (rlrevell@joe-job.com) wrote:
>> The last time I checked users could belong to more than one group.  Am I
>> missing something?
>
> No, you're not.  I think Alan's just saying the gid based checks
> are suboptimal if there's a cleaner way to do it (to which I agree).
> Personally, I don't have a big problem with the Realtime LSM.  I've helped
> you with it, and suggested a few times that I'd prefer it to be generic;
> but never stepped up to deliver code of that sort.  Since it's your itch,
> you've scratched it, and it's quite simple and contained, I consider
> it acceptable.

We appreciate the help, Chris.  The patch is considerably smaller and
cleaner thanks to your efforts.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  1:35         ` Andreas Steinmetz
@ 2005-01-05  4:18           ` Alan Cox
  2005-01-05  5:50             ` Andrew Morton
  2005-01-07  1:18             ` Matt Mackall
  2005-01-05 11:39           ` Christoph Hellwig
  1 sibling, 2 replies; 266+ messages in thread
From: Alan Cox @ 2005-01-05  4:18 UTC (permalink / raw)
  To: Andreas Steinmetz
  Cc: Lee Revell, Linux Kernel Mailing List, Andrew Morton,
	Ingo Molnar, Jack O'Quin

On Mer, 2005-01-05 at 01:35, Andreas Steinmetz wrote:
> Let me remind you all that according to lkml history hch has always been 
> biased and objecting to anything related to lsm. Nobody can take hch's 
> opinion here as objective. I would even go so far that when things are 
> related to lsm(s) he's just tro...

Oh I don't think so. Everyone thinks Christoph has it in for their
project (me included quite often). He's just blessed with a lot of taste
and determination to enforce it, and cursed (or perhaps blessed) with
the ability to explain bluntly and clearly his opinion.

gid hacks are not a good long term plan.

Can we use capabilities, if not - why not and how do we fix it so we can
do the job right. Do we need some more capability bits that are
implicitly inherited and not touched by setuidness ?



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  4:18           ` Alan Cox
@ 2005-01-05  5:50             ` Andrew Morton
  2005-01-05 12:06               ` Herbert Poetzl
  2005-01-05 20:09               ` Olaf Dietsche
  2005-01-07  1:18             ` Matt Mackall
  1 sibling, 2 replies; 266+ messages in thread
From: Andrew Morton @ 2005-01-05  5:50 UTC (permalink / raw)
  To: Alan Cox; +Cc: ast, rlrevell, linux-kernel, mingo, joq

Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
>  Can we use capabilities

capabilities don't work :(

	http://www.uwsg.iu.edu/hypermail/linux/kernel/0404.0/0502.html



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:55       ` Jack O'Quin
  2005-01-04 18:59         ` Lee Revell
@ 2005-01-05 11:20         ` Christoph Hellwig
  1 sibling, 0 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-05 11:20 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Christoph Hellwig, Lee Revell, linux-kernel, Andrew Morton, Ingo Molnar

On Tue, Jan 04, 2005 at 12:55:15PM -0600, Jack O'Quin wrote:
> Statements of the form "had I cared enough to do something about this
> problem, I would have implemented it differently" are not much help.
> This patch is small and clean.  It meshes with existing kernel LSM
> mechanisms.  It solves a real problem affecting many Linux desktop
> users.

It solves problems - most kernel patches do that.  But it does solve
this problems in a way that doesn't fit very well in the grand design.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:57       ` Lee Revell
  2005-01-05  1:35         ` Andreas Steinmetz
@ 2005-01-05 11:24         ` Christoph Hellwig
  1 sibling, 0 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-05 11:24 UTC (permalink / raw)
  To: Lee Revell; +Cc: linux-kernel, Andrew Morton, Ingo Molnar, Jack O'Quin

On Tue, Jan 04, 2005 at 01:57:13PM -0500, Lee Revell wrote:
> On Tue, 2005-01-04 at 18:20 +0000, Christoph Hellwig wrote:
> > On Tue, Jan 04, 2005 at 01:16:54PM -0500, Lee Revell wrote:
> > > Got a patch?  Code talks, BS walks.  This is working perfectly, right
> > > now, and is being used by thousands of Linux ausio users.
> > 
> > Which still doesn't mean it's the right design.  And no, I don't need the
> > feature so I won't write it.  If you want a certain feature it's up to
> > you to implement it in a way that's considered mergeable.
> > 
> 
> Please specify what's wrong with it.  So far all your objection amounts
> to is "I don't like it".

It's tying privilegues to uids/gids, and it does so in an overcomplicated
way and just for an extremly tiny, specialized subset of available
privilegues.

In short it's a very specialized hack.

> If you do have anything other that your opinion to back up your
> assertion that it's a bad design, you should have raised it months ago
> when this was first posted.  Now that we have it to a mergeable state
> (as far as the people who worked on it are concerned), you want to pop
> up and say "Nope, bad design"?

I'm very sorry but I don't have the time to comment on every single patch
posted somewhere.  All the review and core kernel work I do on lkml is in my
unpaid spare time.  If you want me to review specific things in a deadline
or want me to implement features in a way that fits the kernel grand plan
(which doesn't equal to it actually beeing accepted by other kernel
developers), you're free to contract me.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-04 18:59         ` Lee Revell
  2005-01-05  0:01           ` Alan Cox
@ 2005-01-05 11:25           ` Christoph Hellwig
  2005-01-05 17:32             ` Lee Revell
  1 sibling, 1 reply; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-05 11:25 UTC (permalink / raw)
  To: Lee Revell
  Cc: Jack O'Quin, Christoph Hellwig, linux-kernel, Andrew Morton,
	Ingo Molnar

On Tue, Jan 04, 2005 at 01:59:57PM -0500, Lee Revell wrote:
> We could do it the was OSX (our real competition) does if that would
> make people happy.  They just let any user run RT tasks.  Oh wait, but
> that's a "broken design", everyone knows that OSX is a joke, no one
> would use *that* OS to mix a CD or score a movie.  :-)

No one sane (well, no one sane with a background in Operating Systems)
would use OS X at all.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  1:35         ` Andreas Steinmetz
  2005-01-05  4:18           ` Alan Cox
@ 2005-01-05 11:39           ` Christoph Hellwig
  2005-01-05 17:35             ` Lee Revell
  1 sibling, 1 reply; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-05 11:39 UTC (permalink / raw)
  To: Andreas Steinmetz
  Cc: Lee Revell, linux-kernel, Andrew Morton, Ingo Molnar, Jack O'Quin

> Let me remind you all that according to lkml history hch has always been 
> biased and objecting to anything related to lsm. Nobody can take hch's 
> opinion here as objective. I would even go so far that when things are 
> related to lsm(s) he's just tro...

I'm not a big fan of LSM, and I've explained the rationale why multiple
times.  The doesn't mean everything done using LSM is bad  -  in practice
most things are bad though (from the things I've seen everything but lsm)

btw, any reason you drop me from the Cc list once you start the personal
attacks?

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  1:55               ` Lee Revell
  2005-01-05  2:05                 ` Chris Wright
@ 2005-01-05 11:52                 ` Ingo Molnar
  2005-01-05 15:19                   ` Lee Revell
                                     ` (2 more replies)
  1 sibling, 3 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-01-05 11:52 UTC (permalink / raw)
  To: Lee Revell
  Cc: Chris Wright, Alan Cox, Jack O'Quin, Christoph Hellwig,
	Linux Kernel Mailing List, Andrew Morton, Arjan van de Ven


the RT-LSM thing is a bit dangerous because it doesnt really protect
against a runaway, buggy app. So i think the right way to approach this
problem is to not apply RT-LSM for the time being, but to provide an
'advanced latency needs' scheduling class that is _still_ safe even if
the task is runaway, but behaves with near-RT priorities if the task is
'nice' (i.e. doesnt use up large amount of CPU time.)

incidentally, there is such a scheduling class already: negative nice
levels. Please skip any preconceptions you might have about nice levels,
nice levels have been improved in 2.6.10, the timeslices are now given
out exponentially, giving nice -20 tasks far more weight and priority
than they used to have. (They are obviously still preemptable if they
keep looping burning CPU - but that we can consider a feature.) (Also,
in 2.6 the negative nice levels have a much more agressive interactivity
setting, allowing them to preempt everything lower-prio.)

so, could you try vanilla 2.6.10 (without LSM and without jackd running
with RT priorities), with jackd set to nice -20? Make sure the
jack-client process gets this priority too. Best to achieve this is to
renice a shell to -20 and start up everything from there - the nice
settings will be inherited. How does such an audio test compare to a
test done with jackd running at SCHED_FIFO with RT priority 1?

if this works out well then we could achieve something comparable to
RT-LSM, via nice levels alone.

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  5:50             ` Andrew Morton
@ 2005-01-05 12:06               ` Herbert Poetzl
  2005-01-07  1:13                 ` Matt Mackall
  2005-01-05 20:09               ` Olaf Dietsche
  1 sibling, 1 reply; 266+ messages in thread
From: Herbert Poetzl @ 2005-01-05 12:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, ast, rlrevell, linux-kernel, mingo, joq

On Tue, Jan 04, 2005 at 09:50:10PM -0800, Andrew Morton wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> >
> >  Can we use capabilities
> 
> capabilities don't work :(
> 
> 	http://www.uwsg.iu.edu/hypermail/linux/kernel/0404.0/0502.html

well, maybe it is time to fix them ..

I already proposed some methods to extend them,
and I'm also willing to dig into the various things
required to allow to use the capability system for
what it was intended.

best,
Herbert

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 11:52                 ` Ingo Molnar
@ 2005-01-05 15:19                   ` Lee Revell
  2005-01-05 15:21                   ` Lee Revell
  2005-01-05 18:18                   ` Jack O'Quin
  2 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-05 15:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Alan Cox, Jack O'Quin, Christoph Hellwig,
	Linux Kernel Mailing List, Andrew Morton, Arjan van de Ven

On Wed, 2005-01-05 at 12:52 +0100, Ingo Molnar wrote:
> the RT-LSM thing is a bit dangerous because it doesnt really protect
> against a runaway, buggy app. So i think the right way to approach this
> problem is to not apply RT-LSM for the time being, but to provide an
> 'advanced latency needs' scheduling class that is _still_ safe even if
> the task is runaway, but behaves with near-RT priorities if the task is
> 'nice' (i.e. doesnt use up large amount of CPU time.)
> 
> incidentally, there is such a scheduling class already: negative nice
> levels. Please skip any preconceptions you might have about nice levels,
> nice levels have been improved in 2.6.10, the timeslices are now given
> out exponentially, giving nice -20 tasks far more weight and priority
> than they used to have. (They are obviously still preemptable if they
> keep looping burning CPU - but that we can consider a feature.) (Also,
> in 2.6 the negative nice levels have a much more agressive interactivity
> setting, allowing them to preempt everything lower-prio.)
> 
> so, could you try vanilla 2.6.10 (without LSM and without jackd running
> with RT priorities), with jackd set to nice -20? Make sure the
> jack-client process gets this priority too. Best to achieve this is to
> renice a shell to -20 and start up everything from there - the nice
> settings will be inherited. How does such an audio test compare to a
> test done with jackd running at SCHED_FIFO with RT priority 1?
> 
> if this works out well then we could achieve something comparable to
> RT-LSM, via nice levels alone.
> 

Adding Paul Davis to the cc:, as he has expressed very strong opinions
on this in the past.

Of course this does not address the problem as you still need to be root
to run at a negative nice value.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 11:52                 ` Ingo Molnar
  2005-01-05 15:19                   ` Lee Revell
@ 2005-01-05 15:21                   ` Lee Revell
  2005-01-07 12:56                     ` Paul Davis
  2005-01-05 18:18                   ` Jack O'Quin
  2 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-05 15:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Alan Cox, Jack O'Quin, Christoph Hellwig,
	Linux Kernel Mailing List, Andrew Morton, Arjan van de Ven,
	Paul Davis

On Wed, 2005-01-05 at 12:52 +0100, Ingo Molnar wrote:
> the RT-LSM thing is a bit dangerous because it doesnt really protect
> against a runaway, buggy app. So i think the right way to approach this
> problem is to not apply RT-LSM for the time being, but to provide an
> 'advanced latency needs' scheduling class that is _still_ safe even if
> the task is runaway, but behaves with near-RT priorities if the task is
> 'nice' (i.e. doesnt use up large amount of CPU time.)
> 
> incidentally, there is such a scheduling class already: negative nice
> levels. Please skip any preconceptions you might have about nice levels,
> nice levels have been improved in 2.6.10, the timeslices are now given
> out exponentially, giving nice -20 tasks far more weight and priority
> than they used to have. (They are obviously still preemptable if they
> keep looping burning CPU - but that we can consider a feature.) (Also,
> in 2.6 the negative nice levels have a much more agressive interactivity
> setting, allowing them to preempt everything lower-prio.)
> 
> so, could you try vanilla 2.6.10 (without LSM and without jackd running
> with RT priorities), with jackd set to nice -20? Make sure the
> jack-client process gets this priority too. Best to achieve this is to
> renice a shell to -20 and start up everything from there - the nice
> settings will be inherited. How does such an audio test compare to a
> test done with jackd running at SCHED_FIFO with RT priority 1?
> 
> if this works out well then we could achieve something comparable to
> RT-LSM, via nice levels alone.

Ugh, screwed up the cc: list.  Sorry for the WOB.

Paul, care to comment on the above?

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 11:25           ` Christoph Hellwig
@ 2005-01-05 17:32             ` Lee Revell
  2005-01-05 19:11               ` Christoph Hellwig
  0 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-05 17:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jack O'Quin, linux-kernel, Andrew Morton, Ingo Molnar

On Wed, 2005-01-05 at 11:25 +0000, Christoph Hellwig wrote:
> On Tue, Jan 04, 2005 at 01:59:57PM -0500, Lee Revell wrote:
> > We could do it the was OSX (our real competition) does if that would
> > make people happy.  They just let any user run RT tasks.  Oh wait, but
> > that's a "broken design", everyone knows that OSX is a joke, no one
> > would use *that* OS to mix a CD or score a movie.  :-)
> 
> No one sane (well, no one sane with a background in Operating Systems)
> would use OS X at all.
> 

Really?  I would expect any sane engineer to use the best tool for the
job.  If you actually think it's Linux, I suggest you try it sometime.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 11:39           ` Christoph Hellwig
@ 2005-01-05 17:35             ` Lee Revell
  2005-01-05 19:11               ` Christoph Hellwig
  0 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-05 17:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andreas Steinmetz, linux-kernel, Andrew Morton, Ingo Molnar,
	Jack O'Quin

On Wed, 2005-01-05 at 11:39 +0000, Christoph Hellwig wrote:
> I'm not a big fan of LSM, and I've explained the rationale why multiple
> times.  The doesn't mean everything done using LSM is bad  -  in practice
> most things are bad though (from the things I've seen everything but lsm)
                                                                       ^^^

Is this a typo?  Maybe you mean SELinux?

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 11:52                 ` Ingo Molnar
  2005-01-05 15:19                   ` Lee Revell
  2005-01-05 15:21                   ` Lee Revell
@ 2005-01-05 18:18                   ` Jack O'Quin
  2 siblings, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-05 18:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Lee Revell, Chris Wright, Alan Cox, Christoph Hellwig,
	Linux Kernel Mailing List, Andrew Morton, Arjan van de Ven

Ingo Molnar <mingo@elte.hu> writes:

> the RT-LSM thing is a bit dangerous because it doesnt really protect
> against a runaway, buggy app. So i think the right way to approach this
> problem is to not apply RT-LSM for the time being, but to provide an
> 'advanced latency needs' scheduling class that is _still_ safe even if
> the task is runaway, but behaves with near-RT priorities if the task is
> 'nice' (i.e. doesnt use up large amount of CPU time.)

You are right that a runaway SCHED_FIFO application can freeze the
system.  But, this really has nothing to do with the permissions
problem addressed by the realtime-lsm.  In fact, it is needed by
non-root users for running `nice -20', just as for SCHED_FIFO.

I have no objection to creating a "better" RT scheduling class than
SCHED_FIFO.  The "much-maligned" Mac OS X has a deadline scheduler
that works quite well for running JACK and its applications.

> so, could you try vanilla 2.6.10 (without LSM and without jackd running
> with RT priorities), with jackd set to nice -20? Make sure the
> jack-client process gets this priority too. Best to achieve this is to
> renice a shell to -20 and start up everything from there - the nice
> settings will be inherited. How does such an audio test compare to a
> test done with jackd running at SCHED_FIFO with RT priority 1?

For a quick comparison, I used a slightly modified version of the
jack_test3.2 script, that runs jackd without the -R (--realtime)
option...

                                 With -R        Without -R
                               (SCHED_FIFO)     (nice -20)

************* SUMMARY RESULT ****************
Total seconds ran . . . . . . :   300
Number of clients . . . . . . :    20
Ports per client  . . . . . . :     4
Frames per buffer . . . . . . :    64
*********************************************
Timeout Count . . . . . . . . :(    1)          (    1)         
XRUN Count  . . . . . . . . . :     2             2837          
Delay Count (>spare time) . . :     0                0          
Delay Count (>1000 usecs) . . :     0                0          
Delay Maximum . . . . . . . . :  3130   usecs    5038044   usecs
Cycle Maximum . . . . . . . . :   960   usecs    18802   usecs
Average DSP Load. . . . . . . :    34.3 %           44.1 %    
Average CPU System Load . . . :     8.7 %            7.5 %    
Average CPU User Load . . . . :    29.8 %            5.2 %    
Average CPU Nice Load . . . . :     0.0 %           20.3 %    
Average CPU I/O Wait Load . . :     3.2 %            5.2 %    
Average CPU IRQ Load  . . . . :     0.7 %            0.7 %    
Average CPU Soft-IRQ Load . . :     0.0 %            0.2 %    
Average Interrupt Rate  . . . :  1707.6 /sec      1677.3 /sec 
Average Context-Switch Rate . : 11914.9 /sec     11197.6 /sec 
*********************************************

This was not exactly the test you requested.  The LSM is still
present.  But, it makes no difference.  In fact, I used it to grant
nice privileges, since I didn't feel like running it as root.

But this is otherwise vanilla 2.6.10, and the two scheduling
algorithms are fairly represented.  Try it yourself, I think you'll
see similarly dramatic differences.

Note that 2.6.10 has by far the best realtime performance of any
vanilla Linux kernel I have ever tried.  Although, much better results
can be obtained with your Realtime Preemption patches, this is still a
very creditable result, quite usable for many relatively low-latency
applications.  Kudos to you and the many others who contributed to
this achievement.

> if this works out well then we could achieve something comparable to
> RT-LSM, via nice levels alone.

As you see, it does not work at all.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 17:32             ` Lee Revell
@ 2005-01-05 19:11               ` Christoph Hellwig
  0 siblings, 0 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-05 19:11 UTC (permalink / raw)
  To: Lee Revell; +Cc: Jack O'Quin, linux-kernel, Andrew Morton, Ingo Molnar

On Wed, Jan 05, 2005 at 12:32:47PM -0500, Lee Revell wrote:
> Really?  I would expect any sane engineer to use the best tool for the
> job.

Sure.

> If you actually think it's Linux, I suggest you try it sometime.

You don't want to run Darwin, trust me.  If you don't read through their
sources..


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 17:35             ` Lee Revell
@ 2005-01-05 19:11               ` Christoph Hellwig
  0 siblings, 0 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-05 19:11 UTC (permalink / raw)
  To: Lee Revell
  Cc: Andreas Steinmetz, linux-kernel, Andrew Morton, Ingo Molnar,
	Jack O'Quin

On Wed, Jan 05, 2005 at 12:35:56PM -0500, Lee Revell wrote:
> On Wed, 2005-01-05 at 11:39 +0000, Christoph Hellwig wrote:
> > I'm not a big fan of LSM, and I've explained the rationale why multiple
> > times.  The doesn't mean everything done using LSM is bad  -  in practice
> > most things are bad though (from the things I've seen everything but lsm)
>                                                                        ^^^
> 
> Is this a typo?  Maybe you mean SELinux?

Yes.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  5:50             ` Andrew Morton
  2005-01-05 12:06               ` Herbert Poetzl
@ 2005-01-05 20:09               ` Olaf Dietsche
  1 sibling, 0 replies; 266+ messages in thread
From: Olaf Dietsche @ 2005-01-05 20:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, ast, rlrevell, linux-kernel, mingo, joq

Andrew Morton <akpm@osdl.org> writes:

> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>>
>>  Can we use capabilities
>
> capabilities don't work :(
>
> 	http://www.uwsg.iu.edu/hypermail/linux/kernel/0404.0/0502.html

Capabilities don't work, because of missing filesystem
capabilities. If you have them, it's a question of setting the
appropriate permitted, inheritable and effective capability sets.

I didn't follow the whole thread. But if you want to grant
capabilities on a per user/group basis, may I suggest accessfs user
based capabilities, for example? :-)

Regards, Olaf.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 12:06               ` Herbert Poetzl
@ 2005-01-07  1:13                 ` Matt Mackall
  2005-01-07  1:55                   ` Alan Cox
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-07  1:13 UTC (permalink / raw)
  To: Andrew Morton, Alan Cox, ast, rlrevell, linux-kernel, mingo, joq

On Wed, Jan 05, 2005 at 01:06:02PM +0100, Herbert Poetzl wrote:
> On Tue, Jan 04, 2005 at 09:50:10PM -0800, Andrew Morton wrote:
> > Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> > >
> > >  Can we use capabilities
> > 
> > capabilities don't work :(
> > 
> > 	http://www.uwsg.iu.edu/hypermail/linux/kernel/0404.0/0502.html
> 
> well, maybe it is time to fix them ..
> 
> I already proposed some methods to extend them,
> and I'm also willing to dig into the various things
> required to allow to use the capability system for
> what it was intended.

You can't fix them without changing the semantics for existing users
in ways they didn't expect. It could be done with a new personality flag,
but..

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05  4:18           ` Alan Cox
  2005-01-05  5:50             ` Andrew Morton
@ 2005-01-07  1:18             ` Matt Mackall
  2005-01-07  2:36               ` Lee Revell
  2005-01-07  5:54               ` Jack O'Quin
  1 sibling, 2 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-07  1:18 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andreas Steinmetz, Lee Revell, Linux Kernel Mailing List,
	Andrew Morton, Ingo Molnar, Jack O'Quin

On Wed, Jan 05, 2005 at 04:18:15AM +0000, Alan Cox wrote:
> On Mer, 2005-01-05 at 01:35, Andreas Steinmetz wrote:
> > Let me remind you all that according to lkml history hch has always been 
> > biased and objecting to anything related to lsm. Nobody can take hch's 
> > opinion here as objective. I would even go so far that when things are 
> > related to lsm(s) he's just tro...
> 
> Oh I don't think so. Everyone thinks Christoph has it in for their
> project (me included quite often). He's just blessed with a lot of taste
> and determination to enforce it, and cursed (or perhaps blessed) with
> the ability to explain bluntly and clearly his opinion.
> 
> gid hacks are not a good long term plan.
> 
> Can we use capabilities, if not - why not and how do we fix it so we can
> do the job right. Do we need some more capability bits that are
> implicitly inherited and not touched by setuidness ?

Why can't this be done with a simple SUID helper to promote given
tasks to RT with sched_setschedule, doing essentially all the checks
this LSM is doing? 

Objections of "because it requires dangerous root or suid" don't fly,
an RT app under user control can DoS the box trivially. Never mind you
need root to configure the LSM anyway..

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07  1:13                 ` Matt Mackall
@ 2005-01-07  1:55                   ` Alan Cox
  2005-01-07 20:05                     ` Matt Mackall
  0 siblings, 1 reply; 266+ messages in thread
From: Alan Cox @ 2005-01-07  1:55 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Andrew Morton, ast, rlrevell, Linux Kernel Mailing List, mingo, joq

On Gwe, 2005-01-07 at 01:13, Matt Mackall wrote:
> You can't fix them without changing the semantics for existing users
> in ways they didn't expect. It could be done with a new personality flag,
> but..

I disagree. At the most trivial you could just add another 32bits of
sticky capability that are never touched by setuid/non-setuidness and
represent additional "user" (or more rightly session) abilities to do
limited overrides


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07  1:18             ` Matt Mackall
@ 2005-01-07  2:36               ` Lee Revell
  2005-01-07  5:54               ` Jack O'Quin
  1 sibling, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-07  2:36 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Alan Cox, Andreas Steinmetz, Linux Kernel Mailing List,
	Andrew Morton, Ingo Molnar, Jack O'Quin

On Thu, 2005-01-06 at 17:18 -0800, Matt Mackall wrote:
> Why can't this be done with a simple SUID helper to promote given
> tasks to RT with sched_setschedule, doing essentially all the checks
> this LSM is doing? 
> 
> Objections of "because it requires dangerous root or suid" don't fly,
> an RT app under user control can DoS the box trivially. Never mind you
> need root to configure the LSM anyway..

Yes but a bug in an app running as root can trash the filesystem.  The
worst you can do with RT privileges is lock up the machine.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07  1:18             ` Matt Mackall
  2005-01-07  2:36               ` Lee Revell
@ 2005-01-07  5:54               ` Jack O'Quin
  2005-01-07 20:02                 ` Matt Mackall
  1 sibling, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-07  5:54 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Alan Cox, Andreas Steinmetz, Lee Revell, Chris Wright,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar,
	LAD mailing list


[Adding linux-audio-dev to the CC list]

Matt Mackall <mpm@selenic.com> writes:

> On Wed, Jan 05, 2005 at 04:18:15AM +0000, Alan Cox wrote:
>> gid hacks are not a good long term plan.
>> 
>> Can we use capabilities, if not - why not and how do we fix it so
>> we can do the job right. Do we need some more capability bits that
>> are implicitly inherited and not touched by setuidness ?
>
> Why can't this be done with a simple SUID helper to promote given
> tasks to RT with sched_setschedule, doing essentially all the checks
> this LSM is doing?

The answer to your simple question is a long, sad story.  :-(

There is clearly no practical way to write large audio applications
(many with elaborate graphical interfaces) securly enough to run them
as root.  So, we have used capabilities with linux-2.4 systems for
several years.  It was never a satisfactory solution, but was all we
could do at the time.  

There is a small setuid program called `jackstart' that exec()s the
JACK server (`jackd') with appropriate privileges so it can pass
realtime privileges to its applications.  Each client needs to create
a realtime thread and mlock() its storage to do its part of the
realtime audio cycle.  Note that sched_setschedule() provides no way
to handle the mlock() requirement, which cannot be done from another
process.  Clients may come and go at any time, so dropping the
privilege after initialization is not an option.

Unfortunately, all this heavyweight mechanism only helps with JACK and
its many clients.  Lots of other audio or video oriented applications
also have realtime needs.

The biggest problem was CAP_SETPCAP, which for good reasons[1] is
disabled in distributed kernels.  This forced every user to patch and
build a custom kernel.  Worse, it opened all our systems up to the
problems reported by this sendmail security advisory.

 [1] http://www.securiteam.com/unixfocus/5KQ040A1RI.html

While stumbling along with this very unsatisfactory state of affairs,
many on the Linux Audio Developers mailing list were shocked[2] to
hear about an LKML discussion[3] suggesting a significant lack of
developer committment to addressing these issues...

> Quoting Albert Cahalan[3]: "The authors of our code seem to have
> given up and moved on. Nobody cleaned up the mess. Is it any wonder
> the POSIX draft didn't ever make it beyond the draft state?"

 [2] http://www.music.columbia.edu/pipermail/linux-audio-dev/2003-November/005332.html
 [3] http://www.kerneltraffic.org/kernel-traffic/kt20031101_239.html#3

So, all our work, frustration and user confusion while trying to "do
the right thing" seemed doomed to failure.  Since the Linux kernel
developers continued to show little interest in our needs, we started
a discussion about how to meet them ourselves[4].

 [4] http://www.music.columbia.edu/pipermail/linux-audio-dev/2003-November/005345.html

Looking at our security requirements in a practical manner, we quickly
concluded that CAP_SETPCAP is the work of the devil.  A true
filesystem-based privilege vector solution might be adequate, but is
clearly beyond the scope of what we audio programmers could hope to
accomplish.  Even then, it would be difficult to administer.

A simple group ID test is far more secure than CAP_SETPCAP, and
perfectly adequate for us.  When configuring a Digital Audio
Workstation, one is not terribly concerned about local Denial of
Service attacks or runaway realtime threads.  That would be
unacceptable for many other systems, but not ours.  Yet, we want to
avoid system integrity holes in network daemons like sendmail[1].  In
other words: we can tolerate the bad guys crashing the system, but we
don't want them turning it into an open spam relay or corrupting the
filesystem.

So, we needed to provide a simple way for an unskilled system admin
(aka "musician") to configure a personal workstation to run realtime
applications without opening egregious security holes.  Equally
important, it must be easy for other system admins to ensure that
these privileges are *not* available on their server systems.  It soon
became apparent that the then-new LSM framework provided a good
solution.  Because LSM's can be built outside the kernel source tree,
we were no longer forced to wait for some kernel developer to take an
interest.

The realtime-lsm is the solution we evolved.  It has been actively
used by thousands of Linux audio users for over a year now[5].  The
first supported SourceForge release was in April of 2004[6].  It is
now used by many popular audio-oriented distributions, including
Planet CCRMA[7] from Stanford University and the Debian Music
Distribution[8] from the AGNULA project.

 [5] http://www.music.columbia.edu/pipermail/linux-audio-dev/2003-December/005745.html
 [6] http://eca.cx/laa/2004/04/0028.html
 [7] http://ccrma.stanford.edu/planetccrma/software/
 [8] http://www.agnula.org/

I understand that kernel developers are busy and have other problems
they consider more important than ours.  But, you ought to at least
understand that this is really important to us.  We needed a clean
solution two or three years ago.  Now we finally have one.

Distributing it with the kernel sources would be a great convenience
for our users and would significantly simplify maintenance.  It would
also (IMHO) close a significant security and usability deficiency in
the standard kernel.  Any of the NSA and DoD experts will tell you: a
security solution that is difficult to administer is not secure.

It is no surprise that kernel developers should consider our solution
technically inferior to their own ideas on the subject.  I would have
been delighted to have some kernel developer step in and provide a
clean, well-thought out solution several years ago.  This is a kernel
deficiency, not an audio problem.  I don't want to work on kernels.

But, I am feeling quite discouraged that so many kernel developers
still seem to consider this problem unimportant.  I sense a distinct
unwillingness to move forward on this issue.  I really hope I am wrong
about that.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-05 15:21                   ` Lee Revell
@ 2005-01-07 12:56                     ` Paul Davis
  2005-01-07 13:04                       ` Christoph Hellwig
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-07 12:56 UTC (permalink / raw)
  To: Lee Revell
  Cc: Ingo Molnar, Chris Wright, Alan Cox, Jack O'Quin,
	Christoph Hellwig, Linux Kernel Mailing List, Andrew Morton,
	Arjan van de Ven

I just read the thread of messages about this, and I am just
dumbfounded. Jack O'Quin has very politely explained the whole thing,
and it appears that almost nobody actually paid attention to what 
he was saying.

1) capabilities: it has been explained by several people that
capabilities do not work, and in the past there has been an utter lack
of interest on the part of the kernel crowd to fix them, sometimes
even going as far as "it can't be fixed".

2) this is *not* only about scheduling. Realtime tasks need
mlockall() and/or mlock as well. even the man page for mlock
recognizes this, yet almost all the discussion here has focused on
scheduling. 

3) christoph claims that using uid/gid to define priviledge scope
is a bad idea. but that is the *desired* method. uid/gid corresponds exactly
to what the users of these systems want. they don't want priviledge
accorded to specific applications - its the *users* not the
applications that have the right to get RT scheduling, lock down
memory and so on. these applications will run without RT priviledges,
just not very well (in general, so badly that they are unusable for
their intended purpose).

4) christoph's claims about OS X are nothing but ridiculous. whatever
the internals of Darwin may or may not be (and they certainly include
some of the best ideas about media-friendly kernels from the last 20
years, unlike our favorite OS), professional people are using OS X
(like they used OS 9 and OS 8 before) to get serious, paid work done
in a way that they cannot on Linux. and if attitudes like christoph's
prevail, in a way that they will never get to do on Linux without
going through steps that they will consider absurd. Alan jokes (i
presume) "oh, thats easy, make everyone root", but thats not what OS X
does. OS X says "we know that running realtime applications matters
for a broad class of our likely users, and so anyone can do it, not
just root". And note: "realtime applications" does not mean just
"rt-scheduled", as noted above.

5) setuid wrappers don't work for this, because even though you can
change the scheduling class of another process, you cannot "grant" it
the ability to use mlock. at least not without capabilities, so back
to (1) above ...

So, what do we have here? The two most successful media-friendly OS's
(BeOS and OS X) demonstrate clearly the way things need to be from the
user experience perspective, a development community within the Linux
world evolves a solution using the very nice new security modules in
2.6, and then people who don't appear to understand anything about
what is required or what the use cases are say "i don't like and
because nobody pays me i don't have to tell you why".

I've spent probably burnt through to $250,000 supporting myself and my
family over the last 5 years while I develop pro-level audio software
for Linux. I don't expect to see any of that back. So when Christoph
chimes in with the "I'm not paid, I don't have to tell you why I don't
like it, I just don't" ... that really, really, really irritates me in
a way that few other comments do.

We (Jack, Lee and now myself) have tried to explain what the problem
with the kernel is, how LSM makes a solution possible, acknowledged
issues and attempted to address them, and finally have offered up a
working patch that makes life easier for a bunch of people who don't
want to run webservers or compile kernels all day. If you're going to
publically argue that what the "realtime" LSM does should not be part
of the kernel, at least do us the favor of showing us enough respect
to provide technical or policy based reasons for why its such a bad
solution. 

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 12:56                     ` Paul Davis
@ 2005-01-07 13:04                       ` Christoph Hellwig
  2005-01-07 14:16                         ` Paul Davis
  0 siblings, 1 reply; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-07 13:04 UTC (permalink / raw)
  To: Paul Davis
  Cc: Lee Revell, Ingo Molnar, Chris Wright, Alan Cox, Jack O'Quin,
	Christoph Hellwig, Linux Kernel Mailing List, Andrew Morton,
	Arjan van de Ven

On Fri, Jan 07, 2005 at 07:56:02AM -0500, Paul Davis wrote:
> 2) this is *not* only about scheduling. Realtime tasks need
> mlockall() and/or mlock as well. even the man page for mlock
> recognizes this, yet almost all the discussion here has focused on
> scheduling. 

RLIMIT_MEMLOCK is your friend.

> 3) christoph claims that using uid/gid to define priviledge scope
> is a bad idea. but that is the *desired* method. uid/gid corresponds exactly
> to what the users of these systems want. they don't want priviledge
> accorded to specific applications - its the *users* not the
> applications that have the right to get RT scheduling, lock down
> memory and so on. these applications will run without RT priviledges,
> just not very well (in general, so badly that they are unusable for
> their intended purpose).

it doesn't really matter what you want, but how we can implement
something that fits in the kernel design.

> 4) christoph's claims about OS X are nothing but ridiculous. whatever
> the internals of Darwin may or may not be (and they certainly include
> some of the best ideas about media-friendly kernels from the last 20
> years, unlike our favorite OS), professional people are using OS X

professional people are also using Windows or Solaris.  That doesn't
mean we have to copy every bad idea from them.

> 5) setuid wrappers don't work for this, because even though you can
> change the scheduling class of another process, you cannot "grant" it
> the ability to use mlock. at least not without capabilities, so back
> to (1) above ...

See above (RLIMIT_MEMLOCK).

> I've spent probably burnt through to $250,000 supporting myself and my
> family over the last 5 years while I develop pro-level audio software
> for Linux. I don't expect to see any of that back. So when Christoph
> chimes in with the "I'm not paid, I don't have to tell you why I don't
> like it, I just don't" ... that really, really, really irritates me in
> a way that few other comments do.

I think you're taking things totally out of context here.  Lee complained
I didn't review his patch earlier.  I only have a limited time available
so I'll select patches that I'm gonna review - and that means thet have
to either be very interesting or be proposed for inclusion.  If you want
me to review other things you'll have to either pay me or ask me really
nicely offlist.

> We (Jack, Lee and now myself) have tried to explain what the problem
> with the kernel is, how LSM makes a solution possible, acknowledged
> issues and attempted to address them, and finally have offered up a
> working patch that makes life easier for a bunch of people who don't
> want to run webservers or compile kernels all day.

And we have told you that this solution is not okay.  You can spend
more time whining which won't do anything or you could help brainstorming
how to implement a workable solution.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 13:04                       ` Christoph Hellwig
@ 2005-01-07 14:16                         ` Paul Davis
  2005-01-07 14:26                           ` Arjan van de Ven
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-07 14:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Lee Revell, Ingo Molnar, Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton, Arjan van de Ven

>On Fri, Jan 07, 2005 at 07:56:02AM -0500, Paul Davis wrote:
>> 2) this is *not* only about scheduling. Realtime tasks need
>> mlockall() and/or mlock as well. even the man page for mlock
>> recognizes this, yet almost all the discussion here has focused on
>> scheduling. 
>
>RLIMIT_MEMLOCK is your friend.

rlimit_memlock limits the *amount* of memory that mlock() can be used
on, not whether mlock can be used. at least, thats my understanding of
the POSIX design for this. the man page and the source code for mlock
support make that reasonably clear.

moreover, AFAIK all the issues that existed for granting capabilities
exist for rlimit-based priviledges. if they are not granted to all
users/processes, how are they granted, and can they controlled by a
non-root process? last time i looked, the hard limit used by rlimits is
system-wide. you want to copy that idea from OSX or not?

>it doesn't really matter what you want, but how we can implement
>something that fits in the kernel design.

"realtime" LSM does fit into the kernel, quite demonstrably so. it
doesn't, it appears, fit into *your* idea of kernel design.

>> 4) christoph's claims about OS X are nothing but ridiculous. whatever
>> the internals of Darwin may or may not be (and they certainly include
>> some of the best ideas about media-friendly kernels from the last 20
>> years, unlike our favorite OS), professional people are using OS X
>
>professional people are also using Windows or Solaris.  That doesn't
>mean we have to copy every bad idea from them.

I didn't say "copy every idea from them". The point of "realtime" LSM
is precisely *not* to copy every idea from them - instead of every
user being able to run RT apps, only specifically root-administered
uids and/or gids can.

>And we have told you that this solution is not okay.  You can spend

You, Christoph, have told us that. There is no "we" here. You provided
no rationale other than "uid/gid based privildge control is the wrong
method". 

>more time whining which won't do anything or you could help brainstorming
>how to implement a workable solution.

We (Jack, Torben and others on LAD) did brainstorm. We were told on
lkml that LSM was the right way to do this kind of things these days,
because capabilities were broken. But you don't like LSM, so now,
totally post-facto you're telling us that this is not a "workable
solution."

Newsflash: its a totally workable and working solution, and its one
that distributions will adopt whether you get paid or i suck up and
ask you nicely offline. The question was whether we could make
distributions' and users' lives a little easier by not requiring them
to download additional stuff first. Apparently, your unexplained
convictions about the right and wrong way to grant priviledges,
(something that no OS has ever really gotten its head around except
VMS (maybe)), is more important.

Fine, we'll continue to tell people to use "realtime" LSM for audio
work. The people this really affects probably won't use vanilla
kernels anyway. 

--p



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 14:16                         ` Paul Davis
@ 2005-01-07 14:26                           ` Arjan van de Ven
  2005-01-07 14:38                             ` Paul Davis
  2005-01-07 18:01                             ` Chris Wright
  0 siblings, 2 replies; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-07 14:26 UTC (permalink / raw)
  To: Paul Davis
  Cc: Christoph Hellwig, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

On Fri, Jan 07, 2005 at 09:16:50AM -0500, Paul Davis wrote:
> >On Fri, Jan 07, 2005 at 07:56:02AM -0500, Paul Davis wrote:
> >> 2) this is *not* only about scheduling. Realtime tasks need
> >> mlockall() and/or mlock as well. even the man page for mlock
> >> recognizes this, yet almost all the discussion here has focused on
> >> scheduling. 
> >
> >RLIMIT_MEMLOCK is your friend.
> 
> rlimit_memlock limits the *amount* of memory that mlock() can be used
> on, not whether mlock can be used. at least, thats my understanding of
> the POSIX design for this. the man page and the source code for mlock
> support make that reasonably clear.

eh no. It defaults to zero, but if you increase it for a specific user, that
user is allowed to mlock more.

> 
> Fine, we'll continue to tell people to use "realtime" LSM for audio
> work. The people this really affects probably won't use vanilla
> kernels anyway. 

that is so not a constructive way to make progress. 
The realtime LSM is the wrong concept. It's a hack to work around other
design issues with linux. *THAT* is what makes it wrong. Not the fact that
it wouldn't work (I believe it works, I don't think anyone doubts that
much). If you are unwilling to even discuss fixing the underlying design
issues then I'm scared that this issue will never come to any workable
solution.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 14:26                           ` Arjan van de Ven
@ 2005-01-07 14:38                             ` Paul Davis
  2005-01-07 14:42                               ` Arjan van de Ven
  2005-01-07 14:47                               ` Christoph Hellwig
  2005-01-07 18:01                             ` Chris Wright
  1 sibling, 2 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-07 14:38 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Christoph Hellwig, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

>> rlimit_memlock limits the *amount* of memory that mlock() can be used
>> on, not whether mlock can be used. at least, thats my understanding of
>> the POSIX design for this. the man page and the source code for mlock
>> support make that reasonably clear.
>
>eh no. It defaults to zero, but if you increase it for a specific user, that
>user is allowed to mlock more.

from mm/mlock.c:do_mlock() in 2.6.8:

	if (on && !capable(CAP_IPC_LOCK))
		return -EPERM;

i.e. only root or capabilities can make mlock() usable.

>much). If you are unwilling to even discuss fixing the underlying design
>issues then I'm scared that this issue will never come to any workable
>solution.

Lee, Jack and I have been very willing to discuss the issue. Christoph
isn't willing to discuss it, he's just told us "its the wrong design,
and I'm not telling you why or what's better". If there is a better
design that will end up in the mainstream kernel, we'd love to see it
implemented, and will likely be involved in doing it, because its
really important to us.

--p


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 14:38                             ` Paul Davis
@ 2005-01-07 14:42                               ` Arjan van de Ven
  2005-01-07 15:27                                 ` Paul Davis
  2005-01-07 14:47                               ` Christoph Hellwig
  1 sibling, 1 reply; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-07 14:42 UTC (permalink / raw)
  To: Paul Davis
  Cc: Christoph Hellwig, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

On Fri, Jan 07, 2005 at 09:38:38AM -0500, Paul Davis wrote:
> >> rlimit_memlock limits the *amount* of memory that mlock() can be used
> >> on, not whether mlock can be used. at least, thats my understanding of
> >> the POSIX design for this. the man page and the source code for mlock
> >> support make that reasonably clear.
> >
> >eh no. It defaults to zero, but if you increase it for a specific user, that
> >user is allowed to mlock more.
> 
> from mm/mlock.c:do_mlock() in 2.6.8:
> 
> 	if (on && !capable(CAP_IPC_LOCK))
> 		return -EPERM;

now try 2.6.9 ;)
this deficiency got already fixed

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 14:38                             ` Paul Davis
  2005-01-07 14:42                               ` Arjan van de Ven
@ 2005-01-07 14:47                               ` Christoph Hellwig
  2005-01-07 15:26                                 ` Paul Davis
  1 sibling, 1 reply; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-07 14:47 UTC (permalink / raw)
  To: Paul Davis
  Cc: Arjan van de Ven, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

On Fri, Jan 07, 2005 at 09:38:38AM -0500, Paul Davis wrote:
> Lee, Jack and I have been very willing to discuss the issue. Christoph
> isn't willing to discuss it, he's just told us "its the wrong design,
> and I'm not telling you why or what's better". If there is a better
> design that will end up in the mainstream kernel, we'd love to see it
> implemented, and will likely be involved in doing it, because its
> really important to us.

Calm down and read through the thread again.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 14:47                               ` Christoph Hellwig
@ 2005-01-07 15:26                                 ` Paul Davis
  2005-01-07 16:08                                   ` Martin Mares
  2005-01-07 17:53                                   ` Chris Wright
  0 siblings, 2 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-07 15:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arjan van de Ven, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

>On Fri, Jan 07, 2005 at 09:38:38AM -0500, Paul Davis wrote:
>> Lee, Jack and I have been very willing to discuss the issue. Christoph
>> isn't willing to discuss it, he's just told us "its the wrong design,
>> and I'm not telling you why or what's better". If there is a better
>> design that will end up in the mainstream kernel, we'd love to see it
>> implemented, and will likely be involved in doing it, because its
>> really important to us.
>
>Calm down and read through the thread again.

Sure, lets. Distilling out the responses from kernel developers:

======================================================================

Christoph:
---------
This is far too specialized.  And option to the capability LSM to grant 
capabilities to certain uids/gids sounds like the better choise - and
would also allow to get rid of the magic hugetlb uid horrors.

Which still doesn't mean it's the right design.  And no, I don't need the
feature so I won't write it.  If you want a certain feature it's up to
you to implement it in a way that's considered mergeable.

Alan: 
-----
The problem with uid/gid based hacks is that they get really ugly to
administer really fast. Especially once you have users who need realtime
and hugetlb, and users who need one only.

It would be far cleaner to split CAP_SYS_NICE capability down - which
should cover the real time OS functions nicely. Right now it gives a few
too many rights but that could be fixed easily.

gid hacks are not a good long term plan.

Can we use capabilities, if not - why not and how do we fix it so we can
do the job right. Do we need some more capability bits that are
implicitly inherited and not touched by setuidness ?

Andrew:
-------

capabilities don't work :(

Herbert:
--------

well, maybe it is time to fix them ..

I already proposed some methods to extend them,
and I'm also willing to dig into the various things
required to allow to use the capability system for
what it was intended.

Matt:
-----

You can't fix them without changing the semantics for existing users
in ways they didn't expect. It could be done with a new personality flag,
but..

Alan:
-----
I disagree. At the most trivial you could just add another 32bits of
sticky capability that are never touched by setuid/non-setuidness and
represent additional "user" (or more rightly session) abilities to do
limited overrides

Olaf:
-----
Capabilities don't work, because of missing filesystem
capabilities. If you have them, it's a question of setting the
appropriate permitted, inheritable and effective capability sets.

I didn't follow the whole thread. But if you want to grant
capabilities on a per user/group basis, may I suggest accessfs user
based capabilities, for example? :-)

======================================================================

So, we have a few responses, some references to various potential
solutions all of which have problems just as deep if not deeper than
the uid/gid-based model that this particular LSM adopts. No proposal
for any system that would actually work and address anyone's real
needs in a useful way. Please recall that we developed a
capability-based solution for 2.4, but it was cumbersome because the
vanilla kernel doesn't have capabilities enabled and there are lots of
reasons to not enable them given their current status.

Meanwhile, Jack already provided a very detailed, cross-referenced and
clear explanatin of why various other ideas won't work very well from
a user-space perspective. And in this thread, both Lee and Jack have
attempted to deal with issues that have been raised about the uid/gid
approach. 

In summary, on the one hand, we have a working, defensible solution,
and on the other some misgivings and suggestions to try again at
implementing some more generic priviledge-granting system, something
that lkml has been arguing about for years, along with the rest of the
OS design community. Something that I suspect will never be properly 
resolved, merely "muddled towards". There is no right way to grant
priviledges - there are many ways, and the benefits and downfalls of
each depends on what you are trying to achieve. For years, POSIX based
systems have relied on uid/gid solutions and they continue to do
so. People understand how to manage them (as best as can be done), and
what the issues are. Capabilities were supposed to be solution to
this, and instead have essentially been a dead-end. So I trust that
you'll be understanding of any scepticism that I might have of the
suggestion that we go away and work on some other "more generic"
system. 

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 14:42                               ` Arjan van de Ven
@ 2005-01-07 15:27                                 ` Paul Davis
  2005-01-07 15:33                                   ` Arjan van de Ven
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-07 15:27 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Christoph Hellwig, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

>now try 2.6.9 ;)
>this deficiency got already fixed

well thats good, i hope someone updated the man page too :) 

but is there actually any way to grant specific users a reasonable
rlimit, or are you proposing that we adopt another "bad idea" from OS
X and let everybody do this?

--p


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 15:27                                 ` Paul Davis
@ 2005-01-07 15:33                                   ` Arjan van de Ven
  2005-01-07 15:41                                     ` Paul Davis
  0 siblings, 1 reply; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-07 15:33 UTC (permalink / raw)
  To: Paul Davis
  Cc: Christoph Hellwig, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton


On Fri, Jan 07, 2005 at 10:27:33AM -0500, Paul Davis wrote:
> >now try 2.6.9 ;)
> >this deficiency got already fixed
> 
> well thats good, i hope someone updated the man page too :) 
> 
> but is there actually any way to grant specific users a reasonable
> rlimit, 

yes; most distributions will use pam for this, you can set per user or per
gorup limits there.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 15:33                                   ` Arjan van de Ven
@ 2005-01-07 15:41                                     ` Paul Davis
  2005-01-07 16:03                                       ` Arjan van de Ven
  2005-01-07 16:03                                       ` Martin Mares
  0 siblings, 2 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-07 15:41 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Christoph Hellwig, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

>> well thats good, i hope someone updated the man page too :) 
>> 
>> but is there actually any way to grant specific users a reasonable
>> rlimit, 
>
>yes; most distributions will use pam for this, you can set per user or per
>gorup limits there.

isn't that a uid/gid based system? ok, i'm being a little snide :)

fine, so the mlock situation may have improved enough post-2.6.9 that
it can be considered fixed. that leaves the scheduler issue. but
apparently, a uid/gid solution is OK for mlock, and not for the
scheduler. am i missing something?

--p



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 15:41                                     ` Paul Davis
@ 2005-01-07 16:03                                       ` Arjan van de Ven
  2005-01-07 16:20                                         ` Takashi Iwai
  2005-01-07 16:20                                         ` Paul Davis
  2005-01-07 16:03                                       ` Martin Mares
  1 sibling, 2 replies; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-07 16:03 UTC (permalink / raw)
  To: Paul Davis
  Cc: Christoph Hellwig, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

On Fri, Jan 07, 2005 at 10:41:40AM -0500, Paul Davis wrote:
> 
> fine, so the mlock situation may have improved enough post-2.6.9 that
> it can be considered fixed. that leaves the scheduler issue. but
> apparently, a uid/gid solution is OK for mlock, and not for the
> scheduler. am i missing something?

I think you skipped a step. You don't have a scheduler requirement, you have
a latency requirement. You currently *solve* that latency requirement via a
scheduler "hack", yet is quite clear that the "hard" realtime solution is
most likely not the right approach. Note that I'm not saying that you
shouldn't get the latency that that currently provides, but the downsides
(can hang the machine) are bad; a solution that solves that would be far
preferable
something like a soft realtime flag that acts as if it's the hard realtime
one unless the app shows "misbehavior" (eg eats its timeslice for X times in
a row) might for example be such a solution. And with the anti abuse
protection it can run with far lighter privilegs.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 15:41                                     ` Paul Davis
  2005-01-07 16:03                                       ` Arjan van de Ven
@ 2005-01-07 16:03                                       ` Martin Mares
  2005-01-07 16:22                                         ` Paul Davis
  1 sibling, 1 reply; 266+ messages in thread
From: Martin Mares @ 2005-01-07 16:03 UTC (permalink / raw)
  To: Paul Davis
  Cc: Arjan van de Ven, Christoph Hellwig, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

Hello!

> >yes; most distributions will use pam for this, you can set per user or per
> >gorup limits there.
> 
> isn't that a uid/gid based system? ok, i'm being a little snide :)

:)  The big difference between this and a pure uid/gid based system is that
pam_limits is not the only place where you can change the ulimits. If your
system is simple enough that deciding on uid/gid is enough, you can use
pam_limits; if not and you for example want to make the limits depend
on the phase of the moon, it's easy to do so -- just write a simple user space
program which will set the limits accordingly. Also, if the user wishes to
restrict his abilities, because he's going to do some experiment and he
doesn't want to lock up the machine, he can easily do so.

Except for filesystem permissions, I think that it's exactly the usual UNIX
way of controlling access -- the kernel takes care of access checks based
on some trivial attributes like ulimits and capabilities, and user space
decides who should get which. I don't see any reason why the right to use
realtime scheduling should be treated differently. Do you?

It's quite probable that the current system of capabilities is not well
suited for this, but I think that although it's tempting to work around it
by introducing a new security module, in the long term it's much better
to extend and/or fix the capabilities -- I don't see any fundamental reason
for capabilities being unusable for this goal, it's much more likely to be
just minor details in the implementation.

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Always remember that you are absolutely unique ... just like everyone else.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 15:26                                 ` Paul Davis
@ 2005-01-07 16:08                                   ` Martin Mares
  2005-01-07 16:14                                     ` Paul Davis
  2005-01-07 17:53                                   ` Chris Wright
  1 sibling, 1 reply; 266+ messages in thread
From: Martin Mares @ 2005-01-07 16:08 UTC (permalink / raw)
  To: Paul Davis
  Cc: Christoph Hellwig, Arjan van de Ven, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

Hello!

> Olaf:
> -----
> Capabilities don't work, because of missing filesystem
> capabilities. If you have them, it's a question of setting the
> appropriate permitted, inheritable and effective capability sets.

Sure, filesystem capabilities would be nice, but for the stuff Paul
mentions they aren't needed -- what you need is to grant capabilities
to the user's session, which can be easily done by a PAM module.

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
"C++: an octopus made by nailing extra legs onto a dog." -- Steve Taylor

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:08                                   ` Martin Mares
@ 2005-01-07 16:14                                     ` Paul Davis
  2005-01-07 16:29                                       ` Martin Mares
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-07 16:14 UTC (permalink / raw)
  To: Martin Mares
  Cc: Christoph Hellwig, Arjan van de Ven, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

>Sure, filesystem capabilities would be nice, but for the stuff Paul
>mentions they aren't needed -- what you need is to grant capabilities
>to the user's session, which can be easily done by a PAM module.

i think this is true only if the kernel comes with capabilities
enabled.

various media-centric distributions (CCRMA, demudi, dyne:bolic and
others) enabled them for their 2.4 kernels, but not the major
desktop-centric ones. then the impression began to be received that in
2.6, capabilities were even more questionable of a mechanism to use.
In addition, the LSM system appeared, and seemed to offer a much
better solution entirely: no need to patch the kernel at all, or at
least it appeared to be so in the beginning. Hence the "realtime" LSM.

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:03                                       ` Arjan van de Ven
@ 2005-01-07 16:20                                         ` Takashi Iwai
  2005-01-08  5:36                                           ` Con Kolivas
  2005-01-07 16:20                                         ` Paul Davis
  1 sibling, 1 reply; 266+ messages in thread
From: Takashi Iwai @ 2005-01-07 16:20 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Paul Davis, Christoph Hellwig, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

At Fri, 7 Jan 2005 17:03:51 +0100,
Arjan van de Ven wrote:
> 
> On Fri, Jan 07, 2005 at 10:41:40AM -0500, Paul Davis wrote:
> > 
> > fine, so the mlock situation may have improved enough post-2.6.9 that
> > it can be considered fixed. that leaves the scheduler issue. but
> > apparently, a uid/gid solution is OK for mlock, and not for the
> > scheduler. am i missing something?
> 
> I think you skipped a step. You don't have a scheduler requirement, you have
> a latency requirement. You currently *solve* that latency requirement via a
> scheduler "hack", yet is quite clear that the "hard" realtime solution is
> most likely not the right approach. Note that I'm not saying that you
> shouldn't get the latency that that currently provides, but the downsides
> (can hang the machine) are bad; a solution that solves that would be far
> preferable
> something like a soft realtime flag that acts as if it's the hard realtime
> one unless the app shows "misbehavior" (eg eats its timeslice for X times in
> a row) might for example be such a solution. And with the anti abuse
> protection it can run with far lighter privilegs.

This reminds me about the soft-RT patch posted quite sometime ago.
I feel such a handy psuedo-RT scheduler class would be useful for
other systems than JACK, too...


Takashi

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:03                                       ` Arjan van de Ven
  2005-01-07 16:20                                         ` Takashi Iwai
@ 2005-01-07 16:20                                         ` Paul Davis
  2005-01-07 21:12                                           ` Lee Revell
  1 sibling, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-07 16:20 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Christoph Hellwig, Lee Revell, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

>On Fri, Jan 07, 2005 at 10:41:40AM -0500, Paul Davis wrote:
>> 
>> fine, so the mlock situation may have improved enough post-2.6.9 that
>> it can be considered fixed. that leaves the scheduler issue. but
>> apparently, a uid/gid solution is OK for mlock, and not for the
>> scheduler. am i missing something?
>
>I think you skipped a step. You don't have a scheduler requirement, you have
>a latency requirement. You currently *solve* that latency requirement via a
>scheduler "hack", yet is quite clear that the "hard" realtime solution is
>most likely not the right approach. Note that I'm not saying that you

Why is that clear? In just about every respect, realtime audio has the
same characteristics as hard realtime, except that nobody gets hurt
when a deadline is missed :) We have an IRQ source, and a deadline
(sometimes on the sub-msec range, but more typically 1-5msec) for the
work that has to be done. This deadline is tight enough that the task
essentially *has* to run with SCHED_FIFO scheduling, because doing
almost anything else instead will cause the deadline to be missed. 

>shouldn't get the latency that that currently provides, but the downsides
>(can hang the machine) are bad; a solution that solves that would be far
>preferable

OS X's deadline scheduler is arguably better, though I don't believe
it can actually offer the guarantees it claims to with 100%
reliability. But they are essentially do hard realtime via deadline
scheduling, combined with a task killer for any RT task that exceeds
its stated cycle consumption.

To do that in Linux would be great, but its really an addition to the
current scheduling mechanisms, not a replacement. The OS X realtime
task (its actually a Mach RT thread, to be more precise) can still
theoretically cause DOS *if* the kernel task killer was not present,
so its just the task killer that would be needed, presumably driven by
the timer interrupt.

>something like a soft realtime flag that acts as if it's the hard realtime
>one unless the app shows "misbehavior" (eg eats its timeslice for X times in
>a row) might for example be such a solution. And with the anti abuse
>protection it can run with far lighter privilegs.

i guess we're suggesting almost the same thing, except that i consider
this to be hard realtime plus a task killer, not "soft realtime
pretending to be hard realtime" :)

--p


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:03                                       ` Martin Mares
@ 2005-01-07 16:22                                         ` Paul Davis
  2005-01-08 13:04                                           ` Paul Jakma
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-07 16:22 UTC (permalink / raw)
  To: Martin Mares
  Cc: Arjan van de Ven, Christoph Hellwig, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

>It's quite probable that the current system of capabilities is not well
>suited for this, but I think that although it's tempting to work around it
>by introducing a new security module, in the long term it's much better
>to extend and/or fix the capabilities -- I don't see any fundamental reason
>for capabilities being unusable for this goal, it's much more likely to be
>just minor details in the implementation.

capabilities work - we use them in 2.4 where a helper suid application
gets the ball rolling, and then its child grants capabilities to new
clients. 

the problem we have with capabilities is that capabilities are not
enabled by default in the vanilla kernel, and there seems to be
considerable advice suggesting that they should not be enabled.

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:14                                     ` Paul Davis
@ 2005-01-07 16:29                                       ` Martin Mares
  2005-01-07 16:36                                         ` Paul Davis
  2005-01-07 16:37                                         ` Takashi Iwai
  0 siblings, 2 replies; 266+ messages in thread
From: Martin Mares @ 2005-01-07 16:29 UTC (permalink / raw)
  To: Paul Davis
  Cc: Christoph Hellwig, Arjan van de Ven, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

Hello!

> i think this is true only if the kernel comes with capabilities
> enabled.
> 
> various media-centric distributions (CCRMA, demudi, dyne:bolic and
> others) enabled them for their 2.4 kernels, but not the major
> desktop-centric ones. then the impression began to be received that in
> 2.6, capabilities were even more questionable of a mechanism to use.
> In addition, the LSM system appeared, and seemed to offer a much
> better solution entirely: no need to patch the kernel at all, or at
> least it appeared to be so in the beginning. Hence the "realtime" LSM.

Yes, but is there really some difference between people having to enable
LSM and add a new LSM module, and people recompiling the kernel to include
capabilities?

Also, is somebody really shipping 2.4 kernels without capabilities?
I'm unable to find any such config switch in 2.4.28 -- maybe it's because
I'm almost sleeping now, but it doesn't seem to be there.

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
return(ECRAY); /* Program exited before being run */

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:29                                       ` Martin Mares
@ 2005-01-07 16:36                                         ` Paul Davis
  2005-01-07 17:06                                           ` Martin Mares
  2005-01-07 16:37                                         ` Takashi Iwai
  1 sibling, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-07 16:36 UTC (permalink / raw)
  To: Martin Mares
  Cc: Christoph Hellwig, Arjan van de Ven, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

>Yes, but is there really some difference between people having to enable
>LSM and add a new LSM module, and people recompiling the kernel to include
>capabilities?

Well, one is configuration issue, the other involves hacking the
kernel headers before recompiling. Maybe you and I might not seem much
difference, but many people would. One of them says "the kernel gang
think this is OK to use if you want to", the other one says "err, you
can do this but don't call me if it goes wrong".

>Also, is somebody really shipping 2.4 kernels without capabilities?
>I'm unable to find any such config switch in 2.4.28 -- maybe it's because
>I'm almost sleeping now, but it doesn't seem to be there.

They are present but disabled by default. You have to hack the initial
values of CAP_INIT_EFF_SET and CAP_INIT_IHN_SET.

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:29                                       ` Martin Mares
  2005-01-07 16:36                                         ` Paul Davis
@ 2005-01-07 16:37                                         ` Takashi Iwai
  2005-01-07 16:41                                           ` Martin Mares
  1 sibling, 1 reply; 266+ messages in thread
From: Takashi Iwai @ 2005-01-07 16:37 UTC (permalink / raw)
  To: Martin Mares
  Cc: Paul Davis, Christoph Hellwig, Arjan van de Ven, Lee Revell,
	Ingo Molnar, Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

At Fri, 7 Jan 2005 17:29:02 +0100,
Martin Mares wrote:
> 
> Hello!
> 
> > i think this is true only if the kernel comes with capabilities
> > enabled.
> > 
> > various media-centric distributions (CCRMA, demudi, dyne:bolic and
> > others) enabled them for their 2.4 kernels, but not the major
> > desktop-centric ones. then the impression began to be received that in
> > 2.6, capabilities were even more questionable of a mechanism to use.
> > In addition, the LSM system appeared, and seemed to offer a much
> > better solution entirely: no need to patch the kernel at all, or at
> > least it appeared to be so in the beginning. Hence the "realtime" LSM.
> 
> Yes, but is there really some difference between people having to enable
> LSM and add a new LSM module, and people recompiling the kernel to include
> capabilities?

For distributors, it's much easier to provide an additional module
than to let people recompile kernels.


Takashi

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-03 14:15   ` Arjan van de Ven
@ 2005-01-07 16:40     ` Lee Revell
  0 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-07 16:40 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Christoph Hellwig, linux-kernel, Andrew Morton, Ingo Molnar,
	Jack O'Quin, Paul Davis

[added Paul to cc:]

On Mon, 2005-01-03 at 15:15 +0100, Arjan van de Ven wrote:
> On Mon, 2005-01-03 at 14:03 +0000, Christoph Hellwig wrote:
> > On Wed, Dec 29, 2004 at 09:43:22PM -0500, Lee Revell wrote:
> > > The realtime LSM has been previously explained on this list.  Its
> > > function is to allow selected nonroot users to run RT tasks.  The most
> > > common application is low latency audio with JACK, http://jackit.sf.net.
> > > 
> > 
> > This is far too specialized.  And option to the capability LSM to grant 
> > capabilities to certain uids/gids sounds like the better choise - and
> > would also allow to get rid of the magic hugetlb uid horrors.
> those can go away anyway now that there is an rlimit to achieve the
> exact same thing.....
> 
> I can see the point of making an rlimit like thing instead for both the
> nice levels allowed and maybe the "can do rt" bit
> 

How about a "max RT prio" rlimit, that defaults to -1 (can't do RT).
Set it to 90 or something for audio users so you can still run a higher
prio watchdog thread.

Lee



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:37                                         ` Takashi Iwai
@ 2005-01-07 16:41                                           ` Martin Mares
  0 siblings, 0 replies; 266+ messages in thread
From: Martin Mares @ 2005-01-07 16:41 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Paul Davis, Christoph Hellwig, Arjan van de Ven, Lee Revell,
	Ingo Molnar, Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

Hello!

> > Yes, but is there really some difference between people having to enable
> > LSM and add a new LSM module, and people recompiling the kernel to include
> > capabilities?
> 
> For distributors, it's much easier to provide an additional module
> than to let people recompile kernels.

Well, if LSM is enabled in the kernel, enabling capabilities should be
a single insmod, shouldn't it?

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
The better the better, the better the bet.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:36                                         ` Paul Davis
@ 2005-01-07 17:06                                           ` Martin Mares
  2005-01-07 17:29                                             ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: Martin Mares @ 2005-01-07 17:06 UTC (permalink / raw)
  To: Paul Davis
  Cc: Christoph Hellwig, Arjan van de Ven, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

Hello!

> They are present but disabled by default. You have to hack the initial
> values of CAP_INIT_EFF_SET and CAP_INIT_IHN_SET.

Oops. Does anybody know why this has been done?

Also, it seems that it has a relatively easy work-around: boot with
init=/sbin/simple-wrapper and let the wrapper set the cap_bset and exec real
init. (I agree that it's a hack, but a temporarily usable one.)

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
"When I was a boy I was told that anybody could become President; I'm beginning to believe it." -- C. Darrow

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 17:06                                           ` Martin Mares
@ 2005-01-07 17:29                                             ` Chris Wright
  2005-01-07 17:32                                               ` Martin Mares
  0 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-07 17:29 UTC (permalink / raw)
  To: Martin Mares
  Cc: Paul Davis, Christoph Hellwig, Arjan van de Ven, Lee Revell,
	Ingo Molnar, Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

* Martin Mares (mj@ucw.cz) wrote:
> Hello!
> 
> > They are present but disabled by default. You have to hack the initial
> > values of CAP_INIT_EFF_SET and CAP_INIT_IHN_SET.
> 
> Oops. Does anybody know why this has been done?

Yes, SETPCAP became a gaping security hole.  Recall the sendmail hole.

> Also, it seems that it has a relatively easy work-around: boot with
> init=/sbin/simple-wrapper and let the wrapper set the cap_bset and exec real
> init. (I agree that it's a hack, but a temporarily usable one.)

This won't work, you can't increase the bset, which is hardcoded to
leave out SETPCAP.  Also, init is hard coded to start without SETPCAP.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 17:29                                             ` Chris Wright
@ 2005-01-07 17:32                                               ` Martin Mares
  2005-01-07 17:38                                                 ` Chris Wright
  2005-01-07 19:55                                                 ` Jack O'Quin
  0 siblings, 2 replies; 266+ messages in thread
From: Martin Mares @ 2005-01-07 17:32 UTC (permalink / raw)
  To: Chris Wright
  Cc: Paul Davis, Christoph Hellwig, Arjan van de Ven, Lee Revell,
	Ingo Molnar, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

Hello!

> Yes, SETPCAP became a gaping security hole.  Recall the sendmail hole.

Hmmm, I don't remember now, could you give me some pointer, please?

> This won't work, you can't increase the bset, which is hardcoded to
> leave out SETPCAP.  Also, init is hard coded to start without SETPCAP.

If I read the source correctly, init is allowed to increase the bset,
the other processes aren't.

				Have a nice fortnight
-- 
Martin `MJ' Mares   <mj@ucw.cz>   http://atrey.karlin.mff.cuni.cz/~mj/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
American patent law: two monkeys, fourteen days.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 17:32                                               ` Martin Mares
@ 2005-01-07 17:38                                                 ` Chris Wright
  2005-01-07 19:55                                                 ` Jack O'Quin
  1 sibling, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-07 17:38 UTC (permalink / raw)
  To: Martin Mares
  Cc: Chris Wright, Paul Davis, Christoph Hellwig, Arjan van de Ven,
	Lee Revell, Ingo Molnar, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

* Martin Mares (mj@ucw.cz) wrote:
> Hello!
> 
> > Yes, SETPCAP became a gaping security hole.  Recall the sendmail hole.
> 
> Hmmm, I don't remember now, could you give me some pointer, please?

Sure, the Wagner/Chen paper on setuid demystified has some references to
it IIRC.  http://www.cs.ucdavis.edu/~hchen/paper/usenix02.ps

> > This won't work, you can't increase the bset, which is hardcoded to
> > leave out SETPCAP.  Also, init is hard coded to start without SETPCAP.
> 
> If I read the source correctly, init is allowed to increase the bset,
> the other processes aren't.

Yes, you're right I forgot about that.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 15:26                                 ` Paul Davis
  2005-01-07 16:08                                   ` Martin Mares
@ 2005-01-07 17:53                                   ` Chris Wright
  1 sibling, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-07 17:53 UTC (permalink / raw)
  To: Paul Davis
  Cc: Christoph Hellwig, Arjan van de Ven, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

* Paul Davis (paul@linuxaudiosystems.com) wrote:
> So, we have a few responses, some references to various potential
> solutions all of which have problems just as deep if not deeper than
> the uid/gid-based model that this particular LSM adopts. No proposal
> for any system that would actually work and address anyone's real
> needs in a useful way.

I don't think that's quite true.  One repeated recommendation was to
simply generalize the idea so that it applies to all capabilities.
Another, which at this point appears quite workable, was Arjan's
recommendation to make scheduling policy/priority protected by an rlimit
(complicated only by representing the combinations sanely in a single
number).

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 14:26                           ` Arjan van de Ven
  2005-01-07 14:38                             ` Paul Davis
@ 2005-01-07 18:01                             ` Chris Wright
  1 sibling, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-07 18:01 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Paul Davis, Christoph Hellwig, Lee Revell, Ingo Molnar,
	Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

* Arjan van de Ven (arjanv@redhat.com) wrote:
> eh no. It defaults to zero, but if you increase it for a specific user, that
> user is allowed to mlock more.

Actually, I think it defaults to 32k to keep gpg happy (at least in
mainline) ;-)

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 17:32                                               ` Martin Mares
  2005-01-07 17:38                                                 ` Chris Wright
@ 2005-01-07 19:55                                                 ` Jack O'Quin
  1 sibling, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-07 19:55 UTC (permalink / raw)
  To: Martin Mares
  Cc: Chris Wright, Paul Davis, Christoph Hellwig, Arjan van de Ven,
	Lee Revell, Ingo Molnar, Alan Cox, Linux Kernel Mailing List,
	Andrew Morton

Martin Mares <mj@ucw.cz> writes:

>> Yes, SETPCAP became a gaping security hole.  Recall the sendmail hole.
>
> Hmmm, I don't remember now, could you give me some pointer, please?

I already did that...

> Jack O'Quin wrote:
> > The biggest problem was CAP_SETPCAP, which for good reasons[1] is
> > disabled in distributed kernels.  This forced every user to patch and
> > build a custom kernel.  Worse, it opened all our systems up to the
> > problems reported by this sendmail security advisory.

 [1] http://www.securiteam.com/unixfocus/5KQ040A1RI.html

-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07  5:54               ` Jack O'Quin
@ 2005-01-07 20:02                 ` Matt Mackall
  2005-01-07 20:21                   ` Chris Wright
                                     ` (2 more replies)
  0 siblings, 3 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-07 20:02 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Alan Cox, Andreas Steinmetz, Lee Revell, Chris Wright,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar,
	LAD mailing list

On Thu, Jan 06, 2005 at 11:54:05PM -0600, Jack O'Quin wrote:
> Note that sched_setschedule() provides no way to handle the mlock()
> requirement, which cannot be done from another process.

I'm pretty sure that part can be done by a privileged server handing
out mlocked shared memory segments.

The trouble with introducing something into the kernel is that once
done, it can't be undone. So you're absolutely going to meet
resistance to anything that can be a) done sufficiently in userspace
or b) can reasonably be done in a more generic manner so as to meet
the needs of a wider future audience. The onus is on the submitter to
meet these requirements because we can't easily kick out a broken API
after we accept it.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07  1:55                   ` Alan Cox
@ 2005-01-07 20:05                     ` Matt Mackall
  0 siblings, 0 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-07 20:05 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, ast, rlrevell, Linux Kernel Mailing List, mingo, joq

On Fri, Jan 07, 2005 at 01:55:09AM +0000, Alan Cox wrote:
> On Gwe, 2005-01-07 at 01:13, Matt Mackall wrote:
> > You can't fix them without changing the semantics for existing users
> > in ways they didn't expect. It could be done with a new personality flag,
> > but..
> 
> I disagree. At the most trivial you could just add another 32bits of
> sticky capability that are never touched by setuid/non-setuidness and
> represent additional "user" (or more rightly session) abilities to do
> limited overrides

I think we're referring to different brokenness. The problems I see
are with the semantics of inheritance of capabilities which make
wrapping applications painful. Those can't be changed without creating
holes in existing apps so the general utility of caps is limited.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 20:02                 ` Matt Mackall
@ 2005-01-07 20:21                   ` Chris Wright
  2005-01-07 20:27                   ` Jack O'Quin
  2005-01-07 20:45                   ` Lee Revell
  2 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-07 20:21 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Alan Cox, Andreas Steinmetz, Lee Revell,
	Chris Wright, Linux Kernel Mailing List, Andrew Morton,
	Ingo Molnar, LAD mailing list

* Matt Mackall (mpm@selenic.com) wrote:
> On Thu, Jan 06, 2005 at 11:54:05PM -0600, Jack O'Quin wrote:
> > Note that sched_setschedule() provides no way to handle the mlock()
> > requirement, which cannot be done from another process.
> 
> I'm pretty sure that part can be done by a privileged server handing
> out mlocked shared memory segments.

It can actually be done with plain ol' rlimits (RLIMIT_MEMLOCK).

> The trouble with introducing something into the kernel is that once
> done, it can't be undone. So you're absolutely going to meet
> resistance to anything that can be a) done sufficiently in userspace
> or b) can reasonably be done in a more generic manner so as to meet
> the needs of a wider future audience. The onus is on the submitter to
> meet these requirements because we can't easily kick out a broken API
> after we accept it.

Indeed (although in this case it's not adding an API as much as using an
existing one).

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 20:02                 ` Matt Mackall
  2005-01-07 20:21                   ` Chris Wright
@ 2005-01-07 20:27                   ` Jack O'Quin
  2005-01-07 20:46                     ` Matt Mackall
  2005-01-07 20:45                   ` Lee Revell
  2 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-07 20:27 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Alan Cox, Andreas Steinmetz, Lee Revell, Chris Wright,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar,
	LAD mailing list

Matt Mackall <mpm@selenic.com> writes:

> On Thu, Jan 06, 2005 at 11:54:05PM -0600, Jack O'Quin wrote:
>> Note that sched_setschedule() provides no way to handle the mlock()
>> requirement, which cannot be done from another process.
>
> I'm pretty sure that part can be done by a privileged server handing
> out mlocked shared memory segments.

If you're "pretty sure", please explain how locking a shared memory
segment prevents the code and stack of the client's realtime thread
from page faulting.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 20:02                 ` Matt Mackall
  2005-01-07 20:21                   ` Chris Wright
  2005-01-07 20:27                   ` Jack O'Quin
@ 2005-01-07 20:45                   ` Lee Revell
  2 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-07 20:45 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Alan Cox, Andreas Steinmetz, Chris Wright,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar,
	LAD mailing list

On Fri, 2005-01-07 at 12:02 -0800, Matt Mackall wrote:
> The trouble with introducing something into the kernel is that once
> done, it can't be undone. So you're absolutely going to meet
> resistance to anything that can be a) done sufficiently in userspace
> or b) can reasonably be done in a more generic manner so as to meet
> the needs of a wider future audience. The onus is on the submitter to
> meet these requirements because we can't easily kick out a broken API
> after we accept it.

For a big subsystem that exposes an API, you would be right.  But this
is a *really* simple problem, all you need is a way to tell it who gets
RT privileges, which means uid or gid.  So any future solution will be
orthogonal to this one, and when users upgrade even a not very smart
Perl script will be able to migrate the configuration.  How many
different ways are there to say "these are the non-root users who have
realtime prvileges", anyway?

Unless, of course, the solution that's eventually merged is *really*
overcomplicated by comparison, in which case users will (rightly) reject
it, and the system will have worked.

Lee 




^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 20:27                   ` Jack O'Quin
@ 2005-01-07 20:46                     ` Matt Mackall
  2005-01-07 20:55                       ` Lee Revell
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-07 20:46 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Alan Cox, Andreas Steinmetz, Lee Revell, Chris Wright,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar,
	LAD mailing list

On Fri, Jan 07, 2005 at 02:27:26PM -0600, Jack O'Quin wrote:
> Matt Mackall <mpm@selenic.com> writes:
> 
> > On Thu, Jan 06, 2005 at 11:54:05PM -0600, Jack O'Quin wrote:
> >> Note that sched_setschedule() provides no way to handle the mlock()
> >> requirement, which cannot be done from another process.
> >
> > I'm pretty sure that part can be done by a privileged server handing
> > out mlocked shared memory segments.
> 
> If you're "pretty sure", please explain how locking a shared memory
> segment prevents the code and stack of the client's realtime thread
> from page faulting.

You just map your RT-dependent routine (PIC, of course) into the
segment and move your stack pointer into a second segment. I didn't
say it was easy, but it's all just bits. There's also the rlimit
issue.

Or, going the other way, the client app can pass map handles to the
server to bless. Some juggling might be involved but it's obviously
doable.

As has been pointed out, an rlimit solution exists now as well.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 20:46                     ` Matt Mackall
@ 2005-01-07 20:55                       ` Lee Revell
  2005-01-07 21:20                         ` Matt Mackall
  0 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-07 20:55 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Alan Cox, Andreas Steinmetz, Chris Wright,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar,
	LAD mailing list

On Fri, 2005-01-07 at 12:46 -0800, Matt Mackall wrote:
> You just map your RT-dependent routine (PIC, of course) into the
> segment and move your stack pointer into a second segment. I didn't
> say it was easy, but it's all just bits. There's also the rlimit
> issue.
> 
> Or, going the other way, the client app can pass map handles to the
> server to bless. Some juggling might be involved but it's obviously
> doable.
> 

Christ, what a nightmare!  Since when does "obviously doable" mean it's
a good idea?  Please, reread your above statements, then go back and
look at the realtime LSM patch (it's less than 200 lines), and tell me
again that your way is more secure.

Please keep in mind that there are already 1000s of users using the
realtime LSM to do audio work.  Sorry, but I will take a known good,
well understood, PROVEN solution over "it's obviously doable, it's all
bits anyway".  Get back to me when you have some code, or at least some
reasonable suggestions as Alan, Christoph and others have made.

> As has been pointed out, an rlimit solution exists now as well.

Wrong, as was said repeatedly, rlimits only help with mlock!  Have you
even been reading the thread?

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:20                                         ` Paul Davis
@ 2005-01-07 21:12                                           ` Lee Revell
  2005-01-07 21:49                                             ` Andrew Morton
  0 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-07 21:12 UTC (permalink / raw)
  To: Paul Davis
  Cc: Arjan van de Ven, Christoph Hellwig, Ingo Molnar, Chris Wright,
	Alan Cox, Jack O'Quin, Linux Kernel Mailing List,
	Andrew Morton

On Fri, 2005-01-07 at 11:20 -0500, Paul Davis wrote:
> >On Fri, Jan 07, 2005 at 10:41:40AM -0500, Paul Davis wrote:
> >> 
> >> fine, so the mlock situation may have improved enough post-2.6.9 that
> >> it can be considered fixed. that leaves the scheduler issue. but
> >> apparently, a uid/gid solution is OK for mlock, and not for the
> >> scheduler. am i missing something?
> >
> >I think you skipped a step. You don't have a scheduler requirement, you have
> >a latency requirement. You currently *solve* that latency requirement via a
> >scheduler "hack", yet is quite clear that the "hard" realtime solution is
> >most likely not the right approach. Note that I'm not saying that you
> 
> Why is that clear? In just about every respect, realtime audio has the
> same characteristics as hard realtime, except that nobody gets hurt
> when a deadline is missed :) We have an IRQ source, and a deadline
> (sometimes on the sub-msec range, but more typically 1-5msec) for the
> work that has to be done. This deadline is tight enough that the task
> essentially *has* to run with SCHED_FIFO scheduling, because doing
> almost anything else instead will cause the deadline to be missed. 
> 

It's not like hard realtime, it is.  All that makes a hard RT system is
that missing a deadline means the system has utterly failed.  How is
this any different than an xrun causing a loud pop or click in a live
performance?

Really, I think Linux has owned the server space for so long that some
folks on this list are getting hubristic.  Just because you have the
best server OS does not mean it's the best at everything.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 20:55                       ` Lee Revell
@ 2005-01-07 21:20                         ` Matt Mackall
  2005-01-07 21:29                           ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-07 21:20 UTC (permalink / raw)
  To: Lee Revell
  Cc: Jack O'Quin, Alan Cox, Andreas Steinmetz, Chris Wright,
	Linux Kernel Mailing List, Andrew Morton, Ingo Molnar,
	LAD mailing list

On Fri, Jan 07, 2005 at 03:55:12PM -0500, Lee Revell wrote:
> On Fri, 2005-01-07 at 12:46 -0800, Matt Mackall wrote:
> > You just map your RT-dependent routine (PIC, of course) into the
> > segment and move your stack pointer into a second segment. I didn't
> > say it was easy, but it's all just bits. There's also the rlimit
> > issue.
> > 
> > Or, going the other way, the client app can pass map handles to the
> > server to bless. Some juggling might be involved but it's obviously
> > doable.
> > 
> 
> Christ, what a nightmare!  Since when does "obviously doable" mean it's
> a good idea?  Please, reread your above statements, then go back and
> look at the realtime LSM patch (it's less than 200 lines), and tell me
> again that your way is more secure.

My way simply proves that existing userspace methods have not been
exhausted. It's not impossible as was claimed and cleaner methods or
nicely wrapped variants of the above probably exist. And yes, doing
ugly things in userspace is preferable to adding application-specific
baggage to the kernel.

> > As has been pointed out, an rlimit solution exists now as well.
> 
> Wrong, as was said repeatedly, rlimits only help with mlock!  Have you
> even been reading the thread?

Feh. The RT scheduling class issue is orthogonal. Addressing mlock and
scheduling class at once (and nothing else) is actually an ugliness of
your LSM approach as there are folks who want mlock and not RT.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 21:20                         ` Matt Mackall
@ 2005-01-07 21:29                           ` Chris Wright
  0 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-07 21:29 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Lee Revell, Jack O'Quin, Alan Cox, Andreas Steinmetz,
	Chris Wright, Linux Kernel Mailing List, Andrew Morton,
	Ingo Molnar, LAD mailing list

* Matt Mackall (mpm@selenic.com) wrote:
> Feh. The RT scheduling class issue is orthogonal. Addressing mlock and
> scheduling class at once (and nothing else) is actually an ugliness of
> your LSM approach as there are folks who want mlock and not RT.

Last I checked they could be controlled separately in that module.  It
has been suggested (by me and others) that one possible solution would
be to expand it to be generic for all caps.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 21:12                                           ` Lee Revell
@ 2005-01-07 21:49                                             ` Andrew Morton
  2005-01-07 22:07                                               ` Valdis.Kletnieks
                                                                 ` (3 more replies)
  0 siblings, 4 replies; 266+ messages in thread
From: Andrew Morton @ 2005-01-07 21:49 UTC (permalink / raw)
  To: Lee Revell; +Cc: paul, arjanv, hch, mingo, chrisw, alan, joq, linux-kernel

Lee Revell <rlrevell@joe-job.com> wrote:
>
> Really, I think Linux has owned the server space for so long that some
> folks on this list are getting hubristic.  Just because you have the
> best server OS does not mean it's the best at everything.

nah, the requirement is clearly valid, and longstanding.  We need to
satisfy it.  It's just a matter of working out the best way.

Chris Wright <chrisw@osdl.org> wrote:
>
> ...
> Last I checked they could be controlled separately in that module.  It
> has been suggested (by me and others) that one possible solution would
> be to expand it to be generic for all caps.

Maybe this is the way?

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 21:49                                             ` Andrew Morton
@ 2005-01-07 22:07                                               ` Valdis.Kletnieks
  2005-01-07 22:36                                                 ` Chris Wright
  2005-01-07 22:10                                               ` Christoph Hellwig
                                                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 266+ messages in thread
From: Valdis.Kletnieks @ 2005-01-07 22:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Lee Revell, paul, arjanv, hch, mingo, chrisw, alan, joq, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 927 bytes --]

On Fri, 07 Jan 2005 13:49:41 PST, Andrew Morton said:

> Chris Wright <chrisw@osdl.org> wrote:

> > Last I checked they could be controlled separately in that module.  It
> > has been suggested (by me and others) that one possible solution would
> > be to expand it to be generic for all caps.
> 
> Maybe this is the way?

We already *know* how to (in principle) fix the capabilities system to make
it useful.  We should probably investigate doing that and at the same time
fixing the current CAP_SYS_ADMIN mess (which we also have at least some ideas
on fixing). The remaining problem is possible breakage of software that's doing
capability things The Old Way (as the inheritance rules are incompatible).

Linus at one time said that a 2.7 might open if there was some issue that
caused enough disruption to require a fork - could this be it, or does somebody
have a better way to address the backward-combatability problem?

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 21:49                                             ` Andrew Morton
  2005-01-07 22:07                                               ` Valdis.Kletnieks
@ 2005-01-07 22:10                                               ` Christoph Hellwig
  2005-01-07 22:26                                                 ` Paul Davis
                                                                   ` (2 more replies)
  2005-01-07 22:22                                               ` Paul Davis
  2005-01-07 22:44                                               ` Andreas Steinmetz
  3 siblings, 3 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-07 22:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Lee Revell, paul, arjanv, hch, mingo, chrisw, alan, joq, linux-kernel

On Fri, Jan 07, 2005 at 01:49:41PM -0800, Andrew Morton wrote:
> Chris Wright <chrisw@osdl.org> wrote:
> >
> > ...
> > Last I checked they could be controlled separately in that module.  It
> > has been suggested (by me and others) that one possible solution would
> > be to expand it to be generic for all caps.
> 
> Maybe this is the way?

It's at least not as bad as the current hack (when properly done in
the capabilities modules instead of adding one ontop).

I must say I'm not exactly happy with that idea still.  It ties the
privilegues we have been separating from a special uid (0) to filesystem
permissions again.  It's not nessecarily a bad idea per, but it doesn't
really fit into the model we've been working to.  I'd expect quite a few
unpleasant devices when a user detects that the distibution had been
binding various capabilities to uids/gids behinds his back.

So to make forward progress I'd like the audio people to confirm whether
the mlock bits in 2.6.9+ do help that half of their requirement first
(and if not find a way to fix it) and then tackle the scheduling part.
For that one I really wonder whether the combination of the now actually
working nicelevels (see Mingo's post) and a simple wrapper for the really
high requirements cases doesn't work.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 21:49                                             ` Andrew Morton
  2005-01-07 22:07                                               ` Valdis.Kletnieks
  2005-01-07 22:10                                               ` Christoph Hellwig
@ 2005-01-07 22:22                                               ` Paul Davis
  2005-01-07 22:44                                               ` Andreas Steinmetz
  3 siblings, 0 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-07 22:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Lee Revell, arjanv, hch, mingo, chrisw, alan, joq, linux-kernel

>> Last I checked they could be controlled separately in that module.  It
>> has been suggested (by me and others) that one possible solution would
>> be to expand it to be generic for all caps.
>
>Maybe this is the way?

that would make a much more complex LSM, and thus opens the doors to
some inadvertent security hazard that doesn't arise in the simpler
tool we have now. 

other than that, its not a terrible suggestion at all, just a lot, lot
more work.

--p



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 22:10                                               ` Christoph Hellwig
@ 2005-01-07 22:26                                                 ` Paul Davis
  2005-01-07 22:29                                                 ` Chris Wright
  2005-01-07 23:00                                                 ` Lee Revell
  2 siblings, 0 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-07 22:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, Lee Revell, arjanv, mingo, chrisw, alan, joq,
	linux-kernel

>So to make forward progress I'd like the audio people to confirm whether
>the mlock bits in 2.6.9+ do help that half of their requirement first

it does, although it would be nicer to not have two separate
components to administering the usability of realtime applications.

>(and if not find a way to fix it) and then tackle the scheduling part.
>For that one I really wonder whether the combination of the now actually
>working nicelevels (see Mingo's post) and a simple wrapper for the really
>high requirements cases doesn't work.

Jack already posted results: the nice levels are massively inferior as
they currently stand.

The wrapper is incredibly inconvenient for applications: when you use
JACK, start clients would require a different command depending on
whether JACK is using RT mode or not. That is extremely inelegant, and
its why we've developed these solutions (caps+jackstart for 2.4,
"realtime" LSM for 2.6).

--p


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 22:10                                               ` Christoph Hellwig
  2005-01-07 22:26                                                 ` Paul Davis
@ 2005-01-07 22:29                                                 ` Chris Wright
  2005-01-08  6:12                                                   ` Jack O'Quin
  2005-01-07 23:00                                                 ` Lee Revell
  2 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-07 22:29 UTC (permalink / raw)
  To: Christoph Hellwig, Andrew Morton, Lee Revell, paul, arjanv,
	mingo, chrisw, alan, joq, linux-kernel

* Christoph Hellwig (hch@infradead.org) wrote:
> On Fri, Jan 07, 2005 at 01:49:41PM -0800, Andrew Morton wrote:
> > Chris Wright <chrisw@osdl.org> wrote:
> > >
> > > ...
> > > Last I checked they could be controlled separately in that module.  It
> > > has been suggested (by me and others) that one possible solution would
> > > be to expand it to be generic for all caps.
> > 
> > Maybe this is the way?
> 
> It's at least not as bad as the current hack (when properly done in
> the capabilities modules instead of adding one ontop).
> 
> I must say I'm not exactly happy with that idea still.  It ties the
> privilegues we have been separating from a special uid (0) to filesystem
> permissions again.  It's not nessecarily a bad idea per, but it doesn't
> really fit into the model we've been working to.  I'd expect quite a few
> unpleasant devices when a user detects that the distibution had been
> binding various capabilities to uids/gids behinds his back.

I agree, it's still a hack, just a generic and complete hack ;-)

> So to make forward progress I'd like the audio people to confirm whether
> the mlock bits in 2.6.9+ do help that half of their requirement first

It sure should, but I guess they can reply on that.

> (and if not find a way to fix it) and then tackle the scheduling part.
> For that one I really wonder whether the combination of the now actually
> working nicelevels (see Mingo's post) and a simple wrapper for the really
> high requirements cases doesn't work.

I saw Jack (I think) post some numbers showing that it wasn't enough.
What about making priority level protected via rlimit?

Here's an uncompiled, untested patch doing that (probably has some math
error or logic hole in it, but idea seems sound enough).  I think it has
at least one problem, where nice 19 process, could renice itself back to
0.  And it doesn't really handle the different scheduling policies,
other than implicit 40 - 139 being used for SCHED_FIFO/SCHED_RR.

It takes the 140 priority levels (0-139), inverts their priority
order, and then uses that number as the basis for the rlimit (so that a
larger rlimit means higher priority, to fall inline with normal rlimit
semantics).  Defaults to 19 (which should be niceval of 0).  And allows
CAP_SYS_NICE to continue to override if the rlimit is too low.

===== kernel/sched.c 1.386 vs edited =====
--- 1.386/kernel/sched.c	2005-01-04 18:48:21 -08:00
+++ edited/kernel/sched.c	2005-01-07 14:23:32 -08:00
@@ -3009,12 +3009,8 @@ asmlinkage long sys_nice(int increment)
 	 * We don't have to worry. Conceptually one call occurs first
 	 * and we have a single winner.
 	 */
-	if (increment < 0) {
-		if (!capable(CAP_SYS_NICE))
-			return -EPERM;
-		if (increment < -40)
-			increment = -40;
-	}
+	if (increment < -40)
+		increment = -40;
 	if (increment > 40)
 		increment = 40;
 
@@ -3024,6 +3020,11 @@ asmlinkage long sys_nice(int increment)
 	if (nice > 19)
 		nice = 19;
 
+	if ((MAX_PRIO-1) - NICE_TO_PRIO(nice) > 
+	    current->signal->rlim[RLIMIT_PRIO].rlim_cur &&
+	    !capable(CAP_SYS_NICE))
+		return -EPERM;
+
 	retval = security_task_setnice(current, nice);
 	if (retval)
 		return retval;
@@ -3057,6 +3058,15 @@ int task_nice(const task_t *p)
 }
 
 /**
+ * nice_to_prio - return priority of give nice value
+ * @nice: nice value
+ */
+int nice_to_prio(const int nice)
+{
+	return NICE_TO_PRIO(nice);
+}
+
+/**
  * idle_cpu - is a given cpu idle currently?
  * @cpu: the processor in question.
  */
@@ -3140,6 +3150,7 @@ recheck:
 
 	retval = -EPERM;
 	if ((policy == SCHED_FIFO || policy == SCHED_RR) &&
+	    lp.sched_priority+40 > p->signal->rlim[RLIMIT_PRIO].rlim_cur && 
 	    !capable(CAP_SYS_NICE))
 		goto out_unlock;
 	if ((current->euid != p->euid) && (current->euid != p->uid) &&
===== kernel/sys.c 1.102 vs edited =====
--- 1.102/kernel/sys.c	2005-01-06 23:25:46 -08:00
+++ edited/kernel/sys.c	2005-01-07 14:13:37 -08:00
@@ -225,7 +225,9 @@ static int set_one_prio(struct task_stru
 		error = -EPERM;
 		goto out;
 	}
-	if (niceval < task_nice(p) && !capable(CAP_SYS_NICE)) {
+	if ((MAX_PRIO-1) - nice_to_prio(niceval) > 
+	    p->signal->rlim[RLIMIT_PRIO].rlim_cur &&
+	    !capable(CAP_SYS_NICE)) {
 		error = -EACCES;
 		goto out;
 	}
===== include/asm-i386/resource.h 1.5 vs edited =====
--- 1.5/include/asm-i386/resource.h	2004-08-23 01:15:26 -07:00
+++ edited/include/asm-i386/resource.h	2005-01-07 13:55:37 -08:00
@@ -18,8 +18,9 @@
 #define RLIMIT_LOCKS	10		/* maximum file locks held */
 #define RLIMIT_SIGPENDING 11		/* max number of pending signals */
 #define RLIMIT_MSGQUEUE 12		/* maximum bytes in POSIX mqueues */
+#define RLIMIT_PRIO	13		/* maximum scheduling priority */
 
-#define RLIM_NLIMITS	13
+#define RLIM_NLIMITS	14
 
 
 /*
@@ -45,6 +46,7 @@
 	{ RLIM_INFINITY, RLIM_INFINITY },		\
 	{ MAX_SIGPENDING, MAX_SIGPENDING },		\
 	{ MQ_BYTES_MAX, MQ_BYTES_MAX },			\
+	{           19,	            19 },		\
 }
 
 #endif /* __KERNEL__ */
===== include/linux/sched.h 1.280 vs edited =====
--- 1.280/include/linux/sched.h	2005-01-04 18:48:20 -08:00
+++ edited/include/linux/sched.h	2005-01-07 14:14:16 -08:00
@@ -760,6 +760,7 @@ extern void sched_idle_next(void);
 extern void set_user_nice(task_t *p, long nice);
 extern int task_prio(const task_t *p);
 extern int task_nice(const task_t *p);
+extern int nice_to_prio(const int nice);
 extern int task_curr(const task_t *p);
 extern int idle_cpu(int cpu);
 

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 22:07                                               ` Valdis.Kletnieks
@ 2005-01-07 22:36                                                 ` Chris Wright
  2005-01-07 23:01                                                   ` Valdis.Kletnieks
  0 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-07 22:36 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Andrew Morton, Lee Revell, paul, arjanv, hch, mingo, chrisw,
	alan, joq, linux-kernel

* Valdis.Kletnieks@vt.edu (Valdis.Kletnieks@vt.edu) wrote:
> On Fri, 07 Jan 2005 13:49:41 PST, Andrew Morton said:
> 
> > Chris Wright <chrisw@osdl.org> wrote:
> 
> > > Last I checked they could be controlled separately in that module.  It
> > > has been suggested (by me and others) that one possible solution would
> > > be to expand it to be generic for all caps.
> > 
> > Maybe this is the way?
> 
> We already *know* how to (in principle) fix the capabilities system to make
> it useful.  We should probably investigate doing that and at the same time
> fixing the current CAP_SYS_ADMIN mess (which we also have at least some ideas
> on fixing). The remaining problem is possible breakage of software that's doing
> capability things The Old Way (as the inheritance rules are incompatible).

Fixing CAP_SYS_ADMIN whole other can o' worms.  No point in tangling the
two.

> Linus at one time said that a 2.7 might open if there was some issue that
> caused enough disruption to require a fork - could this be it, or does somebody
> have a better way to address the backward-combatability problem?

There's at least two ways.  Introduce a new capability module or introduce
a PF flag to opt in.  Neither are great

-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 21:49                                             ` Andrew Morton
                                                                 ` (2 preceding siblings ...)
  2005-01-07 22:22                                               ` Paul Davis
@ 2005-01-07 22:44                                               ` Andreas Steinmetz
  3 siblings, 0 replies; 266+ messages in thread
From: Andreas Steinmetz @ 2005-01-07 22:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Lee Revell, paul, arjanv, hch, mingo, chrisw, alan, joq, linux-kernel

Andrew Morton wrote:
> Lee Revell <rlrevell@joe-job.com> wrote:
> 
>>Really, I think Linux has owned the server space for so long that some
>>folks on this list are getting hubristic.  Just because you have the
>>best server OS does not mean it's the best at everything.
> 
> 
> nah, the requirement is clearly valid, and longstanding.  We need to
> satisfy it.  It's just a matter of working out the best way.
> 
> Chris Wright <chrisw@osdl.org> wrote:
> 
>>...
>>Last I checked they could be controlled separately in that module.  It
>>has been suggested (by me and others) that one possible solution would
>>be to expand it to be generic for all caps.
> 
> 
> Maybe this is the way?

This could give an advantage for e.g. networked daemons, too. No more 
root privilege necessary for applications just to bind to a privileged 
port which does make life easier (CAP_NET_BIND_SERVICE). Other ideas for 
e.g. CAP_NET_RAW or CAP_SYS_RAWIO come to mind. Using the current 
capabilties in this design as all incuding supersets that can be defined 
more fine grained in a later step I guess should suit others, too. The 
remaining problem would then be the design of an extensible interface 
that is backwards compatible.

-- 
Andreas Steinmetz                       SPAMmers use robotrap@domdv.de

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 22:10                                               ` Christoph Hellwig
  2005-01-07 22:26                                                 ` Paul Davis
  2005-01-07 22:29                                                 ` Chris Wright
@ 2005-01-07 23:00                                                 ` Lee Revell
  2 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-07 23:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, paul, arjanv, mingo, Chris Wright, alan, joq,
	linux-kernel

On Fri, 2005-01-07 at 22:10 +0000, Christoph Hellwig wrote:
> It's not nessecarily a bad idea per, but it doesn't
> really fit into the model we've been working to.  I'd expect quite a few
> unpleasant devices when a user detects that the distibution had been
> binding various capabilities to uids/gids behinds his back.
> 

Point taken, but do keep in mind that this will *certainly* be disabled
by default, unless you run an audio oriented distro, and we assume those
people know what they're doing ;-)

> For that one I really wonder whether the combination of the now actually
> working nicelevels (see Mingo's post)

Ingo said "it should work".  It currently doesn't, as you can see from
Jack's post.  My concern here is, the semantics of SCHED_FIFO are well
defined and stable.  The highest priority runnable SCHED_FIFO process
*always* runs.  The semantics of "nice -20" apparently change from
release to release, as you can see.  We can't have the scheduler
deciding to run something else when jackd needs to run because it
decides jackd is hogging the CPU or whatever.  Everyone knows that when
dealing with realtime constraints the important case is not the average
but the worst.

In a live audio situation an xrun storm and a complete system lockup are
both catastrophic failures.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 22:36                                                 ` Chris Wright
@ 2005-01-07 23:01                                                   ` Valdis.Kletnieks
  2005-01-07 23:20                                                     ` Andrew Morton
  0 siblings, 1 reply; 266+ messages in thread
From: Valdis.Kletnieks @ 2005-01-07 23:01 UTC (permalink / raw)
  To: Chris Wright
  Cc: Andrew Morton, Lee Revell, paul, arjanv, hch, mingo, alan, joq,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1425 bytes --]

On Fri, 07 Jan 2005 14:36:38 PST, Chris Wright said:

> > We already *know* how to (in principle) fix the capabilities system to make
> > it useful.  We should probably investigate doing that and at the same time
> > fixing the current CAP_SYS_ADMIN mess (which we also have at least some ideas
> > on fixing). The remaining problem is possible breakage of software that's doing
> > capability things The Old Way (as the inheritance rules are incompatible).
> 
> Fixing CAP_SYS_ADMIN whole other can o' worms.  No point in tangling the
> two.

Yes, it's two entire cans.  The problem is that in *both* cases, we're probably
going to have to do an API change.  It may be preferable to only require changes
on the userspace side once, rather than change it once to fix the inheritance
problems in 2.7/2.6.N+10 or whatever it will be, and then again in 2.9/2.6.N+20
or whatever....

> > Linus at one time said that a 2.7 might open if there was some issue that
> > caused enough disruption to require a fork - could this be it, or does somebody
> > have a better way to address the backward-combatability problem?
> 
> There's at least two ways.  Introduce a new capability module or introduce
> a PF flag to opt in.  Neither are great

A new PF flag strikes me as marginally better, especially if we have a way to
propogate from Elf headers in a way similar to Execshield's use of elf_ex.e_phnum
to set the executable-stack...

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 23:01                                                   ` Valdis.Kletnieks
@ 2005-01-07 23:20                                                     ` Andrew Morton
  2005-01-07 23:34                                                       ` Valdis.Kletnieks
  2005-01-10 21:05                                                       ` Matt Mackall
  0 siblings, 2 replies; 266+ messages in thread
From: Andrew Morton @ 2005-01-07 23:20 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: chrisw, rlrevell, paul, arjanv, hch, mingo, alan, joq, linux-kernel

Valdis.Kletnieks@vt.edu wrote:
>
> fix the inheritance problems

Does anyone actually have a handle on what's involved in fixing the
inheritance problem?

It's risky, but it is something which we should do.

<grumpytroll> We really shouldn't have merged all that new fancy security
stuff when the existing security framework was known-badly-broken. 
Especially as the new stuff seems incapable of doing simple things which
unbroken inherited caps would do perfectly.</grumpytroll>

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 23:20                                                     ` Andrew Morton
@ 2005-01-07 23:34                                                       ` Valdis.Kletnieks
  2005-01-10 21:05                                                       ` Matt Mackall
  1 sibling, 0 replies; 266+ messages in thread
From: Valdis.Kletnieks @ 2005-01-07 23:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: chrisw, rlrevell, paul, arjanv, hch, mingo, alan, joq, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 360 bytes --]

On Fri, 07 Jan 2005 15:20:04 PST, Andrew Morton said:

> Does anyone actually have a handle on what's involved in fixing the
> inheritance problem?

Andy Lutomirski was looking at that, and it's actually a very small but
incompatible change that allows filesystem support for set-capability files to
be actually usable.  He posted some patches back in May....

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:20                                         ` Takashi Iwai
@ 2005-01-08  5:36                                           ` Con Kolivas
  2005-01-08  6:21                                             ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Con Kolivas @ 2005-01-08  5:36 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Arjan van de Ven, Paul Davis, Christoph Hellwig, Lee Revell,
	Ingo Molnar, Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 1256 bytes --]

Takashi Iwai wrote:
> At Fri, 7 Jan 2005 17:03:51 +0100,
> Arjan van de Ven wrote:
>>something like a soft realtime flag that acts as if it's the hard realtime
>>one unless the app shows "misbehavior" (eg eats its timeslice for X times in
>>a row) might for example be such a solution. And with the anti abuse
>>protection it can run with far lighter privilegs.
> 
> 
> This reminds me about the soft-RT patch posted quite sometime ago.
> I feel such a handy psuedo-RT scheduler class would be useful for
> other systems than JACK, too...

You've already proven that soft RT does not suit your requirements. The 
current scheduler running a task at nice -20 has extremely long periods 
of cpu availability at the expense of lower priority tasks and is close 
to the behaviour you would get with a soft RT patch. Your concern is 
exactly the scenario where nice -20 fails, and would be the same 
scenario where a soft RT policy would fail. Doing this with a scheduling 
policy, you want cpu time long after there is any hope for fairness or 
safety of hanging. From experimentation with such soft RT policies, we 
find average latencies can be reduced but the maximum ones, which are 
the ones that concern professional audio, remain the same.

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 22:29                                                 ` Chris Wright
@ 2005-01-08  6:12                                                   ` Jack O'Quin
  2005-01-08 16:56                                                     ` ross
                                                                       ` (3 more replies)
  0 siblings, 4 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-08  6:12 UTC (permalink / raw)
  To: Chris Wright
  Cc: Christoph Hellwig, Andrew Morton, Lee Revell, paul, arjanv,
	mingo, alan, linux-kernel

Chris Wright <chrisw@osdl.org> writes:

> * Christoph Hellwig (hch@infradead.org) wrote:
>> So to make forward progress I'd like the audio people to confirm whether
>> the mlock bits in 2.6.9+ do help that half of their requirement first
>
> It sure should, but I guess they can reply on that.

That does seem to work now (finally).  It looks like that longstanding
CAP_IPC_LOCK bug is finally fixed, too.

I find it hard to understand why some of you think PAM is an adequate
solution.  As currently deployed, it is poorly documented and nearly
impossible for non-experts to administer securely.  On my Debian woody
system, when I login from the console I get one fairly sensible set of
ulimit values, but from gdm I get a much more permissive set (with
ulimited mlocking, BTW).  Apparently, this is because the `gdm' PAM
config includes `session required pam_limits.so' but the system comes
with an empty /etc/security/limits.conf.  I'm just guessing about that
because I can't find any decent documentation for any of this crap.

Remember, if something is difficult to administer, it's *not* secure.

>> (and if not find a way to fix it) and then tackle the scheduling part.
>> For that one I really wonder whether the combination of the now actually
>> working nicelevels (see Mingo's post) and a simple wrapper for the really
>> high requirements cases doesn't work.
>
> I saw Jack (I think) post some numbers showing that it wasn't enough.
> What about making priority level protected via rlimit?

The numbers I reported yesterday were so bad I couldn't figure out why
anyone even thought it was worth trying.  Now I realize why.  

When Ingo said to try "nice -20", I took him literally, forgetting
that the stupid command to achieve a nice value of -20 is `nice --20'.
So I was actually testing with a nice value of 19.  Bah!  No wonder it
sucked.

Running `nice --20' is still significantly worse than SCHED_FIFO, but
not the unmitigated disaster shown in the middle column.  But, this
improved performance is still not adequate for audio work.  The worst
delay was absurdly long (~1/2 sec).

Here are the corrected results...

                                 With -R        Without -R      Without -R
                               (SCHED_FIFO)     (nice -20)      (nice --20)

************* SUMMARY RESULT ****************
Total seconds ran . . . . . . :   300
Number of clients . . . . . . :    20
Ports per client  . . . . . . :     4
Frames per buffer . . . . . . :    64
*********************************************
Timeout Count . . . . . . . . :(    1)          (    1)          (    1)
XRUN Count  . . . . . . . . . :     2             2837               43
Delay Count (>spare time) . . :     0                0                0
Delay Count (>1000 usecs) . . :     0                0                0
Delay Maximum . . . . . . . . :  3130 usecs    5038044 usecs   501374 usecs
Cycle Maximum . . . . . . . . :   960 usecs      18802 usecs     1036 usecs
Average DSP Load. . . . . . . :    34.3 %           44.1 %         34.3 %    
Average CPU System Load . . . :     8.7 %            7.5 %          7.8 %    
Average CPU User Load . . . . :    29.8 %            5.2 %         25.3 %    
Average CPU Nice Load . . . . :     0.0 %           20.3 %          0.0 %    
Average CPU I/O Wait Load . . :     3.2 %            5.2 %          0.1 %    
Average CPU IRQ Load  . . . . :     0.7 %            0.7 %          0.7 %    
Average CPU Soft-IRQ Load . . :     0.0 %            0.2 %          0.0 %    
Average Interrupt Rate  . . . :  1707.6 /sec      1677.3 /sec    1692.9 /sec 
Average Context-Switch Rate . : 11914.9 /sec     11197.6 /sec   11611.2 /sec 
*********************************************

> Here's an uncompiled, untested patch doing that (probably has some math
> error or logic hole in it, but idea seems sound enough).  I think it has
> at least one problem, where nice 19 process, could renice itself back to
> 0.  And it doesn't really handle the different scheduling policies,
> other than implicit 40 - 139 being used for SCHED_FIFO/SCHED_RR.
>
> It takes the 140 priority levels (0-139), inverts their priority
> order, and then uses that number as the basis for the rlimit (so that a
> larger rlimit means higher priority, to fall inline with normal rlimit
> semantics).  Defaults to 19 (which should be niceval of 0).  And allows
> CAP_SYS_NICE to continue to override if the rlimit is too low.

If you really want to use PAM for everything, then this idea makes a
lot of sense.

But, what about all the other programs that would need updating to
make it useful?  We'd need at least a new pam_limits.so module and a
new shell (since ulimit is built-in).  I expect I will need to
maintain the realtime-lsm for at least another year before all that
can trickle down to actual end users.

-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-08  5:36                                           ` Con Kolivas
@ 2005-01-08  6:21                                             ` Jack O'Quin
  0 siblings, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-08  6:21 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Takashi Iwai, Arjan van de Ven, Paul Davis, Christoph Hellwig,
	Lee Revell, Ingo Molnar, Chris Wright, Alan Cox,
	Linux Kernel Mailing List, Andrew Morton

Con Kolivas <kernel@kolivas.org> writes:

> Takashi Iwai wrote:
>> At Fri, 7 Jan 2005 17:03:51 +0100,
>> Arjan van de Ven wrote:
>>>something like a soft realtime flag that acts as if it's the hard realtime
>>>one unless the app shows "misbehavior" (eg eats its timeslice for X times in
>>>a row) might for example be such a solution. And with the anti abuse
>>>protection it can run with far lighter privilegs.

>> This reminds me about the soft-RT patch posted quite sometime ago.
>> I feel such a handy psuedo-RT scheduler class would be useful for
>> other systems than JACK, too...
>
> You've already proven that soft RT does not suit your
> requirements. The current scheduler running a task at nice -20 has
> extremely long periods of cpu availability at the expense of lower
> priority tasks and is close to the behaviour you would get with a soft
> RT patch. Your concern is exactly the scenario where nice -20 fails,
> and would be the same scenario where a soft RT policy would
> fail. Doing this with a scheduling policy, you want cpu time long
> after there is any hope for fairness or safety of hanging. From
> experimentation with such soft RT policies, we find average latencies
> can be reduced but the maximum ones, which are the ones that concern
> professional audio, remain the same.

Yes, this is exactly right.  The corrected test results I just posted
support your contention.  

For realtime, most of the OS tricks we all know and love are
counter-productive.  It's the worst case that matters, not the
average.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 16:22                                         ` Paul Davis
@ 2005-01-08 13:04                                           ` Paul Jakma
  0 siblings, 0 replies; 266+ messages in thread
From: Paul Jakma @ 2005-01-08 13:04 UTC (permalink / raw)
  To: Paul Davis
  Cc: Martin Mares, Arjan van de Ven, Christoph Hellwig, Lee Revell,
	Ingo Molnar, Chris Wright, Alan Cox, Jack O'Quin,
	Linux Kernel Mailing List, Andrew Morton

On Fri, 7 Jan 2005, Paul Davis wrote:

> capabilities work - we use them in 2.4 where a helper suid application
> gets the ball rolling, and then its child grants capabilities to new
> clients.

We use them too in Quagga. Reasonably happy with them.

Not a panacae, but far better to retain just a few capabilities, than 
retaining ruid 0 (as we must on other systems).

Only issue really is "graininess" of capabilities, which i'd guess is 
a double-edged sword.

regards,
-- 
Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
Kill Ugly Radio
- Frank Zappa

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-08  6:12                                                   ` Jack O'Quin
@ 2005-01-08 16:56                                                     ` ross
  2005-01-08 18:25                                                       ` Christoph Hellwig
  2005-01-08 22:20                                                       ` Lee Revell
  2005-01-08 22:14                                                     ` Lee Revell
                                                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 266+ messages in thread
From: ross @ 2005-01-08 16:56 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, mingo, alan, linux-kernel

On Sat, Jan 08, 2005 at 12:12:59AM -0600, Jack O'Quin wrote:
> I find it hard to understand why some of you think PAM is an adequate
> solution.  As currently deployed, it is poorly documented and nearly
> impossible for non-experts to administer securely.  On my Debian woody
> system, when I login from the console I get one fairly sensible set of
> ulimit values, but from gdm I get a much more permissive set (with
> ulimited mlocking, BTW).  Apparently, this is because the `gdm' PAM
> config includes `session required pam_limits.so' but the system comes
> with an empty /etc/security/limits.conf.  I'm just guessing about that
> because I can't find any decent documentation for any of this crap.
> 
> Remember, if something is difficult to administer, it's *not* secure.

Not to mention that not everyone chooses to use PAM for precisely this
reason.  Slackware has never included PAM and probably never will.
My audio workstation has worked swell with the 2.4+caps solution and
the 2.6+LSM solution.  PAM would break me ::-(

-- 
Ross Vandegrift
ross@lug.udel.edu

"The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell."
	--St. Augustine, De Genesi ad Litteram, Book II, xviii, 37

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-08 16:56                                                     ` ross
@ 2005-01-08 18:25                                                       ` Christoph Hellwig
  2005-01-08 22:20                                                       ` Lee Revell
  1 sibling, 0 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-01-08 18:25 UTC (permalink / raw)
  To: ross
  Cc: Jack O'Quin, Chris Wright, Andrew Morton, Lee Revell, paul,
	arjanv, mingo, alan, linux-kernel

On Sat, Jan 08, 2005 at 11:56:57AM -0500, ross@lug.udel.edu wrote:
> On Sat, Jan 08, 2005 at 12:12:59AM -0600, Jack O'Quin wrote:
> > I find it hard to understand why some of you think PAM is an adequate
> > solution.  As currently deployed, it is poorly documented and nearly
> > impossible for non-experts to administer securely.  On my Debian woody
> > system, when I login from the console I get one fairly sensible set of
> > ulimit values, but from gdm I get a much more permissive set (with
> > ulimited mlocking, BTW).  Apparently, this is because the `gdm' PAM
> > config includes `session required pam_limits.so' but the system comes
> > with an empty /etc/security/limits.conf.  I'm just guessing about that
> > because I can't find any decent documentation for any of this crap.
> > 
> > Remember, if something is difficult to administer, it's *not* secure.
> 
> Not to mention that not everyone chooses to use PAM for precisely this
> reason.  Slackware has never included PAM and probably never will.
> My audio workstation has worked swell with the 2.4+caps solution and
> the 2.6+LSM solution.  PAM would break me ::-(

you can set rmlimits as well without pam.  it's just more complicated.
But hey, it was you who didn't want to use it :)

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-08  6:12                                                   ` Jack O'Quin
  2005-01-08 16:56                                                     ` ross
@ 2005-01-08 22:14                                                     ` Lee Revell
  2005-01-10 21:20                                                     ` Matt Mackall
  2005-01-11 21:21                                                     ` Ingo Molnar
  3 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-08 22:14 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, paul, arjanv,
	mingo, alan, linux-kernel

On Sat, 2005-01-08 at 00:12 -0600, Jack O'Quin wrote:
> I find it hard to understand why some of you think PAM is an adequate
> solution.  As currently deployed, it is poorly documented and nearly
> impossible for non-experts to administer securely.  On my Debian woody
> system, when I login from the console I get one fairly sensible set of
> ulimit values, but from gdm I get a much more permissive set (with
> ulimited mlocking, BTW).  Apparently, this is because the `gdm' PAM
> config includes `session required pam_limits.so' but the system comes
> with an empty /etc/security/limits.conf.  I'm just guessing about that
> because I can't find any decent documentation for any of this crap.

Eh, PAM is a perfectly fine solution.  Documentation is lacking, but
it's easy to find examples.  On my system /etc/security/limits.conf has
this sample config, commented out:

#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0

So add your audio users (or cdrecord users, or whoever) to group
realtime and add:

realtime	hard	memlock	100000
realtime	soft	prio	100

Problem solved.

Lee



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-08 16:56                                                     ` ross
  2005-01-08 18:25                                                       ` Christoph Hellwig
@ 2005-01-08 22:20                                                       ` Lee Revell
  2005-01-08 22:27                                                         ` Andreas Steinmetz
  1 sibling, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-08 22:20 UTC (permalink / raw)
  To: ross
  Cc: Jack O'Quin, Chris Wright, Christoph Hellwig, Andrew Morton,
	paul, arjanv, mingo, alan, linux-kernel

On Sat, 2005-01-08 at 11:56 -0500, ross@lug.udel.edu wrote:
> Not to mention that not everyone chooses to use PAM for precisely this
> reason.  Slackware has never included PAM and probably never will.
> My audio workstation has worked swell with the 2.4+caps solution and
> the 2.6+LSM solution.  PAM would break me ::-(

Hmm.  How could you (for example) configure all your machines to
authenticate against an LDAP server without PAM?

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-08 22:20                                                       ` Lee Revell
@ 2005-01-08 22:27                                                         ` Andreas Steinmetz
  0 siblings, 0 replies; 266+ messages in thread
From: Andreas Steinmetz @ 2005-01-08 22:27 UTC (permalink / raw)
  To: Lee Revell
  Cc: ross, Jack O'Quin, Chris Wright, Christoph Hellwig,
	Andrew Morton, paul, arjanv, mingo, alan, linux-kernel

Lee Revell wrote:
> On Sat, 2005-01-08 at 11:56 -0500, ross@lug.udel.edu wrote:
> 
>>Not to mention that not everyone chooses to use PAM for precisely this
>>reason.  Slackware has never included PAM and probably never will.
>>My audio workstation has worked swell with the 2.4+caps solution and
>>the 2.6+LSM solution.  PAM would break me ::-(
> 
> 
> Hmm.  How could you (for example) configure all your machines to
> authenticate against an LDAP server without PAM?

nss_ldap :-)

-- 
Andreas Steinmetz                       SPAMmers use robotrap@domdv.de

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-07 23:20                                                     ` Andrew Morton
  2005-01-07 23:34                                                       ` Valdis.Kletnieks
@ 2005-01-10 21:05                                                       ` Matt Mackall
  1 sibling, 0 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-10 21:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Valdis.Kletnieks, chrisw, rlrevell, paul, arjanv, hch, mingo,
	alan, joq, linux-kernel

On Fri, Jan 07, 2005 at 03:20:04PM -0800, Andrew Morton wrote:
> Valdis.Kletnieks@vt.edu wrote:
> >
> > fix the inheritance problems
> 
> Does anyone actually have a handle on what's involved in fixing the
> inheritance problem?

Probably not, in the sense that it's a complex enough problem that
something will likely to be found to be fatally flawed a year down the
road. Just like the situation we're in now.
 
> It's risky, but it is something which we should do.
> 
> <grumpytroll> We really shouldn't have merged all that new fancy security
> stuff when the existing security framework was known-badly-broken. 
> Especially as the new stuff seems incapable of doing simple things which
> unbroken inherited caps would do perfectly.</grumpytroll>

It's taken some decades to ferret out all the gotchas of the standard
UNIX permission model. None of this fancy new stuff is "simpler" by
any stretch, so expect it to be quite some time before all the
implications of any of them are completely understood.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-08  6:12                                                   ` Jack O'Quin
  2005-01-08 16:56                                                     ` ross
  2005-01-08 22:14                                                     ` Lee Revell
@ 2005-01-10 21:20                                                     ` Matt Mackall
  2005-01-11 13:05                                                       ` Paul Davis
  2005-01-11 14:30                                                       ` Jack O'Quin
  2005-01-11 21:21                                                     ` Ingo Molnar
  3 siblings, 2 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-10 21:20 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, mingo, alan, linux-kernel

On Sat, Jan 08, 2005 at 12:12:59AM -0600, Jack O'Quin wrote:
> Chris Wright <chrisw@osdl.org> writes:
> 
> > * Christoph Hellwig (hch@infradead.org) wrote:
> >> So to make forward progress I'd like the audio people to confirm whether
> >> the mlock bits in 2.6.9+ do help that half of their requirement first
> >
> > It sure should, but I guess they can reply on that.
> 
> That does seem to work now (finally).  It looks like that longstanding
> CAP_IPC_LOCK bug is finally fixed, too.
> 
> I find it hard to understand why some of you think PAM is an adequate
> solution.

The best we can do _here_ is present something that userspace can use
sensibly. We can't make userspace actually use it that way though. 

Rlimits are neither UID/GID or PAM-specific. They fit well within
the general model of UNIX security, extending an existing mechanism
rather than adding a completely new one. That PAM happens to be the
way rlimits are usually administered may be unfortunate, yes, but it
doesn't mean that rlimits is the wrong way.

> Running `nice --20' is still significantly worse than SCHED_FIFO, but
> not the unmitigated disaster shown in the middle column.  But, this
> improved performance is still not adequate for audio work.  The worst
> delay was absurdly long (~1/2 sec).

Let's work on that. It'd be _far_ better to have unprivileged near-RT
capability everywhere without potential scheduling DoS.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-10 21:20                                                     ` Matt Mackall
@ 2005-01-11 13:05                                                       ` Paul Davis
  2005-01-11 16:28                                                         ` Jack O'Quin
  2005-01-11 19:17                                                         ` Matt Mackall
  2005-01-11 14:30                                                       ` Jack O'Quin
  1 sibling, 2 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-11 13:05 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Chris Wright, Christoph Hellwig, Andrew Morton,
	Lee Revell, arjanv, mingo, alan, linux-kernel

>Rlimits are neither UID/GID or PAM-specific. They fit well within
>the general model of UNIX security, extending an existing mechanism
>rather than adding a completely new one. That PAM happens to be the
>way rlimits are usually administered may be unfortunate, yes, but it
>doesn't mean that rlimits is the wrong way.

agreed, although i note with interest the flap over RLIMIT_MEMLOCK
being made accessible to unprivileged users by people working on
grsecurity. 

>> Running `nice --20' is still significantly worse than SCHED_FIFO, but
>> not the unmitigated disaster shown in the middle column.  But, this
>> improved performance is still not adequate for audio work.  The worst
>> delay was absurdly long (~1/2 sec).
>
>Let's work on that. It'd be _far_ better to have unprivileged near-RT
>capability everywhere without potential scheduling DoS.

I am not sure what you mean here. I think we've established that
SCHED_OTHER cannot be made adequate for realtime audio work. Its
intended purpose (timesharing the machine in ways that should
generally benefit tasks that don't do a lot and/or are dominated by
user interaction, thus rendering the machine apparently responsive) is
really at odds with what we need.

Con has discussed the idea of a new scheduling class, one that has no
internal priority, runs like SCHED_RR but is subject to cpu
utilization limits, and is accessible to unprivileged users. I think
this makes a lot of sense. It can be controlled using sysctl's and/or
rlimit. 

But please note: in any sane world, adding stuff like this could only
take place in an unstable tree. It seems really odd to me that anyone
can be talking about adding any of these *mechanisms* to 2.6. That was
the whole reason we (well, Jack, Torben and others) worked with LSM:
LSM appeared to be the "blessed" method in 2.6 of allowing changes to
security policy to be made. We are now finding out that even if Linus
"blessed" it by inclusion, there is enough vocal opposition to
actually using it for something useful that something else has to be
done. I wouldn't want to run an important machine on 2.6 if adding,
say SCHED_ISO or even RLIMIT_RT_CPU is part of 2.6's "maintainance".

Meanwhile, as I mentioned before, every realtime audio user of 2.6 is
*still* going to use "realtime" LSM because its really the only
effective way to get the privilege needed to do what they want to get
done. 

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-10 21:20                                                     ` Matt Mackall
  2005-01-11 13:05                                                       ` Paul Davis
@ 2005-01-11 14:30                                                       ` Jack O'Quin
  2005-01-11 19:50                                                         ` Matt Mackall
  1 sibling, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-11 14:30 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, mingo, alan, linux-kernel

> On Sat, Jan 08, 2005 at 12:12:59AM -0600, Jack O'Quin wrote:
>> I find it hard to understand why some of you think PAM is an adequate
>> solution.

Matt Mackall <mpm@selenic.com> writes:
> The best we can do _here_ is present something that userspace can use
> sensibly. We can't make userspace actually use it that way though. 

"O'Quin's law" states that "every system reflects the structure of the
organization creating it".  (Probably not original, I "discovered"
this about 25 years ago, while doing OS development at IBM.)  Compared
to most other operating systems, GNU/Linux has a much larger
organizational gap between kernel development and the rest of the OS.
Like anything else, this is both a strength and a weakness.

> Rlimits are neither UID/GID or PAM-specific. They fit well within
> the general model of UNIX security, extending an existing mechanism
> rather than adding a completely new one. That PAM happens to be the
> way rlimits are usually administered may be unfortunate, yes, but it
> doesn't mean that rlimits is the wrong way.

This whole RLIMITS_MEMLOCK situation with PAM is a good example of how
that disconnect causes systemic troubles.  AFAICT, fixing the
longstanding bug in mlock() introduced a Denial of Service bug in
Debian (and perhaps other distributions) when running 2.6.10.

Clearly, this is not a kernel bug.  In fact, the kernel was broken
before.  But, it is an excellent example of how depending on a
Byzantine mechanism like PAM *harms* system security.  The Debian
developers are very careful about things like this.  If they can't get
the default install right, something is badly amiss, damaging
complexity in the overall system.  The kernel is not solely
responsible for that, but ignoring its contribution would be a
mistake.

>> Running `nice --20' is still significantly worse than SCHED_FIFO, but
>> not the unmitigated disaster shown in the middle column.  But, this
>> improved performance is still not adequate for audio work.  The worst
>> delay was absurdly long (~1/2 sec).
>
> Let's work on that. It'd be _far_ better to have unprivileged near-RT
> capability everywhere without potential scheduling DoS.

"Near-RT" is about the most useless concept I've heard of in a long
time.  It sounds like the answer to a question nobody asked.  ;-)

Linux can and should develop a better unprivileged realtime scheduling
algorithm.  But, this is not an "escape hatch" to avoid confronting
mainline 2.6 security problems.  We still have 2005 problems to solve.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 13:05                                                       ` Paul Davis
@ 2005-01-11 16:28                                                         ` Jack O'Quin
  2005-01-11 18:59                                                           ` Matt Mackall
                                                                             ` (2 more replies)
  2005-01-11 19:17                                                         ` Matt Mackall
  1 sibling, 3 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-11 16:28 UTC (permalink / raw)
  To: Paul Davis
  Cc: Matt Mackall, Chris Wright, Christoph Hellwig, Andrew Morton,
	Lee Revell, arjanv, mingo, alan, Con Kolivas, linux-kernel

Paul Davis <paul@linuxaudiosystems.com> writes:

>>Rlimits are neither UID/GID or PAM-specific. They fit well within
>>the general model of UNIX security, extending an existing mechanism
>>rather than adding a completely new one. That PAM happens to be the
>>way rlimits are usually administered may be unfortunate, yes, but it
>>doesn't mean that rlimits is the wrong way.

PAM is how most GNU/Linux systems manage rlimits.  It is very UID/GID
oriented.  So from the sysadmin perspective, claiming that rlimits is
"better" or "easier to manage" than "GID hacks" is bogus.

> agreed, although i note with interest the flap over RLIMIT_MEMLOCK
> being made accessible to unprivileged users by people working on
> grsecurity. 

:-)

>>Let's work on that. It'd be _far_ better to have unprivileged near-RT
>>capability everywhere without potential scheduling DoS.
>
> I am not sure what you mean here. I think we've established that
> SCHED_OTHER cannot be made adequate for realtime audio work. Its
> intended purpose (timesharing the machine in ways that should
> generally benefit tasks that don't do a lot and/or are dominated by
> user interaction, thus rendering the machine apparently responsive) is
> really at odds with what we need.
>
> Con has discussed the idea of a new scheduling class, one that has no
> internal priority, runs like SCHED_RR but is subject to cpu
> utilization limits, and is accessible to unprivileged users. I think
> this makes a lot of sense. It can be controlled using sysctl's and/or
> rlimit. 

A good isochronous scheduler in 2.8 would be great.  We can experiment
with it this year in patch form.  

Meanwhile...

> But please note: in any sane world, adding stuff like this could only
> take place in an unstable tree. It seems really odd to me that anyone
> can be talking about adding any of these *mechanisms* to 2.6. That was
> the whole reason we (well, Jack, Torben and others) worked with LSM:
> LSM appeared to be the "blessed" method in 2.6 of allowing changes to
> security policy to be made. We are now finding out that even if Linus
> "blessed" it by inclusion, there is enough vocal opposition to
> actually using it for something useful that something else has to be
> done. I wouldn't want to run an important machine on 2.6 if adding,
> say SCHED_ISO or even RLIMIT_RT_CPU is part of 2.6's "maintainance".
>
> Meanwhile, as I mentioned before, every realtime audio user of 2.6 is
> *still* going to use "realtime" LSM because its really the only
> effective way to get the privilege needed to do what they want to get
> done. 

I am surprised and dismayed by the ignorance of realtime programming
expressed by some (not all) messages in this thread.  Worse, many
developers seem unaware of how much they don't know.  This stuff is
difficult, even for smart people.  Maybe even "especially for smart
people".

I am very conscious of my own matching ignorance of Linux kernel
internals.  If possible, I'd like to keep it that way.    ;-) 

Kernel developers really don't have the equivalent luxury of ignoring
realtime design issues.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 16:28                                                         ` Jack O'Quin
@ 2005-01-11 18:59                                                           ` Matt Mackall
  2005-01-11 20:47                                                           ` utz lehmann
  2005-01-11 21:07                                                           ` Lee Revell
  2 siblings, 0 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-11 18:59 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Paul Davis, Chris Wright, Christoph Hellwig, Andrew Morton,
	Lee Revell, arjanv, mingo, alan, Con Kolivas, linux-kernel

On Tue, Jan 11, 2005 at 10:28:13AM -0600, Jack O'Quin wrote:
> Paul Davis <paul@linuxaudiosystems.com> writes:
> 
> >>Rlimits are neither UID/GID or PAM-specific. They fit well within
> >>the general model of UNIX security, extending an existing mechanism
> >>rather than adding a completely new one. That PAM happens to be the
> >>way rlimits are usually administered may be unfortunate, yes, but it
> >>doesn't mean that rlimits is the wrong way.
> 
> PAM is how most GNU/Linux systems manage rlimits.  It is very UID/GID
> oriented.  So from the sysadmin perspective, claiming that rlimits is
> "better" or "easier to manage" than "GID hacks" is bogus.

Yes, you're right, so let's invent something completely new and
inherently much less flexible so that the problem is made worse on
both fronts.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 13:05                                                       ` Paul Davis
  2005-01-11 16:28                                                         ` Jack O'Quin
@ 2005-01-11 19:17                                                         ` Matt Mackall
  2005-01-11 19:42                                                           ` Jack O'Quin
  2005-01-11 20:50                                                           ` Chris Wright
  1 sibling, 2 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-11 19:17 UTC (permalink / raw)
  To: Paul Davis
  Cc: Jack O'Quin, Chris Wright, Christoph Hellwig, Andrew Morton,
	Lee Revell, arjanv, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 08:05:08AM -0500, Paul Davis wrote:
> >> Running `nice --20' is still significantly worse than SCHED_FIFO, but
> >> not the unmitigated disaster shown in the middle column.  But, this
> >> improved performance is still not adequate for audio work.  The worst
> >> delay was absurdly long (~1/2 sec).
> >
> >Let's work on that. It'd be _far_ better to have unprivileged near-RT
> >capability everywhere without potential scheduling DoS.
> 
> I am not sure what you mean here. I think we've established that
> SCHED_OTHER cannot be made adequate for realtime audio work. Its
> intended purpose (timesharing the machine in ways that should
> generally benefit tasks that don't do a lot and/or are dominated by
> user interaction, thus rendering the machine apparently responsive) is
> really at odds with what we need.

We have not established that at all. In principle, because SCHED_OTHER
tasks running at full priority lie on the boundary between SCHED_OTHER
and SCHED_FIFO, they can be made to run arbitrarily close to the
performance of tasks in SCHED_FIFO. With the upside that they won't be
able to deadlock the machine.

And I mean arbitrarily close in the strict delta-epsilon sense.
It's not perfect, but neither is SCHED_FIFO, in principle or in
practice. 

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 19:17                                                         ` Matt Mackall
@ 2005-01-11 19:42                                                           ` Jack O'Quin
  2005-01-11 20:50                                                           ` Chris Wright
  1 sibling, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-11 19:42 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Paul Davis, Chris Wright, Christoph Hellwig, Andrew Morton,
	Lee Revell, arjanv, mingo, alan, linux-kernel

> On Tue, Jan 11, 2005 at 08:05:08AM -0500, Paul Davis wrote:
>> I am not sure what you mean here. I think we've established that
>> SCHED_OTHER cannot be made adequate for realtime audio work. Its
>> intended purpose (timesharing the machine in ways that should
>> generally benefit tasks that don't do a lot and/or are dominated by
>> user interaction, thus rendering the machine apparently responsive) is
>> really at odds with what we need.

Matt Mackall <mpm@selenic.com> writes:
> We have not established that at all. In principle, because SCHED_OTHER
> tasks running at full priority lie on the boundary between SCHED_OTHER
> and SCHED_FIFO, they can be made to run arbitrarily close to the
> performance of tasks in SCHED_FIFO. With the upside that they won't be
> able to deadlock the machine.
>
> And I mean arbitrarily close in the strict delta-epsilon sense.
> It's not perfect, but neither is SCHED_FIFO, in principle or in
> practice. 

Though inelegant in theory, SCHED_FIFO *has* been shown to work in
practice.  The POSIX 1003.4 committee were not all a bunch of idiots.
That stuff *is* useful and *does* work (given appropriate privileges).

Your assertions have not been reduced to practice.  This is a
significant difference.  Write some code, then we can discuss whether
it solves any problems or not.  I doubt it, but prove me wrong and
next year you can be the proud author of a scheduler used for hundreds
of audio applications.

Meanwhile, what about 2005?  It's "almost upon us".  :-/
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 14:30                                                       ` Jack O'Quin
@ 2005-01-11 19:50                                                         ` Matt Mackall
  2005-01-11 19:57                                                           ` Jack O'Quin
  2005-01-11 22:45                                                           ` Paul Davis
  0 siblings, 2 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-11 19:50 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 08:30:50AM -0600, Jack O'Quin wrote:
> > Let's work on that. It'd be _far_ better to have unprivileged near-RT
> > capability everywhere without potential scheduling DoS.
> 
> "Near-RT" is about the most useless concept I've heard of in a long
> time.  It sounds like the answer to a question nobody asked.  ;-)

To my way of thinking, it's a pretty good description of Ingo's work
or anything you're ever going to see on a PC. If you think you're
going to get real hard RT performance on your off-the-shelf x86 box
running a conventional OS, you are fooling yourself.

Thankfully a buffer underrun is no more fatal for pro audio than a
broken guitar string. CDs skip, DATs glitch, XLR cables flake out,
circuit breakers trip, amps clip, Powerbooks crash, and the show goes
on. I've done more than enough stage tech to know it's a huge pain in
the ass, but let's stop pretending we require absolute perfection,
please.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 19:50                                                         ` Matt Mackall
@ 2005-01-11 19:57                                                           ` Jack O'Quin
  2005-01-11 20:05                                                             ` Matt Mackall
  2005-01-11 20:19                                                             ` Chris Friesen
  2005-01-11 22:45                                                           ` Paul Davis
  1 sibling, 2 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-11 19:57 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, mingo, alan, linux-kernel

Matt Mackall <mpm@selenic.com> writes:

> On Tue, Jan 11, 2005 at 08:30:50AM -0600, Jack O'Quin wrote:
>> "Near-RT" is about the most useless concept I've heard of in a long
>> time.  It sounds like the answer to a question nobody asked.  ;-)
>
> To my way of thinking, it's a pretty good description of Ingo's work
> or anything you're ever going to see on a PC. If you think you're
> going to get real hard RT performance on your off-the-shelf x86 box
> running a conventional OS, you are fooling yourself.
>
> Thankfully a buffer underrun is no more fatal for pro audio than a
> broken guitar string. CDs skip, DATs glitch, XLR cables flake out,
> circuit breakers trip, amps clip, Powerbooks crash, and the show goes
> on. I've done more than enough stage tech to know it's a huge pain in
> the ass, but let's stop pretending we require absolute perfection,
> please.

In _practice_, Ingo's patches are considerably better than what you
seem to consider "good enough for mere audio work".
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 19:57                                                           ` Jack O'Quin
@ 2005-01-11 20:05                                                             ` Matt Mackall
  2005-01-11 20:29                                                               ` Lee Revell
  2005-01-11 20:19                                                             ` Chris Friesen
  1 sibling, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-11 20:05 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 01:57:11PM -0600, Jack O'Quin wrote:
> Matt Mackall <mpm@selenic.com> writes:
> 
> > On Tue, Jan 11, 2005 at 08:30:50AM -0600, Jack O'Quin wrote:
> >> "Near-RT" is about the most useless concept I've heard of in a long
> >> time.  It sounds like the answer to a question nobody asked.  ;-)
> >
> > To my way of thinking, it's a pretty good description of Ingo's work
> > or anything you're ever going to see on a PC. If you think you're
> > going to get real hard RT performance on your off-the-shelf x86 box
> > running a conventional OS, you are fooling yourself.
> >
> > Thankfully a buffer underrun is no more fatal for pro audio than a
> > broken guitar string. CDs skip, DATs glitch, XLR cables flake out,
> > circuit breakers trip, amps clip, Powerbooks crash, and the show goes
> > on. I've done more than enough stage tech to know it's a huge pain in
> > the ass, but let's stop pretending we require absolute perfection,
> > please.
> 
> In _practice_, Ingo's patches are considerably better than what you
> seem to consider "good enough for mere audio work".

Eh? I never implied mainstream was good enough.

What I said was that high priority SCHED_OTHER could be made good
enough and that that would be preferable to SCHED_FIFO in many cases.

Anyway, *plonk*.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 19:57                                                           ` Jack O'Quin
  2005-01-11 20:05                                                             ` Matt Mackall
@ 2005-01-11 20:19                                                             ` Chris Friesen
  1 sibling, 0 replies; 266+ messages in thread
From: Chris Friesen @ 2005-01-11 20:19 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Matt Mackall, Chris Wright, Christoph Hellwig, Andrew Morton,
	Lee Revell, paul, arjanv, mingo, alan, linux-kernel

Jack O'Quin wrote:
> Matt Mackall <mpm@selenic.com> writes:

>>Thankfully a buffer underrun is no more fatal for pro audio than a
>>broken guitar string. CDs skip, DATs glitch, XLR cables flake out,
>>circuit breakers trip, amps clip, Powerbooks crash, and the show goes
>>on. I've done more than enough stage tech to know it's a huge pain in
>>the ass, but let's stop pretending we require absolute perfection,
>>please.

> In _practice_, Ingo's patches are considerably better than what you
> seem to consider "good enough for mere audio work".

I don't see anywere that Matt was criticising Ingo's work.  He just said 
that it wasn't hard realtime--which is true.

A hard realtime system will *guarantee* that the deadlines will be met, 
*no matter what*.  It makes all kinds of other sacrifices to do it, and 
it makes additional demands on the application designer as well.

I don't think Ingo would claim that his patches make Linux a hard RT 
operating system.

Chris

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 20:05                                                             ` Matt Mackall
@ 2005-01-11 20:29                                                               ` Lee Revell
  2005-01-11 20:47                                                                 ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-11 20:29 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Chris Wright, Christoph Hellwig, Andrew Morton,
	paul, arjanv, mingo, alan, linux-kernel

On Tue, 2005-01-11 at 12:05 -0800, Matt Mackall wrote:
> Anyway, *plonk*.
> 

Plonk?  WTF?  Jack comes up with what many people think is a reasonable
solution to a real problem, that affects thousands of users, and in the
middle of what seems to me a civilized discussion, you killfile him
because he disagrees with you?

Plonk to you too, asshole.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 20:29                                                               ` Lee Revell
@ 2005-01-11 20:47                                                                 ` Chris Wright
  2005-01-11 21:10                                                                   ` Lee Revell
  2005-01-11 21:28                                                                   ` Matt Mackall
  0 siblings, 2 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-11 20:47 UTC (permalink / raw)
  To: Lee Revell
  Cc: Matt Mackall, Jack O'Quin, Chris Wright, Christoph Hellwig,
	Andrew Morton, paul, arjanv, mingo, alan, linux-kernel

* Lee Revell (rlrevell@joe-job.com) wrote:
> On Tue, 2005-01-11 at 12:05 -0800, Matt Mackall wrote:
> > Anyway, *plonk*.
> 
> Plonk?  WTF?  Jack comes up with what many people think is a reasonable
> solution to a real problem, that affects thousands of users, and in the
> middle of what seems to me a civilized discussion, you killfile him
> because he disagrees with you?
> 
> Plonk to you too, asshole.

Guys, could we please bring this back to a useful discussion.  None of
you have commented on whether the rlimits for priority are useful.  As I
said before, I've no real problem with the module as it stands since it's
tiny, quite contained, and does something people need.  But I agree it'd
be better to find something that's workable as long term solution.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 16:28                                                         ` Jack O'Quin
  2005-01-11 18:59                                                           ` Matt Mackall
@ 2005-01-11 20:47                                                           ` utz lehmann
  2005-01-11 21:07                                                           ` Lee Revell
  2 siblings, 0 replies; 266+ messages in thread
From: utz lehmann @ 2005-01-11 20:47 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Paul Davis, Matt Mackall, Chris Wright, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, mingo, alan, Con Kolivas,
	LKML

On Tue, 2005-01-11 at 10:28 -0600, Jack O'Quin wrote:
> Paul Davis <paul@linuxaudiosystems.com> writes:
> 
> >>Rlimits are neither UID/GID or PAM-specific. They fit well within
> >>the general model of UNIX security, extending an existing mechanism
> >>rather than adding a completely new one. That PAM happens to be the
> >>way rlimits are usually administered may be unfortunate, yes, but it
> >>doesn't mean that rlimits is the wrong way.
> 
> PAM is how most GNU/Linux systems manage rlimits.  It is very UID/GID
> oriented.  So from the sysadmin perspective, claiming that rlimits is
> "better" or "easier to manage" than "GID hacks" is bogus.

Why do you have such a problem with a rlimit base approach?
IMHO it's not a hack like realtime LSM, usable for other things beside
pro audio (see "scheduling priorities with rlimit" thread), securer and
more user friendly.

With realtime LSM a user in the realtime group can change the nice
values and RT priorities of other users processes, incl. owned by root
and kernel threads. This has to be fixed. I think this means a rewrite
(not using CAP_SYS_NICE).

It can't be used with distro kernels which have common-caps complied in,
eg. fedora.

IMHO for a possible mainline inclusion the mlock part have to taken away
because RLIMIT_MLOCK is a better solution. A pro audio user have to deal
with rlimits for mlock and realtime LSM for the RT priority part.
Doing both with rlimits is more user friendly. Most of them have only to
put something like this in limits.conf:

me	hard	memlock		500000
me	soft	memlock		500000
me	hard	realtime	60
me	soft	realtime	60

And with rlimits you can drop privileges on process basis. Just set the
hard RLIMIT_RT to 0 (ulimit). You can't do this with realtime LSM.
 
With realtime rlimit you can even think about to give users realtime
prios on a multi user machine. Limit the RT prio for users to 10 and
have a rt-watchdog process with a higher priority which kills runaway
user RT processes.
With realtime LSM you can't limit the RT prio. It's all or nothing.



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 19:17                                                         ` Matt Mackall
  2005-01-11 19:42                                                           ` Jack O'Quin
@ 2005-01-11 20:50                                                           ` Chris Wright
  2005-01-11 20:58                                                             ` Ingo Molnar
  1 sibling, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-11 20:50 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Paul Davis, Jack O'Quin, Chris Wright, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, mingo, alan, linux-kernel

* Matt Mackall (mpm@selenic.com) wrote:
> On Tue, Jan 11, 2005 at 08:05:08AM -0500, Paul Davis wrote:
> > I am not sure what you mean here. I think we've established that
> > SCHED_OTHER cannot be made adequate for realtime audio work. Its
> > intended purpose (timesharing the machine in ways that should
> > generally benefit tasks that don't do a lot and/or are dominated by
> > user interaction, thus rendering the machine apparently responsive) is
> > really at odds with what we need.
> 
> We have not established that at all. In principle, because SCHED_OTHER
> tasks running at full priority lie on the boundary between SCHED_OTHER
> and SCHED_FIFO, they can be made to run arbitrarily close to the
> performance of tasks in SCHED_FIFO. With the upside that they won't be
> able to deadlock the machine.

I don't think they lie quite so neatly on this boundary.  There's one
fundamental difference which is how the dynamic priority is adjusted
which alters the basic preemptibility rules.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 20:50                                                           ` Chris Wright
@ 2005-01-11 20:58                                                             ` Ingo Molnar
  2005-01-11 21:14                                                               ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-01-11 20:58 UTC (permalink / raw)
  To: Chris Wright
  Cc: Matt Mackall, Paul Davis, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Chris Wright <chrisw@osdl.org> wrote:

> > We have not established that at all. In principle, because SCHED_OTHER
> > tasks running at full priority lie on the boundary between SCHED_OTHER
> > and SCHED_FIFO, they can be made to run arbitrarily close to the
> > performance of tasks in SCHED_FIFO. With the upside that they won't be
> > able to deadlock the machine.
> 
> I don't think they lie quite so neatly on this boundary.  There's one
> fundamental difference which is how the dynamic priority is adjusted
> which alters the basic preemptibility rules.

but at nice level -20 this adjustment is at most +5 priority levels -
i.e. down to an equivalent of nice -15. Consider that a nice 0 task can
at most get a -5 priority boost gives a nice -5 task worst-case - so the
nice -20 task still preempts the lower prio task.

so this could work in theory. But practice shows it doesnt work at the
moment, and nobody has analyzed why, yet.

(There are some other differences in scheduling like starvation
prevention adding potential delays, but those should in theory not
affect the basic tests that were done so far. There are also some
differences in timeslice management, but with the huge timeslices that
nice -20 tasks get this shouldnt be causing problems either. So my
current thinking is that there's an unknown scheduling effect causing
latency regression of nice -20 tasks, compared to the latencies of
RT-prio-1 tasks.)

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 16:28                                                         ` Jack O'Quin
  2005-01-11 18:59                                                           ` Matt Mackall
  2005-01-11 20:47                                                           ` utz lehmann
@ 2005-01-11 21:07                                                           ` Lee Revell
  2 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-11 21:07 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Paul Davis, Matt Mackall, Chris Wright, Christoph Hellwig,
	Andrew Morton, arjanv, mingo, alan, Con Kolivas, linux-kernel

On Tue, 2005-01-11 at 10:28 -0600, Jack O'Quin wrote:
> Paul Davis <paul@linuxaudiosystems.com> writes:
> 
> >>Rlimits are neither UID/GID or PAM-specific. They fit well within
> >>the general model of UNIX security, extending an existing mechanism
> >>rather than adding a completely new one. That PAM happens to be the
> >>way rlimits are usually administered may be unfortunate, yes, but it
> >>doesn't mean that rlimits is the wrong way.
> 
> PAM is how most GNU/Linux systems manage rlimits.  It is very UID/GID
> oriented.  So from the sysadmin perspective, claiming that rlimits is
> "better" or "easier to manage" than "GID hacks" is bogus.
> 

Sorry, I have to agree with Matt, let's just use PAM.  Maybe I have been
a Linux admin for too long but I don't think PAM is so bad.  Yes it
could be better documented but if this was a showstopper then no one
would use Linux at all.  It's not like every naive user will have to
figure out PAM now, the audio oriented distributions will just set it up
right by default.  And if people want to use the mainstream distros to
do audio work OOTB, they'll just have to bug their vendor about it.

> > agreed, although i note with interest the flap over RLIMIT_MEMLOCK
> > being made accessible to unprivileged users by people working on
> > grsecurity. 
> 
> :-)

But we are not talking about unprivileged users.  Do not take
"unprivileged" to mean "nonroot".  We need an easy mechanism for root to
tell the kernel 'the following users get to do things that could
potentially lock up the system'.  No general purpose Linux distro would
ship with this enabled by default for everyone.  But, to quote another
LKML thread 'you can't prevent root from doing stupid things because
that would also keep him from doing clever things'.

It's a fine line between stupid and clever.

Lee



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 20:47                                                                 ` Chris Wright
@ 2005-01-11 21:10                                                                   ` Lee Revell
  2005-01-11 21:20                                                                     ` Chris Wright
  2005-01-11 21:28                                                                   ` Matt Mackall
  1 sibling, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-11 21:10 UTC (permalink / raw)
  To: Chris Wright
  Cc: Matt Mackall, Jack O'Quin, Christoph Hellwig, Andrew Morton,
	paul, arjanv, mingo, alan, linux-kernel

On Tue, 2005-01-11 at 12:47 -0800, Chris Wright wrote:
> * Lee Revell (rlrevell@joe-job.com) wrote:
> > On Tue, 2005-01-11 at 12:05 -0800, Matt Mackall wrote:
> > > Anyway, *plonk*.
> > 
> > Plonk?  WTF?  Jack comes up with what many people think is a reasonable
> > solution to a real problem, that affects thousands of users, and in the
> > middle of what seems to me a civilized discussion, you killfile him
> > because he disagrees with you?
> > 
> > Plonk to you too, asshole.
> 
> Guys, could we please bring this back to a useful discussion.  None of
> you have commented on whether the rlimits for priority are useful.  As I
> said before, I've no real problem with the module as it stands since it's
> tiny, quite contained, and does something people need.  But I agree it'd
> be better to find something that's workable as long term solution.

Chris, I did comment on it, see
1105222442.24592.126.camel@krustophenia.net from around 5:15 on
Saturday.

from the above message:

Eh, PAM is a perfectly fine solution.  Documentation is lacking, but
it's easy to find examples.  On my system /etc/security/limits.conf has
this sample config, commented out:

#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0

So add your audio users (or cdrecord users, or whoever) to group
realtime and add:

realtime        hard    memlock 100000
realtime        soft    prio    100

Problem solved.

Lee



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 20:58                                                             ` Ingo Molnar
@ 2005-01-11 21:14                                                               ` Chris Wright
  2005-01-11 21:27                                                                 ` Ingo Molnar
  0 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-11 21:14 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Jack O'Quin,
	Christoph Hellwig, Andrew Morton, Lee Revell, arjanv, alan,
	linux-kernel

* Ingo Molnar (mingo@elte.hu) wrote:
> * Chris Wright <chrisw@osdl.org> wrote:
> > I don't think they lie quite so neatly on this boundary.  There's one
> > fundamental difference which is how the dynamic priority is adjusted
> > which alters the basic preemptibility rules.
> 
> but at nice level -20 this adjustment is at most +5 priority levels -
> i.e. down to an equivalent of nice -15. Consider that a nice 0 task can
> at most get a -5 priority boost gives a nice -5 task worst-case - so the
> nice -20 task still preempts the lower prio task.

Yeah, I realize it provides some safety, I just wanted to point out
the fundamental difference.  And one point being made is that it's
the occasional worst case latencies which are the problem.  Dynamic
adjustments could be one culprit for this.

Hmm, I wonder if this could have anything to do with it.  These are
within striking range:

  PID COMMAND          NI PRI
    9 events/1        -10  34
  931 kcryptd/1       -10  33
  930 kcryptd/0       -10  34
    8 events/0        -10  34
  892 ata/1           -10  34
  891 ata/0           -10  34
 3747 udevd           -10  33
   26 kacpid          -10  31
  238 aio/1           -10  34
  237 aio/0           -10  31
  117 kblockd/1       -10  34
  116 kblockd/0       -10  34
   10 khelper         -10  34

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:10                                                                   ` Lee Revell
@ 2005-01-11 21:20                                                                     ` Chris Wright
  0 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-11 21:20 UTC (permalink / raw)
  To: Lee Revell
  Cc: Chris Wright, Matt Mackall, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, paul, arjanv, mingo, alan, linux-kernel

* Lee Revell (rlrevell@joe-job.com) wrote:
> On Tue, 2005-01-11 at 12:47 -0800, Chris Wright wrote:
> > Guys, could we please bring this back to a useful discussion.  None of
> > you have commented on whether the rlimits for priority are useful.  As I
> > said before, I've no real problem with the module as it stands since it's
> > tiny, quite contained, and does something people need.  But I agree it'd
> > be better to find something that's workable as long term solution.
> 
> Chris, I did comment on it, see
> 1105222442.24592.126.camel@krustophenia.net from around 5:15 on
> Saturday.

Eeek, I missed/forgot (let me guess, I replied too? ;-)

Thanks Lee.
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-08  6:12                                                   ` Jack O'Quin
                                                                       ` (2 preceding siblings ...)
  2005-01-10 21:20                                                     ` Matt Mackall
@ 2005-01-11 21:21                                                     ` Ingo Molnar
  2005-01-12  2:10                                                       ` Jack O'Quin
  2005-01-15  4:56                                                       ` Jack O'Quin
  3 siblings, 2 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-01-11 21:21 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> The numbers I reported yesterday were so bad I couldn't figure out why
> anyone even thought it was worth trying.  Now I realize why. 
> 
> When Ingo said to try "nice -20", I took him literally, forgetting
> that the stupid command to achieve a nice value of -20 is `nice --20'.
> So I was actually testing with a nice value of 19.  Bah!  No wonder it
> sucked.
> 
> Running `nice --20' is still significantly worse than SCHED_FIFO, but
> not the unmitigated disaster shown in the middle column.  But, this
> improved performance is still not adequate for audio work.  The worst
> delay was absurdly long (~1/2 sec).
> 
> Here are the corrected results...
> 
>                                  With -R        Without -R      Without -R
>                                (SCHED_FIFO)     (nice -20)      (nice --20)
> 
> ************* SUMMARY RESULT ****************
> Total seconds ran . . . . . . :   300
> Number of clients . . . . . . :    20
> Ports per client  . . . . . . :     4
> Frames per buffer . . . . . . :    64
> *********************************************
> Timeout Count . . . . . . . . :(    1)          (    1)          (    1)
> XRUN Count  . . . . . . . . . :     2             2837               43
> Delay Count (>spare time) . . :     0                0                0
> Delay Count (>1000 usecs) . . :     0                0                0
> Delay Maximum . . . . . . . . :  3130 usecs    5038044 usecs   501374 usecs
> Cycle Maximum . . . . . . . . :   960 usecs      18802 usecs     1036 usecs
> Average DSP Load. . . . . . . :    34.3 %           44.1 %         34.3 %    

what kind of non-audio workload was there during this test? 43 xruns
arent nice but arent that bad either.

plus, is it 100% sure that all audio threads inherited the nice --20
priority - including the client threads? Nornally jackd does a
setscheduler for the client threads so that they get boosted to
SCHED_FIFO, but there is no parallel to that in the nice --20 case, did
you do that manually (or did you start the clients up from the nice --20
shell too?))

If the nice --20 priority setup is perfect and there are still xruns
then could you try the following hack, change this line in
kernel/sched.c:

 #define STARVATION_LIMIT        (MAX_SLEEP_AVG)

to:

 #define STARVATION_LIMIT        0

this will turn off starvation checking, for testing purposes. (to see
whether there's anything else but anti-starvation causing xruns.)

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:14                                                               ` Chris Wright
@ 2005-01-11 21:27                                                                 ` Ingo Molnar
  2005-01-11 22:13                                                                   ` Chris Wright
                                                                                     ` (2 more replies)
  0 siblings, 3 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-01-11 21:27 UTC (permalink / raw)
  To: Chris Wright
  Cc: Matt Mackall, Paul Davis, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Chris Wright <chrisw@osdl.org> wrote:

> Hmm, I wonder if this could have anything to do with it.  These are
> within striking range:
> 
>   PID COMMAND          NI PRI
>     9 events/1        -10  34
>   931 kcryptd/1       -10  33
>   930 kcryptd/0       -10  34
>     8 events/0        -10  34
>   892 ata/1           -10  34
>   891 ata/0           -10  34
>  3747 udevd           -10  33
>    26 kacpid          -10  31
>   238 aio/1           -10  34
>   237 aio/0           -10  31
>   117 kblockd/1       -10  34
>   116 kblockd/0       -10  34
>    10 khelper         -10  34

you are right, i forgot about kernel threads. If they are nice -10 on
Jack's system too then they are within striking range indeed, especially
since they are typically idle and if then they are active for short
bursts of time and get the maximum boost. Jack, could you renice these
to -5, to make sure they dont interfere?

btw., why are these at nice -10? workqueue.c sets nice value to -5
normally.

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 20:47                                                                 ` Chris Wright
  2005-01-11 21:10                                                                   ` Lee Revell
@ 2005-01-11 21:28                                                                   ` Matt Mackall
  2005-01-11 21:38                                                                     ` Lee Revell
                                                                                       ` (3 more replies)
  1 sibling, 4 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-11 21:28 UTC (permalink / raw)
  To: Chris Wright
  Cc: Lee Revell, Jack O'Quin, Christoph Hellwig, Andrew Morton,
	paul, arjanv, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 12:47:07PM -0800, Chris Wright wrote:
> Guys, could we please bring this back to a useful discussion.  None of
> you have commented on whether the rlimits for priority are useful.  As I
> said before, I've no real problem with the module as it stands since it's
> tiny, quite contained, and does something people need.  But I agree it'd
> be better to find something that's workable as long term solution.

I almost like it. I don't like that it exposes the internal scheduler
priorities directly (-tiny in fact has options to change these!). So
perhaps some thought could be given to either stratifying it a bit
more (>2000 for SCHED_FIFO, >1000 for SCHED_RR, then SCHED_OTHER) or
separate limits for the different scheduling disciplines. 

Right now, you can make a good argument that SCHED_FIFO > SCHED_RR >
SCHED_OTHER from a privilege point of view, but that could change if
we add a pseudo-RT scheduling class of some sort. Similarly, adding a
discipline means adding an rlimit with the split approach, so that's
not very friendly either.

Another way:

0-20: normal nice values (inverted)
>20: privilege to set any RT priority

Limiting to below normal nice is a little weird and the offset to make
everything positive is weird as well. Above 20, any RT app can starve
SCHED_OTHER and it's less important to dole out fine-grained levels
here as these apps must be engineered to cooperate to some degree
anyway.

But I'm also still not convinced this policy can't be most flexibly
handled by a setuid helper together with the mlock rlimit.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:28                                                                   ` Matt Mackall
@ 2005-01-11 21:38                                                                     ` Lee Revell
  2005-01-11 21:41                                                                       ` Arjan van de Ven
  2005-01-11 22:05                                                                       ` Matt Mackall
  2005-01-11 21:42                                                                     ` Chris Wright
                                                                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-11 21:38 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Jack O'Quin, Christoph Hellwig, Andrew Morton,
	paul, arjanv, mingo, alan, linux-kernel

On Tue, 2005-01-11 at 13:28 -0800, Matt Mackall wrote:
> But I'm also still not convinced this policy can't be most flexibly
> handled by a setuid helper together with the mlock rlimit.
> 

Quoting my message from a few days ago:

On Thu, 2005-01-06 at 17:18 -0800, Matt Mackall wrote:
> Why can't this be done with a simple SUID helper to promote given
> tasks to RT with sched_setschedule, doing essentially all the checks
> this LSM is doing? 
> 
> Objections of "because it requires dangerous root or suid" don't fly,
> an RT app under user control can DoS the box trivially. Never mind you
> need root to configure the LSM anyway..

Yes but a bug in an app running as root can trash the filesystem.  The
worst you can do with RT privileges is lock up the machine.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:38                                                                     ` Lee Revell
@ 2005-01-11 21:41                                                                       ` Arjan van de Ven
  2005-01-11 22:51                                                                         ` Paul Davis
  2005-01-11 22:05                                                                       ` Matt Mackall
  1 sibling, 1 reply; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-11 21:41 UTC (permalink / raw)
  To: Lee Revell
  Cc: Matt Mackall, Chris Wright, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, paul, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 04:38:14PM -0500, Lee Revell wrote:
> Yes but a bug in an app running as root can trash the filesystem.  The
> worst you can do with RT privileges is lock up the machine.

several filesystem and IO threads run at prio -10 but not RT.
That makes me a bit less sure of your statement....

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:28                                                                   ` Matt Mackall
  2005-01-11 21:38                                                                     ` Lee Revell
@ 2005-01-11 21:42                                                                     ` Chris Wright
  2005-01-11 22:16                                                                       ` Matt Mackall
  2005-01-11 22:17                                                                     ` utz
  2005-01-11 22:48                                                                     ` Paul Davis
  3 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-11 21:42 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Lee Revell, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, paul, arjanv, mingo, alan, linux-kernel

* Matt Mackall (mpm@selenic.com) wrote:
> On Tue, Jan 11, 2005 at 12:47:07PM -0800, Chris Wright wrote:
> > Guys, could we please bring this back to a useful discussion.  None of
> > you have commented on whether the rlimits for priority are useful.  As I
> > said before, I've no real problem with the module as it stands since it's
> > tiny, quite contained, and does something people need.  But I agree it'd
> > be better to find something that's workable as long term solution.
> 
> I almost like it. I don't like that it exposes the internal scheduler
> priorities directly (-tiny in fact has options to change these!). So
> perhaps some thought could be given to either stratifying it a bit
> more (>2000 for SCHED_FIFO, >1000 for SCHED_RR, then SCHED_OTHER) or
> separate limits for the different scheduling disciplines. 

Yeah, I don't like that either (thought I mentioned it in earliest
patch).  I thought about the method you mentioned, but didn't like it
much better.  I also suggested using 0 == default, 1 == can nice down,
2 == can set RT prio.  Utz suggests just splitting nice limit from rt
limit.

> But I'm also still not convinced this policy can't be most flexibly
> handled by a setuid helper together with the mlock rlimit.

Wait, why can't it be done with (to date fictitious) pam_prio, which
simply calls sched_setscheduler?  It's already privileged while it's
doing these things...

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:38                                                                     ` Lee Revell
  2005-01-11 21:41                                                                       ` Arjan van de Ven
@ 2005-01-11 22:05                                                                       ` Matt Mackall
  1 sibling, 0 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-11 22:05 UTC (permalink / raw)
  To: Lee Revell
  Cc: Chris Wright, Jack O'Quin, Christoph Hellwig, Andrew Morton,
	paul, arjanv, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 04:38:14PM -0500, Lee Revell wrote:
> On Tue, 2005-01-11 at 13:28 -0800, Matt Mackall wrote:
> > But I'm also still not convinced this policy can't be most flexibly
> > handled by a setuid helper together with the mlock rlimit.
> 
> Quoting my message from a few days ago:
> 
> On Thu, 2005-01-06 at 17:18 -0800, Matt Mackall wrote:
> > Why can't this be done with a simple SUID helper to promote given
> > tasks to RT with sched_setschedule, doing essentially all the checks
> > this LSM is doing? 
> > 
> > Objections of "because it requires dangerous root or suid" don't fly,
> > an RT app under user control can DoS the box trivially. Never mind you
> > need root to configure the LSM anyway..
> 
> Yes but a bug in an app running as root can trash the filesystem.  The
> worst you can do with RT privileges is lock up the machine.

Yes. So can a bug in an LSM or in new rlimits code.

But bugs can be fixed. Poorly designed APIs cannot. That's why the
best API from the kernel perspective is no API: do it in userspace
wherever possible. Bring the kernel in only when the kernel can do it
better, more cleanly, and more generally. The rlimits-on-priorities
approach may qualify in that it might solve problems for other folks
(games on the desktop, CD burning, and the like) and isn't a bad fit
into the rest of the standard security model, but it's still got a
wart or two.

I suppose I ought to spell out my personal LSM bias while I'm at it:

- it invites ad-hoc extensions like this
- we have enough security issues without supporting a proliferation of
  incompatible security models

So while I think it's perfectly fine for people to kludge up things
like this, I don't think they belong in the tree unless they're _very_
generally applicable and _very_ well thought out. LSMs should not be
treated like drivers.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:27                                                                 ` Ingo Molnar
@ 2005-01-11 22:13                                                                   ` Chris Wright
  2005-01-11 22:26                                                                     ` Con Kolivas
  2005-01-12  3:21                                                                   ` Jack O'Quin
  2005-01-13  5:44                                                                   ` Jack O'Quin
  2 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-11 22:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Jack O'Quin,
	Christoph Hellwig, Andrew Morton, Lee Revell, arjanv, alan,
	linux-kernel

* Ingo Molnar (mingo@elte.hu) wrote:
> you are right, i forgot about kernel threads. If they are nice -10 on
> Jack's system too then they are within striking range indeed, especially
> since they are typically idle and if then they are active for short
> bursts of time and get the maximum boost. Jack, could you renice these
> to -5, to make sure they dont interfere?

Yup, their bursty nature makes them seem a likely culprit.

> btw., why are these at nice -10? workqueue.c sets nice value to -5
> normally.

Heh, I was just wondering the same thing.

BTW, grepping set_user_nice shows a few more possible culprits.
One more reason that there may be value in promoting the audio app to
rt scheduling.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:42                                                                     ` Chris Wright
@ 2005-01-11 22:16                                                                       ` Matt Mackall
  2005-01-11 22:21                                                                         ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-11 22:16 UTC (permalink / raw)
  To: Chris Wright
  Cc: Lee Revell, Jack O'Quin, Christoph Hellwig, Andrew Morton,
	paul, arjanv, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 01:42:51PM -0800, Chris Wright wrote:
> > But I'm also still not convinced this policy can't be most flexibly
> > handled by a setuid helper together with the mlock rlimit.
> 
> Wait, why can't it be done with (to date fictitious) pam_prio, which
> simply calls sched_setscheduler?  It's already privileged while it's
> doing these things...

You certainly do not want to run everything at RT from login on.
That'd be bad.

Also, tying to UIDs rather than (UID, executable) is worrisome as
random_game_with_audio in Gnome might decide it needs RT, much to the
admin's surprise.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:28                                                                   ` Matt Mackall
  2005-01-11 21:38                                                                     ` Lee Revell
  2005-01-11 21:42                                                                     ` Chris Wright
@ 2005-01-11 22:17                                                                     ` utz
  2005-01-11 22:48                                                                     ` Paul Davis
  3 siblings, 0 replies; 266+ messages in thread
From: utz @ 2005-01-11 22:17 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Lee Revell, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, paul, arjanv, mingo, alan, LKML

On Tue, 2005-01-11 at 13:28 -0800, Matt Mackall wrote:
> On Tue, Jan 11, 2005 at 12:47:07PM -0800, Chris Wright wrote:
> > Guys, could we please bring this back to a useful discussion.  None of
> > you have commented on whether the rlimits for priority are useful.  As I
> > said before, I've no real problem with the module as it stands since it's
> > tiny, quite contained, and does something people need.  But I agree it'd
> > be better to find something that's workable as long term solution.
> 
> I almost like it. I don't like that it exposes the internal scheduler
> priorities directly (-tiny in fact has options to change these!). So
> perhaps some thought could be given to either stratifying it a bit
> more (>2000 for SCHED_FIFO, >1000 for SCHED_RR, then SCHED_OTHER) or
> separate limits for the different scheduling disciplines. 
> 
> Right now, you can make a good argument that SCHED_FIFO > SCHED_RR >
> SCHED_OTHER from a privilege point of view, but that could change if
> we add a pseudo-RT scheduling class of some sort. Similarly, adding a
> discipline means adding an rlimit with the split approach, so that's
> not very friendly either.
> 
> Another way:
> 
> 0-20: normal nice values (inverted)
> >20: privilege to set any RT priority
> 
> Limiting to below normal nice is a little weird and the offset to make
> everything positive is weird as well. Above 20, any RT app can starve
> SCHED_OTHER and it's less important to dole out fine-grained levels
> here as these apps must be engineered to cooperate to some degree
> anyway.

Limiting to positive nice values are needed too. At leased i need such
thing. Normal users are only allowed to increase the nice value (lower
prio). If a user job runs at nice 15 they can't renice it to 5. I get
about 3 calls a week to do this as root.

And the presentation of the usual nice values can be done in userspace.
pamlimits and ulimit already converts values (min -> s, KiB -> Bytes).

And separating the nice and RT part is useful to prevent confusion in
userspace tools and for the admin.

I patched PAM which allows the setting of nice and realtime rlimits in
limits.conf:

nice goes from 19 to -20 (internally converted to 0-39).
realtime from 0 - 99.




^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 22:16                                                                       ` Matt Mackall
@ 2005-01-11 22:21                                                                         ` Chris Wright
  2005-01-11 22:36                                                                           ` utz lehmann
  0 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-11 22:21 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Lee Revell, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, paul, arjanv, mingo, alan, linux-kernel

* Matt Mackall (mpm@selenic.com) wrote:
> On Tue, Jan 11, 2005 at 01:42:51PM -0800, Chris Wright wrote:
> > > But I'm also still not convinced this policy can't be most flexibly
> > > handled by a setuid helper together with the mlock rlimit.
> > 
> > Wait, why can't it be done with (to date fictitious) pam_prio, which
> > simply calls sched_setscheduler?  It's already privileged while it's
> > doing these things...
> 
> You certainly do not want to run everything at RT from login on.
> That'd be bad.

Yup, true.

> Also, tying to UIDs rather than (UID, executable) is worrisome as
> random_game_with_audio in Gnome might decide it needs RT, much to the
> admin's surprise.

Hmm, well, the pam_limit approach has that problem.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 22:13                                                                   ` Chris Wright
@ 2005-01-11 22:26                                                                     ` Con Kolivas
  0 siblings, 0 replies; 266+ messages in thread
From: Con Kolivas @ 2005-01-11 22:26 UTC (permalink / raw)
  To: Chris Wright
  Cc: Ingo Molnar, Matt Mackall, Paul Davis, Jack O'Quin,
	Christoph Hellwig, Andrew Morton, Lee Revell, arjanv, alan,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 880 bytes --]

Chris Wright wrote:
> * Ingo Molnar (mingo@elte.hu) wrote:
> 
>>you are right, i forgot about kernel threads. If they are nice -10 on
>>Jack's system too then they are within striking range indeed, especially
>>since they are typically idle and if then they are active for short
>>bursts of time and get the maximum boost. Jack, could you renice these
>>to -5, to make sure they dont interfere?
> 
> 
> Yup, their bursty nature makes them seem a likely culprit.
> 
> 
>>btw., why are these at nice -10? workqueue.c sets nice value to -5
>>normally.
> 
> 
> Heh, I was just wondering the same thing.
> 
> BTW, grepping set_user_nice shows a few more possible culprits.
> One more reason that there may be value in promoting the audio app to
> rt scheduling.

They were nice -10. I changed them to nice -5 recently in -mm and that 
just got commited to -bk post 2.6.10

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 22:21                                                                         ` Chris Wright
@ 2005-01-11 22:36                                                                           ` utz lehmann
  2005-01-11 22:41                                                                             ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: utz lehmann @ 2005-01-11 22:36 UTC (permalink / raw)
  To: Chris Wright
  Cc: Matt Mackall, Lee Revell, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, paul, arjanv, mingo, alan, LKML

On Tue, 2005-01-11 at 14:21 -0800, Chris Wright wrote:
> * Matt Mackall (mpm@selenic.com) wrote:
> > On Tue, Jan 11, 2005 at 01:42:51PM -0800, Chris Wright wrote:

> > Also, tying to UIDs rather than (UID, executable) is worrisome as
> > random_game_with_audio in Gnome might decide it needs RT, much to the
> > admin's surprise.
> 
> Hmm, well, the pam_limit approach has that problem.

selinux?



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 22:36                                                                           ` utz lehmann
@ 2005-01-11 22:41                                                                             ` Chris Wright
  0 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-11 22:41 UTC (permalink / raw)
  To: utz lehmann
  Cc: Chris Wright, Matt Mackall, Lee Revell, Jack O'Quin,
	Christoph Hellwig, Andrew Morton, paul, arjanv, mingo, alan,
	LKML

* utz lehmann (lkml@s2y4n2c.de) wrote:
> On Tue, 2005-01-11 at 14:21 -0800, Chris Wright wrote:
> > * Matt Mackall (mpm@selenic.com) wrote:
> > > On Tue, Jan 11, 2005 at 01:42:51PM -0800, Chris Wright wrote:
> 
> > > Also, tying to UIDs rather than (UID, executable) is worrisome as
> > > random_game_with_audio in Gnome might decide it needs RT, much to the
> > > admin's surprise.
> > 
> > Hmm, well, the pam_limit approach has that problem.
> 
> selinux?

Won't work (at least not now).  LIDS should do be able to do it.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 19:50                                                         ` Matt Mackall
  2005-01-11 19:57                                                           ` Jack O'Quin
@ 2005-01-11 22:45                                                           ` Paul Davis
  1 sibling, 0 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-11 22:45 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Chris Wright, Christoph Hellwig, Andrew Morton,
	Lee Revell, arjanv, mingo, alan, linux-kernel

>Thankfully a buffer underrun is no more fatal for pro audio than a
>broken guitar string. CDs skip, DATs glitch, XLR cables flake out,
>circuit breakers trip, amps clip, Powerbooks crash, and the show goes
>on. I've done more than enough stage tech to know it's a huge pain in
>the ass, but let's stop pretending we require absolute perfection,
>please.

Are you really serious? Nobody said anything about absolute
perfection. We've got 2 kernels (2.4+lowlat, and 2.6
+realtime_preempt) whose performance *far* exceeds that of any vanilla
kernel, and in the latter case, probably any other desktop OS. We've
even got a kernel (2.6.9 or maybe .10) whose performance is getting
closer to par with OSX. We want people to be able to access this
performance relatively hassle free. Right now, people who want this
have to jump through a lot of hoops to access something they can, and
should, be able to do quite easily.

*That* is what this is all about, nothing more. From the looks of
things, the performance of vanilla 2.6 in this area is going to
continue to improve, but users' ability to actually use it will remain
in the same primitive condition its in now.

--p





^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:28                                                                   ` Matt Mackall
                                                                                       ` (2 preceding siblings ...)
  2005-01-11 22:17                                                                     ` utz
@ 2005-01-11 22:48                                                                     ` Paul Davis
  2005-01-11 23:06                                                                       ` Matt Mackall
  3 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-11 22:48 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Lee Revell, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, arjanv, mingo, alan, linux-kernel

>But I'm also still not convinced this policy can't be most flexibly
>handled by a setuid helper together with the mlock rlimit.

This has been explained several times already.

When you run a JACK client, the user should not be required to use a
different command sequence depending on whether or not JACK is running
with RT scheduling or not. That's almost more arcane than the current
situation and is a step backwards from even 2.4, where we use
capabilities to allow JACK itself to pass on the ability to use RT
scheduling and memlock to its clients.

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:41                                                                       ` Arjan van de Ven
@ 2005-01-11 22:51                                                                         ` Paul Davis
  2005-01-11 23:05                                                                           ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-11 22:51 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Lee Revell, Matt Mackall, Chris Wright, Jack O'Quin,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel

>On Tue, Jan 11, 2005 at 04:38:14PM -0500, Lee Revell wrote:
>> Yes but a bug in an app running as root can trash the filesystem.  The
>> worst you can do with RT privileges is lock up the machine.
>
>several filesystem and IO threads run at prio -10 but not RT.
>That makes me a bit less sure of your statement....

Its completely orthogonal. Lee didn't say "tasks running without RT
can't mess up filesystems". He said "tasks running as root can trash
the filesystem" and "tasks running as RT can lock up the
machine". obviously, the intersection point (a root, RT task) is
double trouble.

--p


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 22:51                                                                         ` Paul Davis
@ 2005-01-11 23:05                                                                           ` Chris Wright
  2005-01-12  1:43                                                                             ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-11 23:05 UTC (permalink / raw)
  To: Paul Davis
  Cc: Arjan van de Ven, Lee Revell, Matt Mackall, Chris Wright,
	Jack O'Quin, Christoph Hellwig, Andrew Morton, mingo, alan,
	linux-kernel

* Paul Davis (paul@linuxaudiosystems.com) wrote:
> >On Tue, Jan 11, 2005 at 04:38:14PM -0500, Lee Revell wrote:
> >> Yes but a bug in an app running as root can trash the filesystem.  The
> >> worst you can do with RT privileges is lock up the machine.
> >
> >several filesystem and IO threads run at prio -10 but not RT.
> >That makes me a bit less sure of your statement....
> 
> Its completely orthogonal. Lee didn't say "tasks running without RT
> can't mess up filesystems". He said "tasks running as root can trash
> the filesystem" and "tasks running as RT can lock up the
> machine". obviously, the intersection point (a root, RT task) is
> double trouble.

This is straying from the core issue...  But, Arjan's saying that an RT
(non-root) task could trash the filesystem if it deadlocks the machine
(because those important fs and IO threads don't run).

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 22:48                                                                     ` Paul Davis
@ 2005-01-11 23:06                                                                       ` Matt Mackall
  2005-01-12  2:13                                                                         ` Paul Davis
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-11 23:06 UTC (permalink / raw)
  To: Paul Davis
  Cc: Chris Wright, Lee Revell, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, arjanv, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 05:48:43PM -0500, Paul Davis wrote:
> >But I'm also still not convinced this policy can't be most flexibly
> >handled by a setuid helper together with the mlock rlimit.
> 
> This has been explained several times already.
> 
> When you run a JACK client, the user should not be required to use a
> different command sequence depending on whether or not JACK is running
> with RT scheduling or not. That's almost more arcane than the current
> situation and is a step backwards from even 2.4, where we use
> capabilities to allow JACK itself to pass on the ability to use RT
> scheduling and memlock to its clients.

And that is a failure of imagination on the part of the JACK
developers. Simply add a library function to libjack or whatever:

 jack_make_me_important(...); /* pretty please */

A client starts at normal priority, asks jack nicely to promote it to
RT, then jackd, if so configured/enabled, calls the wrapper with a PID
and a priority level. The wrapper checks the UID/priority/executable
name against its permission table and does sched_set{scheduler,param}
if allowed.

This is nice because Jack gets to make the decisions about what the
appropriate priorities for its clients are (eg they can't be higher
priority than jackd, etc.) and it all gracefully falls back if the
helper isn't enabled.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 23:05                                                                           ` Chris Wright
@ 2005-01-12  1:43                                                                             ` Jack O'Quin
  2005-01-12  7:49                                                                               ` Arjan van de Ven
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-12  1:43 UTC (permalink / raw)
  To: Chris Wright
  Cc: Paul Davis, Arjan van de Ven, Lee Revell, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel

Chris Wright <chrisw@osdl.org> writes:

> * Paul Davis (paul@linuxaudiosystems.com) wrote:
>> >On Tue, Jan 11, 2005 at 04:38:14PM -0500, Lee Revell wrote:
>> >> Yes but a bug in an app running as root can trash the filesystem.  The
>> >> worst you can do with RT privileges is lock up the machine.
>> >
>> >several filesystem and IO threads run at prio -10 but not RT.
>> >That makes me a bit less sure of your statement....
>> 
>> Its completely orthogonal. Lee didn't say "tasks running without RT
>> can't mess up filesystems". He said "tasks running as root can trash
>> the filesystem" and "tasks running as RT can lock up the
>> machine". obviously, the intersection point (a root, RT task) is
>> double trouble.
>
> This is straying from the core issue...  But, Arjan's saying that an RT
> (non-root) task could trash the filesystem if it deadlocks the machine
> (because those important fs and IO threads don't run).

Lexicographic ambiguity: Lee and Paul are using "trash" for things
like installing a hidden suid root shell or co-opting sendmail into an
open spam relay.  Arjan just means crashing the system which forces
reboot to run fsck.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:21                                                     ` Ingo Molnar
@ 2005-01-12  2:10                                                       ` Jack O'Quin
  2005-01-15  4:56                                                       ` Jack O'Quin
  1 sibling, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-12  2:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, alan, linux-kernel

Ingo Molnar <mingo@elte.hu> writes:

> * Jack O'Quin <joq@io.com> wrote:
>
>> Here are the corrected results...
>> 
>>                                  With -R        Without -R      Without -R
>>                                (SCHED_FIFO)     (nice -20)      (nice --20)
>> 
>> XRUN Count  . . . . . . . . . :     2             2837               43
>> Delay Maximum . . . . . . . . :  3130 usecs    5038044 usecs   501374 usecs
>> Cycle Maximum . . . . . . . . :   960 usecs      18802 usecs     1036 usecs
>
> what kind of non-audio workload was there during this test? 43 xruns
> arent nice but arent that bad either.

Nothing heavy, but I was reading mail, and switching GNOME workspaces.
Workspace switching often caused trouble in the past, but I had
already hacked my X server not to run nice -10 (which is the Debian
default).

> plus, is it 100% sure that all audio threads inherited the nice --20
> priority - including the client threads? Nornally jackd does a
> setscheduler for the client threads so that they get boosted to
> SCHED_FIFO, but there is no parallel to that in the nice --20 case, did
> you do that manually (or did you start the clients up from the nice --20
> shell too?))

Having totally screwed up the test once already, I hesitate to claim
100% surety about anything.  :-)

The script starts all the clients.  I ran it with nice --20.  I just
started it again so I could check the nice values with GNOME system
monitor.  They all have -20, AFAICS.  There are a bunch of them at
-20, and I don't see any process that looks relevant without -20.

> If the nice --20 priority setup is perfect and there are still xruns
> then could you try the following hack, change this line in
> kernel/sched.c:
>
>  #define STARVATION_LIMIT        (MAX_SLEEP_AVG)
>
> to:
>
>  #define STARVATION_LIMIT        0
>
> this will turn off starvation checking, for testing purposes. (to see
> whether there's anything else but anti-starvation causing xruns.)

No problem (it might be Thursday before I have time to try it).
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 23:06                                                                       ` Matt Mackall
@ 2005-01-12  2:13                                                                         ` Paul Davis
  2005-01-12 19:09                                                                           ` Matt Mackall
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-12  2:13 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Lee Revell, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, arjanv, mingo, alan, linux-kernel

>And that is a failure of imagination on the part of the JACK

Please be careful with your words. Based on your comments below, it
appears that you've never read any of the technical docs on it, and
almost certainly never read the source code.

>developers. Simply add a library function to libjack or whatever:
>
> jack_make_me_important(...); /* pretty please */

like:

  int jack_set_client_capabilities (jack_engine_t *engine, jack_client_id_t id);

along with various other things that will ultimately get the client to
call functions like:

   int jack_drop_real_time_scheduling (pthread_t thread);
   int jack_acquire_real_time_scheduling (pthread_t thread, int priority);

these functions are exported to clients, because some clients have
other threads that require RT scheduling.

>A client starts at normal priority, asks jack nicely to promote it to
>RT, then jackd, if so configured/enabled, calls the wrapper with a PID

a PID? clients are multithreaded, and only specific threads run with
RT scheduling (normally just the one created for them by
libjack). So you presumably mean a TID, which in turn creates a
problem for any system (e.g. 2.4) where all threads share the PID, and
sched_setscheduler() really does use the PID as a PID, not a TID.

but its gets worse. JACK clients need to drop RT scheduling under
certain, well-defined circumstances. how do they get it back under
this scheme?

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:27                                                                 ` Ingo Molnar
  2005-01-11 22:13                                                                   ` Chris Wright
@ 2005-01-12  3:21                                                                   ` Jack O'Quin
  2005-01-12  4:29                                                                     ` Chris Wright
  2005-01-13  5:44                                                                   ` Jack O'Quin
  2 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-12  3:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel

Ingo Molnar <mingo@elte.hu> writes:

> * Chris Wright <chrisw@osdl.org> wrote:
>
>> Hmm, I wonder if this could have anything to do with it.  These are
>> within striking range:
>> 
>>   PID COMMAND          NI PRI
>>     9 events/1        -10  34
>>   931 kcryptd/1       -10  33
>>   930 kcryptd/0       -10  34
>>     8 events/0        -10  34
>>   892 ata/1           -10  34
>>   891 ata/0           -10  34
>>  3747 udevd           -10  33
>>    26 kacpid          -10  31
>>   238 aio/1           -10  34
>>   237 aio/0           -10  31
>>   117 kblockd/1       -10  34
>>   116 kblockd/0       -10  34
>>    10 khelper         -10  34
>
> you are right, i forgot about kernel threads. If they are nice -10 on
> Jack's system too then they are within striking range indeed, especially
> since they are typically idle and if then they are active for short
> bursts of time and get the maximum boost. Jack, could you renice these
> to -5, to make sure they dont interfere?

Sure.  My system does have some of these running at nice -10.  Where
(how) do I change them?

BTW, let's not lose sight of the fact that `nice --20 foo' requires
CAP_SYS_NICE just like SCHED_FIFO does.  From a privilege perspective,
this recurses to the same (still unsolved) problem.  

Chris's rlimits proposal was the only workable suggestion I've seen
for that.  Is there any hope of doing something like that in the 2.6.x
timeframe?  

At this point, I no longer even care that PAM will probably start
randomly assigning users unlimited scheduling rights like it recently
did for mlock.  Eventually, that will get fixed.  :-(
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-12  3:21                                                                   ` Jack O'Quin
@ 2005-01-12  4:29                                                                     ` Chris Wright
  0 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-12  4:29 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Ingo Molnar, Chris Wright, Matt Mackall, Paul Davis,
	Christoph Hellwig, Andrew Morton, Lee Revell, arjanv, alan,
	linux-kernel

* Jack O'Quin (joq@io.com) wrote:
> Ingo Molnar <mingo@elte.hu> writes:
> > you are right, i forgot about kernel threads. If they are nice -10 on
> > Jack's system too then they are within striking range indeed, especially
> > since they are typically idle and if then they are active for short
> > bursts of time and get the maximum boost. Jack, could you renice these
> > to -5, to make sure they dont interfere?
> 
> Sure.  My system does have some of these running at nice -10.  Where
> (how) do I change them?

For a one off test you can brute force it with the plain old renice(8).
Or (depending on which kernel you're using -- Con changed this post
2.6.10) you can apply a patch like:

diff -Nru a/kernel/workqueue.c b/kernel/workqueue.c
--- a/kernel/workqueue.c	2005-01-11 20:26:26 -08:00
+++ b/kernel/workqueue.c	2005-01-11 20:26:26 -08:00
@@ -188,7 +188,7 @@
 
 	current->flags |= PF_NOFREEZE;
 
-	set_user_nice(current, -10);
+	set_user_nice(current, -5);
 
 	/* Block and flush all signals */
 	sigfillset(&blocked);

> BTW, let's not lose sight of the fact that `nice --20 foo' requires
> CAP_SYS_NICE just like SCHED_FIFO does.  From a privilege perspective,
> this recurses to the same (still unsolved) problem.  

Yup, not forgotten ;-)

> Chris's rlimits proposal was the only workable suggestion I've seen
> for that.  Is there any hope of doing something like that in the 2.6.x
> timeframe?  

Yes there is.  We've made other rlimits changes in 2.6, and this one isn't
that invasive.  The main issues are:  getting semantics right, making sure
it actually solves the problem, making sure it keeps sane defaults (not
creating some new ugly hole), and making sure it's in step with the Grand
Plan (TM).  None of these issues are showstoppers, all quite workable.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-12  1:43                                                                             ` Jack O'Quin
@ 2005-01-12  7:49                                                                               ` Arjan van de Ven
  2005-01-12 21:12                                                                                 ` Lee Revell
  2005-01-13  0:44                                                                                 ` Jack O'Quin
  0 siblings, 2 replies; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-12  7:49 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Paul Davis, Lee Revell, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 07:43:29PM -0600, Jack O'Quin wrote:
> > This is straying from the core issue...  But, Arjan's saying that an RT
> > (non-root) task could trash the filesystem if it deadlocks the machine
> > (because those important fs and IO threads don't run).
> 
> Lexicographic ambiguity: Lee and Paul are using "trash" for things
> like installing a hidden suid root shell or co-opting sendmail into an
> open spam relay.  Arjan just means crashing the system which forces
> reboot to run fsck.

I actually meant data corruption.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-12  2:13                                                                         ` Paul Davis
@ 2005-01-12 19:09                                                                           ` Matt Mackall
  2005-01-12 21:25                                                                             ` Lee Revell
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-12 19:09 UTC (permalink / raw)
  To: Paul Davis
  Cc: Chris Wright, Lee Revell, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, arjanv, mingo, alan, linux-kernel

On Tue, Jan 11, 2005 at 09:13:44PM -0500, Paul Davis wrote:
> >And that is a failure of imagination on the part of the JACK
> 
> Please be careful with your words. Based on your comments below, it
> appears that you've never read any of the technical docs on it, and
> almost certainly never read the source code.

I thought I made it clear that I didn't even know the name of library.
And I thought I understood from you that you had to do different
start-up per client depending on whether RT was available. Have I
misunderstood you?

> >A client starts at normal priority, asks jack nicely to promote it to
> >RT, then jackd, if so configured/enabled, calls the wrapper with a PID
> 
> a PID? clients are multithreaded, and only specific threads run with
> RT scheduling (normally just the one created for them by
> libjack). So you presumably mean a TID, which in turn creates a
> problem for any system (e.g. 2.4) where all threads share the PID, and
> sched_setscheduler() really does use the PID as a PID, not a TID.

That actually sounds like an independent API problem.

> but its gets worse. JACK clients need to drop RT scheduling under
> certain, well-defined circumstances. how do they get it back under
> this scheme?

Assuming a more thread-aware API, they just ask for privileges again.
But with the non-thread-aware API, my first reaction would be the thread in
question clones, and the clone drops privileges.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-12  7:49                                                                               ` Arjan van de Ven
@ 2005-01-12 21:12                                                                                 ` Lee Revell
  2005-01-13  0:44                                                                                 ` Jack O'Quin
  1 sibling, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-12 21:12 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Jack O'Quin, Chris Wright, Paul Davis, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel

On Wed, 2005-01-12 at 08:49 +0100, Arjan van de Ven wrote: 
> On Tue, Jan 11, 2005 at 07:43:29PM -0600, Jack O'Quin wrote:
> > > This is straying from the core issue...  But, Arjan's saying that an RT
> > > (non-root) task could trash the filesystem if it deadlocks the machine
> > > (because those important fs and IO threads don't run).
> > 
> > Lexicographic ambiguity: Lee and Paul are using "trash" for things
> > like installing a hidden suid root shell or co-opting sendmail into an
> > open spam relay.  Arjan just means crashing the system which forces
> > reboot to run fsck.
> 
> I actually meant data corruption.

OK, so the ability to run RT tasks implies the ability to possibly
corrupt data.  It appears that this can't be fixed until we have a real
isochronous scheduling class; for the forseeable future RT tasks will
need SCHED_FIFO and nonroot users will need to run them.

Anyway it's good to see the problem finally being taken seriously.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-12 19:09                                                                           ` Matt Mackall
@ 2005-01-12 21:25                                                                             ` Lee Revell
  0 siblings, 0 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-12 21:25 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Paul Davis, Chris Wright, Jack O'Quin, Christoph Hellwig,
	Andrew Morton, arjanv, mingo, alan, linux-kernel

On Wed, 2005-01-12 at 11:09 -0800, Matt Mackall wrote:
> On Tue, Jan 11, 2005 at 09:13:44PM -0500, Paul Davis wrote:
> > >A client starts at normal priority, asks jack nicely to promote it to
> > >RT, then jackd, if so configured/enabled, calls the wrapper with a PID
> > 
> > a PID? clients are multithreaded, and only specific threads run with
> > RT scheduling (normally just the one created for them by
> > libjack). So you presumably mean a TID, which in turn creates a
> > problem for any system (e.g. 2.4) where all threads share the PID, and
> > sched_setscheduler() really does use the PID as a PID, not a TID.
> 
> That actually sounds like an independent API problem.
> 

What's your point?  It has to work on 2.4, so this is not a feasible
solution.

> > but its gets worse. JACK clients need to drop RT scheduling under
> > certain, well-defined circumstances. how do they get it back under
> > this scheme?
> 
> Assuming a more thread-aware API, they just ask for privileges again.
> But with the non-thread-aware API, my first reaction would be the thread in
> question clones, and the clone drops privileges.
> 

Clones?  Seems pretty inefficient compared to having a simple mechanism
for root to grant users the ability to run RT tasks.  We have such a
system now, and it works perfectly, so any solution that makes people
jump through hoops will be rejected.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-12  7:49                                                                               ` Arjan van de Ven
  2005-01-12 21:12                                                                                 ` Lee Revell
@ 2005-01-13  0:44                                                                                 ` Jack O'Quin
  2005-01-13  7:28                                                                                   ` Arjan van de Ven
  1 sibling, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-13  0:44 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Chris Wright, Paul Davis, Lee Revell, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel

Arjan van de Ven <arjanv@redhat.com> writes:

> On Tue, Jan 11, 2005 at 07:43:29PM -0600, Jack O'Quin wrote:
>> Lexicographic ambiguity: Lee and Paul are using "trash" for things
>> like installing a hidden suid root shell or co-opting sendmail into an
>> open spam relay.  Arjan just means crashing the system which forces
>> reboot to run fsck.
>
> I actually meant data corruption.

Are you concerned about something different from the "normal" risk of
data corruption when the kernel panics or someone trips over the power
cord?
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:27                                                                 ` Ingo Molnar
  2005-01-11 22:13                                                                   ` Chris Wright
  2005-01-12  3:21                                                                   ` Jack O'Quin
@ 2005-01-13  5:44                                                                   ` Jack O'Quin
  2005-01-13  6:34                                                                     ` Matt Mackall
  2005-01-15 13:49                                                                     ` Ingo Molnar
  2 siblings, 2 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-13  5:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel

Ingo Molnar <mingo@elte.hu> writes:

> * Chris Wright <chrisw@osdl.org> wrote:
>
>> Hmm, I wonder if this could have anything to do with it.  These are
>> within striking range:
>> 
>>   PID COMMAND          NI PRI
>>     9 events/1        -10  34
>>   931 kcryptd/1       -10  33
>>   930 kcryptd/0       -10  34
>>     8 events/0        -10  34
>>   892 ata/1           -10  34
>>   891 ata/0           -10  34
>>  3747 udevd           -10  33
>>    26 kacpid          -10  31
>>   238 aio/1           -10  34
>>   237 aio/0           -10  31
>>   117 kblockd/1       -10  34
>>   116 kblockd/0       -10  34
>>    10 khelper         -10  34
>
> you are right, i forgot about kernel threads. If they are nice -10 on
> Jack's system too then they are within striking range indeed, especially
> since they are typically idle and if then they are active for short
> bursts of time and get the maximum boost. Jack, could you renice these
> to -5, to make sure they dont interfere?

OK, I reran with just 5 processes reniced from -10 to -5.  On my
system they were: events, khelper, kblockd, aio and reiserfs.  In
addition, I reniced loop0 from -20 to -5.

I made no changes to the kernel, yet.  It's still vanilla 2.6.10 with
realtime-lsm built-in.

A whole bunch of jackd, sh and jack_test3_client processes are ran at
nice -20.

It didn't make any significant difference...

                                 With -R        Without -R        Without -R
                               (SCHED_FIFO)     (nice --20)    (kprocs reniced)

************* SUMMARY RESULT ****************
Total seconds ran . . . . . . :   300
Number of clients . . . . . . :    20
Ports per client  . . . . . . :     4
Frames per buffer . . . . . . :    64
*********************************************
Timeout Count . . . . . . . . :(    1)           (    1)        (    1)	       
XRUN Count  . . . . . . . . . :     2                43             49	       
Delay Count (>spare time) . . :     0                 0              0	       
Delay Count (>1000 usecs) . . :     0                 0              0	       
Delay Maximum . . . . . . . . :  3130 usecs    501374 usecs   501415 usecs
Cycle Maximum . . . . . . . . :   960 usecs      1036 usecs      902 usecs 
Average DSP Load. . . . . . . :    34.3 %          34.3 %         34.7 %     
Average CPU System Load . . . :     8.7 %           7.8 %          8.5 %     
Average CPU User Load . . . . :    29.8 %          25.3 %         23.9 %     
Average CPU Nice Load . . . . :     0.0 %           0.0 %          0.0 %     
Average CPU I/O Wait Load . . :     3.2 %           0.1 %          0.0 %     
Average CPU IRQ Load  . . . . :     0.7 %           0.7 %          0.7 %     
Average CPU Soft-IRQ Load . . :     0.0 %           0.0 %          0.0 %     
Average Interrupt Rate  . . . :  1707.6 /sec     1692.9 /sec    1695.7 /sec  
Average Context-Switch Rate . : 11914.9 /sec    11611.2 /sec   11603.6 /sec  
*********************************************


One major problem: this `nice --20' hack affects every thread, not
just the critical realtime ones.  That's not what we want.  Audio
applications make very conscious choices which threads run with high
priority and which do not.

That JACK scheduling test doesn't have any graphical component, so it
cannot detect the problems of audio applications with GTK or Qt
threads running at nice -20 which will interfere with their own signal
processing loop.  I expect that to cause a horrible mess.

Plus, we maintain JACK for several platforms including GNU/Linux,
FreeBSD, and Mac OS X.  IRIX support is planned soon, possibly Solaris
some day.  I would really prefer for Linux to support genuine POSIX
realtime with SCHED_FIFO scheduling.  Since that is our primary
development platform, it makes our code a lot more portable.

And, this is not just about JACK.  We could change to call nice()
instead of pthread_setschedparam() on Linux, but that about all the
other audio applications?  I don't think this is a reasonable thing to
ask of people.  It would take a year just to get them all changed,
like herding cats.

This whole approach seems like a "dry well" to me.

Tomorrow, I'll try the test again after making a new kernel with
STARVATION_LIMIT set to zero.  

Anything else I should try?
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13  5:44                                                                   ` Jack O'Quin
@ 2005-01-13  6:34                                                                     ` Matt Mackall
  2005-01-13 19:17                                                                       ` Jack O'Quin
  2005-01-15 13:49                                                                     ` Ingo Molnar
  1 sibling, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-13  6:34 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Ingo Molnar, Chris Wright, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel

On Wed, Jan 12, 2005 at 11:44:34PM -0600, Jack O'Quin wrote:
> 
> One major problem: this `nice --20' hack affects every thread, not
> just the critical realtime ones.  That's not what we want.  Audio
> applications make very conscious choices which threads run with high
> priority and which do not.

I don't think it was intended as a final solution but rather as a
feasibility experiment.

> Plus, we maintain JACK for several platforms including GNU/Linux,
> FreeBSD, and Mac OS X.  IRIX support is planned soon, possibly Solaris
> some day.  I would really prefer for Linux to support genuine POSIX
> realtime with SCHED_FIFO scheduling.  Since that is our primary
> development platform, it makes our code a lot more portable.

Good realtime support is looking like a certainty at this point,
though it may take a while for all the bits to be fully merged.
 
> And, this is not just about JACK.  We could change to call nice()
> instead of pthread_setschedparam() on Linux, but that about all the
> other audio applications?  I don't think this is a reasonable thing to
> ask of people.  It would take a year just to get them all changed,
> like herding cats.

If we can get high priority SCHED_OTHER working sufficiently well,
that will be preferable in the long run as the security implications
are slightly less dire. It's already been noted that it doesn't solve
your privilege problem, but it's still interesting to us because it
has potential to address the deadlock issue.

Doesn't mean you have to use it (though you'll probably want to give
your users the option).

> This whole approach seems like a "dry well" to me.

It may turn out to be. Please continue testing it though - you've got
a good test case handy.

> Tomorrow, I'll try the test again after making a new kernel with
> STARVATION_LIMIT set to zero.  
> 
> Anything else I should try?

Testing feedback on the bits from Ingo that have gone to -mm will
probably help speed their acceptance in mainline.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13  0:44                                                                                 ` Jack O'Quin
@ 2005-01-13  7:28                                                                                   ` Arjan van de Ven
  2005-01-13 21:04                                                                                     ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-13  7:28 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Paul Davis, Lee Revell, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel

On Wed, Jan 12, 2005 at 06:44:23PM -0600, Jack O'Quin wrote:
> Arjan van de Ven <arjanv@redhat.com> writes:
> 
> > On Tue, Jan 11, 2005 at 07:43:29PM -0600, Jack O'Quin wrote:
> >> Lexicographic ambiguity: Lee and Paul are using "trash" for things
> >> like installing a hidden suid root shell or co-opting sendmail into an
> >> open spam relay.  Arjan just means crashing the system which forces
> >> reboot to run fsck.
> >
> > I actually meant data corruption.
> 
> Are you concerned about something different from the "normal" risk of
> data corruption when the kernel panics or someone trips over the power
> cord?

yes; the "normal" risk is time limited, eg the kernel will wait at most 30
seconds before writing back your dirty data, 5 seconds for ext3 actually.
With the "RT-abuse" hang, this 30 second thing goes on hold (because it's
done from those kernel threads that cause you those hickups in sound :-) and
you can starve a far longer period of time.. which may well mean a far
larger dataset not hitting the disk.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13  6:34                                                                     ` Matt Mackall
@ 2005-01-13 19:17                                                                       ` Jack O'Quin
  2005-01-14 20:52                                                                         ` Lee Revell
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-13 19:17 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Ingo Molnar, Chris Wright, Paul Davis, Christoph Hellwig,
	Con Kolivas, Andrew Morton, Lee Revell, arjanv, alan,
	linux-kernel

Matt Mackall <mpm@selenic.com> writes:

> If we can get high priority SCHED_OTHER working sufficiently well,
> that will be preferable in the long run as the security implications
> are slightly less dire. It's already been noted that it doesn't solve
> your privilege problem, but it's still interesting to us because it
> has potential to address the deadlock issue.

True.  

But there may be other, better solutions to the deadlock problem.
Several years ago, Roger Larsson wrote a completely user-space
realtime monitor program that works perfectly well for revoking
realtime privileges when it detects CPU starvation.  I still use it
occasionally to help debug problems if the built-in JACK watchdog
timer doesn't catch them.

In my view, Con Kolivas' SCHED_ISO prototype is a good avenue to
explore for mainstream kernel support.  With that approach, it is
relatively easy to build in protection against programs that abuse
their promised cycle reservations.  This appears to be similar to what
Apple is doing.

SCHED_OTHER is so timesharing oriented, that I seriously doubt its
appropriateness for soft realtime.  I say this naively without any
first-hand study of the current Linux implementation.  I do understand
traditional Unix schedulers (at one time in detail).  The general idea
was to punish CPU-bound processes and reward I/O-bound processes.

> Doesn't mean you have to use it (though you'll probably want to give
> your users the option).

We already do.  That's why I was able to experiment with nice --20 so
quickly.  In fact, SCHED_OTHER is the default.  Users have to specify
-R (--realtime) before JACK requests SCHED_FIFO privileges.

> On Wed, Jan 12, 2005 at 11:44:34PM -0600, Jack O'Quin wrote:
>> This whole approach seems like a "dry well" to me.
>
> It may turn out to be. Please continue testing it though - you've got
> a good test case handy.

Sure.

I didn't write that test script, BTW.  (I'd like to know who did.)
IIUC, it is one Lee Revell and Rui Nuno Capela have been using to test
Ingo's RP patches.  I got it from Rui.  I chose it because it was
handy and I figured Ingo would be familiar with its output.

We are considering including it (or some variant) in the JACK sources,
so any interested user can download JACK, configure, compile, and then
run `make test'.

It is a fairly heavy test.  The system takes an interrupt from the
audio card every 1.45 msec, then must schedule 22 realtime threads
belonging to 21 different processes (the JACK server and twenty
clients) before the next interrupt arrives.  An XRUN means the system
was late servicing the interrupt (very bad).  The "DSP load" indicates
that these threads are using a little over 1/3 of the total bandwidth
of my 1.5GHz Athlon XP.

>> Tomorrow, I'll try the test again after making a new kernel with
>> STARVATION_LIMIT set to zero.  
>> 
>> Anything else I should try?
>
> Testing feedback on the bits from Ingo that have gone to -mm will
> probably help speed their acceptance in mainline.

Several people continue working with him on that.  Lee and Rui have
been instrumental in testing with Ingo's kernels and in developing
JACK patches to gather needed information.  Much of their
instrumentation will be included in the next JACK release.

We are all highly motivated to help.  We want Linux to have the best
soft realtime possible, while working within the very real constraints
of what it is practical to do in a general-purpose OS.  

I hate for the OSX folks to do better.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13  7:28                                                                                   ` Arjan van de Ven
@ 2005-01-13 21:04                                                                                     ` Jack O'Quin
  2005-01-13 21:07                                                                                       ` Arjan van de Ven
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-13 21:04 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Chris Wright, Paul Davis, Lee Revell, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel


>> > On Tue, Jan 11, 2005 at 07:43:29PM -0600, Jack O'Quin wrote:
>> >> Arjan just means crashing the system which forces reboot to run
>> >> fsck.

>> Arjan van de Ven <arjanv@redhat.com> writes:
>> > I actually meant data corruption.

> On Wed, Jan 12, 2005 at 06:44:23PM -0600, Jack O'Quin wrote:
>> Are you concerned about something different from the "normal" risk of
>> data corruption when the kernel panics or someone trips over the power
>> cord?

Arjan van de Ven <arjanv@redhat.com> writes:
> yes; the "normal" risk is time limited, eg the kernel will wait at most 30
> seconds before writing back your dirty data, 5 seconds for ext3 actually.
> With the "RT-abuse" hang, this 30 second thing goes on hold (because it's
> done from those kernel threads that cause you those hickups in sound :-) and
> you can starve a far longer period of time.. which may well mean a far
> larger dataset not hitting the disk.

Ah, good point.

Just thinking about this naively, I come up with two scenarios:

  (1) SMP -- RT thread hangs one CPU.  Kernel threads can still run on
  other processors.  Rest of system continues running (degraded) until
  more RT threads hang the remaining CPUs at which time we end up
  with...

  (2) UP -- RT thread hangs the last remaining CPU.  Kernel threads
  can't run.  User processes can no longer write data to FS.

(Probably, this simplistic analysis misses some other, more subtle,
factors.)

RT threads should not do FS writes of their own.  But, a badly broken
or malicious one could, I suppose.  So, that might provide a mechanism
for losing more data than usual.  Is that what you had in mind?
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 21:04                                                                                     ` Jack O'Quin
@ 2005-01-13 21:07                                                                                       ` Arjan van de Ven
  2005-01-13 21:25                                                                                         ` Lee Revell
  0 siblings, 1 reply; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-13 21:07 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Paul Davis, Lee Revell, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel

On Thu, Jan 13, 2005 at 03:04:26PM -0600, Jack O'Quin wrote:
> 
> (Probably, this simplistic analysis misses some other, more subtle,
> factors.)

I think you can do nasty things to the locks held by those threads too

> 
> RT threads should not do FS writes of their own.  But, a badly broken
> or malicious one could, I suppose.  So, that might provide a mechanism
> for losing more data than usual.  Is that what you had in mind?

basically yes.
note that "FS writes" can come from various things, including library calls
made and such. But I think you got my point; even though it might seem a bit
theoretical it sure is unpleasant.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 21:07                                                                                       ` Arjan van de Ven
@ 2005-01-13 21:25                                                                                         ` Lee Revell
  2005-01-13 21:43                                                                                           ` Arjan van de Ven
  2005-01-14  2:05                                                                                           ` utz lehmann
  0 siblings, 2 replies; 266+ messages in thread
From: Lee Revell @ 2005-01-13 21:25 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Jack O'Quin, Chris Wright, Paul Davis, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel,
	Con Kolivas

On Thu, 2005-01-13 at 22:07 +0100, Arjan van de Ven wrote:
> On Thu, Jan 13, 2005 at 03:04:26PM -0600, Jack O'Quin wrote:
> > 
> > (Probably, this simplistic analysis misses some other, more subtle,
> > factors.)
> 
> I think you can do nasty things to the locks held by those threads too
> 
> > 
> > RT threads should not do FS writes of their own.  But, a badly broken
> > or malicious one could, I suppose.  So, that might provide a mechanism
> > for losing more data than usual.  Is that what you had in mind?
> 
> basically yes.
> note that "FS writes" can come from various things, including library calls
> made and such. But I think you got my point; even though it might seem a bit
> theoretical it sure is unpleasant.
> 

I added Con to the cc: because this thread is starting to converge with
an email discussion we've been having.

The basic issue is that the current semantics of SCHED_FIFO seem make
the deadlock/data corruption due to runaway RT thread issue difficult.
The obvious solution is a new scheduling class equivalent to SCHED_FIFO
but with a mechanism for the kernel to demote the offending thread to
SCHED_OTHER in an emergency.  The problem can be solved in userspace
with a SCHED_FIFO watchdog thread that runs at a higher RT priority than
all other RT processes.

This all seems to imply that introducing an rlimit for MAX_RT_PRIO is an
excellent solution.  The RT watchdog thread could run as root, and the
rlimit would be used to ensure than even nonroot users in the RT group
could never preempt the watchdog thread.

Lee 


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 21:25                                                                                         ` Lee Revell
@ 2005-01-13 21:43                                                                                           ` Arjan van de Ven
  2005-01-13 23:31                                                                                             ` Jack O'Quin
  2005-01-14  2:05                                                                                           ` utz lehmann
  1 sibling, 1 reply; 266+ messages in thread
From: Arjan van de Ven @ 2005-01-13 21:43 UTC (permalink / raw)
  To: Lee Revell
  Cc: Jack O'Quin, Chris Wright, Paul Davis, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel,
	Con Kolivas


On Thu, Jan 13, 2005 at 04:25:08PM -0500, Lee Revell wrote:
> The basic issue is that the current semantics of SCHED_FIFO seem make
> the deadlock/data corruption due to runaway RT thread issue difficult.
> The obvious solution is a new scheduling class equivalent to SCHED_FIFO
> but with a mechanism for the kernel to demote the offending thread to
> SCHED_OTHER in an emergency. 

and this is getting really close to the original "counter proposal" to the
LSM module that was basically "lets make lower nice limit an rlimit, and
have -20 mean "basically FIFO" *if* the task behaves itself".


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 21:43                                                                                           ` Arjan van de Ven
@ 2005-01-13 23:31                                                                                             ` Jack O'Quin
  2005-01-14  0:33                                                                                               ` Chris Wright
                                                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-13 23:31 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Lee Revell, Chris Wright, Paul Davis, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel,
	Con Kolivas

Arjan van de Ven <arjanv@redhat.com> writes:

> On Thu, Jan 13, 2005 at 04:25:08PM -0500, Lee Revell wrote:
>> The basic issue is that the current semantics of SCHED_FIFO seem make
>> the deadlock/data corruption due to runaway RT thread issue difficult.
>> The obvious solution is a new scheduling class equivalent to SCHED_FIFO
>> but with a mechanism for the kernel to demote the offending thread to
>> SCHED_OTHER in an emergency. 
>
> and this is getting really close to the original "counter proposal" to the
> LSM module that was basically "lets make lower nice limit an rlimit, and
> have -20 mean "basically FIFO" *if* the task behaves itself".

Yes.  However, my tests have so far shown a need for "actual FIFO as
long as the task behaves itself."

Otherwise, your rlimits proposal is fine.  I still think it puts more
of a burden on the sysadmin, but nobody else seems to care about that.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 23:31                                                                                             ` Jack O'Quin
@ 2005-01-14  0:33                                                                                               ` Chris Wright
  2005-01-14  0:50                                                                                               ` Con Kolivas
  2005-01-14 17:20                                                                                               ` Mike Galbraith
  2 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-14  0:33 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Arjan van de Ven, Lee Revell, Chris Wright, Paul Davis,
	Matt Mackall, Christoph Hellwig, Andrew Morton, mingo, alan,
	linux-kernel, Con Kolivas

* Jack O'Quin (joq@io.com) wrote:
> Otherwise, your rlimits proposal is fine.  I still think it puts more
> of a burden on the sysadmin, but nobody else seems to care about that.

Actually, I care.  However, I don't think the burden is really too
much greater.  It may put some extra burden on the how-to-audio writer.
But adding a group and editing /etc/security/limits.conf doesn't sound
too bad to me.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 23:31                                                                                             ` Jack O'Quin
  2005-01-14  0:33                                                                                               ` Chris Wright
@ 2005-01-14  0:50                                                                                               ` Con Kolivas
  2005-01-14  1:20                                                                                                 ` Matt Mackall
  2005-01-14 17:20                                                                                               ` Mike Galbraith
  2 siblings, 1 reply; 266+ messages in thread
From: Con Kolivas @ 2005-01-14  0:50 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Arjan van de Ven, Lee Revell, Chris Wright, Paul Davis,
	Matt Mackall, Christoph Hellwig, Andrew Morton, mingo, alan,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3222 bytes --]

Jack O'Quin wrote:
> Arjan van de Ven <arjanv@redhat.com> writes:
> 
> 
>>On Thu, Jan 13, 2005 at 04:25:08PM -0500, Lee Revell wrote:
>>
>>>The basic issue is that the current semantics of SCHED_FIFO seem make
>>>the deadlock/data corruption due to runaway RT thread issue difficult.
>>>The obvious solution is a new scheduling class equivalent to SCHED_FIFO
>>>but with a mechanism for the kernel to demote the offending thread to
>>>SCHED_OTHER in an emergency. 
>>
>>and this is getting really close to the original "counter proposal" to the
>>LSM module that was basically "lets make lower nice limit an rlimit, and
>>have -20 mean "basically FIFO" *if* the task behaves itself".
> 
> 
> Yes.  However, my tests have so far shown a need for "actual FIFO as
> long as the task behaves itself."

I should comment on this thread on lkml. After some 
investigation/discussion and testing I came up with a proposal for this 
problem. Since we are a general purpose operating system and not a hard 
rt system (although addon patches are clearly making that a future 
possibility) we need a solution that is satisfactory to a general...

There are two ways I suggested for this.

First, (and I am increasingly believing in the second) is to implement a 
new scheduling class for isochronous scheduling. This would be a class 
for unprivileged users, and behave like SCHED_RR (to avoid complications 
of QoS features we dont have infrastrucutre for) at a priority just 
above SCHED_NORMAL, but below all privileged SCHED_RR and SCHED_FIFO. 
Importantly, a soft cpu limit and rate period can be set by default for 
this scheduling class that provides good true SCHED_RR performance, and 
is configurable. Literature suggests that 70% is adequate cpu for good 
real time performance and would be starvation free. I believe setting 
70% with 10% hysteresis (dropping to say 63% on hitting limit) would be 
a good start. Beyond this, however, to satisfy the needs of those with 
more demanding setups, a simple configurable runtime setting to set both 
the cpu% and the rate period could be available to something as simple 
as proc
/proc/sys/kernel/iso_cpu
/proc/sys/kernel/iso_cpu_period
where iso_cpu is set to 70, and period to maybe 1 second. The actual 
mode of setting this tunable is not important, and could be in /sys or 
whatever

The second option is to not implement a new scheduling class at all, and 
allow unprivileged users to use either SCHED_FIFO or SCHED_RR, but to 
make the cpu constraints described for SCHED_ISO above apply to their 
use of those classes. Supporting priority settings for these could be 
possible, but in my opinion, it would work as a better class if they 
only had one priority level, as for the SCHED_ISO implementation above 
(better than any SCHED_NORMAL, but lower than privileged SCHED_RR/FIFO).

This latter approach to me seems the least invasive and most user and 
sysadmin friendly method.

What was amusing to me was that after I suggested the latter option, I 
discovered that was basically what OSX does, however being not a real 
multi-user operating system they had absurd limits for cpu at 90% by 
default. Theory suggests 70% should be a good default limit.

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  0:50                                                                                               ` Con Kolivas
@ 2005-01-14  1:20                                                                                                 ` Matt Mackall
  2005-01-14  1:27                                                                                                   ` Con Kolivas
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-14  1:20 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Jack O'Quin, Arjan van de Ven, Lee Revell, Chris Wright,
	Paul Davis, Christoph Hellwig, Andrew Morton, mingo, alan,
	linux-kernel

On Fri, Jan 14, 2005 at 11:50:14AM +1100, Con Kolivas wrote:
> Jack O'Quin wrote:
> >Arjan van de Ven <arjanv@redhat.com> writes:
> >
> >
> >>On Thu, Jan 13, 2005 at 04:25:08PM -0500, Lee Revell wrote:
> >>
> >>>The basic issue is that the current semantics of SCHED_FIFO seem make
> >>>the deadlock/data corruption due to runaway RT thread issue difficult.
> >>>The obvious solution is a new scheduling class equivalent to SCHED_FIFO
> >>>but with a mechanism for the kernel to demote the offending thread to
> >>>SCHED_OTHER in an emergency. 
> >>
> >>and this is getting really close to the original "counter proposal" to the
> >>LSM module that was basically "lets make lower nice limit an rlimit, and
> >>have -20 mean "basically FIFO" *if* the task behaves itself".
> >
> >
> >Yes.  However, my tests have so far shown a need for "actual FIFO as
> >long as the task behaves itself."
> 
> I should comment on this thread on lkml. After some 
> investigation/discussion and testing I came up with a proposal for this 
> problem. Since we are a general purpose operating system and not a hard 
> rt system (although addon patches are clearly making that a future 
> possibility) we need a solution that is satisfactory to a general...
> 
> There are two ways I suggested for this.
> 
> First, (and I am increasingly believing in the second) is to implement a 
> new scheduling class for isochronous scheduling. This would be a class 
> for unprivileged users, and behave like SCHED_RR (to avoid complications 
> of QoS features we dont have infrastrucutre for) at a priority just 
> above SCHED_NORMAL, but below all privileged SCHED_RR and SCHED_FIFO. 
> Importantly, a soft cpu limit and rate period can be set by default for 
> this scheduling class that provides good true SCHED_RR performance, and 
> is configurable. Literature suggests that 70% is adequate cpu for good 
> real time performance and would be starvation free. I believe setting 
> 70% with 10% hysteresis (dropping to say 63% on hitting limit) would be 
> a good start. Beyond this, however, to satisfy the needs of those with 
> more demanding setups, a simple configurable runtime setting to set both 
> the cpu% and the rate period could be available to something as simple 
> as proc
> /proc/sys/kernel/iso_cpu
> /proc/sys/kernel/iso_cpu_period
> where iso_cpu is set to 70, and period to maybe 1 second. The actual 
> mode of setting this tunable is not important, and could be in /sys or 
> whatever

This sounds promising, but I think it still needs to be privileged.
See a) below.

> The second option is to not implement a new scheduling class at all, and 
> allow unprivileged users to use either SCHED_FIFO or SCHED_RR, but to 
> make the cpu constraints described for SCHED_ISO above apply to their 
> use of those classes. Supporting priority settings for these could be 
> possible, but in my opinion, it would work as a better class if they 
> only had one priority level, as for the SCHED_ISO implementation above 
> (better than any SCHED_NORMAL, but lower than privileged SCHED_RR/FIFO).

a) How to arbitrate between competing unprivileged users that want
pseudo-SCHED_FIFO? Do they all lose?
b) Priority levels are important here, we want to be able to do things
like have audio run at higher priority than video.

> 
> This latter approach to me seems the least invasive and most user and 
> sysadmin friendly method.
> 
> What was amusing to me was that after I suggested the latter option, I 
> discovered that was basically what OSX does, however being not a real 
> multi-user operating system they had absurd limits for cpu at 90% by 
> default. Theory suggests 70% should be a good default limit.
> 
> Cheers,
> Con



-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  1:20                                                                                                 ` Matt Mackall
@ 2005-01-14  1:27                                                                                                   ` Con Kolivas
  0 siblings, 0 replies; 266+ messages in thread
From: Con Kolivas @ 2005-01-14  1:27 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Arjan van de Ven, Lee Revell, Chris Wright,
	Paul Davis, Christoph Hellwig, Andrew Morton, mingo, alan,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3954 bytes --]

Matt Mackall wrote:
> On Fri, Jan 14, 2005 at 11:50:14AM +1100, Con Kolivas wrote:
> 
>>Jack O'Quin wrote:
>>
>>>Arjan van de Ven <arjanv@redhat.com> writes:
>>>
>>>
>>>
>>>>On Thu, Jan 13, 2005 at 04:25:08PM -0500, Lee Revell wrote:
>>>>
>>>>
>>>>>The basic issue is that the current semantics of SCHED_FIFO seem make
>>>>>the deadlock/data corruption due to runaway RT thread issue difficult.
>>>>>The obvious solution is a new scheduling class equivalent to SCHED_FIFO
>>>>>but with a mechanism for the kernel to demote the offending thread to
>>>>>SCHED_OTHER in an emergency. 
>>>>
>>>>and this is getting really close to the original "counter proposal" to the
>>>>LSM module that was basically "lets make lower nice limit an rlimit, and
>>>>have -20 mean "basically FIFO" *if* the task behaves itself".
>>>
>>>
>>>Yes.  However, my tests have so far shown a need for "actual FIFO as
>>>long as the task behaves itself."
>>
>>I should comment on this thread on lkml. After some 
>>investigation/discussion and testing I came up with a proposal for this 
>>problem. Since we are a general purpose operating system and not a hard 
>>rt system (although addon patches are clearly making that a future 
>>possibility) we need a solution that is satisfactory to a general...
>>
>>There are two ways I suggested for this.
>>
>>First, (and I am increasingly believing in the second) is to implement a 
>>new scheduling class for isochronous scheduling. This would be a class 
>>for unprivileged users, and behave like SCHED_RR (to avoid complications 
>>of QoS features we dont have infrastrucutre for) at a priority just 
>>above SCHED_NORMAL, but below all privileged SCHED_RR and SCHED_FIFO. 
>>Importantly, a soft cpu limit and rate period can be set by default for 
>>this scheduling class that provides good true SCHED_RR performance, and 
>>is configurable. Literature suggests that 70% is adequate cpu for good 
>>real time performance and would be starvation free. I believe setting 
>>70% with 10% hysteresis (dropping to say 63% on hitting limit) would be 
>>a good start. Beyond this, however, to satisfy the needs of those with 
>>more demanding setups, a simple configurable runtime setting to set both 
>>the cpu% and the rate period could be available to something as simple 
>>as proc
>>/proc/sys/kernel/iso_cpu
>>/proc/sys/kernel/iso_cpu_period
>>where iso_cpu is set to 70, and period to maybe 1 second. The actual 
>>mode of setting this tunable is not important, and could be in /sys or 
>>whatever
> 
> 
> This sounds promising, but I think it still needs to be privileged.
> See a) below.

>>The second option is to not implement a new scheduling class at all, and 
>>allow unprivileged users to use either SCHED_FIFO or SCHED_RR, but to 
>>make the cpu constraints described for SCHED_ISO above apply to their 
>>use of those classes. Supporting priority settings for these could be 
>>possible, but in my opinion, it would work as a better class if they 
>>only had one priority level, as for the SCHED_ISO implementation above 
>>(better than any SCHED_NORMAL, but lower than privileged SCHED_RR/FIFO).
> 
> 
> a) How to arbitrate between competing unprivileged users that want
> pseudo-SCHED_FIFO? Do they all lose?
> b) Priority levels are important here, we want to be able to do things
> like have audio run at higher priority than video.

Indeed how OSX does it is to pretend that it is paying any attention at 
all to the QoS requests that is given to it. In actual fact all it does 
is RR frequently enough and then everything tends to work anyway...

The reason I suggest not supporting priorities (at the moment) is for 
them to even work I would need to implement a complete Earliest Deadline 
First scheduler, ideally with more syscalls and transferring QoS 
requests to the scheduler. This is umm non-trivial to say the least... 
but possible on a landscape not living in a 2.6 forever development world.

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 21:25                                                                                         ` Lee Revell
  2005-01-13 21:43                                                                                           ` Arjan van de Ven
@ 2005-01-14  2:05                                                                                           ` utz lehmann
  2005-01-14  2:08                                                                                             ` Con Kolivas
  2005-01-14  2:24                                                                                             ` Nick Piggin
  1 sibling, 2 replies; 266+ messages in thread
From: utz lehmann @ 2005-01-14  2:05 UTC (permalink / raw)
  To: Lee Revell
  Cc: Arjan van de Ven, Jack O'Quin, Chris Wright, Paul Davis,
	Matt Mackall, Christoph Hellwig, Andrew Morton, mingo, alan,
	LKML, Con Kolivas

On Thu, 2005-01-13 at 16:25 -0500, Lee Revell wrote:
> On Thu, 2005-01-13 at 22:07 +0100, Arjan van de Ven wrote:
> > On Thu, Jan 13, 2005 at 03:04:26PM -0600, Jack O'Quin wrote:
> > > 
> > > (Probably, this simplistic analysis misses some other, more subtle,
> > > factors.)
> > 
> > I think you can do nasty things to the locks held by those threads too
> > 
> > > 
> > > RT threads should not do FS writes of their own.  But, a badly broken
> > > or malicious one could, I suppose.  So, that might provide a mechanism
> > > for losing more data than usual.  Is that what you had in mind?
> > 
> > basically yes.
> > note that "FS writes" can come from various things, including library calls
> > made and such. But I think you got my point; even though it might seem a bit
> > theoretical it sure is unpleasant.
> > 
> 
> I added Con to the cc: because this thread is starting to converge with
> an email discussion we've been having.
> 
> The basic issue is that the current semantics of SCHED_FIFO seem make
> the deadlock/data corruption due to runaway RT thread issue difficult.
> The obvious solution is a new scheduling class equivalent to SCHED_FIFO
> but with a mechanism for the kernel to demote the offending thread to
> SCHED_OTHER in an emergency.  The problem can be solved in userspace
> with a SCHED_FIFO watchdog thread that runs at a higher RT priority than
> all other RT processes.
> 
> This all seems to imply that introducing an rlimit for MAX_RT_PRIO is an
> excellent solution.  The RT watchdog thread could run as root, and the
> rlimit would be used to ensure than even nonroot users in the RT group
> could never preempt the watchdog thread.

Just an idea. What about throttling runaway RT tasks?
If the system spend more than 98% in RT tasks for 5s consider this as a
_fatal error_. Print an error message and throttle RT tasks by inserting
ticks where only SCHED_OTHER tasks allowed. For a limit of 98% this
means one SCHED_OTHER only tick all 50 ticks.

The limit and timeout should be configurable and of course it can be
disabled.

I know this is against RT task preempt all SCHED_OTHER but this is only
for a fatal system state to be able to recover sanely. A locked up
machine is is the worse alternative.



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:05                                                                                           ` utz lehmann
@ 2005-01-14  2:08                                                                                             ` Con Kolivas
  2005-01-14  2:23                                                                                               ` Andrew Morton
  2005-01-14  2:35                                                                                               ` utz lehmann
  2005-01-14  2:24                                                                                             ` Nick Piggin
  1 sibling, 2 replies; 266+ messages in thread
From: Con Kolivas @ 2005-01-14  2:08 UTC (permalink / raw)
  To: utz lehmann
  Cc: Lee Revell, Arjan van de Ven, Jack O'Quin, Chris Wright,
	Paul Davis, Matt Mackall, Christoph Hellwig, Andrew Morton,
	mingo, alan, LKML

[-- Attachment #1: Type: text/plain, Size: 2561 bytes --]

utz lehmann wrote:
> On Thu, 2005-01-13 at 16:25 -0500, Lee Revell wrote:
> 
>>On Thu, 2005-01-13 at 22:07 +0100, Arjan van de Ven wrote:
>>
>>>On Thu, Jan 13, 2005 at 03:04:26PM -0600, Jack O'Quin wrote:
>>>
>>>>(Probably, this simplistic analysis misses some other, more subtle,
>>>>factors.)
>>>
>>>I think you can do nasty things to the locks held by those threads too
>>>
>>>
>>>>RT threads should not do FS writes of their own.  But, a badly broken
>>>>or malicious one could, I suppose.  So, that might provide a mechanism
>>>>for losing more data than usual.  Is that what you had in mind?
>>>
>>>basically yes.
>>>note that "FS writes" can come from various things, including library calls
>>>made and such. But I think you got my point; even though it might seem a bit
>>>theoretical it sure is unpleasant.
>>>
>>
>>I added Con to the cc: because this thread is starting to converge with
>>an email discussion we've been having.
>>
>>The basic issue is that the current semantics of SCHED_FIFO seem make
>>the deadlock/data corruption due to runaway RT thread issue difficult.
>>The obvious solution is a new scheduling class equivalent to SCHED_FIFO
>>but with a mechanism for the kernel to demote the offending thread to
>>SCHED_OTHER in an emergency.  The problem can be solved in userspace
>>with a SCHED_FIFO watchdog thread that runs at a higher RT priority than
>>all other RT processes.
>>
>>This all seems to imply that introducing an rlimit for MAX_RT_PRIO is an
>>excellent solution.  The RT watchdog thread could run as root, and the
>>rlimit would be used to ensure than even nonroot users in the RT group
>>could never preempt the watchdog thread.
> 
> 
> Just an idea. What about throttling runaway RT tasks?
> If the system spend more than 98% in RT tasks for 5s consider this as a
> _fatal error_. Print an error message and throttle RT tasks by inserting
> ticks where only SCHED_OTHER tasks allowed. For a limit of 98% this
> means one SCHED_OTHER only tick all 50 ticks.
> 
> The limit and timeout should be configurable and of course it can be
> disabled.
> 
> I know this is against RT task preempt all SCHED_OTHER but this is only
> for a fatal system state to be able to recover sanely. A locked up
> machine is is the worse alternative.

There is a patch in -mm currently designed to use a sysrq key 
combination which converts all real time tasks to sched normal to save 
you if you desire in a lockup situation. We do want to preserve RT 
scheduling behaviour at all times without caveats for privileged users.

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:08                                                                                             ` Con Kolivas
@ 2005-01-14  2:23                                                                                               ` Andrew Morton
  2005-01-14  2:35                                                                                               ` utz lehmann
  1 sibling, 0 replies; 266+ messages in thread
From: Andrew Morton @ 2005-01-14  2:23 UTC (permalink / raw)
  To: Con Kolivas
  Cc: lkml, rlrevell, arjanv, joq, chrisw, paul, mpm, hch, mingo, alan,
	linux-kernel

Con Kolivas <kernel@kolivas.org> wrote:
>
> There is a patch in -mm currently designed to use a sysrq key 
>  combination which converts all real time tasks to sched normal to save 
>  you if you desire in a lockup situation.

That's in 2.6.11-rc1 now.  sysrq-n.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:05                                                                                           ` utz lehmann
  2005-01-14  2:08                                                                                             ` Con Kolivas
@ 2005-01-14  2:24                                                                                             ` Nick Piggin
  2005-01-14  2:40                                                                                               ` Paul Davis
  1 sibling, 1 reply; 266+ messages in thread
From: Nick Piggin @ 2005-01-14  2:24 UTC (permalink / raw)
  To: utz lehmann
  Cc: Lee Revell, Arjan van de Ven, Jack O'Quin, Chris Wright,
	Paul Davis, Matt Mackall, Christoph Hellwig, Andrew Morton,
	mingo, alan, LKML, Con Kolivas

On Fri, 2005-01-14 at 03:05 +0100, utz lehmann wrote:
> On Thu, 2005-01-13 at 16:25 -0500, Lee Revell wrote:

> > This all seems to imply that introducing an rlimit for MAX_RT_PRIO is an
> > excellent solution.  The RT watchdog thread could run as root, and the
> > rlimit would be used to ensure than even nonroot users in the RT group
> > could never preempt the watchdog thread.
> 
> Just an idea. What about throttling runaway RT tasks?
> If the system spend more than 98% in RT tasks for 5s consider this as a
> _fatal error_. Print an error message and throttle RT tasks by inserting
> ticks where only SCHED_OTHER tasks allowed. For a limit of 98% this
> means one SCHED_OTHER only tick all 50 ticks.
> 
> The limit and timeout should be configurable and of course it can be
> disabled.
> 

This is just a hack. Realtime scheduling is pretty rigidly specified,
and we satisfy that. Thus it is useful for systems that need to make
use of it. The way SCHED_FIFO and SCHED_RR scheduling is specified is
inherently insecure/incompatible with a multi user machine; I don't
understand why people are getting heated with this debate. You literally
can't run more than one realtime system on the same CPU(s) if they don't
have a knowledge of one another.

SCHED_FIFO and SCHED_RR are definitely privileged operations and you
can't really change them without making them useless to legitimate
users, I think.




^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:08                                                                                             ` Con Kolivas
  2005-01-14  2:23                                                                                               ` Andrew Morton
@ 2005-01-14  2:35                                                                                               ` utz lehmann
  2005-01-14  2:42                                                                                                 ` Con Kolivas
  1 sibling, 1 reply; 266+ messages in thread
From: utz lehmann @ 2005-01-14  2:35 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Lee Revell, Arjan van de Ven, Jack O'Quin, Chris Wright,
	Paul Davis, Matt Mackall, Christoph Hellwig, Andrew Morton,
	mingo, alan, LKML

On Fri, 2005-01-14 at 13:08 +1100, Con Kolivas wrote:
> utz lehmann wrote:
> > On Thu, 2005-01-13 at 16:25 -0500, Lee Revell wrote:
> > 
> >>On Thu, 2005-01-13 at 22:07 +0100, Arjan van de Ven wrote:
> >>
> >>>On Thu, Jan 13, 2005 at 03:04:26PM -0600, Jack O'Quin wrote:
> >>>
> >>>>(Probably, this simplistic analysis misses some other, more subtle,
> >>>>factors.)
> >>>
> >>>I think you can do nasty things to the locks held by those threads too
> >>>
> >>>
> >>>>RT threads should not do FS writes of their own.  But, a badly broken
> >>>>or malicious one could, I suppose.  So, that might provide a mechanism
> >>>>for losing more data than usual.  Is that what you had in mind?
> >>>
> >>>basically yes.
> >>>note that "FS writes" can come from various things, including library calls
> >>>made and such. But I think you got my point; even though it might seem a bit
> >>>theoretical it sure is unpleasant.
> >>>
> >>
> >>I added Con to the cc: because this thread is starting to converge with
> >>an email discussion we've been having.
> >>
> >>The basic issue is that the current semantics of SCHED_FIFO seem make
> >>the deadlock/data corruption due to runaway RT thread issue difficult.
> >>The obvious solution is a new scheduling class equivalent to SCHED_FIFO
> >>but with a mechanism for the kernel to demote the offending thread to
> >>SCHED_OTHER in an emergency.  The problem can be solved in userspace
> >>with a SCHED_FIFO watchdog thread that runs at a higher RT priority than
> >>all other RT processes.
> >>
> >>This all seems to imply that introducing an rlimit for MAX_RT_PRIO is an
> >>excellent solution.  The RT watchdog thread could run as root, and the
> >>rlimit would be used to ensure than even nonroot users in the RT group
> >>could never preempt the watchdog thread.
> > 
> > 
> > Just an idea. What about throttling runaway RT tasks?
> > If the system spend more than 98% in RT tasks for 5s consider this as a
> > _fatal error_. Print an error message and throttle RT tasks by inserting
> > ticks where only SCHED_OTHER tasks allowed. For a limit of 98% this
> > means one SCHED_OTHER only tick all 50 ticks.
> > 
> > The limit and timeout should be configurable and of course it can be
> > disabled.
> > 
> > I know this is against RT task preempt all SCHED_OTHER but this is only
> > for a fatal system state to be able to recover sanely. A locked up
> > machine is is the worse alternative.
> 
> There is a patch in -mm currently designed to use a sysrq key 
> combination which converts all real time tasks to sched normal to save 
> you if you desire in a lockup situation. We do want to preserve RT 
> scheduling behaviour at all times without caveats for privileged users.

The sysrq is already in 2.6.10. I had to use it the last days a few
times. But it does help if you have no access to the console.

The RT throttling idea is not to change the behavior in normal
conditions. It's only for a fatal system state. If you have a runaway RT
task you can't guarantee the system is work properly anyway. It's
blocking vital kernel threads, filesystems, swap, keyboard, ...

It's a bit like out of memory. You can do nothing and panic. Or trying
something bad (killing processes) which is hopefully better as the
former.
btw: Are RT tasks excluded by the oom killer?



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:24                                                                                             ` Nick Piggin
@ 2005-01-14  2:40                                                                                               ` Paul Davis
  2005-01-14  2:57                                                                                                 ` Nick Piggin
  2005-01-14  3:12                                                                                                 ` Andrew Morton
  0 siblings, 2 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-14  2:40 UTC (permalink / raw)
  To: Nick Piggin
  Cc: utz lehmann, Lee Revell, Arjan van de Ven, Jack O'Quin,
	Chris Wright, Matt Mackall, Christoph Hellwig, Andrew Morton,
	mingo, alan, LKML, Con Kolivas

>SCHED_FIFO and SCHED_RR are definitely privileged operations and you

this is the crux of what this whole debate is about. for all of you
people who think about linux on multi-user systems with network
connectivity, running servers and so forth, this is clearly a given.

but there is large and growing body of machines that run linux where
the sole human user of the machine has a strong and overwhelming
desire to have tasks run with the characteristics offered by
SCHED_FIFO and/or SCHED_RR. are they still "privileged" operations on
this class of linux system? what about linux installed on an embedded
system, with a small LCD screen and the sole purpose of running audio
apps live? are they still privileged then?

i think there is room for debate, but its clear that in general,
SCHED_FIFO/SCHED_RR's "definite" status as privileged operations is
not clear. we are trying to find ways to provide access to it in ways
that don't conflict with the other categories of linux systems where
it clearly needs to be off-limits to unprivileged users.

--p



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:35                                                                                               ` utz lehmann
@ 2005-01-14  2:42                                                                                                 ` Con Kolivas
  2005-01-14  3:20                                                                                                   ` Andrew Morton
  2005-01-14  3:26                                                                                                   ` utz lehmann
  0 siblings, 2 replies; 266+ messages in thread
From: Con Kolivas @ 2005-01-14  2:42 UTC (permalink / raw)
  To: utz lehmann
  Cc: Lee Revell, Arjan van de Ven, Jack O'Quin, Chris Wright,
	Paul Davis, Matt Mackall, Christoph Hellwig, Andrew Morton,
	mingo, alan, LKML

[-- Attachment #1: Type: text/plain, Size: 1942 bytes --]

utz lehmann wrote:
> On Fri, 2005-01-14 at 13:08 +1100, Con Kolivas wrote:
> 
>>utz lehmann wrote:
>>>Just an idea. What about throttling runaway RT tasks?
>>>If the system spend more than 98% in RT tasks for 5s consider this as a
>>>_fatal error_. Print an error message and throttle RT tasks by inserting
>>>ticks where only SCHED_OTHER tasks allowed. For a limit of 98% this
>>>means one SCHED_OTHER only tick all 50 ticks.
>>>
>>>The limit and timeout should be configurable and of course it can be
>>>disabled.
>>>
>>>I know this is against RT task preempt all SCHED_OTHER but this is only
>>>for a fatal system state to be able to recover sanely. A locked up
>>>machine is is the worse alternative.
>>
>>There is a patch in -mm currently designed to use a sysrq key 
>>combination which converts all real time tasks to sched normal to save 
>>you if you desire in a lockup situation. We do want to preserve RT 
>>scheduling behaviour at all times without caveats for privileged users.
> 
> 
> The sysrq is already in 2.6.10. I had to use it the last days a few
> times. But it does help if you have no access to the console.
> 
> The RT throttling idea is not to change the behavior in normal
> conditions. It's only for a fatal system state. If you have a runaway RT
> task you can't guarantee the system is work properly anyway. It's
> blocking vital kernel threads, filesystems, swap, keyboard, ...

I understand fully your concern. If such a thing were to be introduced 
it would have to be disabled by default. Since I'm looking at 
implementing such throttling for user RT tasks, it should be trivial to 
add it to other RT tasks, and have 100% as the default cpu limit. How 
does that sound?

> It's a bit like out of memory. You can do nothing and panic. Or trying
> something bad (killing processes) which is hopefully better as the
> former.
> btw: Are RT tasks excluded by the oom killer?

I haven't looked. VM hackers?

Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:40                                                                                               ` Paul Davis
@ 2005-01-14  2:57                                                                                                 ` Nick Piggin
  2005-01-14  3:12                                                                                                 ` Andrew Morton
  1 sibling, 0 replies; 266+ messages in thread
From: Nick Piggin @ 2005-01-14  2:57 UTC (permalink / raw)
  To: Paul Davis
  Cc: utz lehmann, Lee Revell, Arjan van de Ven, Jack O'Quin,
	Chris Wright, Matt Mackall, Christoph Hellwig, Andrew Morton,
	mingo, alan, LKML, Con Kolivas

On Thu, 2005-01-13 at 21:40 -0500, Paul Davis wrote:
> >SCHED_FIFO and SCHED_RR are definitely privileged operations and you
> 
> this is the crux of what this whole debate is about. for all of you
> people who think about linux on multi-user systems with network
> connectivity, running servers and so forth, this is clearly a given.
> 
> but there is large and growing body of machines that run linux where
> the sole human user of the machine has a strong and overwhelming
> desire to have tasks run with the characteristics offered by
> SCHED_FIFO and/or SCHED_RR. are they still "privileged" operations on
> this class of linux system? what about linux installed on an embedded
> system, with a small LCD screen and the sole purpose of running audio
> apps live? are they still privileged then?
> 

I think yes, because their misuse can trivially take down the
machine by definition. So it is still privileged in the context
of that system.

> i think there is room for debate, but its clear that in general,
> SCHED_FIFO/SCHED_RR's "definite" status as privileged operations is
> not clear. we are trying to find ways to provide access to it in ways
> that don't conflict with the other categories of linux systems where
> it clearly needs to be off-limits to unprivileged users.
> 

In such a system, sure you could make allowances by elevating
privileges or what have you.

I guess the tricky part is exactly how to make these allowances. I've
joined the thread too late (and don't have the knowledge) to really get
into that... but I just wanted to be clear that watering down SCHED_RR
and SCHED_FIFO basically just makes them no good to anyone.

I personally can't see how a scheduling policy can allow deterministic
access to the CPU without being a privileged operation. If you don't
need deterministic access to the scheduler, then let's talk about why
SCHED_OTHER isn't good enough. If you do, then we're talking about
security access I think.

Nick


http://mobile.yahoo.com.au - Yahoo! Mobile
- Check & compose your email via SMS on your Telstra or Vodafone mobile.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:40                                                                                               ` Paul Davis
  2005-01-14  2:57                                                                                                 ` Nick Piggin
@ 2005-01-14  3:12                                                                                                 ` Andrew Morton
  2005-01-14  3:18                                                                                                   ` Con Kolivas
  2005-01-14  6:57                                                                                                   ` Matt Mackall
  1 sibling, 2 replies; 266+ messages in thread
From: Andrew Morton @ 2005-01-14  3:12 UTC (permalink / raw)
  To: Paul Davis
  Cc: nickpiggin, lkml, rlrevell, arjanv, joq, chrisw, mpm, hch, mingo,
	alan, linux-kernel, kernel

Paul Davis <paul@linuxaudiosystems.com> wrote:
>
> >SCHED_FIFO and SCHED_RR are definitely privileged operations and you
> 
> this is the crux of what this whole debate is about. for all of you
> people who think about linux on multi-user systems with network
> connectivity, running servers and so forth, this is clearly a given.
> 
> but there is large and growing body of machines that run linux where
> the sole human user of the machine has a strong and overwhelming
> desire to have tasks run with the characteristics offered by
> SCHED_FIFO and/or SCHED_RR. are they still "privileged" operations on
> this class of linux system? what about linux installed on an embedded
> system, with a small LCD screen and the sole purpose of running audio
> apps live? are they still privileged then?
> 

Paul.  Everyone agrees with you.  I think.  We just need to work out
the best way of doing it.

Would I be right in suspecting that we know what to do, but nobody has
stepped up to write the code?  It's kinda looking like that?

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:12                                                                                                 ` Andrew Morton
@ 2005-01-14  3:18                                                                                                   ` Con Kolivas
  2005-01-14  3:30                                                                                                     ` Paul Davis
  2005-01-14  3:31                                                                                                     ` Nick Piggin
  2005-01-14  6:57                                                                                                   ` Matt Mackall
  1 sibling, 2 replies; 266+ messages in thread
From: Con Kolivas @ 2005-01-14  3:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Davis, nickpiggin, lkml, rlrevell, arjanv, joq, chrisw, mpm,
	hch, mingo, alan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

Andrew Morton wrote:
> Paul Davis <paul@linuxaudiosystems.com> wrote:
> 
>>>SCHED_FIFO and SCHED_RR are definitely privileged operations and you
>>
>>this is the crux of what this whole debate is about. for all of you
>>people who think about linux on multi-user systems with network
>>connectivity, running servers and so forth, this is clearly a given.
>>
>>but there is large and growing body of machines that run linux where
>>the sole human user of the machine has a strong and overwhelming
>>desire to have tasks run with the characteristics offered by
>>SCHED_FIFO and/or SCHED_RR. are they still "privileged" operations on
>>this class of linux system? what about linux installed on an embedded
>>system, with a small LCD screen and the sole purpose of running audio
>>apps live? are they still privileged then?
>>
> 
> 
> Paul.  Everyone agrees with you.  I think.  We just need to work out
> the best way of doing it.
> 
> Would I be right in suspecting that we know what to do, but nobody has
> stepped up to write the code?  It's kinda looking like that?

I thought I made it clear i had already volunteered. I was after a 
response to my proposal for how to do it.

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:42                                                                                                 ` Con Kolivas
@ 2005-01-14  3:20                                                                                                   ` Andrew Morton
  2005-01-14  3:28                                                                                                     ` utz lehmann
  2005-01-14  3:26                                                                                                   ` utz lehmann
  1 sibling, 1 reply; 266+ messages in thread
From: Andrew Morton @ 2005-01-14  3:20 UTC (permalink / raw)
  To: Con Kolivas
  Cc: lkml, rlrevell, arjanv, joq, chrisw, paul, mpm, hch, mingo, alan,
	linux-kernel

Con Kolivas <kernel@kolivas.org> wrote:
>
>  > btw: Are RT tasks excluded by the oom killer?
> 
>  I haven't looked. VM hackers?

Nope.  We're nastier to tasks which have been niced down, but we're not
nicer to tasks which have been given elevated priority/policy.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  2:42                                                                                                 ` Con Kolivas
  2005-01-14  3:20                                                                                                   ` Andrew Morton
@ 2005-01-14  3:26                                                                                                   ` utz lehmann
  1 sibling, 0 replies; 266+ messages in thread
From: utz lehmann @ 2005-01-14  3:26 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Lee Revell, Arjan van de Ven, Jack O'Quin, Chris Wright,
	Paul Davis, Matt Mackall, Christoph Hellwig, Andrew Morton,
	mingo, alan, LKML

On Fri, 2005-01-14 at 13:42 +1100, Con Kolivas wrote:
> utz lehmann wrote:
> > On Fri, 2005-01-14 at 13:08 +1100, Con Kolivas wrote:
> > 
> >>utz lehmann wrote:
> >>>Just an idea. What about throttling runaway RT tasks?
> >>>If the system spend more than 98% in RT tasks for 5s consider this as a
> >>>_fatal error_. Print an error message and throttle RT tasks by inserting
> >>>ticks where only SCHED_OTHER tasks allowed. For a limit of 98% this
> >>>means one SCHED_OTHER only tick all 50 ticks.
> >>>
> >>>The limit and timeout should be configurable and of course it can be
> >>>disabled.
> >>>
> >>>I know this is against RT task preempt all SCHED_OTHER but this is only
> >>>for a fatal system state to be able to recover sanely. A locked up
> >>>machine is is the worse alternative.
> >>
> >>There is a patch in -mm currently designed to use a sysrq key 
> >>combination which converts all real time tasks to sched normal to save 
> >>you if you desire in a lockup situation. We do want to preserve RT 
> >>scheduling behaviour at all times without caveats for privileged users.
> > 
> > 
> > The sysrq is already in 2.6.10. I had to use it the last days a few
> > times. But it does help if you have no access to the console.
> > 
> > The RT throttling idea is not to change the behavior in normal
> > conditions. It's only for a fatal system state. If you have a runaway RT
> > task you can't guarantee the system is work properly anyway. It's
> > blocking vital kernel threads, filesystems, swap, keyboard, ...
> 
> I understand fully your concern. If such a thing were to be introduced 
> it would have to be disabled by default. Since I'm looking at 
> implementing such throttling for user RT tasks, it should be trivial to 
> add it to other RT tasks, and have 100% as the default cpu limit. How 
> does that sound?

Sounds good.-)
The kernel should have 100% limit (disable) as default. Users and
distros can change it to a sane value for there needs.



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:20                                                                                                   ` Andrew Morton
@ 2005-01-14  3:28                                                                                                     ` utz lehmann
  0 siblings, 0 replies; 266+ messages in thread
From: utz lehmann @ 2005-01-14  3:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Con Kolivas, rlrevell, arjanv, joq, chrisw, paul, mpm, hch,
	mingo, alan, LKML

On Thu, 2005-01-13 at 19:20 -0800, Andrew Morton wrote:
> Con Kolivas <kernel@kolivas.org> wrote:
> >
> >  > btw: Are RT tasks excluded by the oom killer?
> > 
> >  I haven't looked. VM hackers?
> 
> Nope.  We're nastier to tasks which have been niced down, but we're not
> nicer to tasks which have been given elevated priority/policy.

Maybe this should be done?
RT tasks are somewhat important i think.



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:18                                                                                                   ` Con Kolivas
@ 2005-01-14  3:30                                                                                                     ` Paul Davis
  2005-01-14  3:38                                                                                                       ` Con Kolivas
  2005-01-14  3:31                                                                                                     ` Nick Piggin
  1 sibling, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-14  3:30 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Andrew Morton, nickpiggin, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

>> Paul.  Everyone agrees with you.  I think.  We just need to work out
>> the best way of doing it.
>> 
>> Would I be right in suspecting that we know what to do, but nobody has
>> stepped up to write the code?  It's kinda looking like that?
>
>I thought I made it clear i had already volunteered. I was after a 
>response to my proposal for how to do it.

I think your proposal is a good (maybe even excellent) one, but it
somewhat sidesteps the issue (which may be the best thing to
do). Rather than answering the question "how best to allow regular
users access to SCHED_FIFO", it says "lets offer regular users
SCHED_ISO which is essentially identical to SCHED_FIFO unless tasks
running SCHED_ISO use too much cpu time".

its a fine answer, but its the answer to a slightly different
question. if anyone (maybe us audio freaks, maybe someone else) comes
up with a reason to want "The Real SCHED_FIFO", the original question
will have gone unanswered.

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:18                                                                                                   ` Con Kolivas
  2005-01-14  3:30                                                                                                     ` Paul Davis
@ 2005-01-14  3:31                                                                                                     ` Nick Piggin
  2005-01-14  3:34                                                                                                       ` Paul Davis
                                                                                                                         ` (2 more replies)
  1 sibling, 3 replies; 266+ messages in thread
From: Nick Piggin @ 2005-01-14  3:31 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Andrew Morton, Paul Davis, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

On Fri, 2005-01-14 at 14:18 +1100, Con Kolivas wrote:
> Andrew Morton wrote:
> > Paul Davis <paul@linuxaudiosystems.com> wrote:
> > 
> >>>SCHED_FIFO and SCHED_RR are definitely privileged operations and you
> >>
> >>this is the crux of what this whole debate is about. for all of you
> >>people who think about linux on multi-user systems with network
> >>connectivity, running servers and so forth, this is clearly a given.
> >>
> >>but there is large and growing body of machines that run linux where
> >>the sole human user of the machine has a strong and overwhelming
> >>desire to have tasks run with the characteristics offered by
> >>SCHED_FIFO and/or SCHED_RR. are they still "privileged" operations on
> >>this class of linux system? what about linux installed on an embedded
> >>system, with a small LCD screen and the sole purpose of running audio
> >>apps live? are they still privileged then?
> >>
> > 
> > 
> > Paul.  Everyone agrees with you.  I think.  We just need to work out
> > the best way of doing it.
> > 
> > Would I be right in suspecting that we know what to do, but nobody has
> > stepped up to write the code?  It's kinda looking like that?
> 
> I thought I made it clear i had already volunteered. I was after a 
> response to my proposal for how to do it.
> 

It sounds to me like both your proposals may be too complex and not
sufficiently deterministic (I don't know for sure, maybe that's
exactly what the RT people want).

I wouldn't have thought it is so much a matter of having real-time-ish
scheduling available that tries to play nicely in a multi user machine.
That must still imply that either the user is able to unduly tie up
resources (and thus it has to be a privileged operation), or that it
sometimes can't meet its "guarantees" (in which case, is it useful?).

I was thinking that the solution might be more along the lines of
a nice way to handle privileges for these guys.

I could be completely off the rails though. I haven't really been
following this thread so please shoot me in my foot if I have put it
in my mouth.




^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:31                                                                                                     ` Nick Piggin
@ 2005-01-14  3:34                                                                                                       ` Paul Davis
  2005-01-14  4:11                                                                                                       ` Con Kolivas
  2005-01-14  9:21                                                                                                       ` Will Dyson
  2 siblings, 0 replies; 266+ messages in thread
From: Paul Davis @ 2005-01-14  3:34 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Con Kolivas, Andrew Morton, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

>I wouldn't have thought it is so much a matter of having real-time-ish
>scheduling available that tries to play nicely in a multi user machine.
>That must still imply that either the user is able to unduly tie up
>resources (and thus it has to be a privileged operation), or that it
>sometimes can't meet its "guarantees" (in which case, is it useful?).

most audio hackers and users are perfectly comfortable with the OSX
compromise - tasks with no special privilege get deterministic access
to the CPU as long as they do not consume excessive cycles.

this begs the question about what happens when the entire class of
SCHED_ISO (to use Con's working name for such a scheduling class)
tasks is eating too much CPU, rather than any one of them, but i'll
leave that to Con :)

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:30                                                                                                     ` Paul Davis
@ 2005-01-14  3:38                                                                                                       ` Con Kolivas
  2005-01-14  3:51                                                                                                         ` Paul Davis
  2005-01-14  4:04                                                                                                         ` Nick Piggin
  0 siblings, 2 replies; 266+ messages in thread
From: Con Kolivas @ 2005-01-14  3:38 UTC (permalink / raw)
  To: Paul Davis
  Cc: Andrew Morton, nickpiggin, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1117 bytes --]

Paul Davis wrote:
>>>Paul.  Everyone agrees with you.  I think.  We just need to work out
>>>the best way of doing it.
>>>
>>>Would I be right in suspecting that we know what to do, but nobody has
>>>stepped up to write the code?  It's kinda looking like that?
>>
>>I thought I made it clear i had already volunteered. I was after a 
>>response to my proposal for how to do it.
> 
> 
> I think your proposal is a good (maybe even excellent) one, but it
> somewhat sidesteps the issue (which may be the best thing to
> do). Rather than answering the question "how best to allow regular
> users access to SCHED_FIFO", it says "lets offer regular users
> SCHED_ISO which is essentially identical to SCHED_FIFO unless tasks
> running SCHED_ISO use too much cpu time".
> 
> its a fine answer, but its the answer to a slightly different
> question. if anyone (maybe us audio freaks, maybe someone else) comes
> up with a reason to want "The Real SCHED_FIFO", the original question
> will have gone unanswered.

Ah then  you missed something. You can set the max cpu of SCHED_ISO to 
100% and then you have it.

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:38                                                                                                       ` Con Kolivas
@ 2005-01-14  3:51                                                                                                         ` Paul Davis
  2005-01-14  4:00                                                                                                           ` Con Kolivas
  2005-01-14  4:04                                                                                                         ` Nick Piggin
  1 sibling, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-14  3:51 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Andrew Morton, nickpiggin, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

>> its a fine answer, but its the answer to a slightly different
>> question. if anyone (maybe us audio freaks, maybe someone else) comes
>> up with a reason to want "The Real SCHED_FIFO", the original question
>> will have gone unanswered.
>
>Ah then  you missed something. You can set the max cpu of SCHED_ISO to 
>100% and then you have it.

true, i missed that :) but i also recall you saying you were thinking
of having no prioritization within SCHED_ISO ... or am i remembering
wrong? also, is it just me, or having to ways to achieve the exact
same result seems very un-linux-like ... and if they are not exact
same results, how does a regular user get the SCHED_FIFO ones? is the
answer just "they don't" ?

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:51                                                                                                         ` Paul Davis
@ 2005-01-14  4:00                                                                                                           ` Con Kolivas
  2005-01-14  4:16                                                                                                             ` Nick Piggin
  0 siblings, 1 reply; 266+ messages in thread
From: Con Kolivas @ 2005-01-14  4:00 UTC (permalink / raw)
  To: Paul Davis
  Cc: Andrew Morton, nickpiggin, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1981 bytes --]

Paul Davis wrote:
>>>its a fine answer, but its the answer to a slightly different
>>>question. if anyone (maybe us audio freaks, maybe someone else) comes
>>>up with a reason to want "The Real SCHED_FIFO", the original question
>>>will have gone unanswered.
>>
>>Ah then  you missed something. You can set the max cpu of SCHED_ISO to 
>>100% and then you have it.
> 
> 
> true, i missed that :) but i also recall you saying you were thinking
> of having no prioritization within SCHED_ISO ... or am i remembering
> wrong? 

Nothing is set in stone.  I wont even look at code until Ingo or Linus 
rules on this. Ingo has expressed interest in SCHED_ISO on a previous 
thread with me.

> also, is it just me, or having to ways to achieve the exact
> same result seems very un-linux-like ... and if they are not exact
> same results, how does a regular user get the SCHED_FIFO ones? is the
> answer just "they don't" ?

To answer your question, the second of my proposals was to not have a 
separate scheduling class at all. To let normal users set SCHED_FIFO and 
SCHED_RR, possibly with all their priorities intact, but for there to be 
limits placed on their usage of these classes. The reason I suggested 
not supporting priorities is that proper real time scheduling would 
entail being able to say "I need x cycles, to complete by y time and I 
can or cannot be preempted". With these QoS requirements, a whole new 
scheduling style (EDF) would need to be implemented. Without actually 
implementing this, if you set a limit of cpu to 70%, all it takes is one 
FIFO process to run long enough at high enough priority and all your 
other soft real time tasks go to SCHED_NORMAL, which is nothing like 
what happens with true RT scheduling. Forcing all soft RT threads to 
round robin at the same priority would make them sort themselves out. 
It's a compromise either way, and in fact this latter way is what OSX 
does and works well in practice as well as theory.

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:38                                                                                                       ` Con Kolivas
  2005-01-14  3:51                                                                                                         ` Paul Davis
@ 2005-01-14  4:04                                                                                                         ` Nick Piggin
  1 sibling, 0 replies; 266+ messages in thread
From: Nick Piggin @ 2005-01-14  4:04 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Paul Davis, Andrew Morton, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

On Fri, 2005-01-14 at 14:38 +1100, Con Kolivas wrote:
> Paul Davis wrote:

> > its a fine answer, but its the answer to a slightly different
> > question. if anyone (maybe us audio freaks, maybe someone else) comes
> > up with a reason to want "The Real SCHED_FIFO", the original question
> > will have gone unanswered.
> 
> Ah then  you missed something. You can set the max cpu of SCHED_ISO to 
> 100% and then you have it.
> 

Is that a good solution? I'm not sure if it is wise to try to
masquerade SCHED_ISO as an unprivileged RT class.

I mean what happens if two users are trying to run independent
SCHED_ISO systems? Both will probably break, right?

And how can you provide _any_ guarantees in an arbitrary environment
without this becoming a privileged operation? I can't quite get my head
around that at the moment...

I guess if you have SCHED_ISO start out with 0 guarantees, and have root
dole some out, then it may be workable. But then that is just another
specialised ad hoc sort of hack wouldn't it? (not talking about
SCHED_ISO itself, but the granting of the privilege to use it).





^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:31                                                                                                     ` Nick Piggin
  2005-01-14  3:34                                                                                                       ` Paul Davis
@ 2005-01-14  4:11                                                                                                       ` Con Kolivas
  2005-01-14  4:23                                                                                                         ` Nick Piggin
  2005-01-14  9:21                                                                                                       ` Will Dyson
  2 siblings, 1 reply; 266+ messages in thread
From: Con Kolivas @ 2005-01-14  4:11 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, Paul Davis, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 620 bytes --]

Nick Piggin wrote:
> It sounds to me like both your proposals may be too complex and not
> sufficiently deterministic (I don't know for sure, maybe that's
> exactly what the RT people want).

This is the solution already employed in the real world by OSX. It works
well, and the audio people have told me they are happy with it.

> I could be completely off the rails though. I haven't really been
> following this thread so please shoot me in my foot if I have put it
> in my mouth.

If your foot is in your mouth and you ask me to shoot you in the foot it
would blow your head off... Hmm it's tempting...

Cheers,
Con

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  4:00                                                                                                           ` Con Kolivas
@ 2005-01-14  4:16                                                                                                             ` Nick Piggin
  0 siblings, 0 replies; 266+ messages in thread
From: Nick Piggin @ 2005-01-14  4:16 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Paul Davis, Andrew Morton, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

On Fri, 2005-01-14 at 15:00 +1100, Con Kolivas wrote:
> Paul Davis wrote:
> >>>its a fine answer, but its the answer to a slightly different
> >>>question. if anyone (maybe us audio freaks, maybe someone else) comes
> >>>up with a reason to want "The Real SCHED_FIFO", the original question
> >>>will have gone unanswered.
> >>
> >>Ah then  you missed something. You can set the max cpu of SCHED_ISO to 
> >>100% and then you have it.
> > 
> > 
> > true, i missed that :) but i also recall you saying you were thinking
> > of having no prioritization within SCHED_ISO ... or am i remembering
> > wrong? 
> 
> Nothing is set in stone.  I wont even look at code until Ingo or Linus 
> rules on this. Ingo has expressed interest in SCHED_ISO on a previous 
> thread with me.
> 

You may have a chicken and egg problem :) I don't think anybody will
rule on this unless there is at least a demand for it. For there to
be a demand for it I think you'd need to come up with a rigorous
specification, wouldn't you? And then implement it even.

Unfortunately this is just how kernel development goes if you're brave
enough to try out new things.

I'm leaning toward the opinion that the entire problem would be better
handled purely with the existing RT scheduling classes, and a good way
to handle the security side of things.

> > also, is it just me, or having to ways to achieve the exact
> > same result seems very un-linux-like ... and if they are not exact
> > same results, how does a regular user get the SCHED_FIFO ones? is the
> > answer just "they don't" ?
> 
> To answer your question, the second of my proposals was to not have a 
> separate scheduling class at all. To let normal users set SCHED_FIFO and 
> SCHED_RR, possibly with all their priorities intact, but for there to be 
> limits placed on their usage of these classes. The reason I suggested 
> not supporting priorities is that proper real time scheduling would 
> entail being able to say "I need x cycles, to complete by y time and I 
> can or cannot be preempted". With these QoS requirements, a whole new 
> scheduling style (EDF) would need to be implemented. Without actually 

This sort of thing is pretty well specialised enough that it doesn't
belong in the kernel scheduler, however. Often it can be satisfied by
userspace managers... but you'd have to be talking about a hard RT
system anyway, which Linux isn't.


http://mobile.yahoo.com.au - Yahoo! Mobile
- Check & compose your email via SMS on your Telstra or Vodafone mobile.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  4:11                                                                                                       ` Con Kolivas
@ 2005-01-14  4:23                                                                                                         ` Nick Piggin
  2005-01-14  4:45                                                                                                           ` Paul Davis
  0 siblings, 1 reply; 266+ messages in thread
From: Nick Piggin @ 2005-01-14  4:23 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Andrew Morton, Paul Davis, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

On Fri, 2005-01-14 at 15:11 +1100, Con Kolivas wrote:
> Nick Piggin wrote:
> > It sounds to me like both your proposals may be too complex and not
> > sufficiently deterministic (I don't know for sure, maybe that's
> > exactly what the RT people want).
> 
> This is the solution already employed in the real world by OSX. It works
> well, and the audio people have told me they are happy with it.
> 

Alternatively, could you grant the required capabilities to use real
RT scheduling and not foul up the scheduler?

Or do a similar sort of thing with a userspace daemon that manages
priorities and watches CPU usage?

Basically I'd prefer not to put hacks in the (mainline) scheduler to
handle this pretty specific special case.

> > I could be completely off the rails though. I haven't really been
> > following this thread so please shoot me in my foot if I have put it
> > in my mouth.
> 
> If your foot is in your mouth and you ask me to shoot you in the foot it
> would blow your head off... Hmm it's tempting...
> 

Meeeow!




^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  4:23                                                                                                         ` Nick Piggin
@ 2005-01-14  4:45                                                                                                           ` Paul Davis
  2005-01-14  5:14                                                                                                             ` Nick Piggin
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-14  4:45 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Con Kolivas, Andrew Morton, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

>Alternatively, could you grant the required capabilities to use real
>RT scheduling and not foul up the scheduler?

this is precisely the point i was making. either you agree that
unprivileged users can get easy access to a scheduling class that can
reliably DOS the system, or they can't. if they can't, what kind of
scheduling class can they access easily?

according to andrew, and i agree with his conclusion, many people
agree that its OK for them to get access to the DOS class, but there's
little agreement on the security model to allow this. Con is
suggesting that they are not, but instead get a different scheduling
class that is functionally equivalent except that it can't
(theoretically) be used to DOS the system.

--p

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  4:45                                                                                                           ` Paul Davis
@ 2005-01-14  5:14                                                                                                             ` Nick Piggin
  0 siblings, 0 replies; 266+ messages in thread
From: Nick Piggin @ 2005-01-14  5:14 UTC (permalink / raw)
  To: Paul Davis
  Cc: Con Kolivas, Andrew Morton, lkml, rlrevell, arjanv, joq, chrisw,
	mpm, hch, mingo, alan, linux-kernel

On Thu, 2005-01-13 at 23:45 -0500, Paul Davis wrote:
> >Alternatively, could you grant the required capabilities to use real
> >RT scheduling and not foul up the scheduler?
> 
> this is precisely the point i was making. either you agree that
> unprivileged users can get easy access to a scheduling class that can
> reliably DOS the system, or they can't. if they can't, what kind of
> scheduling class can they access easily?
> 
> according to andrew, and i agree with his conclusion, many people
> agree that its OK for them to get access to the DOS class, but there's
> little agreement on the security model to allow this. Con is
> suggesting that they are not, but instead get a different scheduling
> class that is functionally equivalent except that it can't
> (theoretically) be used to DOS the system.

Well IMO that would be preferable if there are no other objections.
I can't think how any sort of unprivileged "real time" scheduling
would have a place on multi-user systems.

And if it is only really used on single user systems then presumably
the priority elevation isn't a big problem provided it can be properly
managed. So this would appear to be the better solution.

Supposing you do want some sort of DOS prevention in the system, I'd
much prefer it be handled by a trusted user-space daemon for example,
rather than scheduler smarts (which may require a little bit of work
to limit priorities but would be relatively straightforward).




^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:12                                                                                                 ` Andrew Morton
  2005-01-14  3:18                                                                                                   ` Con Kolivas
@ 2005-01-14  6:57                                                                                                   ` Matt Mackall
  2005-01-14  7:04                                                                                                     ` Andrew Morton
  2005-01-14 20:10                                                                                                     ` Chris Wright
  1 sibling, 2 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-14  6:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Davis, nickpiggin, lkml, rlrevell, arjanv, joq, chrisw, hch,
	mingo, alan, linux-kernel, kernel

On Thu, Jan 13, 2005 at 07:12:37PM -0800, Andrew Morton wrote:
> Paul Davis <paul@linuxaudiosystems.com> wrote:
> >
> > >SCHED_FIFO and SCHED_RR are definitely privileged operations and you
> > 
> > this is the crux of what this whole debate is about. for all of you
> > people who think about linux on multi-user systems with network
> > connectivity, running servers and so forth, this is clearly a given.
> > 
> > but there is large and growing body of machines that run linux where
> > the sole human user of the machine has a strong and overwhelming
> > desire to have tasks run with the characteristics offered by
> > SCHED_FIFO and/or SCHED_RR. are they still "privileged" operations on
> > this class of linux system? what about linux installed on an embedded
> > system, with a small LCD screen and the sole purpose of running audio
> > apps live? are they still privileged then?
> > 
> 
> Paul.  Everyone agrees with you.  I think.  We just need to work out
> the best way of doing it.
> 
> Would I be right in suspecting that we know what to do, but nobody has
> stepped up to write the code?  It's kinda looking like that?

The closest thing to concensus I've seen yet was a new rlimit for
scheduling with code from Chris Wright. The version I last saw had
some rough edges on the API (exposing the internal scheduler priority
levels) but wasn't too bad in principle. We really ought not get in
the habit of adding new rlimits though.

Perhaps he can post whatever he has again, I'm not sure what the
current state is.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  6:57                                                                                                   ` Matt Mackall
@ 2005-01-14  7:04                                                                                                     ` Andrew Morton
  2005-01-14  7:55                                                                                                       ` Chris Wright
  2005-01-14 20:10                                                                                                     ` Chris Wright
  1 sibling, 1 reply; 266+ messages in thread
From: Andrew Morton @ 2005-01-14  7:04 UTC (permalink / raw)
  To: Matt Mackall
  Cc: paul, nickpiggin, lkml, rlrevell, arjanv, joq, chrisw, hch,
	mingo, alan, linux-kernel, kernel

Matt Mackall <mpm@selenic.com> wrote:
>
> The closest thing to concensus I've seen yet was a new rlimit for
>  scheduling with code from Chris Wright.

hmm, yes.  It doesn't feel like an rlimity thing to me, unless the rlimit
actually _limits_ something.  Say, minimum permissible nice level.  But
scheduling policy sounds more like a capability than an rlimit.

>  We really ought not get in
>  the habit of adding new rlimits though.

How come?  It's a real pita that the standard shells don't appear to have a
way of setting an unknown rlimit.  But what else?

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  7:04                                                                                                     ` Andrew Morton
@ 2005-01-14  7:55                                                                                                       ` Chris Wright
  0 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-01-14  7:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matt Mackall, paul, nickpiggin, lkml, rlrevell, arjanv, joq,
	chrisw, hch, mingo, alan, linux-kernel, kernel

* Andrew Morton (akpm@osdl.org) wrote:
> Matt Mackall <mpm@selenic.com> wrote:
> >
> > The closest thing to concensus I've seen yet was a new rlimit for
> >  scheduling with code from Chris Wright.
> 
> hmm, yes.  It doesn't feel like an rlimity thing to me, unless the rlimit
> actually _limits_ something.  Say, minimum permissible nice level.  But
> scheduling policy sounds more like a capability than an rlimit.

It's had a few incarnations with minor tweaks.  But they each did
provide a limit, an upper bound, on how the user could prioritize it's
task with the scheduler (both nice values and rt priorities).

> >  We really ought not get in
> >  the habit of adding new rlimits though.
> 
> How come?  It's a real pita that the standard shells don't appear to have a
> way of setting an unknown rlimit.  But what else?

It's got that slippery slope feeling.  When do you decided that you're
just punting everything to an rlimit and it becomes an unmanaged mess?
However, in this case, at least it's easy to justify cpu time as a
resource.  I'll repost in the AM...sleep calls.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  3:31                                                                                                     ` Nick Piggin
  2005-01-14  3:34                                                                                                       ` Paul Davis
  2005-01-14  4:11                                                                                                       ` Con Kolivas
@ 2005-01-14  9:21                                                                                                       ` Will Dyson
  2005-01-14  9:54                                                                                                         ` Nick Piggin
  2 siblings, 1 reply; 266+ messages in thread
From: Will Dyson @ 2005-01-14  9:21 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Con Kolivas, Andrew Morton, Paul Davis, lkml, rlrevell, arjanv,
	joq, chrisw, mpm, hch, mingo, alan, linux-kernel

On Fri, 14 Jan 2005 14:31:21 +1100, Nick Piggin <nickpiggin@yahoo.com.au> wrote:
 
> It sounds to me like both your proposals may be too complex and not
> sufficiently deterministic (I don't know for sure, maybe that's
> exactly what the RT people want).
> 
> I wouldn't have thought it is so much a matter of having real-time-ish
> scheduling available that tries to play nicely in a multi user machine.
> That must still imply that either the user is able to unduly tie up
> resources (and thus it has to be a privileged operation), or that it
> sometimes can't meet its "guarantees" (in which case, is it useful?).

The VM system with overcommit is in a similar pickle. It can't honor
the "guarantees" it makes. Yet, I think it is in wide use. Overcommit
is a useful behavior for many people, despite the fact that it allows
any user to turn loose the oom_killer on the system.

So I think many people would also find a best-effort-at-realtime
SCHED_ISO type thing pretty useful, even if it allowed unprivileged
users to tie up resources (while protecting the system from DOS).
Heck, we don't have to allow unprivileged users to tie up resources.
SCHED_ISO use could be limited to members of a certain group, possibly
implemented using some sort of LSM module... :)

Of course, suggesting that access to SCHED_ISO be limited pretty much
admits that running processes as SCHED_ISO should be a privileged
operation, like accessing /dev/dsp (a privilege that is granted
through group membership on most desktops).

> I was thinking that the solution might be more along the lines of
> a nice way to handle privileges for these guys.

A nice,  flexible way to hand out scheduler (and perhaps other)
privileges would be... nice. Are you thinking of something more
fine-grained than per-user?

-- 
Will Dyson

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  9:21                                                                                                       ` Will Dyson
@ 2005-01-14  9:54                                                                                                         ` Nick Piggin
  0 siblings, 0 replies; 266+ messages in thread
From: Nick Piggin @ 2005-01-14  9:54 UTC (permalink / raw)
  To: Will Dyson
  Cc: Con Kolivas, Andrew Morton, Paul Davis, lkml, rlrevell, arjanv,
	joq, chrisw, mpm, hch, mingo, alan, linux-kernel

Will Dyson wrote:
> On Fri, 14 Jan 2005 14:31:21 +1100, Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>  
> 
>>It sounds to me like both your proposals may be too complex and not
>>sufficiently deterministic (I don't know for sure, maybe that's
>>exactly what the RT people want).
>>
>>I wouldn't have thought it is so much a matter of having real-time-ish
>>scheduling available that tries to play nicely in a multi user machine.
>>That must still imply that either the user is able to unduly tie up
>>resources (and thus it has to be a privileged operation), or that it
>>sometimes can't meet its "guarantees" (in which case, is it useful?).
> 
> 
> The VM system with overcommit is in a similar pickle. It can't honor
> the "guarantees" it makes. Yet, I think it is in wide use. Overcommit
> is a useful behavior for many people, despite the fact that it allows
> any user to turn loose the oom_killer on the system.
> 

I'm not sure if that is a really good comparison.

> So I think many people would also find a best-effort-at-realtime
> SCHED_ISO type thing pretty useful, even if it allowed unprivileged
> users to tie up resources (while protecting the system from DOS).
> Heck, we don't have to allow unprivileged users to tie up resources.
> SCHED_ISO use could be limited to members of a certain group, possibly
> implemented using some sort of LSM module... :)
> 
> Of course, suggesting that access to SCHED_ISO be limited pretty much
> admits that running processes as SCHED_ISO should be a privileged
> operation, like accessing /dev/dsp (a privilege that is granted
> through group membership on most desktops).
> 

Now I'm not adverse to cool hacks, and I haven't thought about
SCHED_ISO enough to comment on it much (nor has its behaviour
even been firmly defined as far as I know).

But regarding the kernel in general and the scheduler especially:
it is pretty important to fight feature creep. SCHED_ISO will have
a non zero cost in terms of complexity, maintainability, and
probably performance.

So the only way it can go in is if a non trivial number of people
really need it for things that can't be satisfied in userspace or
with a good privilege system or [something elegant], etc.

I'm not by any means stopping anyone from coming up with a firm
definition for SCHED_ISO, implementing it, and demonstrating that
it is the best way to solve a problem that X people care about,
and that its benefits outweigh its inevitable costs...
I'm just giving some frank advice.

> 
>>I was thinking that the solution might be more along the lines of
>>a nice way to handle privileges for these guys.
> 
> 
> A nice,  flexible way to hand out scheduler (and perhaps other)
> privileges would be... nice. Are you thinking of something more
> fine-grained than per-user?
> 

I'm not too sure, that topic's out of my league... But that is
basically what is sought after by the guys behind the realtime LSM.
So I'd better stop hijacking their thread!


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 23:31                                                                                             ` Jack O'Quin
  2005-01-14  0:33                                                                                               ` Chris Wright
  2005-01-14  0:50                                                                                               ` Con Kolivas
@ 2005-01-14 17:20                                                                                               ` Mike Galbraith
  2005-01-15  1:14                                                                                                 ` Jack O'Quin
  2 siblings, 1 reply; 266+ messages in thread
From: Mike Galbraith @ 2005-01-14 17:20 UTC (permalink / raw)
  To: Jack O'Quin, Arjan van de Ven
  Cc: Lee Revell, Chris Wright, Paul Davis, Matt Mackall,
	Christoph Hellwig, Andrew Morton, mingo, alan, linux-kernel,
	Con Kolivas

At 05:31 PM 1/13/2005 -0600, Jack O'Quin wrote:
>Arjan van de Ven <arjanv@redhat.com> writes:
>
> > On Thu, Jan 13, 2005 at 04:25:08PM -0500, Lee Revell wrote:
> >> The basic issue is that the current semantics of SCHED_FIFO seem make
> >> the deadlock/data corruption due to runaway RT thread issue difficult.
> >> The obvious solution is a new scheduling class equivalent to SCHED_FIFO
> >> but with a mechanism for the kernel to demote the offending thread to
> >> SCHED_OTHER in an emergency.
> >
> > and this is getting really close to the original "counter proposal" to the
> > LSM module that was basically "lets make lower nice limit an rlimit, and
> > have -20 mean "basically FIFO" *if* the task behaves itself".
>
>Yes.  However, my tests have so far shown a need for "actual FIFO as
>long as the task behaves itself."

I for one wonder why that appears to be so.  What happens if you use 
SCHED_RR instead of SCHED_FIFO?

(ie is the problem just one of running out of slice at a bad time, or is it 
the dynamic priority adjustment)

         -Mike 


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14  6:57                                                                                                   ` Matt Mackall
  2005-01-14  7:04                                                                                                     ` Andrew Morton
@ 2005-01-14 20:10                                                                                                     ` Chris Wright
  2005-01-14 20:55                                                                                                       ` Matt Mackall
  1 sibling, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-14 20:10 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Andrew Morton, Paul Davis, nickpiggin, lkml, rlrevell, arjanv,
	joq, chrisw, hch, mingo, alan, linux-kernel, kernel

* Matt Mackall (mpm@selenic.com) wrote:
> The closest thing to concensus I've seen yet was a new rlimit for
> scheduling with code from Chris Wright. The version I last saw had
> some rough edges on the API (exposing the internal scheduler priority
> levels) but wasn't too bad in principle. We really ought not get in
> the habit of adding new rlimits though.
> 
> Perhaps he can post whatever he has again, I'm not sure what the
> current state is.

This is the latest version, with the idea from Utz to break nice and
rtprio apart.

The basic issue on the rlimit value is how to sanely encode nice values,
realtime prioroties and scheduler policies into a number.  The first
incarnation was the clumsiest, and tried to pack it all into a number
in range of [0,139].  This, as many agree, too closely reflects kernel
internal values.  This one gives 0-39 (nice values 19,-20) to RLIMIT_NICE,
and 0-99 (rt priorities) to RLIMIT_RTPRIO.  There's no distinction in rt
policy, and the traditional override (CAP_SYS_NICE) is still in place.
The defaults for both rlimits are 0, and behaviour should be backwards
compatible.  I tested this one a bit, and it worked as expected.  I've
got a patch to pam_limits as well, although it's untested.

thanks,
-chris
-- 

===== include/asm-i386/resource.h 1.5 vs edited =====
--- 1.5/include/asm-i386/resource.h	2004-08-23 01:15:26 -07:00
+++ edited/include/asm-i386/resource.h	2005-01-14 10:28:19 -08:00
@@ -18,8 +18,11 @@
 #define RLIMIT_LOCKS	10		/* maximum file locks held */
 #define RLIMIT_SIGPENDING 11		/* max number of pending signals */
 #define RLIMIT_MSGQUEUE 12		/* maximum bytes in POSIX mqueues */
+#define RLIMIT_NICE	13		/* max nice prio allowed to raise to
+					   0-39 for nice level 19 .. -20 */
+#define RLIMIT_RTPRIO	14		/* maximum realtime priority */
 
-#define RLIM_NLIMITS	13
+#define RLIM_NLIMITS	15
 
 
 /*
@@ -45,6 +48,8 @@
 	{ RLIM_INFINITY, RLIM_INFINITY },		\
 	{ MAX_SIGPENDING, MAX_SIGPENDING },		\
 	{ MQ_BYTES_MAX, MQ_BYTES_MAX },			\
+	{            0,	             0 },		\
+	{            0,	             0 },		\
 }
 
 #endif /* __KERNEL__ */
===== include/linux/sched.h 1.291 vs edited =====
--- 1.291/include/linux/sched.h	2005-01-11 16:42:57 -08:00
+++ edited/include/linux/sched.h	2005-01-14 10:11:13 -08:00
@@ -767,6 +767,7 @@ extern void sched_idle_next(void);
 extern void set_user_nice(task_t *p, long nice);
 extern int task_prio(const task_t *p);
 extern int task_nice(const task_t *p);
+extern unsigned long nice_to_rlimit_nice(const int nice);
 extern int task_curr(const task_t *p);
 extern int idle_cpu(int cpu);
 extern int sched_setscheduler(struct task_struct *, int, struct sched_param *);
===== kernel/sched.c 1.407 vs edited =====
--- 1.407/kernel/sched.c	2005-01-11 16:42:35 -08:00
+++ edited/kernel/sched.c	2005-01-14 10:38:21 -08:00
@@ -68,6 +68,12 @@
 #define MAX_USER_PRIO		(USER_PRIO(MAX_PRIO))
 
 /*
+ * convert nice to RLIMIT_NICE values ([ 19 ... -20 ] to [ 0 ... 39 ])
+ */
+
+#define NICE_TO_RLIMIT_NICE(nice)	(19 - nice)
+
+/*
  * Some helpers for converting nanosecond timing to jiffy resolution
  */
 #define NS_TO_JIFFIES(TIME)	((TIME) / (1000000000 / HZ))
@@ -3140,12 +3146,8 @@ asmlinkage long sys_nice(int increment)
 	 * We don't have to worry. Conceptually one call occurs first
 	 * and we have a single winner.
 	 */
-	if (increment < 0) {
-		if (!capable(CAP_SYS_NICE))
-			return -EPERM;
-		if (increment < -40)
-			increment = -40;
-	}
+	if (increment < -40)
+		increment = -40;
 	if (increment > 40)
 		increment = 40;
 
@@ -3155,6 +3157,12 @@ asmlinkage long sys_nice(int increment)
 	if (nice > 19)
 		nice = 19;
 
+	if (increment < 0 && 
+		NICE_TO_RLIMIT_NICE(nice) >
+		current->signal->rlim[RLIMIT_NICE].rlim_cur &&
+		!capable(CAP_SYS_NICE))
+		return -EPERM;
+
 	retval = security_task_setnice(current, nice);
 	if (retval)
 		return retval;
@@ -3188,6 +3196,15 @@ int task_nice(const task_t *p)
 }
 
 /**
+ * nice_to_rlimit_nice - return rlimit_nice priority of give nice value
+ * @nice: nice value
+ */
+unsigned long nice_to_rlimit_nice(const int nice)
+{
+	return NICE_TO_RLIMIT_NICE(nice);
+}
+
+/**
  * idle_cpu - is a given cpu idle currently?
  * @cpu: the processor in question.
  */
@@ -3252,6 +3269,7 @@ recheck:
 		return -EINVAL;
 
 	if ((policy == SCHED_FIFO || policy == SCHED_RR) &&
+	    param->sched_priority > p->signal->rlim[RLIMIT_RTPRIO].rlim_cur && 
 	    !capable(CAP_SYS_NICE))
 		return -EPERM;
 	if ((current->euid != p->euid) && (current->euid != p->uid) &&
===== kernel/sys.c 1.104 vs edited =====
--- 1.104/kernel/sys.c	2005-01-11 16:42:35 -08:00
+++ edited/kernel/sys.c	2005-01-14 10:11:13 -08:00
@@ -225,7 +225,10 @@ static int set_one_prio(struct task_stru
 		error = -EPERM;
 		goto out;
 	}
-	if (niceval < task_nice(p) && !capable(CAP_SYS_NICE)) {
+	if (niceval < task_nice(p) &&
+		nice_to_rlimit_nice(niceval) >
+		p->signal->rlim[RLIMIT_NICE].rlim_cur &&
+		!capable(CAP_SYS_NICE)) {
 		error = -EACCES;
 		goto out;
 	}


-----
And the patch for pam.

--- Linux-PAM-0.77/modules/pam_limits/pam_limits.c.prio	2005-01-14 10:47:03.000000000 -0800
+++ Linux-PAM-0.77/modules/pam_limits/pam_limits.c	2005-01-14 10:55:13.000000000 -0800
@@ -39,6 +39,11 @@
 #include <grp.h>
 #include <pwd.h>
 
+/* Hack to test new rlimit values */
+#define RLIMIT_NICE	13
+#define RLIMIT_RTPRIO	14
+#define RLIM_NLIMITS	15
+
 /* Module defines */
 #define LINE_LENGTH 1024
 
@@ -293,6 +298,10 @@ static void process_limit(int source, co
     else if (strcmp(lim_item, "locks") == 0)
 	limit_item = RLIMIT_LOCKS;
 #endif
+    else if (strcmp(lim_item, "rt_priority") == 0)
+	limit_item = RLIMIT_RTPRIO;
+    else if (strcmp(lim_item, "nice") == 0)
+	limit_item = RLIMIT_NICE;
     else if (strcmp(lim_item, "maxlogins") == 0) {
 	limit_item = LIMIT_LOGIN;
 	pl->flag_numsyslogins = 0;
@@ -360,6 +369,19 @@ static void process_limit(int source, co
         case RLIMIT_AS:
             limit_value *= 1024;
             break;
+        case RLIMIT_NICE:
+            limit_value = 19 - limit_value;
+            if (limit_value > 39)
+		limit_value = 39;
+	    if (limit_value < 0);
+		limit_value = 0;
+            break;
+        case RLIMIT_RTPRIO:
+            if (limit_value > 99)
+		limit_value = 99;
+	    if (limit_value < 0);
+		limit_value = 0;
+            break;
     }
 
     if ( (limit_item != LIMIT_LOGIN)

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13 19:17                                                                       ` Jack O'Quin
@ 2005-01-14 20:52                                                                         ` Lee Revell
  2005-01-15  0:42                                                                           ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-01-14 20:52 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Matt Mackall, Ingo Molnar, Chris Wright, Paul Davis,
	Christoph Hellwig, Con Kolivas, Andrew Morton, arjanv, alan,
	linux-kernel

On Thu, 2005-01-13 at 13:17 -0600, Jack O'Quin wrote:
> But there may be other, better solutions to the deadlock problem.
> Several years ago, Roger Larsson wrote a completely user-space
> realtime monitor program that works perfectly well for revoking
> realtime privileges when it detects CPU starvation.  I still use it
> occasionally to help debug problems if the built-in JACK watchdog
> timer doesn't catch them.

Jack,

Do you have a link to Roger Larsson's RT watchdog?

Since we seem to have a consensus that the rlimit approach is the way to
go, I think it will be important to have a generic watchdog thread
running as root at a higher RT prio than the RT group.  JACK solves the
problem with its own watchdog thread but as more and more apps migrate
to (in our opinion) the "correct" RT programming model, where you have
multithreaded apps with normal prio disk and GUI threads feeding an RT
rendering thread, a system wide watchdog daemon becomes more attractive.
Keep in mind there are many other applications than audio, for example
CD burning has an obvious RT constraint and cdrecord will take advantage
of SCHED_FIFO and mlockall() if it can get them.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14 20:10                                                                                                     ` Chris Wright
@ 2005-01-14 20:55                                                                                                       ` Matt Mackall
  2005-01-14 23:04                                                                                                         ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-14 20:55 UTC (permalink / raw)
  To: Chris Wright
  Cc: Andrew Morton, Paul Davis, nickpiggin, lkml, rlrevell, arjanv,
	joq, hch, mingo, alan, linux-kernel, kernel

On Fri, Jan 14, 2005 at 12:10:21PM -0800, Chris Wright wrote:
> * Matt Mackall (mpm@selenic.com) wrote:
> > The closest thing to concensus I've seen yet was a new rlimit for
> > scheduling with code from Chris Wright. The version I last saw had
> > some rough edges on the API (exposing the internal scheduler priority
> > levels) but wasn't too bad in principle. We really ought not get in
> > the habit of adding new rlimits though.
> > 
> > Perhaps he can post whatever he has again, I'm not sure what the
> > current state is.
> 
> This is the latest version, with the idea from Utz to break nice and
> rtprio apart.
> 
> The basic issue on the rlimit value is how to sanely encode nice values,
> realtime prioroties and scheduler policies into a number.  The first
> incarnation was the clumsiest, and tried to pack it all into a number
> in range of [0,139].  This, as many agree, too closely reflects kernel
> internal values.  This one gives 0-39 (nice values 19,-20) to RLIMIT_NICE,
> and 0-99 (rt priorities) to RLIMIT_RTPRIO.  There's no distinction in rt
> policy, and the traditional override (CAP_SYS_NICE) is still in place.
> The defaults for both rlimits are 0, and behaviour should be backwards
> compatible.  I tested this one a bit, and it worked as expected.  I've
> got a patch to pam_limits as well, although it's untested.

This is looking pretty good.

> +#define NICE_TO_RLIMIT_NICE(nice)	(19 - nice)
...
> +unsigned long nice_to_rlimit_nice(const int nice)
> +{
> +	return NICE_TO_RLIMIT_NICE(nice);
> +}

This is a bit silly.

> -	if (niceval < task_nice(p) && !capable(CAP_SYS_NICE)) {
> +	if (niceval < task_nice(p) &&
> +		nice_to_rlimit_nice(niceval) >
> +		p->signal->rlim[RLIMIT_NICE].rlim_cur &&
> +		!capable(CAP_SYS_NICE)) {

Perhaps we want another helper function to do the rlim and
CAP_SYS_NICE check together.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14 20:55                                                                                                       ` Matt Mackall
@ 2005-01-14 23:04                                                                                                         ` Chris Wright
  2005-01-15  0:58                                                                                                           ` Matt Mackall
  0 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-01-14 23:04 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Chris Wright, Andrew Morton, Paul Davis, nickpiggin, lkml,
	rlrevell, arjanv, joq, hch, mingo, alan, linux-kernel, kernel

* Matt Mackall (mpm@selenic.com) wrote:
> On Fri, Jan 14, 2005 at 12:10:21PM -0800, Chris Wright wrote:
> > The basic issue on the rlimit value is how to sanely encode nice values,
> > realtime prioroties and scheduler policies into a number.  The first
> > incarnation was the clumsiest, and tried to pack it all into a number
> > in range of [0,139].  This, as many agree, too closely reflects kernel
> > internal values.  This one gives 0-39 (nice values 19,-20) to RLIMIT_NICE,
> > and 0-99 (rt priorities) to RLIMIT_RTPRIO.  There's no distinction in rt
> > policy, and the traditional override (CAP_SYS_NICE) is still in place.
> > The defaults for both rlimits are 0, and behaviour should be backwards
> > compatible.  I tested this one a bit, and it worked as expected.  I've
> > got a patch to pam_limits as well, although it's untested.
> 
> This is looking pretty good.
> 
> > +#define NICE_TO_RLIMIT_NICE(nice)	(19 - nice)
> ...
> > +unsigned long nice_to_rlimit_nice(const int nice)
> > +{
> > +	return NICE_TO_RLIMIT_NICE(nice);
> > +}
> 
> This is a bit silly.

Heh, I wondered what comment that would get ;-)  It's gone.

> > -	if (niceval < task_nice(p) && !capable(CAP_SYS_NICE)) {
> > +	if (niceval < task_nice(p) &&
> > +		nice_to_rlimit_nice(niceval) >
> > +		p->signal->rlim[RLIMIT_NICE].rlim_cur &&
> > +		!capable(CAP_SYS_NICE)) {
> 
> Perhaps we want another helper function to do the rlim and
> CAP_SYS_NICE check together.

Sure.
-chris
-- 

===== include/asm-i386/resource.h 1.5 vs edited =====
--- 1.5/include/asm-i386/resource.h	2004-08-23 01:15:26 -07:00
+++ edited/include/asm-i386/resource.h	2005-01-14 13:48:53 -08:00
@@ -18,8 +18,11 @@
 #define RLIMIT_LOCKS	10		/* maximum file locks held */
 #define RLIMIT_SIGPENDING 11		/* max number of pending signals */
 #define RLIMIT_MSGQUEUE 12		/* maximum bytes in POSIX mqueues */
+#define RLIMIT_NICE	13		/* max nice prio allowed to raise to
+					   0-39 for nice level 19 .. -20 */
+#define RLIMIT_RTPRIO	14		/* maximum realtime priority */
 
-#define RLIM_NLIMITS	13
+#define RLIM_NLIMITS	15
 
 
 /*
@@ -45,6 +48,8 @@
 	{ RLIM_INFINITY, RLIM_INFINITY },		\
 	{ MAX_SIGPENDING, MAX_SIGPENDING },		\
 	{ MQ_BYTES_MAX, MQ_BYTES_MAX },			\
+	{            0,	             0 },		\
+	{            0,	             0 },		\
 }
 
 #endif /* __KERNEL__ */
===== include/asm-x86_64/resource.h 1.5 vs edited =====
--- 1.5/include/asm-x86_64/resource.h	2004-08-23 01:15:26 -07:00
+++ edited/include/asm-x86_64/resource.h	2005-01-14 14:17:38 -08:00
@@ -18,8 +18,11 @@
 #define RLIMIT_LOCKS	10		/* maximum file locks held */
 #define RLIMIT_SIGPENDING 11		/* max number of pending signals */
 #define RLIMIT_MSGQUEUE 12		/* maximum bytes in POSIX mqueues */
+#define RLIMIT_NICE	13		/* max nice prio allowed to raise to
+					   0-39 for nice level 19 .. -20 */
+#define RLIMIT_RTPRIO	14		/* maximum realtime priority */
 
-#define RLIM_NLIMITS	13
+#define RLIM_NLIMITS	15
 
 /*
  * SuS says limits have to be unsigned.
@@ -44,6 +47,8 @@
 	{ RLIM_INFINITY, RLIM_INFINITY },		\
 	{ MAX_SIGPENDING, MAX_SIGPENDING },		\
 	{ MQ_BYTES_MAX, MQ_BYTES_MAX },			\
+	{             0,             0 },		\
+	{             0,             0 },		\
 }
 
 #endif /* __KERNEL__ */
===== include/linux/sched.h 1.291 vs edited =====
--- 1.291/include/linux/sched.h	2005-01-11 16:42:57 -08:00
+++ edited/include/linux/sched.h	2005-01-14 13:58:32 -08:00
@@ -767,6 +767,7 @@ extern void sched_idle_next(void);
 extern void set_user_nice(task_t *p, long nice);
 extern int task_prio(const task_t *p);
 extern int task_nice(const task_t *p);
+extern int can_nice(const task_t *p, const int nice);
 extern int task_curr(const task_t *p);
 extern int idle_cpu(int cpu);
 extern int sched_setscheduler(struct task_struct *, int, struct sched_param *);
===== kernel/sched.c 1.407 vs edited =====
--- 1.407/kernel/sched.c	2005-01-11 16:42:35 -08:00
+++ edited/kernel/sched.c	2005-01-14 15:03:44 -08:00
@@ -3121,6 +3121,19 @@ out_unlock:
 
 EXPORT_SYMBOL(set_user_nice);
 
+/**
+ * can_nice - check if a task can reduce its nice value
+   @p: task
+ * @nice: nice value
+ */
+int can_nice(const task_t *p, const int nice)
+{
+	/* convert nice value [19,-20] to rlimit style value [0,39] */
+	int nice_rlim = 19 - nice;
+	return (nice_rlim <= p->signal->rlim[RLIMIT_NICE].rlim_cur || 
+		capable(CAP_SYS_NICE));
+}
+
 #ifdef __ARCH_WANT_SYS_NICE
 
 /*
@@ -3140,12 +3153,8 @@ asmlinkage long sys_nice(int increment)
 	 * We don't have to worry. Conceptually one call occurs first
 	 * and we have a single winner.
 	 */
-	if (increment < 0) {
-		if (!capable(CAP_SYS_NICE))
-			return -EPERM;
-		if (increment < -40)
-			increment = -40;
-	}
+	if (increment < -40)
+		increment = -40;
 	if (increment > 40)
 		increment = 40;
 
@@ -3155,6 +3164,9 @@ asmlinkage long sys_nice(int increment)
 	if (nice > 19)
 		nice = 19;
 
+	if (increment < 0 && !can_nice(current, nice))
+		return -EPERM;
+
 	retval = security_task_setnice(current, nice);
 	if (retval)
 		return retval;
@@ -3252,6 +3264,7 @@ recheck:
 		return -EINVAL;
 
 	if ((policy == SCHED_FIFO || policy == SCHED_RR) &&
+	    param->sched_priority > p->signal->rlim[RLIMIT_RTPRIO].rlim_cur && 
 	    !capable(CAP_SYS_NICE))
 		return -EPERM;
 	if ((current->euid != p->euid) && (current->euid != p->uid) &&
===== kernel/sys.c 1.104 vs edited =====
--- 1.104/kernel/sys.c	2005-01-11 16:42:35 -08:00
+++ edited/kernel/sys.c	2005-01-14 14:10:11 -08:00
@@ -225,7 +225,7 @@ static int set_one_prio(struct task_stru
 		error = -EPERM;
 		goto out;
 	}
-	if (niceval < task_nice(p) && !capable(CAP_SYS_NICE)) {
+	if (niceval < task_nice(p) && !can_nice(p, niceval)) {
 		error = -EACCES;
 		goto out;
 	}


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14 20:52                                                                         ` Lee Revell
@ 2005-01-15  0:42                                                                           ` Jack O'Quin
  2005-01-15  2:19                                                                             ` Randy.Dunlap
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-15  0:42 UTC (permalink / raw)
  To: Lee Revell
  Cc: Matt Mackall, Ingo Molnar, Chris Wright, Paul Davis,
	Christoph Hellwig, Con Kolivas, Andrew Morton, arjanv, alan,
	linux-kernel


> On Thu, 2005-01-13 at 13:17 -0600, Jack O'Quin wrote:
>> But there may be other, better solutions to the deadlock problem.
>> Several years ago, Roger Larsson wrote a completely user-space
>> realtime monitor program that works perfectly well for revoking
>> realtime privileges when it detects CPU starvation.  I still use it
>> occasionally to help debug problems if the built-in JACK watchdog
>> timer doesn't catch them.

Lee Revell <rlrevell@joe-job.com> writes:
> Do you have a link to Roger Larsson's RT watchdog?

No official, supported version.  With his permission, I posted a copy
on my home system a year ago for some audio users who had inquired
about it.  That copy is here...

  http://www.joq.us/joq/rt_monitor.tgz
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14 23:04                                                                                                         ` Chris Wright
@ 2005-01-15  0:58                                                                                                           ` Matt Mackall
  0 siblings, 0 replies; 266+ messages in thread
From: Matt Mackall @ 2005-01-15  0:58 UTC (permalink / raw)
  To: Chris Wright
  Cc: Andrew Morton, Paul Davis, nickpiggin, lkml, rlrevell, arjanv,
	joq, hch, mingo, alan, linux-kernel, kernel

On Fri, Jan 14, 2005 at 03:04:18PM -0800, Chris Wright wrote:
> > Perhaps we want another helper function to do the rlim and
> > CAP_SYS_NICE check together.
> 
> Sure.
> -chris

This last version looks good to me. My only concern right now is
increasing the list of rlimits, but I can probably save those for the
next rlimit addition.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-14 17:20                                                                                               ` Mike Galbraith
@ 2005-01-15  1:14                                                                                                 ` Jack O'Quin
  2005-01-15  8:06                                                                                                   ` Mike Galbraith
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-15  1:14 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Arjan van de Ven, Lee Revell, Chris Wright, Paul Davis,
	Matt Mackall, Christoph Hellwig, Andrew Morton, mingo, alan,
	linux-kernel, Con Kolivas

Mike Galbraith <efault@gmx.de> writes:

> At 05:31 PM 1/13/2005 -0600, Jack O'Quin wrote:
>>Yes.  However, my tests have so far shown a need for "actual FIFO as
>>long as the task behaves itself."
>
> I for one wonder why that appears to be so.  What happens if you use
> SCHED_RR instead of SCHED_FIFO?
>
> (ie is the problem just one of running out of slice at a bad time, or
> is it the dynamic priority adjustment)

I have no quick and easy test for that.  

If it's important, I can modify a version of JACK to use SCHED_RR,
instead.

I very much doubt it would make any difference, since we normally only
run one realtime thread at a time.  Each client taps the next on the
shoulder when it is time for it to run, so there is essentially no
concurrency among them.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15  0:42                                                                           ` Jack O'Quin
@ 2005-01-15  2:19                                                                             ` Randy.Dunlap
  2005-01-15  4:06                                                                               ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Randy.Dunlap @ 2005-01-15  2:19 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Lee Revell, Matt Mackall, Ingo Molnar, Chris Wright, Paul Davis,
	Christoph Hellwig, Con Kolivas, Andrew Morton, arjanv, alan,
	linux-kernel

Jack O'Quin wrote:
>>On Thu, 2005-01-13 at 13:17 -0600, Jack O'Quin wrote:
>>
>>>But there may be other, better solutions to the deadlock problem.
>>>Several years ago, Roger Larsson wrote a completely user-space
>>>realtime monitor program that works perfectly well for revoking
>>>realtime privileges when it detects CPU starvation.  I still use it
>>>occasionally to help debug problems if the built-in JACK watchdog
>>>timer doesn't catch them.
> 
> 
> Lee Revell <rlrevell@joe-job.com> writes:
> 
>>Do you have a link to Roger Larsson's RT watchdog?
> 
> 
> No official, supported version.  With his permission, I posted a copy
> on my home system a year ago for some audio users who had inquired
> about it.  That copy is here...
> 
>   http://www.joq.us/joq/rt_monitor.tgz

Bad URL, not found....

-- 
~Randy

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15  2:19                                                                             ` Randy.Dunlap
@ 2005-01-15  4:06                                                                               ` Jack O'Quin
  0 siblings, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-15  4:06 UTC (permalink / raw)
  To: Randy.Dunlap
  Cc: Lee Revell, Matt Mackall, Ingo Molnar, Chris Wright, Paul Davis,
	Christoph Hellwig, Con Kolivas, Andrew Morton, arjanv, alan,
	linux-kernel

>> Lee Revell <rlrevell@joe-job.com> writes:
>>>Do you have a link to Roger Larsson's RT watchdog?

> Jack O'Quin wrote:
>> No official, supported version.  With his permission, I posted a copy
>> on my home system a year ago for some audio users who had inquired
>> about it.  That copy is here...
>>   http://www.joq.us/joq/rt_monitor.tgz

"Randy.Dunlap" <rddunlap@osdl.org> writes:
> Bad URL, not found....

Sorry, that was a typo...

 http://www.joq.us/jack/rt_monitor.tgz
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-11 21:21                                                     ` Ingo Molnar
  2005-01-12  2:10                                                       ` Jack O'Quin
@ 2005-01-15  4:56                                                       ` Jack O'Quin
  2005-01-15 14:43                                                         ` Ingo Molnar
  1 sibling, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-15  4:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, alan, linux-kernel


Ingo Molnar <mingo@elte.hu> writes:

> what kind of non-audio workload was there during this test? 43 xruns
> arent nice but arent that bad either.

Audio playback through JACK didn't work well at all with nice --20,
even at relatively high latencies (23 msec cycle).  It was bad enough
that I would not want to use it for anything.  

The 1/2 second max delay was probably more of an issue than the number
of xruns.  Something really bad happened there.

> this will turn off starvation checking, for testing purposes. (to see
> whether there's anything else but anti-starvation causing xruns.)

I build a 2.6.10 kernel with just these two changes...

--- kernel/sched.c~	Fri Dec 24 15:35:24 2004
+++ kernel/sched.c	Wed Jan 12 23:48:49 2005
@@ -95,7 +95,7 @@
 #define MAX_BONUS		(MAX_USER_PRIO * PRIO_BONUS_RATIO / 100)
 #define INTERACTIVE_DELTA	  2
 #define MAX_SLEEP_AVG		(DEF_TIMESLICE * MAX_BONUS)
-#define STARVATION_LIMIT	(MAX_SLEEP_AVG)
+#define STARVATION_LIMIT	0
 #define NS_MAX_SLEEP_AVG	(JIFFIES_TO_NS(MAX_SLEEP_AVG))
 #define CREDIT_LIMIT		100
 
--- kernel/workqueue.c~	Fri Dec 24 15:35:40 2004
+++ kernel/workqueue.c	Fri Jan 14 19:34:10 2005
@@ -188,7 +188,7 @@
 
 	current->flags |= PF_NOFREEZE;
 
-	set_user_nice(current, -10);
+	set_user_nice(current, -5);
 
 	/* Block and flush all signals */
 	sigfillset(&blocked);

Since realtime-lsm was not available, I ran the test as root.  Overall
system performance was not good.  Trying to do mail with xemacs and
gnus (as I had done before) hung for long periods of time.

The test did not work correctly.  A number of segfaults occurred.  The
jackd server hung and had to be killed manually.

So, these results aren't worth much, but here's what it reported
(compared with earlier results)...


                                 With -R       Without -R      Without -R
                               (SCHED_FIFO)    (nice --20) (STARVATION_LIMIT=0)

************* SUMMARY RESULT ****************
Total seconds ran . . . . . . :   300
Number of clients . . . . . . :    20
Ports per client  . . . . . . :     4
Frames per buffer . . . . . . :    64
*********************************************
Timeout Count . . . . . . . . :(    1)          (    1)       (    2)	      
XRUN Count  . . . . . . . . . :     2               43            46	      
Delay Count (>spare time) . . :     0                0             0	      
Delay Count (>1000 usecs) . . :     0                0             0	      
Delay Maximum . . . . . . . . :  3130 usecs   501374 usecs       0 usecs 
Cycle Maximum . . . . . . . . :   960 usecs     1036 usecs       0 usecs 
Average DSP Load. . . . . . . :    34.3 %         34.3 %        19.5 %     

The "{Delay|Cycle} Maximum" values were apparently not reported
because jackd hung.  I suspect the DSP load went down because many of
the clients crashed.

I ran it again with -R on this kernel, just to check.  The DSP load
was back to 33.2%.  It performed as before, except the "{Delay|Cycle}
Maximum" values were also reported as zero.  Not sure why, don't have
time to debug it right now.  Running with -R did not impact other
interactive processes as badly as nice --20 on this kernel.

If you want, I can dig into this some more and try to figure out what
went wrong.  Did I make the exact changes you wanted?

If it's not interesting, I probably won't bother.
--
 joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15  1:14                                                                                                 ` Jack O'Quin
@ 2005-01-15  8:06                                                                                                   ` Mike Galbraith
  2005-01-15 23:48                                                                                                     ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Mike Galbraith @ 2005-01-15  8:06 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Arjan van de Ven, Lee Revell, Chris Wright, Paul Davis,
	Matt Mackall, Christoph Hellwig, Andrew Morton, mingo, alan,
	linux-kernel, Con Kolivas

At 07:14 PM 1/14/2005 -0600, Jack O'Quin wrote:
>Mike Galbraith <efault@gmx.de> writes:
>
> > At 05:31 PM 1/13/2005 -0600, Jack O'Quin wrote:
> >>Yes.  However, my tests have so far shown a need for "actual FIFO as
> >>long as the task behaves itself."
> >
> > I for one wonder why that appears to be so.  What happens if you use
> > SCHED_RR instead of SCHED_FIFO?
> >
> > (ie is the problem just one of running out of slice at a bad time, or
> > is it the dynamic priority adjustment)
>
>I have no quick and easy test for that.
>
>If it's important, I can modify a version of JACK to use SCHED_RR,
>instead.

I think the problem you're seeing is strange enough to consider trying the 
(possibly odd sounding) test.  I haven't seen an explanation of why nice 
-20 doesn't work for you.

>I very much doubt it would make any difference, since we normally only
>run one realtime thread at a time.  Each client taps the next on the
>shoulder when it is time for it to run, so there is essentially no
>concurrency among them.

It may not make any difference.  Seeing that would at least be an 
additional datapoint.  The only significant difference I see between a 
gaggle of SCHED_FIFO tasks and one of nice -20 tasks, who are alone in 
their top-of-the-heap queue, and who are not cpu hogs, is the timeslice.  I 
don't recall there being any wakeup/preempt logic differences, ergo the 
SCHED_RR suggestion.

         -Mike 


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-13  5:44                                                                   ` Jack O'Quin
  2005-01-13  6:34                                                                     ` Matt Mackall
@ 2005-01-15 13:49                                                                     ` Ingo Molnar
  2005-01-15 23:02                                                                       ` Jack O'Quin
  1 sibling, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-01-15 13:49 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> OK, I reran with just 5 processes reniced from -10 to -5.  On my
> system they were: events, khelper, kblockd, aio and reiserfs.  In
> addition, I reniced loop0 from -20 to -5.

> One major problem: this `nice --20' hack affects every thread, not
> just the critical realtime ones.  That's not what we want.  Audio
> applications make very conscious choices which threads run with high
> priority and which do not.

how much did this problem affect your test? Could the source of the 500
msec delays be the non-highprio components of the test that somehow
became nice --20?

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15  4:56                                                       ` Jack O'Quin
@ 2005-01-15 14:43                                                         ` Ingo Molnar
  2005-01-15 23:10                                                           ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-01-15 14:43 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> --- kernel/sched.c~	Fri Dec 24 15:35:24 2004
> +++ kernel/sched.c	Wed Jan 12 23:48:49 2005
> @@ -95,7 +95,7 @@
>  #define MAX_BONUS		(MAX_USER_PRIO * PRIO_BONUS_RATIO / 100)
>  #define INTERACTIVE_DELTA	  2
>  #define MAX_SLEEP_AVG		(DEF_TIMESLICE * MAX_BONUS)
> -#define STARVATION_LIMIT	(MAX_SLEEP_AVG)
> +#define STARVATION_LIMIT	0
>  #define NS_MAX_SLEEP_AVG	(JIFFIES_TO_NS(MAX_SLEEP_AVG))
>  #define CREDIT_LIMIT		100

could you try the patch below? The above patch wasnt enough. With the
patch below we turn off the starvation limits for nice --20 tasks only. 
This is still a hack only. If we cannot make nice --20 perform like
RT-prio-1 then there's some problem with SCHED_OTHER scheduling.

	Ingo

--- linux/kernel/sched.c.orig
+++ linux/kernel/sched.c
@@ -2245,10 +2245,10 @@ EXPORT_PER_CPU_SYMBOL(kstat);
  * if a better static_prio task has expired:
  */
 #define EXPIRED_STARVING(rq) \
-	((STARVATION_LIMIT && ((rq)->expired_timestamp && \
+	((task_nice(current) > -20) && ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
 		(jiffies - (rq)->expired_timestamp >= \
 			STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
-			((rq)->curr->static_prio > (rq)->best_expired_prio))
+			((rq)->curr->static_prio > (rq)->best_expired_prio)))
 
 /*
  * Do the virtual cpu time signal calculations.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15 13:49                                                                     ` Ingo Molnar
@ 2005-01-15 23:02                                                                       ` Jack O'Quin
  2005-01-15 23:38                                                                         ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-15 23:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel

Ingo Molnar <mingo@elte.hu> writes:

> * Jack O'Quin <joq@io.com> wrote:
>
>> OK, I reran with just 5 processes reniced from -10 to -5.  On my
>> system they were: events, khelper, kblockd, aio and reiserfs.  In
>> addition, I reniced loop0 from -20 to -5.
>
>> One major problem: this `nice --20' hack affects every thread, not
>> just the critical realtime ones.  That's not what we want.  Audio
>> applications make very conscious choices which threads run with high
>> priority and which do not.
>
> how much did this problem affect your test? Could the source of the 500
> msec delays be the non-highprio components of the test that somehow
> became nice --20?

Some interference is definitely possible.  But, the test does not
involve any graphical interface, so I'd expect that to be small.
Looking at jack_test3_client.cpp, the main thread just does a sleep()
while the process cycle is running.

Still, it's hard to be sure.  

Probably, the best way to tell would be patching JACK so it uses
nice(-20) instead of pthread_setschedparam() for the realtime threads.
As a hack, that looks easy.  I'll build a working directory with just
that change, so we can experiment with it better.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15 14:43                                                         ` Ingo Molnar
@ 2005-01-15 23:10                                                           ` Jack O'Quin
  2005-01-16  1:48                                                             ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-15 23:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, alan, linux-kernel


> * Jack O'Quin <joq@io.com> wrote:
>
>> --- kernel/sched.c~	Fri Dec 24 15:35:24 2004
>> +++ kernel/sched.c	Wed Jan 12 23:48:49 2005
>> @@ -95,7 +95,7 @@
>>  #define MAX_BONUS		(MAX_USER_PRIO * PRIO_BONUS_RATIO / 100)
>>  #define INTERACTIVE_DELTA	  2
>>  #define MAX_SLEEP_AVG		(DEF_TIMESLICE * MAX_BONUS)
>> -#define STARVATION_LIMIT	(MAX_SLEEP_AVG)
>> +#define STARVATION_LIMIT	0
>>  #define NS_MAX_SLEEP_AVG	(JIFFIES_TO_NS(MAX_SLEEP_AVG))
>>  #define CREDIT_LIMIT		100

Ingo Molnar <mingo@elte.hu> writes:
> could you try the patch below? The above patch wasnt enough. With the
> patch below we turn off the starvation limits for nice --20 tasks only. 
> This is still a hack only. If we cannot make nice --20 perform like
> RT-prio-1 then there's some problem with SCHED_OTHER scheduling.

I am building again with your new patch and with STARVATION_LIMIT
defined as (MAX_SLEEP_AVG) again.  I'll run that with a modified JACK
to reduce the interference of all those other non-realtime threads.

Will let you know what happens.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15 23:02                                                                       ` Jack O'Quin
@ 2005-01-15 23:38                                                                         ` Jack O'Quin
  2005-01-16 23:13                                                                           ` Ingo Molnar
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-15 23:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel



>> * Jack O'Quin <joq@io.com> wrote:
>>> One major problem: this `nice --20' hack affects every thread, not
>>> just the critical realtime ones.  That's not what we want.  Audio
>>> applications make very conscious choices which threads run with high
>>> priority and which do not.

> Ingo Molnar <mingo@elte.hu> writes:
>> how much did this problem affect your test? Could the source of the 500
>> msec delays be the non-highprio components of the test that somehow
>> became nice --20?

Jack O'Quin <joq@io.com> writes:
> Probably, the best way to tell would be patching JACK so it uses
> nice(-20) instead of pthread_setschedparam() for the realtime threads.
> As a hack, that looks easy.  I'll build a working directory with just
> that change, so we can experiment with it better.

Bah!  Nothing is ever as easy as it looks.

According to the manpage, nice(2) is per-process not per-thread.  That
does not give the granularity we need.  

Is that correct?  If so, I can't think of any way to make this work.
Suggestions?

We need to allow both realtime and non-realtime threads in the same
process.  Anything less would require an enormous rewrite for most
audio programs, an unreasonable thing to ask.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15  8:06                                                                                                   ` Mike Galbraith
@ 2005-01-15 23:48                                                                                                     ` Jack O'Quin
  0 siblings, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-15 23:48 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Arjan van de Ven, Lee Revell, Chris Wright, Paul Davis,
	Matt Mackall, Christoph Hellwig, Andrew Morton, mingo, alan,
	linux-kernel, Con Kolivas

Mike Galbraith <efault@gmx.de> writes:

> At 07:14 PM 1/14/2005 -0600, Jack O'Quin wrote:
>>Mike Galbraith <efault@gmx.de> writes:
>>
>> > At 05:31 PM 1/13/2005 -0600, Jack O'Quin wrote:
>> >>Yes.  However, my tests have so far shown a need for "actual FIFO as
>> >>long as the task behaves itself."
>> >
>> > I for one wonder why that appears to be so.  What happens if you use
>> > SCHED_RR instead of SCHED_FIFO?
>> >
>> > (ie is the problem just one of running out of slice at a bad time, or
>> > is it the dynamic priority adjustment)
>>
>>I have no quick and easy test for that.
>>
>>If it's important, I can modify a version of JACK to use SCHED_RR,
>>instead.
>
> I think the problem you're seeing is strange enough to consider trying
> the (possibly odd sounding) test.  I haven't seen an explanation of
> why nice -20 doesn't work for you.

The simplest explanation that makes any sense to me is that the
non-realtime threads are interfering with the realtime ones.  These
threads don't do much in this test, although they would in a real
audio application.  Still, there are enough things going on before and
after the sleep() in the main thread to possibly generate the number
of xruns we're seeing.

This is why I don't think nice is an appropriate solution for the
problem we're trying to solve.  It's too blunt an instrument for audio
work.

>>I very much doubt it would make any difference, since we normally only
>>run one realtime thread at a time.  Each client taps the next on the
>>shoulder when it is time for it to run, so there is essentially no
>>concurrency among them.
>
> It may not make any difference.  Seeing that would at least be an
> additional datapoint.  The only significant difference I see between a
> gaggle of SCHED_FIFO tasks and one of nice -20 tasks, who are alone in
> their top-of-the-heap queue, and who are not cpu hogs, is the
> timeslice.  I don't recall there being any wakeup/preempt logic
> differences, ergo the SCHED_RR suggestion.

I think you're missing the fact that SCHED_FIFO is per-thread while
nice() is per-process.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15 23:10                                                           ` Jack O'Quin
@ 2005-01-16  1:48                                                             ` Jack O'Quin
  2005-01-16  4:30                                                               ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-16  1:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, alan, linux-kernel


> Ingo Molnar <mingo@elte.hu> writes:
>> could you try the patch below? The above patch wasnt enough. With the
>> patch below we turn off the starvation limits for nice --20 tasks only. 
>> This is still a hack only. If we cannot make nice --20 perform like
>> RT-prio-1 then there's some problem with SCHED_OTHER scheduling.

I made your suggested sched.c change.  It works much better.  I was
not able to modify JACK (for reasons explained in an earlier message).
So, this test has the same interference problems with non-realtime
threads.  Of course, I don't know for sure that they are the source of
our xruns and long delays, but that's my suspicion.

I didn't have the same problems with normal processes being
unresponsive (compared to the previous sched.c patch).  The test ran
normally, with about the same results we saw before for the nice --20
experiments.

*** Terminated Sat Jan 15 18:15:13 CST 2005 ***
************* SUMMARY RESULT ****************
Total seconds ran . . . . . . :   300
Number of clients . . . . . . :    20
Ports per client  . . . . . . :     4
Frames per buffer . . . . . . :    64
*********************************************
Timeout Count . . . . . . . . :(    1)
XRUN Count  . . . . . . . . . :    47
Delay Count (>spare time) . . :     0
Delay Count (>1000 usecs) . . :     0
Delay Maximum . . . . . . . . : 500544   usecs
Cycle Maximum . . . . . . . . :  1086   usecs
Average DSP Load. . . . . . . :    36.1 %
Average CPU System Load . . . :     8.2 %
Average CPU User Load . . . . :    26.3 %
Average CPU Nice Load . . . . :     0.0 %
Average CPU I/O Wait Load . . :     0.4 %
Average CPU IRQ Load  . . . . :     0.7 %
Average CPU Soft-IRQ Load . . :     0.0 %
Average Interrupt Rate  . . . :  1703.3 /sec
Average Context-Switch Rate . : 11600.6 /sec
*********************************************

I think this means the starvation test was not the problem.  So far,
I've seen no proof that there is any problem with the 2.6.10
scheduler, just some evidence that nice --20 does not work for
multi-threaded realtime audio.

If someone can suggest a way to run certain threads of a process with
a different nice value than the others, I can probably hack that into
JACK in some crude way.  That should tell us whether my intuition is
right about the source of scheduling interference.  

Otherwise, I'm out of ideas at the moment.  I don't think SCHED_RR
will be any different from SCHED_FIFO in this test.  Even if it were,
I'm not sure what that would prove.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-16  1:48                                                             ` Jack O'Quin
@ 2005-01-16  4:30                                                               ` Jack O'Quin
  2005-01-16 23:22                                                                 ` Ingo Molnar
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-16  4:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, alan, linux-kernel

Jack O'Quin <joq@io.com> writes:

> *** Terminated Sat Jan 15 18:15:13 CST 2005 ***
> ************* SUMMARY RESULT ****************
> Total seconds ran . . . . . . :   300
> Number of clients . . . . . . :    20
> Ports per client  . . . . . . :     4
> Frames per buffer . . . . . . :    64
> *********************************************
> Timeout Count . . . . . . . . :(    1)
> XRUN Count  . . . . . . . . . :    47
> Delay Count (>spare time) . . :     0
> Delay Count (>1000 usecs) . . :     0
> Delay Maximum . . . . . . . . : 500544   usecs
> Cycle Maximum . . . . . . . . :  1086   usecs
> Average DSP Load. . . . . . . :    36.1 %
> Average CPU System Load . . . :     8.2 %
> Average CPU User Load . . . . :    26.3 %
> Average CPU Nice Load . . . . :     0.0 %
> Average CPU I/O Wait Load . . :     0.4 %
> Average CPU IRQ Load  . . . . :     0.7 %
> Average CPU Soft-IRQ Load . . :     0.0 %
> Average Interrupt Rate  . . . :  1703.3 /sec
> Average Context-Switch Rate . : 11600.6 /sec
> *********************************************
>
> I think this means the starvation test was not the problem.  So far,
> I've seen no proof that there is any problem with the 2.6.10
> scheduler, just some evidence that nice --20 does not work for
> multi-threaded realtime audio.
>
> If someone can suggest a way to run certain threads of a process with
> a different nice value than the others, I can probably hack that into
> JACK in some crude way.  That should tell us whether my intuition is
> right about the source of scheduling interference.  
>
> Otherwise, I'm out of ideas at the moment.  I don't think SCHED_RR
> will be any different from SCHED_FIFO in this test.  Even if it were,
> I'm not sure what that would prove.

Studying the test script, I discovered that it starts a separate
program running in the background.  So, I hacked the script to run it
with nice -15 in order not to interfere with the realtime threads.
The XRUNS didn't get much better, but the maximum delay went way down,
from 1/2 sec to a much more believable (but still too high) 32.5 msec.
I ran this with the same patched scheduler.

*** Terminated Sat Jan 15 21:22:00 CST 2005 ***
************* SUMMARY RESULT ****************
Total seconds ran . . . . . . :   300
Number of clients . . . . . . :    20
Ports per client  . . . . . . :     4
Frames per buffer . . . . . . :    64
*********************************************
Timeout Count . . . . . . . . :(    0)
XRUN Count  . . . . . . . . . :    43
Delay Count (>spare time) . . :     0
Delay Count (>1000 usecs) . . :     0
Delay Maximum . . . . . . . . : 32518   usecs
Cycle Maximum . . . . . . . . :   820   usecs
Average DSP Load. . . . . . . :    34.9 %
Average CPU System Load . . . :     8.5 %
Average CPU User Load . . . . :    23.8 %
Average CPU Nice Load . . . . :     0.0 %
Average CPU I/O Wait Load . . :     0.0 %
Average CPU IRQ Load  . . . . :     0.7 %
Average CPU Soft-IRQ Load . . :     0.0 %
Average Interrupt Rate  . . . :  1688.5 /sec
Average Context-Switch Rate . : 11704.9 /sec
*********************************************

This supports my intuition that lack of per-thread granularity is the
main problem.  Where I was able to isolate some non-realtime code and
run it at lower priority, it helped quite a bit.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-15 23:38                                                                         ` Jack O'Quin
@ 2005-01-16 23:13                                                                           ` Ingo Molnar
  2005-01-16 23:57                                                                             ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-01-16 23:13 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> According to the manpage, nice(2) is per-process not per-thread.  That
> does not give the granularity we need. 

the manpage is incorrect - sys_nice() is per-thread. (Btw., you could
use setpriority() too.)

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-16  4:30                                                               ` Jack O'Quin
@ 2005-01-16 23:22                                                                 ` Ingo Molnar
  0 siblings, 0 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-01-16 23:22 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Christoph Hellwig, Andrew Morton, Lee Revell, paul,
	arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> Studying the test script, I discovered that it starts a separate
> program running in the background.  So, I hacked the script to run it
> with nice -15 in order not to interfere with the realtime threads. The
> XRUNS didn't get much better, but the maximum delay went way down,
> from 1/2 sec to a much more believable (but still too high) 32.5 msec.
> I ran this with the same patched scheduler.

> This supports my intuition that lack of per-thread granularity is the
> main problem.  Where I was able to isolate some non-realtime code and
> run it at lower priority, it helped quite a bit.

ok, makes perfect sense. My suggestion for the next step would be to try
nice() or setpriority() to do priority isolation.

If that experiment works out fine (i.e. the xrun count is comparable to
the SCHED_FIFO case) then it would also be nice to do a nice --19 run
(under the hacked kernel), which is a priority level that doesnt have
starvation turned off in the patched kernel but is otherwise very close
in behavior to nice --20.

i.e. as an end result we'd have the following 3 priority setups
compared: SCHED_FIFO:RT-prio-1, SCHED_NORMAL:nice--20,
SCHED_NORMAL:nice--19. The (ideal) goal would be for them to have
near-identical audio-latency performance.

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-16 23:13                                                                           ` Ingo Molnar
@ 2005-01-16 23:57                                                                             ` Jack O'Quin
  2005-01-17  9:17                                                                               ` Sytse Wielinga
  2005-01-17 10:06                                                                               ` Ingo Molnar
  0 siblings, 2 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-16 23:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


> * Jack O'Quin <joq@io.com> wrote:
>> According to the manpage, nice(2) is per-process not per-thread.  That
>> does not give the granularity we need. 

Ingo Molnar <mingo@elte.hu> writes:
> the manpage is incorrect - sys_nice() is per-thread. (Btw., you could
> use setpriority() too.)

OK.  Where is this stuff documented?

BTW, I think this violates POSIX, which states...

  The nice value set with nice() shall be applied to the process. If
  the process is multi-threaded, the nice value shall affect all
  system scope threads in the process.

(It does not affect SCHED_FIFO or SCHED_RR threads, however.)

Is it possible to call sched_setscheduler() with a thread ID instead
of a pid?  That's what I really need.  JACK sets and resets the thread
priorities from a different thread.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-16 23:57                                                                             ` Jack O'Quin
@ 2005-01-17  9:17                                                                               ` Sytse Wielinga
  2005-01-17 14:36                                                                                 ` Ingo Molnar
  2005-01-17 10:06                                                                               ` Ingo Molnar
  1 sibling, 1 reply; 266+ messages in thread
From: Sytse Wielinga @ 2005-01-17  9:17 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Ingo Molnar, Chris Wright, Matt Mackall, Paul Davis,
	Christoph Hellwig, Andrew Morton, Lee Revell, arjanv, alan,
	linux-kernel

On Sun, Jan 16, 2005 at 05:57:23PM -0600, Jack O'Quin wrote:
> > * Jack O'Quin <joq@io.com> wrote:
> >> According to the manpage, nice(2) is per-process not per-thread.  That
> >> does not give the granularity we need. 
> 
> Ingo Molnar <mingo@elte.hu> writes:
> > the manpage is incorrect - sys_nice() is per-thread. (Btw., you could
> > use setpriority() too.)
> 
> OK.  Where is this stuff documented?
> 
> BTW, I think this violates POSIX, which states...
> 
>   The nice value set with nice() shall be applied to the process. If
>   the process is multi-threaded, the nice value shall affect all
>   system scope threads in the process.

We are talking about two different things here. POSIX is just about API and
has, correct me if I'm wrong, nothing to do with system calls whatsoever. The
manpage nice(2) is about the libc library call nice(), which is per-process,
which it should be according to POSIX. The system call, called sys_nice() in C,
is per-thread. Apparently glibc or some thread library contains some magic to
make the translation.

Sytse

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-16 23:57                                                                             ` Jack O'Quin
  2005-01-17  9:17                                                                               ` Sytse Wielinga
@ 2005-01-17 10:06                                                                               ` Ingo Molnar
  2005-01-18  5:02                                                                                 ` Jack O'Quin
  1 sibling, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-01-17 10:06 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> Is it possible to call sched_setscheduler() with a thread ID instead
> of a pid?  That's what I really need.  JACK sets and resets the thread
> priorities from a different thread.

yes. The PID arguments in these APIs are all treated as 'TIDs'. One day
the APIs themselves might switch over to what POSIX specifies, and there
will be new, thread-specific APIs - but at the moment they are all
thread-granular.

(If then this switchover will happen in a controlled manner via glibc,
not via the kernel. I.e. kernel will introduce new syscalls to do the
per-process priority changing, then newest glibc will utilize it - i.e.
already existing apps will stay compatible.)

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-17  9:17                                                                               ` Sytse Wielinga
@ 2005-01-17 14:36                                                                                 ` Ingo Molnar
  0 siblings, 0 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-01-17 14:36 UTC (permalink / raw)
  To: Sytse Wielinga
  Cc: Jack O'Quin, Chris Wright, Matt Mackall, Paul Davis,
	Christoph Hellwig, Andrew Morton, Lee Revell, arjanv, alan,
	linux-kernel


* Sytse Wielinga <s.b.wielinga@student.utwente.nl> wrote:

> We are talking about two different things here. POSIX is just about
> API and has, correct me if I'm wrong, nothing to do with system calls
> whatsoever. The manpage nice(2) is about the libc library call nice(),
> which is per-process, which it should be according to POSIX. The
> system call, called sys_nice() in C, is per-thread. Apparently glibc
> or some thread library contains some magic to make the translation.

AFAIK there's no such translation at the glibc level - i.e. you'll get
per-thread semantics. (glibc really needs kernel help to do the
per-process things cleanly.) Anyway, this hasnt been a big issue in the
past, and especially for the current testing purpose this behavior is
what we need right now.

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-17 10:06                                                                               ` Ingo Molnar
@ 2005-01-18  5:02                                                                                 ` Jack O'Quin
  2005-01-18  8:02                                                                                   ` Ingo Molnar
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-18  5:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1560 bytes --]

Ingo Molnar <mingo@elte.hu> writes:

> * Jack O'Quin <joq@io.com> wrote:
>
>> Is it possible to call sched_setscheduler() with a thread ID instead
>> of a pid?  That's what I really need.  JACK sets and resets the thread
>> priorities from a different thread.
>
> yes. The PID arguments in these APIs are all treated as 'TIDs'. One day
> the APIs themselves might switch over to what POSIX specifies, and there
> will be new, thread-specific APIs - but at the moment they are all
> thread-granular.

In the absence of any documentation, I'm guessing about storing the
nice value in the priority field of the sched_param struct.  But, I
have not been able to figure out how to make that work.

While intializing the "realtime" thread, I modified JACK to do this
instead of setting SCHED_FIFO and the desired RT priority...

	policy = SCHED_OTHER;
	param.sched_priority = -20;

Is that even a reasonable guess?  It doesn't work.  

All the relevant pthread_xxx() services seem to return EINVAL given
these values.  When I change to use sched_setscheduler() instead of
pthread_setschedparam(), I get ESRCH.  Is the thread_t returned from
pthread_create() different from the thread ID used by the kernel?  If
so, how do I obtain the right value?

Is this stuff documented somewhere?  How is anyone expected to use it?

I'm attaching the relevant JACK thread.c source file, so you all can
appreciate how much "fun" it is trying to do realtime programming
under Linux.  BTW, the #ifdef JACK_USE_MACH_THREADS parts are the Mac
OS X version.  Much cleaner, isn't it?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: JACK threading interfaces --]
[-- Type: text/x-csrc, Size: 7782 bytes --]

/*
  Copyright (C) 2004 Paul Davis
  
  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU Lesser General Public License as published by
  the Free Software Foundation; either version 2.1 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU Lesser General Public License for more details.

  You should have received a copy of the GNU Lesser General Public License
  along with this program; if not, write to the Free Software
  Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

  Thread creation function including workarounds for real-time scheduling
  behaviour on different glibc versions.

  $Id: thread.c,v 1.6 2004/09/15 14:59:06 joq Exp $

*/

#include <config.h>

#include <jack/thread.h>
#include <jack/internal.h>

#include <pthread.h>
#include <sched.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

#ifdef JACK_USE_MACH_THREADS
#include <sysdeps/pThreadUtilities.h>
#endif

static inline void
log_result (char *msg, int res)
{
	char outbuf[500];
	snprintf(outbuf, sizeof(outbuf),
		 "jack_create_thread: error %d %s: %s",
		 res, msg, strerror(res));
	jack_error(outbuf);
}

int
jack_create_thread (pthread_t* thread,
		    int priority,
		    int realtime,
		    void*(*start_routine)(void*),
		    void* arg)
{
#ifndef JACK_USE_MACH_THREADS
	pthread_attr_t attr;
	int policy;
	struct sched_param param;
	int actual_policy;
	struct sched_param actual_param;
#endif /* !JACK_USE_MACH_THREADS */

	int result = 0;

	if (!realtime) {
		result = pthread_create (thread, 0, start_routine, arg);
		if (result) {
			log_result("creating thread with default parameters",
				   result);
		}
		return result;
	}

	/* realtime thread. this disgusting mess is a reflection of
	 * the 2nd-class nature of RT programming under POSIX in
	 * general and Linux in particular.
	 */

#ifndef JACK_USE_MACH_THREADS

	pthread_attr_init(&attr);
	policy = SCHED_FIFO;
	param.sched_priority = priority;
	result = pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
	if (result) {
		log_result("requesting explicit scheduling", result);
		return result;
	}
	result = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
	if (result) {
		log_result("requesting joinable thread creation", result);
		return result;
	}
	result = pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
	if (result) {
		log_result("requesting system scheduling scope", result);
		return result;
	}
	result = pthread_attr_setschedpolicy(&attr, policy);
	if (result) {
		log_result("requesting non-standard scheduling policy", result);
		return result;
	}
	result = pthread_attr_setschedparam(&attr, &param);
	if (result) {
		log_result("requesting thread priority", result);
		return result;
	}
	
	/* with respect to getting scheduling class+priority set up
	   correctly, there are three possibilities here: 

	   a) the call sets them and returns zero
	      ===================================

	      this is great, obviously.

	   b) the call fails to set them and returns an error code
	      ====================================================

  	      this could happen for a number of reasons,
	      but the important one is that we do not have the
	      priviledges required to create a realtime
	      thread. this could be correct, or it could be
	      bogus: there is at least one version of glibc
	      that checks only for UID in
	      pthread_attr_setschedpolicy(), and does not
	      check capabilities.
	      
	   c) the call fails to set them and does not return an error code
              ============================================================

	      this last case is caused by a stupid workaround in NPTL 0.60
	      where scheduling parameters are simply ignored, with no indication
	      of an error.
	*/

	result = pthread_create (thread, &attr, start_routine, arg);

	if (result) {

		/* this workaround temporarily switches the
		   current thread to the proper scheduler
		   and priority, using a call that
		   correctly checks for capabilities, then
		   starts the realtime thread so that it
		   can inherit them and finally switches
		   the current thread back to what it was
		   before.
		*/
		
		int current_policy;
		struct sched_param current_param;
		pthread_attr_t inherit_attr;
		
		current_policy = sched_getscheduler (0);
		sched_getparam (0, &current_param);
		
		result = sched_setscheduler (0, policy, &param);
		if (result) {
			log_result("switching current thread to rt for "
				   "inheritance", result);
			return result;
		}
		
		pthread_attr_init (&inherit_attr);
		result = pthread_attr_setscope (&inherit_attr,
						PTHREAD_SCOPE_SYSTEM);
		if (result) {
			log_result("requesting system scheduling scope "
				   "for inheritance", result);
			return result;
		}
		result = pthread_attr_setinheritsched (&inherit_attr,
						       PTHREAD_INHERIT_SCHED);
		if (result) {
			log_result("requesting inheritance of scheduling "
				   "parameters", result);
			return result;
		}
		result = pthread_create (thread, &inherit_attr, start_routine,
					 arg);
		if (result) {
			log_result("creating real-time thread by inheritance",
				   result);
		}
		
		sched_setscheduler (0, current_policy, &current_param);
		
		if (result)
			return result;
	}
	
	/* Still here? Good. Let's see if this worked... */

	result = pthread_getschedparam (*thread, &actual_policy, &actual_param);
	if (result) {
		log_result ("verifying scheduler parameters", result);
		return result;
	}

	if (actual_policy == policy &&
	    actual_param.sched_priority == param.sched_priority) {
		return 0;		/* everything worked OK */
	}

	/* we failed to set the sched class and priority,
	 * even though no error was returned by
	 * pthread_create(). fix this by setting them
	 * explicitly, which as far as is known will
	 * work even when using thread attributes does not.
	 */

	result = pthread_setschedparam (*thread, policy, &param);
	if (result) {
		log_result("setting scheduler parameters after thread "
			   "creation", result);
		return result;
	}

#else /* JACK_USE_MACH_THREADS */

	result = pthread_create (thread, 0, start_routine, arg);
	if (result) {
		log_result ("creating realtime thread", result);
		return result;
	}

	/* time constraint thread */
	setThreadToPriority (*thread, 96, TRUE, 10000000);
	
#endif /* JACK_USE_MACH_THREADS */

	return 0;
}

#if JACK_USE_MACH_THREADS 

int
jack_drop_real_time_scheduling (pthread_t thread)
{
	setThreadToPriority(thread, 31, FALSE, 10000000);
	return 0;       
}

int
jack_acquire_real_time_scheduling (pthread_t thread, int priority)
	//priority is unused
{
	setThreadToPriority(thread, 96, TRUE, 10000000);
	return 0;
}

#else /* !JACK_USE_MACH_THREADS */

int
jack_drop_real_time_scheduling (pthread_t thread)
{
	struct sched_param rtparam;
	int x;
	
	memset (&rtparam, 0, sizeof (rtparam));
	rtparam.sched_priority = 0;
	
	if ((x = pthread_setschedparam (thread, SCHED_OTHER, &rtparam)) != 0) {
		jack_error ("cannot switch to normal scheduling priority(%s)\n",
			    strerror (errno));
		return -1;
	}
        return 0;
}

int
jack_acquire_real_time_scheduling (pthread_t thread, int priority)
{
	struct sched_param rtparam;
	int x;
	
	memset (&rtparam, 0, sizeof (rtparam));
	rtparam.sched_priority = priority;
	
	if ((x = pthread_setschedparam (thread, SCHED_FIFO, &rtparam)) != 0) {
		jack_error ("cannot use real-time scheduling (FIFO/%d) "
			    "(%d: %s)", rtparam.sched_priority, x,
			    strerror (x));
		return -1;
	}
        return 0;
}

#endif /* JACK_USE_MACH_THREADS */

[-- Attachment #3: Type: text/plain, Size: 58 bytes --]


Maybe someone can give me a clue what to do...
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-18  5:02                                                                                 ` Jack O'Quin
@ 2005-01-18  8:02                                                                                   ` Ingo Molnar
  2005-01-18 17:05                                                                                     ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-01-18  8:02 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> In the absence of any documentation, I'm guessing about storing the
> nice value in the priority field of the sched_param struct.  But, I
> have not been able to figure out how to make that work.

the call you need is:

       setpriority(PRIO_PROCESS, tid, -20);

where 'tid' is the TID (pid) of the thread in question. There's no way i
know of to utilize the pthread_t ID to do this, so you'll have to figure
the TID out via gettid() - which needs to happen in the child context -
how hard would it be to attach the TID field to some per-thread Jack
structure? [while the purpose is still a quick hack.]

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-18  8:02                                                                                   ` Ingo Molnar
@ 2005-01-18 17:05                                                                                     ` Jack O'Quin
  2005-01-19  8:24                                                                                       ` Ingo Molnar
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-01-18 17:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel

Ingo Molnar <mingo@elte.hu> writes:

> * Jack O'Quin <joq@io.com> wrote:
>
>> In the absence of any documentation, I'm guessing about storing the
>> nice value in the priority field of the sched_param struct.  But, I
>> have not been able to figure out how to make that work.
>
> the call you need is:
>
>        setpriority(PRIO_PROCESS, tid, -20);
>
> where 'tid' is the TID (pid) of the thread in question. There's no way i
> know of to utilize the pthread_t ID to do this, so you'll have to figure
> the TID out via gettid() - which needs to happen in the child context -
> how hard would it be to attach the TID field to some per-thread Jack
> structure? [while the purpose is still a quick hack.]

Adding a tid field is relatively easy.  Fixing the race condition
between setting it in the new thread and using it in the creating
thread is harder, but not impossible.  But, even setting it in the new
thread would create an incompatible interface.  With hundreds of JACK
client applications, binary compatibility is a serious consideration.

Due to the absurd difficulty of successfully creating a realtime
thread under the various incompatible Linux kernels and pthread
libraries, we export jack_create_thread() to applications.  That way,
they can take advantage of our latest fix for the latest NPTL botch
(0.60 was particularly bad).

So, the new thread's start_routine is not necessarily ours.  I suppose
we could provide an internal function to intialize the thread and then
call the requester's start_routine.  But, this is getting to be a
significant time sink.

Eventually, I can probably cobble something together that will
establish whether your current 2.6.10 SCHED_OTHER works with nice -20.
Is that all we're trying to accomplish?  I do think it can be made to
work (on some kernel versions, given appropriate privileges, with
kernel thread priorities adjusted properly, etc.).

But, that does not meet any of my needs.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-18 17:05                                                                                     ` Jack O'Quin
@ 2005-01-19  8:24                                                                                       ` Ingo Molnar
  2005-01-19 14:39                                                                                         ` Ingo Molnar
  0 siblings, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-01-19  8:24 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> Adding a tid field is relatively easy.  Fixing the race condition
> between setting it in the new thread and using it in the creating
> thread is harder, but not impossible.  But, even setting it in the new
> thread would create an incompatible interface.  With hundreds of JACK
> client applications, binary compatibility is a serious consideration.

i'm not suggesting that this is the way to go, it's just to test how
nice--20 tasks would perform (on the hacked kernel). We still dont have
this data, because in the other tests you tried, some non-highprio
threads got nice--20 priority as well, which can (and apparently do)
interfere with the highprio threads.

is it possible to call a function from the highprio-threads (and only
from them) themselves, during the setup of those threads? If this is
possible then all you need to add is a nice(-20); function call, which
only affects the current thread. (you dont have to know the TID or PID
and dont have to extend any Jack APIs and structures for this hack.)

('highprio threads' are the ones that normally get SCHED_FIFO priority
with -R, 'lowprio threads' are the other client-side threads, if any.)

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-19  8:24                                                                                       ` Ingo Molnar
@ 2005-01-19 14:39                                                                                         ` Ingo Molnar
  2005-01-19 17:45                                                                                           ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-01-19 14:39 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Ingo Molnar <mingo@elte.hu> wrote:

> i'm not suggesting that this is the way to go, it's just to test how
> nice--20 tasks would perform (on the hacked kernel). We still dont
> have this data, because in the other tests you tried, some
> non-highprio threads got nice--20 priority as well, which can (and
> apparently do) interfere with the highprio threads.

to make it easier to test, i've written an API hack: with the kernel
patch below setscheduler() will set the task to nice --20 if you use
SCHED_FIFO and sched_priority of 1. I.e. all you need to do is to run
Jack with -R and use an RT priority of 1 - all the highprio threads
should then become nice --20. If you use RT prio 2 (or higher) it should
be SCHED_FIFO again. Just apply the patch to 2.6.11-rc1 (2.6.10 might
work too) and it will work automatically. (the hack also includes the
earlier 'no starvation for nice--20 tasks' hack.)

	Ingo

--- linux/kernel/sched.c.orig
+++ linux/kernel/sched.c
@@ -2245,10 +2245,10 @@ EXPORT_PER_CPU_SYMBOL(kstat);
  * if a better static_prio task has expired:
  */
 #define EXPIRED_STARVING(rq) \
-	((STARVATION_LIMIT && ((rq)->expired_timestamp && \
+	((task_nice(current) > -20) && ((STARVATION_LIMIT && ((rq)->expired_timestamp && \
 		(jiffies - (rq)->expired_timestamp >= \
 			STARVATION_LIMIT * ((rq)->nr_running) + 1))) || \
-			((rq)->curr->static_prio > (rq)->best_expired_prio))
+			((rq)->curr->static_prio > (rq)->best_expired_prio)))
 
 /*
  * Do the virtual cpu time signal calculations.
@@ -3211,6 +3211,12 @@ static inline task_t *find_process_by_pi
 static void __setscheduler(struct task_struct *p, int policy, int prio)
 {
 	BUG_ON(p->array);
+	if (prio == 1 && policy != SCHED_NORMAL) {
+		p->policy = SCHED_NORMAL;
+		p->static_prio = NICE_TO_PRIO(-20);
+		p->prio = p->static_prio;
+		return;
+	}
 	p->policy = policy;
 	p->rt_priority = prio;
 	if (policy != SCHED_NORMAL)

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-19 14:39                                                                                         ` Ingo Molnar
@ 2005-01-19 17:45                                                                                           ` Jack O'Quin
  2005-01-19 18:32                                                                                             ` Matt Mackall
  2005-01-20  8:05                                                                                             ` Ingo Molnar
  0 siblings, 2 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-01-19 17:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel

Ingo Molnar <mingo@elte.hu> writes:

> * Ingo Molnar <mingo@elte.hu> wrote:
>
>> i'm not suggesting that this is the way to go, it's just to test how
>> nice--20 tasks would perform (on the hacked kernel). We still dont
>> have this data, because in the other tests you tried, some
>> non-highprio threads got nice--20 priority as well, which can (and
>> apparently do) interfere with the highprio threads.

I could hack the threads that the test actually uses just to get some
numbers.  But, that will break some existing JACK clients.

> ('highprio threads' are the ones that normally get SCHED_FIFO priority
> with -R, 'lowprio threads' are the other client-side threads, if any.)

I usually call them `realtime threads' and `non-realtime threads'.
Means the same thing.  I think of them that way, because any code
running in a realtime thread is severely constrained.  It must be
written *very* carefully, almost like a hardware interrupt handler.

> to make it easier to test, i've written an API hack: with the kernel
> patch below setscheduler() will set the task to nice --20 if you use
> SCHED_FIFO and sched_priority of 1. I.e. all you need to do is to run
> Jack with -R and use an RT priority of 1 - all the highprio threads
> should then become nice --20. If you use RT prio 2 (or higher) it should
> be SCHED_FIFO again. Just apply the patch to 2.6.11-rc1 (2.6.10 might
> work too) and it will work automatically. (the hack also includes the
> earlier 'no starvation for nice--20 tasks' hack.)

Good idea, thanks.

These tests mean a lot more running "real" audio programs.  :-)

> @@ -3211,6 +3211,12 @@ static inline task_t *find_process_by_pi
>  static void __setscheduler(struct task_struct *p, int policy, int prio)
>  {
>  	BUG_ON(p->array);
> +	if (prio == 1 && policy != SCHED_NORMAL) {
> +		p->policy = SCHED_NORMAL;
> +		p->static_prio = NICE_TO_PRIO(-20);
> +		p->prio = p->static_prio;
> +		return;
> +	}
>  	p->policy = policy;
>  	p->rt_priority = prio;
>  	if (policy != SCHED_NORMAL)
>

JACK actually uses three different priorities, the defaults are 9, 10
and 20.  How about if I change this test?

	if (prio <= 20 && policy != SCHED_NORMAL) {

Or, should that be?

	if (prio > 0 && prio <= 20 && policy != SCHED_NORMAL) {
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-19 17:45                                                                                           ` Jack O'Quin
@ 2005-01-19 18:32                                                                                             ` Matt Mackall
  2005-01-20  8:07                                                                                               ` Ingo Molnar
  2005-01-20  8:05                                                                                             ` Ingo Molnar
  1 sibling, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-01-19 18:32 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Ingo Molnar, Chris Wright, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel

> > @@ -3211,6 +3211,12 @@ static inline task_t *find_process_by_pi
> >  static void __setscheduler(struct task_struct *p, int policy, int prio)
> >  {
> >  	BUG_ON(p->array);
> > +	if (prio == 1 && policy != SCHED_NORMAL) {
> > +		p->policy = SCHED_NORMAL;
> > +		p->static_prio = NICE_TO_PRIO(-20);
> > +		p->prio = p->static_prio;
> > +		return;
> > +	}
> >  	p->policy = policy;
> >  	p->rt_priority = prio;
> >  	if (policy != SCHED_NORMAL)
> >
> 
> JACK actually uses three different priorities, the defaults are 9, 10
> and 20.  How about if I change this test?
> 
> 	if (prio <= 20 && policy != SCHED_NORMAL) {
> 
> Or, should that be?
> 
> 	if (prio > 0 && prio <= 20 && policy != SCHED_NORMAL) {

Or you can just drop the 'prio == 1 &&' part for this test. Ingo was
trying to be clever to allow some RT bits, but that's not really
necessary.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-19 17:45                                                                                           ` Jack O'Quin
  2005-01-19 18:32                                                                                             ` Matt Mackall
@ 2005-01-20  8:05                                                                                             ` Ingo Molnar
  1 sibling, 0 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-01-20  8:05 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Chris Wright, Matt Mackall, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Jack O'Quin <joq@io.com> wrote:

> JACK actually uses three different priorities, the defaults are 9, 10
> and 20.  How about if I change this test?
> 
> 	if (prio <= 20 && policy != SCHED_NORMAL) {

yeah, this is OK. 20 is used for the watchdog thread, right? (so it has
minimal latency impact). What's the difference between prio 9 and 10
threads? You might want to map prio 9 ones to nice--15 and prio 10 ones
to nice--20, if there's a real difference between them. But for the
first test i'd suggest to use nice--20 for both. (to make sure
SCHED_OTHER tasks interfere as rarely as possible.)

> Or, should that be?
> 
> 	if (prio > 0 && prio <= 20 && policy != SCHED_NORMAL) {

'prio' cannot get negative here, so the first test is just as fine.

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-19 18:32                                                                                             ` Matt Mackall
@ 2005-01-20  8:07                                                                                               ` Ingo Molnar
  0 siblings, 0 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-01-20  8:07 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Chris Wright, Paul Davis, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, alan, linux-kernel


* Matt Mackall <mpm@selenic.com> wrote:

> > Or, should that be?
> > 
> > 	if (prio > 0 && prio <= 20 && policy != SCHED_NORMAL) {
> 
> Or you can just drop the 'prio == 1 &&' part for this test. Ingo was
> trying to be clever to allow some RT bits, but that's not really
> necessary.

actually, there may be some kernel threads that may run at RT priority
99. But i agree, dropping the test for prio==1 should work just as fine.

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  5:30         ` Jack O'Quin
  2005-03-08  6:33           ` Matt Mackall
@ 2005-03-10 14:01           ` Pavel Machek
  1 sibling, 0 replies; 266+ messages in thread
From: Pavel Machek @ 2005-03-10 14:01 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Andrew Morton, Matt Mackall, paul, cfriesen, chrisw, hch,
	rlrevell, arjanv, mingo, alan, linux-kernel

On Mon 07-03-05 23:30:57, Jack O'Quin wrote:
> Andrew Morton <akpm@osdl.org> writes:
> 
> > Matt Mackall <mpm@selenic.com> wrote:
> >>
> >> I think Chris Wright's last rlimit patch is more sensible and ready to
> >>  go.
> >
> > I must say that I like rlimits - very straightforward, although somewhat
> > awkward to use from userspace due to shortsighted shell design.
> >
> > Does anyone have serious objections to this approach?
> 
> 1. is likely to introduce multiuser system security holes like the one
> created recently when the mlock() rlimits bug was fixed (DoS attacks)

Default is unchanged and you claim your boxes are single-user-a-time,
anyway.

> 2. requires updates to all the shells

No. Just set it during login.

> 3. forces Windows and Mac musicians to learn and understand PAM

While you force them to mess with security modules. I'd say thats and improvement.
And "understanding PAM" in this case means updating two files, adding one
line to each.
 
> 4. is undocumented and has never been tested in any real music studios

So write the docs and test it.

				Pavel
-- 
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms         


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-09  3:44               ` Matt Mackall
@ 2005-03-09  4:04                 ` Jack O'Quin
  0 siblings, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-03-09  4:04 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Andrew Morton, paul, cfriesen, chrisw, hch, rlrevell, arjanv,
	mingo, alan, linux-kernel

Matt Mackall <mpm@selenic.com> writes:

> On Tue, Mar 08, 2005 at 09:39:24PM -0600, Jack O'Quin wrote:
>> >> 4. is undocumented and has never been tested in any real music studios
>> >
>> > Well you'll have a bit to test it before it goes to Linus.
>> 
>> Only toy tests will be possible without the required userspace tools.
>
> Chris posted the requisite change to pam_limits as well.

Sure.  

You and Chris and I can find a way to test it.  Those are "toy tests".

The RT-LSM has been used for over a year by hundreds (probably
thousands) of musicians in studios making real music.  That's what I
mean by "real music studios".  We won't be able to do that kind of
testing for the rlimits solution until next year.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-09  3:39             ` Jack O'Quin
@ 2005-03-09  3:44               ` Matt Mackall
  2005-03-09  4:04                 ` Jack O'Quin
  0 siblings, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-03-09  3:44 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Andrew Morton, paul, cfriesen, chrisw, hch, rlrevell, arjanv,
	mingo, alan, linux-kernel

On Tue, Mar 08, 2005 at 09:39:24PM -0600, Jack O'Quin wrote:
> >> 4. is undocumented and has never been tested in any real music studios
> >
> > Well you'll have a bit to test it before it goes to Linus.
> 
> Only toy tests will be possible without the required userspace tools.

Chris posted the requisite change to pam_limits as well.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  6:33           ` Matt Mackall
@ 2005-03-09  3:39             ` Jack O'Quin
  2005-03-09  3:44               ` Matt Mackall
  0 siblings, 1 reply; 266+ messages in thread
From: Jack O'Quin @ 2005-03-09  3:39 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Andrew Morton, paul, cfriesen, chrisw, hch, rlrevell, arjanv,
	mingo, alan, linux-kernel


>> Andrew Morton <akpm@osdl.org> writes:
>> > Does anyone have serious objections to this approach?

> On Mon, Mar 07, 2005 at 11:30:57PM -0600, Jack O'Quin wrote:
>> 1. is likely to introduce multiuser system security holes like the one
>> created recently when the mlock() rlimits bug was fixed (DoS attacks)

Matt Mackall <mpm@selenic.com> writes:
> I wouldn't say "likely". But anything's possible, so I wouldn't rule
> it out entirely.

I wasn't predicting a bug in your code, just pointing to a known PAM
problem.  The lack of good documentation and overly obscure PAM
interfaces cause some (most?) distributions to ship with broken PAM
configurations.  Debian includes pam_limits.so in seven different
/etc/pam.d files, yet their /etc/security/limits.conf is empty.

When the recent mlock() rlimits bug fix was merged, it had the
unintended effect of suddenly granting almost every user unlimited
mlock() privileges.  I suspect something similar will happen for this
new rlimit.  Mounting a DoS attack becomes child's play for anyone.

This is OK for me, but a disaster for shared system admins.  That is
why these kinds of API changes should be avoided in a stable release.

The big advantage of the LSM approach is that we can be confident it
will have no effect on systems that do not load it.  Further, the
sysadmin can easily check that it's not present.  None of that is true
for this rlimits API change.

>> 2. requires updates to all the shells
>
> Requires update to the PAM distro for our purposes. 

That, too.

>> 3. forces Windows and Mac musicians to learn and understand PAM
>
> Or for the distro (ubuntu or whatever) to catch up. The alternative is
> for the user to compile their own kernel module and mess with its
> arcane interface.

No, this LSM is already included in several distributions.

>> 4. is undocumented and has never been tested in any real music studios
>
> Well you'll have a bit to test it before it goes to Linus.

Only toy tests will be possible without the required userspace tools.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08 21:34                   ` Lee Revell
@ 2005-03-08 23:55                     ` James Morris
  0 siblings, 0 replies; 266+ messages in thread
From: James Morris @ 2005-03-08 23:55 UTC (permalink / raw)
  To: Lee Revell
  Cc: Christoph Hellwig, Andrew Morton, Ingo Molnar, paul, mpm, joq,
	cfriesen, Chris Wright, Arjan van de Ven, Alan Cox, linux-kernel,
	Stephen Smalley

On Tue, 8 Mar 2005, Lee Revell wrote:

> I am still confused about why the LSM framework was merged in the first
> place.

The purpose of LSM is to allow different security models to be
implemented.  IMHO, a security model here meaning a complete or otherwise
significantly enhancing system-wide framework, such as SELinux.

I don't think LSM is a suitable framework for upstream merging of trivial
or experimental access control enhancements.  They should either be made
part of the core kernel under LSM control or incorporated directly into an
existing LSM.

One of the reasons I would put forward for this is that it can be
dangerous to allow the user to arbitrarily compose security modules.

Also, from an architectural point of view, it's better to think about
security models at a high level with broadly defined components (e.g.  
"DAC" and "MAC"), not as a collection of miscellaneous features.

In the case of this code, I would suggest integrating it into the core
kernel, and providing an LSM hook to allow other LSMs to mediate it.

As an example, see the vm_enough_memory hook.

- James
-- 
James Morris
<jmorris@redhat.com>



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08 21:20                 ` Christoph Hellwig
@ 2005-03-08 21:34                   ` Lee Revell
  2005-03-08 23:55                     ` James Morris
  0 siblings, 1 reply; 266+ messages in thread
From: Lee Revell @ 2005-03-08 21:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, Ingo Molnar, paul, mpm, joq, cfriesen,
	Chris Wright, arjanv, alan, linux-kernel

On Tue, 2005-03-08 at 21:20 +0000, Christoph Hellwig wrote:
> On Tue, Mar 08, 2005 at 01:55:55PM -0500, Lee Revell wrote:
> > And as I mentioned a few times, the authors have neither the inclination
> > nor the ability to do that, because they are not kernel hackers.  The
> > realtime LSM was written by users (not developers) of the kernel, to
> > solve a specific real world problem.  No one ever claimed it was the
> > correct solution from the kernel POV.
> 
> And I told you that doesn't matter.  If someone wants a feature in they
> should find a way to make it palable.  We're not accepting such excuses
> to put in crap.
> 

Fine.  Consider it a proof of concept.  I'm satisfied if any solution
gets merged, it doesn't have to be this one.

I am still confused about why the LSM framework was merged in the first
place.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08 18:55               ` Lee Revell
  2005-03-08 19:11                 ` Paul Davis
@ 2005-03-08 21:20                 ` Christoph Hellwig
  2005-03-08 21:34                   ` Lee Revell
  1 sibling, 1 reply; 266+ messages in thread
From: Christoph Hellwig @ 2005-03-08 21:20 UTC (permalink / raw)
  To: Lee Revell
  Cc: Christoph Hellwig, Andrew Morton, Ingo Molnar, paul, mpm, joq,
	cfriesen, Chris Wright, arjanv, alan, linux-kernel

On Tue, Mar 08, 2005 at 01:55:55PM -0500, Lee Revell wrote:
> And as I mentioned a few times, the authors have neither the inclination
> nor the ability to do that, because they are not kernel hackers.  The
> realtime LSM was written by users (not developers) of the kernel, to
> solve a specific real world problem.  No one ever claimed it was the
> correct solution from the kernel POV.

And I told you that doesn't matter.  If someone wants a feature in they
should find a way to make it palable.  We're not accepting such excuses
to put in crap.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08 19:11                 ` Paul Davis
@ 2005-03-08 20:29                   ` Andrew Morton
  0 siblings, 0 replies; 266+ messages in thread
From: Andrew Morton @ 2005-03-08 20:29 UTC (permalink / raw)
  To: Paul Davis
  Cc: rlrevell, hch, mingo, mpm, joq, cfriesen, chrisw, arjanv, alan,
	linux-kernel

Paul Davis <paul@linuxaudiosystems.com> wrote:
>
> >And as I mentioned a few times, the authors have neither the inclination
> >nor the ability to do that, because they are not kernel hackers.  The
> >realtime LSM was written by users (not developers) of the kernel, to
> >solve a specific real world problem.  No one ever claimed it was the
> >correct solution from the kernel POV.
> 
> i would just like to add that its very disappointing that the LSM,
> having been included in the kernel (apparently very much against
> Christoph's and others' advice) turns out to be so useless. from
> outside lkml, LSM appeared to be a mechanism to allow
> non-kernel-developers to create new security policies (perhaps even
> mechanisms) without trying to tackle the entire kernel. instead, we
> are now getting a fix which, while it solves the same problem, has
> required substantive analysis of its effect on the overall kernel, and
> will require continued vigilance to ensure that it doesn't now or
> later cause unintended side effects. LSM appeared to be the "right"
> way to do this in terms of modularity - it is disappointing to find it
> has so little support (close to zero to judge from this debate) on
> LKML despite being present in the kernel.
> 

That, plus the fact that inherited capabilities could also be used here,
except they don't work right.  That's a nice, simple and long-standing
kernel feature which I think we should have fixed up before piling in more
security features.

But I've said that often enough.  If nobody has a sufficient need for
fixed-up-caps to actually put work into it, nothing happens.  And it's a
lot of work, because this is a scary feature.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:33     ` Matt Mackall
                         ` (3 preceding siblings ...)
  2005-03-08  6:55       ` Andrew Morton
@ 2005-03-08 19:17       ` utz lehmann
  4 siblings, 0 replies; 266+ messages in thread
From: utz lehmann @ 2005-03-08 19:17 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Andrew Morton, Paul Davis, joq, cfriesen, chrisw, hch, rlrevell,
	arjanv, mingo, alan, LKML

On Mon, 2005-03-07 at 20:33 -0800, Matt Mackall wrote:
> On Mon, Mar 07, 2005 at 07:50:20PM -0800, Andrew Morton wrote:
> > 
> > So I still have the rt-lsm patch floating about, saying "merge me, merge
> > me!".  I'm not sure that the world would end were I to do so.
> > 
> > Consider this a prod in the direction of those who were pushing
> > alternatives ;)
> 
> I think Chris Wright's last rlimit patch is more sensible and ready to
> go. And I think I may have even convinced Ingo on this point before
> the conversation died last time around. So here's that patch again,
> updated to 2.6.11. Compiles cleanly. Chris, please add a signed-off-by.
> 
> <snip>
> 
> Add a pair of rlimits for allowing non-root tasks to raise nice and rt
> priorities. Defaults to traditional behavior. Originally written by
> Chris Wright.

The nice part is really useful for me. With it i can allow users to
renice their previously niced jobs (eg. from 19 to 0). At the moment
they need to call me and i do this as root.

If this rlimit approach is not the solution for the audio RT stuff, can
the nice part merged anyway?

utz



^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08 18:55               ` Lee Revell
@ 2005-03-08 19:11                 ` Paul Davis
  2005-03-08 20:29                   ` Andrew Morton
  2005-03-08 21:20                 ` Christoph Hellwig
  1 sibling, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-03-08 19:11 UTC (permalink / raw)
  To: Lee Revell
  Cc: Christoph Hellwig, Andrew Morton, Ingo Molnar, mpm, joq,
	cfriesen, Chris Wright, arjanv, alan, linux-kernel

>And as I mentioned a few times, the authors have neither the inclination
>nor the ability to do that, because they are not kernel hackers.  The
>realtime LSM was written by users (not developers) of the kernel, to
>solve a specific real world problem.  No one ever claimed it was the
>correct solution from the kernel POV.

i would just like to add that its very disappointing that the LSM,
having been included in the kernel (apparently very much against
Christoph's and others' advice) turns out to be so useless. from
outside lkml, LSM appeared to be a mechanism to allow
non-kernel-developers to create new security policies (perhaps even
mechanisms) without trying to tackle the entire kernel. instead, we
are now getting a fix which, while it solves the same problem, has
required substantive analysis of its effect on the overall kernel, and
will require continued vigilance to ensure that it doesn't now or
later cause unintended side effects. LSM appeared to be the "right"
way to do this in terms of modularity - it is disappointing to find it
has so little support (close to zero to judge from this debate) on
LKML despite being present in the kernel.

--p


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:32             ` Christoph Hellwig
  2005-03-08  4:47               ` Matt Mackall
@ 2005-03-08 18:55               ` Lee Revell
  2005-03-08 19:11                 ` Paul Davis
  2005-03-08 21:20                 ` Christoph Hellwig
  1 sibling, 2 replies; 266+ messages in thread
From: Lee Revell @ 2005-03-08 18:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, Ingo Molnar, paul, mpm, joq, cfriesen,
	Chris Wright, arjanv, alan, linux-kernel

On Tue, 2005-03-08 at 04:32 +0000, Christoph Hellwig wrote:
> and as I mentioned a few times if we really want to go for a magic
> uid/gid-based approach we should at least have one that's useable for
> all capabilities so it can replace the oracle hack aswell.  But the
> proponents of the patch weren't iterested to invest the tiniest bit
> of work over what they submited.

And as I mentioned a few times, the authors have neither the inclination
nor the ability to do that, because they are not kernel hackers.  The
realtime LSM was written by users (not developers) of the kernel, to
solve a specific real world problem.  No one ever claimed it was the
correct solution from the kernel POV.

I know Jack disagrees but I for one am glad to see the max-RT-prio
rlimit patch going in.  This probably reflects my sysadmin background,
PAM does not scare me at all.  Anyway it solves the same problem and
will be invisible to any user with a reasonable distro.  If musicians
end up having to tweak the PAM configuration, then I would say the
distro has failed miserably.

Lee


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  6:55       ` Andrew Morton
@ 2005-03-08  8:45         ` Matt Mackall
  0 siblings, 0 replies; 266+ messages in thread
From: Matt Mackall @ 2005-03-08  8:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: paul, joq, cfriesen, chrisw, hch, rlrevell, arjanv, mingo, alan,
	linux-kernel

On Mon, Mar 07, 2005 at 10:55:35PM -0800, Andrew Morton wrote:
> Matt Mackall <mpm@selenic.com> wrote:
> >
> >  Add a pair of rlimits for allowing non-root tasks to raise nice and rt
> >  priorities. Defaults to traditional behavior. Originally written by
> >  Chris Wright.
> 
> It needs some dinking with because Ingo has been playing games in my
> resource.h.  Here's the end result.  Unlike yours, this will work on alpha,
> mips and sparc[64], too ;)

Boggle. The diffstat of this patch (and Chris') looks identical to
mine. Whatever..

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:33     ` Matt Mackall
                         ` (2 preceding siblings ...)
  2005-03-08  6:45       ` Chris Wright
@ 2005-03-08  6:55       ` Andrew Morton
  2005-03-08  8:45         ` Matt Mackall
  2005-03-08 19:17       ` utz lehmann
  4 siblings, 1 reply; 266+ messages in thread
From: Andrew Morton @ 2005-03-08  6:55 UTC (permalink / raw)
  To: Matt Mackall
  Cc: paul, joq, cfriesen, chrisw, hch, rlrevell, arjanv, mingo, alan,
	linux-kernel

Matt Mackall <mpm@selenic.com> wrote:
>
>  Add a pair of rlimits for allowing non-root tasks to raise nice and rt
>  priorities. Defaults to traditional behavior. Originally written by
>  Chris Wright.

It needs some dinking with because Ingo has been playing games in my
resource.h.  Here's the end result.  Unlike yours, this will work on alpha,
mips and sparc[64], too ;)




From: Matt Mackall <mpm@selenic.com>

Add a pair of rlimits for allowing non-root tasks to raise nice and rt
priorities. Defaults to traditional behavior. Originally written by
Chris Wright.

The patch implements a simple rlimit ceiling for the RT (and nice) priorities
a task can set.  The rlimit defaults to 0, meaning no change in behavior by
default.  A value of 50 means RT priority levels 1-50 are allowed.  A value of
100 means all 99 privilege levels from 1 to 99 are allowed.  CAP_SYS_NICE is
blanket permission.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 25-akpm/include/asm-generic/resource.h |    7 ++++++-
 25-akpm/include/linux/sched.h          |    1 +
 25-akpm/kernel/sched.c                 |   25 +++++++++++++++++++------
 25-akpm/kernel/sys.c                   |    2 +-
 4 files changed, 27 insertions(+), 8 deletions(-)

diff -puN include/asm-generic/resource.h~nice-and-rt-prio-rlimits include/asm-generic/resource.h
--- 25/include/asm-generic/resource.h~nice-and-rt-prio-rlimits	2005-03-07 22:50:45.000000000 -0800
+++ 25-akpm/include/asm-generic/resource.h	2005-03-07 22:52:10.000000000 -0800
@@ -41,8 +41,11 @@
 #define RLIMIT_LOCKS		10	/* maximum file locks held */
 #define RLIMIT_SIGPENDING	11	/* max number of pending signals */
 #define RLIMIT_MSGQUEUE		12	/* maximum bytes in POSIX mqueues */
+#define RLIMIT_NICE		13	/* max nice prio allowed to raise to
+					   0-39 for nice level 19 .. -20 */
+#define RLIMIT_RTPRIO		14	/* maximum realtime priority */
 
-#define RLIM_NLIMITS		13
+#define RLIM_NLIMITS		15
 
 /*
  * SuS says limits have to be unsigned.
@@ -81,6 +84,8 @@
 	[RLIMIT_LOCKS]		= {  RLIM_INFINITY,  RLIM_INFINITY },	\
 	[RLIMIT_SIGPENDING]	= { 		0,	       0 },	\
 	[RLIMIT_MSGQUEUE]	= {   MQ_BYTES_MAX,   MQ_BYTES_MAX },	\
+	[RLIMIT_NICE]		= { 0, 0 },				\
+	[RLIMIT_RTPRIO]		= { 0, 0 },				\
 }
 
 #endif	/* __KERNEL__ */
diff -puN include/linux/sched.h~nice-and-rt-prio-rlimits include/linux/sched.h
--- 25/include/linux/sched.h~nice-and-rt-prio-rlimits	2005-03-07 22:50:45.000000000 -0800
+++ 25-akpm/include/linux/sched.h	2005-03-07 22:50:45.000000000 -0800
@@ -872,6 +872,7 @@ extern void sched_idle_next(void);
 extern void set_user_nice(task_t *p, long nice);
 extern int task_prio(const task_t *p);
 extern int task_nice(const task_t *p);
+extern int can_nice(const task_t *p, const int nice);
 extern int task_curr(const task_t *p);
 extern int idle_cpu(int cpu);
 extern int sched_setscheduler(struct task_struct *, int, struct sched_param *);
diff -puN kernel/sched.c~nice-and-rt-prio-rlimits kernel/sched.c
--- 25/kernel/sched.c~nice-and-rt-prio-rlimits	2005-03-07 22:50:45.000000000 -0800
+++ 25-akpm/kernel/sched.c	2005-03-07 22:50:45.000000000 -0800
@@ -3304,6 +3304,19 @@ struct task_struct *kgdb_get_idle(int th
 }
 #endif
 
+/*
+ * can_nice - check if a task can reduce its nice value
+ * @p: task
+ * @nice: nice value
+ */
+int can_nice(const task_t *p, const int nice)
+{
+	/* convert nice value [19,-20] to rlimit style value [0,39] */
+	int nice_rlim = 19 - nice;
+	return (nice_rlim <= p->signal->rlim[RLIMIT_NICE].rlim_cur ||
+		capable(CAP_SYS_NICE));
+}
+
 #ifdef __ARCH_WANT_SYS_NICE
 
 /*
@@ -3323,12 +3336,8 @@ asmlinkage long sys_nice(int increment)
 	 * We don't have to worry. Conceptually one call occurs first
 	 * and we have a single winner.
 	 */
-	if (increment < 0) {
-		if (!capable(CAP_SYS_NICE))
-			return -EPERM;
-		if (increment < -40)
-			increment = -40;
-	}
+	if (increment < -40)
+		increment = -40;
 	if (increment > 40)
 		increment = 40;
 
@@ -3338,6 +3347,9 @@ asmlinkage long sys_nice(int increment)
 	if (nice > 19)
 		nice = 19;
 
+	if (increment < 0 && !can_nice(current, nice))
+		return -EPERM;
+
 	retval = security_task_setnice(current, nice);
 	if (retval)
 		return retval;
@@ -3453,6 +3465,7 @@ recheck:
 		return -EINVAL;
 
 	if ((policy == SCHED_FIFO || policy == SCHED_RR) &&
+	    param->sched_priority > p->signal->rlim[RLIMIT_RTPRIO].rlim_cur &&
 	    !capable(CAP_SYS_NICE))
 		return -EPERM;
 	if ((current->euid != p->euid) && (current->euid != p->uid) &&
diff -puN kernel/sys.c~nice-and-rt-prio-rlimits kernel/sys.c
--- 25/kernel/sys.c~nice-and-rt-prio-rlimits	2005-03-07 22:50:45.000000000 -0800
+++ 25-akpm/kernel/sys.c	2005-03-07 22:50:45.000000000 -0800
@@ -229,7 +229,7 @@ static int set_one_prio(struct task_stru
 		error = -EPERM;
 		goto out;
 	}
-	if (niceval < task_nice(p) && !capable(CAP_SYS_NICE)) {
+	if (niceval < task_nice(p) && !can_nice(p, niceval)) {
 		error = -EACCES;
 		goto out;
 	}
_


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  6:45       ` Chris Wright
@ 2005-03-08  6:49         ` Matt Mackall
  0 siblings, 0 replies; 266+ messages in thread
From: Matt Mackall @ 2005-03-08  6:49 UTC (permalink / raw)
  To: Chris Wright
  Cc: Andrew Morton, Paul Davis, joq, cfriesen, hch, rlrevell, arjanv,
	mingo, alan, linux-kernel

On Mon, Mar 07, 2005 at 10:45:05PM -0800, Chris Wright wrote:
> * Matt Mackall (mpm@selenic.com) wrote:
> > On Mon, Mar 07, 2005 at 07:50:20PM -0800, Andrew Morton wrote:
> > > Consider this a prod in the direction of those who were pushing
> > > alternatives ;)
> > 
> > I think Chris Wright's last rlimit patch is more sensible and ready to
> > go. And I think I may have even convinced Ingo on this point before
> > the conversation died last time around. So here's that patch again,
> > updated to 2.6.11. Compiles cleanly. Chris, please add a signed-off-by.
> 
> Only very minor nits below.
> 
> > Add a pair of rlimits for allowing non-root tasks to raise nice and rt
> > priorities. Defaults to traditional behavior. Originally written by
> > Chris Wright.
> > 
> > Signed-off-by: Matt Mackall <mpm@selenic.com>
> > 
> > Index: rlimits/include/asm-generic/resource.h
> > ===================================================================
> > --- rlimits.orig/include/asm-generic/resource.h	2005-03-02 18:30:27.000000000 -0800
> > +++ rlimits/include/asm-generic/resource.h	2005-03-07 20:21:04.000000000 -0800
> > @@ -20,8 +20,10 @@
> >  #define RLIMIT_LOCKS		10	/* maximum file locks held */
> >  #define RLIMIT_SIGPENDING	11	/* max number of pending signals */
> >  #define RLIMIT_MSGQUEUE		12	/* maximum bytes in POSIX mqueues */
> > -
> > -#define RLIM_NLIMITS		13
> > +#define RLIMIT_NICE	13		/* max nice prio allowed to raise to
> > +					   0-39 for nice level 19 .. -20 */
> > +#define RLIMIT_RTPRIO	14		/* maximum realtime priority */
> 
> Needs one more tab to keep in line with rest.

That's just tab damage from the patch.
 
> > +#define RLIM_NLIMITS		15
> 
> 
> >  #endif
> >  
> >  /*
> > @@ -53,6 +55,8 @@
> >  	[RLIMIT_LOCKS]		= { RLIM_INFINITY, RLIM_INFINITY },	\
> >  	[RLIMIT_SIGPENDING]	= { MAX_SIGPENDING, MAX_SIGPENDING },	\
> >  	[RLIMIT_MSGQUEUE]	= { MQ_BYTES_MAX, MQ_BYTES_MAX },	\
> > +	[RLIMIT_NICE]		= { 0, 0 }, \
> > +	[RLIMIT_RTPRIO]		= { 0, 0 }, \
> 
> Might as well fit in with rest of file on these too.
> 
> Also, missed alpha, sparc, sparc64, and mips.  BTW, where's that last
> cleanup from Ingo to consolidate these?  Ah, just saw these are inflight
> to Linus' tree, nevermind.

Uhh, is that not what I've already done above?
 
> -#define RLIM_NLIMITS		13
> +#define RLIMIT_NICE		13	/* max nice prio allowed to raise to
> +					   0-39 for nice level 19 .. -20 */
> +#define RLIMIT_RTPRIO		14	/* maximum realtime priority */

Heh.

Anyway...

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:33     ` Matt Mackall
  2005-03-08  4:40       ` Andrew Morton
  2005-03-08  5:38       ` Ingo Molnar
@ 2005-03-08  6:45       ` Chris Wright
  2005-03-08  6:49         ` Matt Mackall
  2005-03-08  6:55       ` Andrew Morton
  2005-03-08 19:17       ` utz lehmann
  4 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-03-08  6:45 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Andrew Morton, Paul Davis, joq, cfriesen, chrisw, hch, rlrevell,
	arjanv, mingo, alan, linux-kernel

* Matt Mackall (mpm@selenic.com) wrote:
> On Mon, Mar 07, 2005 at 07:50:20PM -0800, Andrew Morton wrote:
> > Consider this a prod in the direction of those who were pushing
> > alternatives ;)
> 
> I think Chris Wright's last rlimit patch is more sensible and ready to
> go. And I think I may have even convinced Ingo on this point before
> the conversation died last time around. So here's that patch again,
> updated to 2.6.11. Compiles cleanly. Chris, please add a signed-off-by.

Only very minor nits below.

> Add a pair of rlimits for allowing non-root tasks to raise nice and rt
> priorities. Defaults to traditional behavior. Originally written by
> Chris Wright.
> 
> Signed-off-by: Matt Mackall <mpm@selenic.com>
> 
> Index: rlimits/include/asm-generic/resource.h
> ===================================================================
> --- rlimits.orig/include/asm-generic/resource.h	2005-03-02 18:30:27.000000000 -0800
> +++ rlimits/include/asm-generic/resource.h	2005-03-07 20:21:04.000000000 -0800
> @@ -20,8 +20,10 @@
>  #define RLIMIT_LOCKS		10	/* maximum file locks held */
>  #define RLIMIT_SIGPENDING	11	/* max number of pending signals */
>  #define RLIMIT_MSGQUEUE		12	/* maximum bytes in POSIX mqueues */
> -
> -#define RLIM_NLIMITS		13
> +#define RLIMIT_NICE	13		/* max nice prio allowed to raise to
> +					   0-39 for nice level 19 .. -20 */
> +#define RLIMIT_RTPRIO	14		/* maximum realtime priority */

Needs one more tab to keep in line with rest.

> +#define RLIM_NLIMITS		15


>  #endif
>  
>  /*
> @@ -53,6 +55,8 @@
>  	[RLIMIT_LOCKS]		= { RLIM_INFINITY, RLIM_INFINITY },	\
>  	[RLIMIT_SIGPENDING]	= { MAX_SIGPENDING, MAX_SIGPENDING },	\
>  	[RLIMIT_MSGQUEUE]	= { MQ_BYTES_MAX, MQ_BYTES_MAX },	\
> +	[RLIMIT_NICE]		= { 0, 0 }, \
> +	[RLIMIT_RTPRIO]		= { 0, 0 }, \

Might as well fit in with rest of file on these too.

Also, missed alpha, sparc, sparc64, and mips.  BTW, where's that last
cleanup from Ingo to consolidate these?  Ah, just saw these are inflight
to Linus' tree, nevermind.

Below fixes those nits, and rediffs against that inflight cleanup so it
should apply cleanly on top of that.

thanks,
-chris
-- 

Add a pair of rlimits for allowing non-root tasks to raise nice and rt
priorities. Defaults to traditional behavior. Originally written by
Chris Wright.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Chris Wright <chrisw@osdl.org>

===== include/asm-generic/resource.h 1.1 vs edited =====
--- 1.1/include/asm-generic/resource.h	2005-01-20 21:00:51 -08:00
+++ edited/include/asm-generic/resource.h	2005-03-07 21:15:00 -08:00
@@ -41,8 +41,10 @@
 #define RLIMIT_LOCKS		10	/* maximum file locks held */
 #define RLIMIT_SIGPENDING	11	/* max number of pending signals */
 #define RLIMIT_MSGQUEUE		12	/* maximum bytes in POSIX mqueues */
-
-#define RLIM_NLIMITS		13
+#define RLIMIT_NICE		13	/* max nice prio allowed to raise to
+					   0-39 for nice level 19 .. -20 */
+#define RLIMIT_RTPRIO		14	/* maximum realtime priority */
+#define RLIM_NLIMITS		15
 
 /*
  * SuS says limits have to be unsigned.
@@ -81,6 +83,8 @@
 	[RLIMIT_LOCKS]		= {  RLIM_INFINITY,  RLIM_INFINITY },	\
 	[RLIMIT_SIGPENDING]	= { MAX_SIGPENDING, MAX_SIGPENDING },	\
 	[RLIMIT_MSGQUEUE]	= {   MQ_BYTES_MAX,   MQ_BYTES_MAX },	\
+	[RLIMIT_NICE]		= {              0,              0 },	\
+	[RLIMIT_RTPRIO]		= {              0,              0 },	\
 }
 
 #endif	/* __KERNEL__ */
===== include/linux/sched.h 1.279 vs edited =====
--- 1.279/include/linux/sched.h	2005-03-04 22:41:13 -08:00
+++ edited/include/linux/sched.h	2005-03-07 21:43:39 -08:00
@@ -792,6 +792,7 @@ extern void sched_idle_next(void);
 extern void set_user_nice(task_t *p, long nice);
 extern int task_prio(const task_t *p);
 extern int task_nice(const task_t *p);
+extern int can_nice(const task_t *p, const int nice);
 extern int task_curr(const task_t *p);
 extern int idle_cpu(int cpu);
 extern int sched_setscheduler(struct task_struct *, int, struct sched_param *);
===== kernel/sched.c 1.394 vs edited =====
--- 1.394/kernel/sched.c	2005-03-04 22:41:14 -08:00
+++ edited/kernel/sched.c	2005-03-07 21:43:39 -08:00
@@ -3278,6 +3278,19 @@ out_unlock:
 
 EXPORT_SYMBOL(set_user_nice);
 
+/*
+ * can_nice - check if a task can reduce its nice value
+ * @p: task
+ * @nice: nice value
+ */
+int can_nice(const task_t *p, const int nice)
+{
+	/* convert nice value [19,-20] to rlimit style value [0,39] */
+	int nice_rlim = 19 - nice;
+	return (nice_rlim <= p->signal->rlim[RLIMIT_NICE].rlim_cur || 
+		capable(CAP_SYS_NICE));
+}
+
 #ifdef __ARCH_WANT_SYS_NICE
 
 /*
@@ -3297,12 +3310,8 @@ asmlinkage long sys_nice(int increment)
 	 * We don't have to worry. Conceptually one call occurs first
 	 * and we have a single winner.
 	 */
-	if (increment < 0) {
-		if (!capable(CAP_SYS_NICE))
-			return -EPERM;
-		if (increment < -40)
-			increment = -40;
-	}
+	if (increment < -40)
+		increment = -40;
 	if (increment > 40)
 		increment = 40;
 
@@ -3312,6 +3321,9 @@ asmlinkage long sys_nice(int increment)
 	if (nice > 19)
 		nice = 19;
 
+	if (increment < 0 && !can_nice(current, nice))
+		return -EPERM;
+
 	retval = security_task_setnice(current, nice);
 	if (retval)
 		return retval;
@@ -3427,6 +3439,7 @@ recheck:
 		return -EINVAL;
 
 	if ((policy == SCHED_FIFO || policy == SCHED_RR) &&
+	    param->sched_priority > p->signal->rlim[RLIMIT_RTPRIO].rlim_cur && 
 	    !capable(CAP_SYS_NICE))
 		return -EPERM;
 	if ((current->euid != p->euid) && (current->euid != p->uid) &&
===== kernel/sys.c 1.104 vs edited =====
--- 1.104/kernel/sys.c	2005-01-11 16:42:35 -08:00
+++ edited/kernel/sys.c	2005-03-07 21:10:23 -08:00
@@ -225,7 +225,7 @@ static int set_one_prio(struct task_stru
 		error = -EPERM;
 		goto out;
 	}
-	if (niceval < task_nice(p) && !capable(CAP_SYS_NICE)) {
+	if (niceval < task_nice(p) && !can_nice(p, niceval)) {
 		error = -EACCES;
 		goto out;
 	}

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  6:40               ` Chris Wright
@ 2005-03-08  6:42                 ` Ingo Molnar
  0 siblings, 0 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-03-08  6:42 UTC (permalink / raw)
  To: Chris Wright
  Cc: Peter Williams, Andrew Morton, Matt Mackall, paul, joq, cfriesen,
	hch, rlrevell, arjanv, alan, linux-kernel


* Chris Wright <chrisw@osdl.org> wrote:

> > Yes.  In kernel "damage control" is an optional extra not a necessity 
> > with this solution.  Not so sure about with the RT LSB solution though.
> 
> This has one advantage over RT LSM in that area, which is it places an
> upper bound on the priority (in control of the admin).  So it's
> possible to save some space for damage control in the top few prio
> slots.

it's not just purely for damage control - there have been requests of
being able to 'partition' the RT priorities space between applications. 
(It's an afterthought but nice nevertheless.)

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  6:28             ` Peter Williams
@ 2005-03-08  6:40               ` Chris Wright
  2005-03-08  6:42                 ` Ingo Molnar
  0 siblings, 1 reply; 266+ messages in thread
From: Chris Wright @ 2005-03-08  6:40 UTC (permalink / raw)
  To: Peter Williams
  Cc: Ingo Molnar, Andrew Morton, Matt Mackall, paul, joq, cfriesen,
	chrisw, hch, rlrevell, arjanv, alan, linux-kernel

* Peter Williams (pwil3058@bigpond.net.au) wrote:
> But the patch you describe still seems a little loose to me in that it 
> doesn't control both which users AND which programs they can run. 
> Although I suppose that can be managed by suitable setting of file 
> permissions?

rlimits are typically handled per user or per group.  this is set during
login and the limits apply to the users session.  none of the solutions
limit which programs the user can run, however strictly group based priv
granting can reduce the number of processes with the privs (using setgid
programs).

> Also I presume that root privileges are needed to set the rlimits which 
> means that the program has to be setuid root or run from a setuid root 
> wrapper.  In the first of these cases the program will be running for a 
> (hopefully) short while with way more privilege than it needs.  This is 
> why I'm attracted to mechanisms that allow programs to be given a subset 
> of root's privileges and only for specified users.

typically this is handled via pam during login, so yes, root (or more
specifically CAP_SYS_RESOURCE) is required, but need not be in any
wrapper.  limiting the allowed programs a user/role/domain/context/etc
can run is the goal of other type of security restrictions (such as
SELinux).

> I would be nice to have a solution to this particular problem that fits 
> in with such a generalized "granular" privilege mechanism (when/if such 
> a mechanism becomes available in the future) rather than a quirky fix 
> that is specific to this problem and doesn't generalize well to similar 
> problems when they arise in the future.  However, I agree with your 
> opinion that granting CAP_SYS_NICE is dangerous without some limit on 
> the priority levels is dangerous and think that a generalized "granular" 
> privilege mechanism would need to include such restrictions.
> 
> >The patch does not attempt to do any
> >"damage control" of abuse caused by RT tasks, and is hence much simpler
> >than my patch or Con's SCHED_ISO patch. ("damage control" could be done
> >from userspace anyway)
> 
> Yes.  In kernel "damage control" is an optional extra not a necessity 
> with this solution.  Not so sure about with the RT LSB solution though.

This has one advantage over RT LSM in that area, which is it places an
upper bound on the priority (in control of the admin).  So it's possible
to save some space for damage control in the top few prio slots.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  5:30         ` Jack O'Quin
@ 2005-03-08  6:33           ` Matt Mackall
  2005-03-09  3:39             ` Jack O'Quin
  2005-03-10 14:01           ` Pavel Machek
  1 sibling, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-03-08  6:33 UTC (permalink / raw)
  To: Jack O'Quin
  Cc: Andrew Morton, paul, cfriesen, chrisw, hch, rlrevell, arjanv,
	mingo, alan, linux-kernel

On Mon, Mar 07, 2005 at 11:30:57PM -0600, Jack O'Quin wrote:
> Andrew Morton <akpm@osdl.org> writes:
> 
> > Matt Mackall <mpm@selenic.com> wrote:
> >>
> >> I think Chris Wright's last rlimit patch is more sensible and ready to
> >>  go.
> >
> > I must say that I like rlimits - very straightforward, although somewhat
> > awkward to use from userspace due to shortsighted shell design.
> >
> > Does anyone have serious objections to this approach?
> 
> 1. is likely to introduce multiuser system security holes like the one
> created recently when the mlock() rlimits bug was fixed (DoS attacks)

I wouldn't say "likely". But anything's possible, so I wouldn't rule
it out entirely.
 
> 2. requires updates to all the shells

Requires update to the PAM distro for our purposes. 

> 3. forces Windows and Mac musicians to learn and understand PAM

Or for the distro (ubuntu or whatever) to catch up. The alternative is
for the user to compile their own kernel module and mess with its
arcane interface.

> 4. is undocumented and has never been tested in any real music studios

Well you'll have a bit to test it before it goes to Linus.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  5:49           ` Ingo Molnar
@ 2005-03-08  6:28             ` Peter Williams
  2005-03-08  6:40               ` Chris Wright
  0 siblings, 1 reply; 266+ messages in thread
From: Peter Williams @ 2005-03-08  6:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Matt Mackall, paul, joq, cfriesen, chrisw, hch,
	rlrevell, arjanv, alan, linux-kernel

Ingo Molnar wrote:
> * Peter Williams <pwil3058@bigpond.net.au> wrote:
> 
> 
>>I don't object to rlimits per se and I think that they are useful but
>>not as a sole solution to this problem.  Being able to give a task
>>preferential treatment is a permissions issue and should be solved as
>>one.
>>
>>Having RT cpu usage limits on tasks is a useful tool to have when
>>granting normal users the privilege of running tasks as RT tasks so
>>that you can limit the damage that they can do BUT the presence of a
>>limit on a task is not a very good criterion for granting that
>>privilege.
> 
> 
> i think you are talking about my rlimit patch (the 'RT CPU limit' patch)
> - but that one is not in discussion here.
> 
> what is being discussed currently is the other rlimit patch (from Chris
> Wright and Matt Mackall) which implements a simple rlimit ceiling for
> the RT (and nice) priorities a task can set. The rlimit defaults to 0,
> meaning no change in behavior by default. A value of 50 means RT
> priority levels 1-50 are allowed. A value of 100 means all 99 privilege
> levels from 1 to 99 are allowed. CAP_SYS_NICE is blanket permission.
> It's all pretty finegrained and and it's a quite straightforward
> extension of what we have today.

OK.  My misunderstanding.

But the patch you describe still seems a little loose to me in that it 
doesn't control both which users AND which programs they can run. 
Although I suppose that can be managed by suitable setting of file 
permissions?

Also I presume that root privileges are needed to set the rlimits which 
means that the program has to be setuid root or run from a setuid root 
wrapper.  In the first of these cases the program will be running for a 
(hopefully) short while with way more privilege than it needs.  This is 
why I'm attracted to mechanisms that allow programs to be given a subset 
of root's privileges and only for specified users.

I would be nice to have a solution to this particular problem that fits 
in with such a generalized "granular" privilege mechanism (when/if such 
a mechanism becomes available in the future) rather than a quirky fix 
that is specific to this problem and doesn't generalize well to similar 
problems when they arise in the future.  However, I agree with your 
opinion that granting CAP_SYS_NICE is dangerous without some limit on 
the priority levels is dangerous and think that a generalized "granular" 
privilege mechanism would need to include such restrictions.

> The patch does not attempt to do any
> "damage control" of abuse caused by RT tasks, and is hence much simpler
> than my patch or Con's SCHED_ISO patch. ("damage control" could be done
> from userspace anyway)

Yes.  In kernel "damage control" is an optional extra not a necessity 
with this solution.  Not so sure about with the RT LSB solution though.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  5:40         ` Peter Williams
  2005-03-08  5:49           ` Ingo Molnar
  2005-03-08  6:00           ` Chris Wright
@ 2005-03-08  6:18           ` Matt Mackall
  2 siblings, 0 replies; 266+ messages in thread
From: Matt Mackall @ 2005-03-08  6:18 UTC (permalink / raw)
  To: Peter Williams
  Cc: Andrew Morton, paul, joq, cfriesen, chrisw, hch, rlrevell,
	arjanv, mingo, alan, linux-kernel

On Tue, Mar 08, 2005 at 04:40:02PM +1100, Peter Williams wrote:
> The granting of the ability to switch to and from RT mode should require 
> a means to specify which users it applies to and also which programs it 
> applies to.  The RT rlimits mechanism doesn't meet these criteria.

a) rlimits are per-process
b) rlimits are typically administered per-user
c) any user can trivially gain any privilege of any process they own
so in some sense per-process limits are meaningless

So rlimits are in fact as granular as can be, both in theory and in
practice.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  5:40         ` Peter Williams
  2005-03-08  5:49           ` Ingo Molnar
@ 2005-03-08  6:00           ` Chris Wright
  2005-03-08  6:18           ` Matt Mackall
  2 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-03-08  6:00 UTC (permalink / raw)
  To: Peter Williams
  Cc: Andrew Morton, Matt Mackall, paul, joq, cfriesen, chrisw, hch,
	rlrevell, arjanv, mingo, alan, linux-kernel

* Peter Williams (pwil3058@bigpond.net.au) wrote:
> Andrew Morton wrote:
> >Matt Mackall <mpm@selenic.com> wrote:
> >
> >>I think Chris Wright's last rlimit patch is more sensible and ready to
> >>go.
> >
> >
> >I must say that I like rlimits - very straightforward, although somewhat
> >awkward to use from userspace due to shortsighted shell design.
> >
> >Does anyone have serious objections to this approach?
> 
> I don't object to rlimits per se and I think that they are useful but 
> not as a sole solution to this problem.  Being able to give a task 
> preferential treatment is a permissions issue and should be solved as one.
> 
> Having RT cpu usage limits on tasks is a useful tool to have when 
> granting normal users the privilege of running tasks as RT tasks so that 
> you can limit the damage that they can do BUT the presence of a limit on 
> a task is not a very good criterion for granting that privilege.
> 
> The granting of the ability to switch to and from RT mode should require 
> a means to specify which users it applies to and also which programs it 
> applies to.  The RT rlimits mechanism doesn't meet these criteria.
> 
> In summary, IMHO you should put them both in but modify the RT rlimits 
> patch so that it plays no part in the decision as to whether the task is 
> allowed to run as RT or not.

I'm not sure I follow you.  This patch just sets the max RT priority a
process can have (defaults to 0, as w/out the patch).  Increasing that
value is a form of permission granting, giving the process the ability
to increase its RT prio if it chooses to ask for it.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  5:40         ` Peter Williams
@ 2005-03-08  5:49           ` Ingo Molnar
  2005-03-08  6:28             ` Peter Williams
  2005-03-08  6:00           ` Chris Wright
  2005-03-08  6:18           ` Matt Mackall
  2 siblings, 1 reply; 266+ messages in thread
From: Ingo Molnar @ 2005-03-08  5:49 UTC (permalink / raw)
  To: Peter Williams
  Cc: Andrew Morton, Matt Mackall, paul, joq, cfriesen, chrisw, hch,
	rlrevell, arjanv, alan, linux-kernel


* Peter Williams <pwil3058@bigpond.net.au> wrote:

> I don't object to rlimits per se and I think that they are useful but
> not as a sole solution to this problem.  Being able to give a task
> preferential treatment is a permissions issue and should be solved as
> one.
> 
> Having RT cpu usage limits on tasks is a useful tool to have when
> granting normal users the privilege of running tasks as RT tasks so
> that you can limit the damage that they can do BUT the presence of a
> limit on a task is not a very good criterion for granting that
> privilege.

i think you are talking about my rlimit patch (the 'RT CPU limit' patch)
- but that one is not in discussion here.

what is being discussed currently is the other rlimit patch (from Chris
Wright and Matt Mackall) which implements a simple rlimit ceiling for
the RT (and nice) priorities a task can set. The rlimit defaults to 0,
meaning no change in behavior by default. A value of 50 means RT
priority levels 1-50 are allowed. A value of 100 means all 99 privilege
levels from 1 to 99 are allowed. CAP_SYS_NICE is blanket permission.
It's all pretty finegrained and and it's a quite straightforward
extension of what we have today. The patch does not attempt to do any
"damage control" of abuse caused by RT tasks, and is hence much simpler
than my patch or Con's SCHED_ISO patch. ("damage control" could be done
from userspace anyway)

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:40       ` Andrew Morton
  2005-03-08  5:30         ` Jack O'Quin
@ 2005-03-08  5:40         ` Peter Williams
  2005-03-08  5:49           ` Ingo Molnar
                             ` (2 more replies)
  1 sibling, 3 replies; 266+ messages in thread
From: Peter Williams @ 2005-03-08  5:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matt Mackall, paul, joq, cfriesen, chrisw, hch, rlrevell, arjanv,
	mingo, alan, linux-kernel

Andrew Morton wrote:
> Matt Mackall <mpm@selenic.com> wrote:
> 
>>I think Chris Wright's last rlimit patch is more sensible and ready to
>> go.
> 
> 
> I must say that I like rlimits - very straightforward, although somewhat
> awkward to use from userspace due to shortsighted shell design.
> 
> Does anyone have serious objections to this approach?

I don't object to rlimits per se and I think that they are useful but 
not as a sole solution to this problem.  Being able to give a task 
preferential treatment is a permissions issue and should be solved as one.

Having RT cpu usage limits on tasks is a useful tool to have when 
granting normal users the privilege of running tasks as RT tasks so that 
you can limit the damage that they can do BUT the presence of a limit on 
a task is not a very good criterion for granting that privilege.

The granting of the ability to switch to and from RT mode should require 
a means to specify which users it applies to and also which programs it 
applies to.  The RT rlimits mechanism doesn't meet these criteria.

In summary, IMHO you should put them both in but modify the RT rlimits 
patch so that it plays no part in the decision as to whether the task is 
allowed to run as RT or not.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:33     ` Matt Mackall
  2005-03-08  4:40       ` Andrew Morton
@ 2005-03-08  5:38       ` Ingo Molnar
  2005-03-08  6:45       ` Chris Wright
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-03-08  5:38 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Andrew Morton, Paul Davis, joq, cfriesen, chrisw, hch, rlrevell,
	arjanv, alan, linux-kernel


* Matt Mackall <mpm@selenic.com> wrote:

> Add a pair of rlimits for allowing non-root tasks to raise nice and rt
> priorities. Defaults to traditional behavior. Originally written by
> Chris Wright.
> 
> Signed-off-by: Matt Mackall <mpm@selenic.com>

this too looks good to me.

  Acked-by: Ingo Molnar <mingo@elte.hu>

(no strong feelings either way, other than rlimits feel a bit less
hackish.)

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:40       ` Andrew Morton
@ 2005-03-08  5:30         ` Jack O'Quin
  2005-03-08  6:33           ` Matt Mackall
  2005-03-10 14:01           ` Pavel Machek
  2005-03-08  5:40         ` Peter Williams
  1 sibling, 2 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-03-08  5:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matt Mackall, paul, cfriesen, chrisw, hch, rlrevell, arjanv,
	mingo, alan, linux-kernel

Andrew Morton <akpm@osdl.org> writes:

> Matt Mackall <mpm@selenic.com> wrote:
>>
>> I think Chris Wright's last rlimit patch is more sensible and ready to
>>  go.
>
> I must say that I like rlimits - very straightforward, although somewhat
> awkward to use from userspace due to shortsighted shell design.
>
> Does anyone have serious objections to this approach?

1. is likely to introduce multiuser system security holes like the one
created recently when the mlock() rlimits bug was fixed (DoS attacks)

2. requires updates to all the shells

3. forces Windows and Mac musicians to learn and understand PAM

4. is undocumented and has never been tested in any real music studios

-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:22         ` Ingo Molnar
  2005-03-08  4:28           ` Andrew Morton
@ 2005-03-08  5:19           ` Jack O'Quin
  1 sibling, 0 replies; 266+ messages in thread
From: Jack O'Quin @ 2005-03-08  5:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Christoph Hellwig, paul, mpm, cfriesen, chrisw,
	rlrevell, arjanv, alan, linux-kernel


> * Andrew Morton <akpm@osdl.org> wrote:
>
>> Still.  It seems to be what we deserve if all that fancy stuff we have
>> cannot address this very simple and very real-world problem.

Ingo Molnar <mingo@elte.hu> writes:
> please describe this "very simple and very real-world problem" in simple
> terms. Lets make sure "problem" and "solution" didnt become detached.

Linux audio users need to run large, complex low-latency desktop audio
applications without granting them full root privileges.  These
applications require reliable SCHED_FIFO (or equivalent) scheduling,
and the ability to lock process images into memory.  We need to be
able to drop and reacquire these privileges from time to time.  We
strongly prefer using the POSIX realtime interfaces.

For desktop musicians this needs to be simple to administer, yet still
reasonably secure.  Denial of service attacks are not a serious threat
in our environment, but we really don't want people turning our
systems into open spam relays or creating hidden setuid root shells.

Ours is *not* a timesharing multiuser environment.  Multiple users may
access these systems, but only one at a time.  Many musicians have a
Mac or Windows background, systems which grant realtime privileges to
all tasks indiscriminantly.  The realtime LSM allows us to grant
similar privileges while maintaining better control over who gets
them, a significant improvement over our competition.

AFAICT, video and probably some other desktop multimedia applications
have similar needs, but others should speak for them.  I do know that
audio is highly sensitive to realtime performance glitches.

We believe that this LSM meets our needs, because hundreds of us have
used it successfully for over a year.  This is the last missing piece
that allows us to reap the benefits of the excellent kernel latency
improvements Ingo, Andrew and others have made over the last several
years.
-- 
  joq

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:47               ` Matt Mackall
@ 2005-03-08  4:58                 ` Chris Wright
  0 siblings, 0 replies; 266+ messages in thread
From: Chris Wright @ 2005-03-08  4:58 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Hellwig, Andrew Morton, Ingo Molnar, paul, joq,
	cfriesen, chrisw, rlrevell, arjanv, alan, linux-kernel

* Matt Mackall (mpm@selenic.com) wrote:
> On Tue, Mar 08, 2005 at 04:32:50AM +0000, Christoph Hellwig wrote:
> > On Mon, Mar 07, 2005 at 08:28:21PM -0800, Andrew Morton wrote:
> > > > please describe this "very simple and very real-world problem" in simple
> > > > terms. Lets make sure "problem" and "solution" didnt become detached.
> > > > 
> > > 
> > > Well others can do that better than I but I'd describe it as
> > > 
> > > - Audio apps need to meet their realtime requirements
> 
> Add video, data acquisition, motion control, CD burning, etc..
> 
> > > - The way to implement that is to give them !SCHED_OTHER and mlockall
> > >   capabilities.
> > > 
> > > - But they don't want to run as root.
> > 
> > Which all fits very nicely with MEMLOCK rlimit and a tiny wrapper
> > that sets !SCHED_OTHER and execs the audio app..
> 
> This is somewhat complicated by the fact that the existing apps are
> already running and instead need promotion. Then we run into problems
> lie set_rlimit doesn't want to work on other processes and issues with
> sched_setparam on other threads, etc.
> 
> Part of me wants to say, well you designed it wrong. You should have
> planned a setuid launcher for the rt threads. But at the same time,
> the rlimits thing seems like a reasonably clean way to give RT access
> to users, and still allows for protect watchdog processes..
>  
> > and as I mentioned a few times if we really want to go for a magic
> > uid/gid-based approach we should at least have one that's useable for
> > all capabilities so it can replace the oracle hack aswell.  But the
> > proponents of the patch weren't iterested to invest the tiniest bit
> > of work over what they submited.
> 
> Does the mlock rlimit not already address the Oracle problem?

It does, that's effectively dead code as far as I'm concerned.  The mlock
bit just came in a bit later.  I had a patch around here to rip it out,
this should be a good time to dust that off.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:32             ` Christoph Hellwig
@ 2005-03-08  4:47               ` Matt Mackall
  2005-03-08  4:58                 ` Chris Wright
  2005-03-08 18:55               ` Lee Revell
  1 sibling, 1 reply; 266+ messages in thread
From: Matt Mackall @ 2005-03-08  4:47 UTC (permalink / raw)
  To: Christoph Hellwig, Andrew Morton, Ingo Molnar, paul, joq,
	cfriesen, chrisw, rlrevell, arjanv, alan, linux-kernel

On Tue, Mar 08, 2005 at 04:32:50AM +0000, Christoph Hellwig wrote:
> On Mon, Mar 07, 2005 at 08:28:21PM -0800, Andrew Morton wrote:
> > > please describe this "very simple and very real-world problem" in simple
> > > terms. Lets make sure "problem" and "solution" didnt become detached.
> > > 
> > 
> > Well others can do that better than I but I'd describe it as
> > 
> > - Audio apps need to meet their realtime requirements

Add video, data acquisition, motion control, CD burning, etc..

> > - The way to implement that is to give them !SCHED_OTHER and mlockall
> >   capabilities.
> > 
> > - But they don't want to run as root.
> 
> Which all fits very nicely with MEMLOCK rlimit and a tiny wrapper
> that sets !SCHED_OTHER and execs the audio app..

This is somewhat complicated by the fact that the existing apps are
already running and instead need promotion. Then we run into problems
lie set_rlimit doesn't want to work on other processes and issues with
sched_setparam on other threads, etc.

Part of me wants to say, well you designed it wrong. You should have
planned a setuid launcher for the rt threads. But at the same time,
the rlimits thing seems like a reasonably clean way to give RT access
to users, and still allows for protect watchdog processes..
 
> and as I mentioned a few times if we really want to go for a magic
> uid/gid-based approach we should at least have one that's useable for
> all capabilities so it can replace the oracle hack aswell.  But the
> proponents of the patch weren't iterested to invest the tiniest bit
> of work over what they submited.

Does the mlock rlimit not already address the Oracle problem?

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:33     ` Matt Mackall
@ 2005-03-08  4:40       ` Andrew Morton
  2005-03-08  5:30         ` Jack O'Quin
  2005-03-08  5:40         ` Peter Williams
  2005-03-08  5:38       ` Ingo Molnar
                         ` (3 subsequent siblings)
  4 siblings, 2 replies; 266+ messages in thread
From: Andrew Morton @ 2005-03-08  4:40 UTC (permalink / raw)
  To: Matt Mackall
  Cc: paul, joq, cfriesen, chrisw, hch, rlrevell, arjanv, mingo, alan,
	linux-kernel

Matt Mackall <mpm@selenic.com> wrote:
>
> I think Chris Wright's last rlimit patch is more sensible and ready to
>  go.

I must say that I like rlimits - very straightforward, although somewhat
awkward to use from userspace due to shortsighted shell design.

Does anyone have serious objections to this approach?

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  3:50   ` Andrew Morton
  2005-03-08  3:55     ` Christoph Hellwig
@ 2005-03-08  4:33     ` Matt Mackall
  2005-03-08  4:40       ` Andrew Morton
                         ` (4 more replies)
  1 sibling, 5 replies; 266+ messages in thread
From: Matt Mackall @ 2005-03-08  4:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Davis, joq, cfriesen, chrisw, hch, rlrevell, arjanv, mingo,
	alan, linux-kernel

On Mon, Mar 07, 2005 at 07:50:20PM -0800, Andrew Morton wrote:
> 
> So I still have the rt-lsm patch floating about, saying "merge me, merge
> me!".  I'm not sure that the world would end were I to do so.
> 
> Consider this a prod in the direction of those who were pushing
> alternatives ;)

I think Chris Wright's last rlimit patch is more sensible and ready to
go. And I think I may have even convinced Ingo on this point before
the conversation died last time around. So here's that patch again,
updated to 2.6.11. Compiles cleanly. Chris, please add a signed-off-by.

<snip>

Add a pair of rlimits for allowing non-root tasks to raise nice and rt
priorities. Defaults to traditional behavior. Originally written by
Chris Wright.

Signed-off-by: Matt Mackall <mpm@selenic.com>

Index: rlimits/include/linux/sched.h
===================================================================
--- rlimits.orig/include/linux/sched.h	2005-03-03 22:50:14.000000000 -0800
+++ rlimits/include/linux/sched.h	2005-03-07 20:18:30.000000000 -0800
@@ -791,6 +791,7 @@ extern void sched_idle_next(void);
 extern void set_user_nice(task_t *p, long nice);
 extern int task_prio(const task_t *p);
 extern int task_nice(const task_t *p);
+extern int can_nice(const task_t *p, const int nice);
 extern int task_curr(const task_t *p);
 extern int idle_cpu(int cpu);
 extern int sched_setscheduler(struct task_struct *, int, struct sched_param *);
Index: rlimits/kernel/sched.c
===================================================================
--- rlimits.orig/kernel/sched.c	2005-03-02 22:51:08.000000000 -0800
+++ rlimits/kernel/sched.c	2005-03-07 20:23:17.000000000 -0800
@@ -3273,6 +3273,19 @@ out_unlock:
 
 EXPORT_SYMBOL(set_user_nice);
 
+/*
+ * can_nice - check if a task can reduce its nice value
+ * @p: task
+ * @nice: nice value
+ */
+int can_nice(const task_t *p, const int nice)
+{
+	/* convert nice value [19,-20] to rlimit style value [0,39] */
+	int nice_rlim = 19 - nice;
+	return (nice_rlim <= p->signal->rlim[RLIMIT_NICE].rlim_cur || 
+		capable(CAP_SYS_NICE));
+}
+
 #ifdef __ARCH_WANT_SYS_NICE
 
 /*
@@ -3292,12 +3305,8 @@ asmlinkage long sys_nice(int increment)
 	 * We don't have to worry. Conceptually one call occurs first
 	 * and we have a single winner.
 	 */
-	if (increment < 0) {
-		if (!capable(CAP_SYS_NICE))
-			return -EPERM;
-		if (increment < -40)
-			increment = -40;
-	}
+	if (increment < -40)
+		increment = -40;
 	if (increment > 40)
 		increment = 40;
 
@@ -3307,6 +3316,9 @@ asmlinkage long sys_nice(int increment)
 	if (nice > 19)
 		nice = 19;
 
+	if (increment < 0 && !can_nice(current, nice))
+		return -EPERM;
+
 	retval = security_task_setnice(current, nice);
 	if (retval)
 		return retval;
@@ -3422,6 +3434,7 @@ recheck:
 		return -EINVAL;
 
 	if ((policy == SCHED_FIFO || policy == SCHED_RR) &&
+	    param->sched_priority > p->signal->rlim[RLIMIT_RTPRIO].rlim_cur && 
 	    !capable(CAP_SYS_NICE))
 		return -EPERM;
 	if ((current->euid != p->euid) && (current->euid != p->uid) &&
Index: rlimits/kernel/sys.c
===================================================================
--- rlimits.orig/kernel/sys.c	2005-03-02 22:51:07.000000000 -0800
+++ rlimits/kernel/sys.c	2005-03-07 20:18:30.000000000 -0800
@@ -225,7 +225,7 @@ static int set_one_prio(struct task_stru
 		error = -EPERM;
 		goto out;
 	}
-	if (niceval < task_nice(p) && !capable(CAP_SYS_NICE)) {
+	if (niceval < task_nice(p) && !can_nice(p, niceval)) {
 		error = -EACCES;
 		goto out;
 	}
Index: rlimits/include/asm-generic/resource.h
===================================================================
--- rlimits.orig/include/asm-generic/resource.h	2005-03-02 18:30:27.000000000 -0800
+++ rlimits/include/asm-generic/resource.h	2005-03-07 20:21:04.000000000 -0800
@@ -20,8 +20,10 @@
 #define RLIMIT_LOCKS		10	/* maximum file locks held */
 #define RLIMIT_SIGPENDING	11	/* max number of pending signals */
 #define RLIMIT_MSGQUEUE		12	/* maximum bytes in POSIX mqueues */
-
-#define RLIM_NLIMITS		13
+#define RLIMIT_NICE	13		/* max nice prio allowed to raise to
+					   0-39 for nice level 19 .. -20 */
+#define RLIMIT_RTPRIO	14		/* maximum realtime priority */
+#define RLIM_NLIMITS		15
 #endif
 
 /*
@@ -53,6 +55,8 @@
 	[RLIMIT_LOCKS]		= { RLIM_INFINITY, RLIM_INFINITY },	\
 	[RLIMIT_SIGPENDING]	= { MAX_SIGPENDING, MAX_SIGPENDING },	\
 	[RLIMIT_MSGQUEUE]	= { MQ_BYTES_MAX, MQ_BYTES_MAX },	\
+	[RLIMIT_NICE]		= { 0, 0 }, \
+	[RLIMIT_RTPRIO]		= { 0, 0 }, \
 }
 
 #endif	/* __KERNEL__ */


-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:28           ` Andrew Morton
@ 2005-03-08  4:32             ` Christoph Hellwig
  2005-03-08  4:47               ` Matt Mackall
  2005-03-08 18:55               ` Lee Revell
  0 siblings, 2 replies; 266+ messages in thread
From: Christoph Hellwig @ 2005-03-08  4:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, hch, paul, mpm, joq, cfriesen, chrisw, rlrevell,
	arjanv, alan, linux-kernel

On Mon, Mar 07, 2005 at 08:28:21PM -0800, Andrew Morton wrote:
> > please describe this "very simple and very real-world problem" in simple
> > terms. Lets make sure "problem" and "solution" didnt become detached.
> > 
> 
> Well others can do that better than I but I'd describe it as
> 
> - Audio apps need to meet their realtime requirements
> 
> - The way to implement that is to give them !SCHED_OTHER and mlockall
>   capabilities.
> 
> - But they don't want to run as root.

Which all fits very nicely with MEMLOCK rlimit and a tiny wrapper
that sets !SCHED_OTHER and execs the audio app..

and as I mentioned a few times if we really want to go for a magic
uid/gid-based approach we should at least have one that's useable for
all capabilities so it can replace the oracle hack aswell.  But the
proponents of the patch weren't iterested to invest the tiniest bit
of work over what they submited.

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:22         ` Ingo Molnar
@ 2005-03-08  4:28           ` Andrew Morton
  2005-03-08  4:32             ` Christoph Hellwig
  2005-03-08  5:19           ` Jack O'Quin
  1 sibling, 1 reply; 266+ messages in thread
From: Andrew Morton @ 2005-03-08  4:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: hch, paul, mpm, joq, cfriesen, chrisw, rlrevell, arjanv, alan,
	linux-kernel

Ingo Molnar <mingo@elte.hu> wrote:
>
> 
> * Andrew Morton <akpm@osdl.org> wrote:
> 
> > > next we
> > > $CAPABILITY for $FOO and we're headed straight to interface-hell.
> > 
> > "interface hell"?  Wow.
> > 
> > Still.  It seems to be what we deserve if all that fancy stuff we have
> > cannot address this very simple and very real-world problem.
> 
> please describe this "very simple and very real-world problem" in simple
> terms. Lets make sure "problem" and "solution" didnt become detached.
> 

Well others can do that better than I but I'd describe it as

- Audio apps need to meet their realtime requirements

- The way to implement that is to give them !SCHED_OTHER and mlockall
  capabilities.

- But they don't want to run as root.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  4:16       ` Andrew Morton
@ 2005-03-08  4:22         ` Ingo Molnar
  2005-03-08  4:28           ` Andrew Morton
  2005-03-08  5:19           ` Jack O'Quin
  0 siblings, 2 replies; 266+ messages in thread
From: Ingo Molnar @ 2005-03-08  4:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, paul, mpm, joq, cfriesen, chrisw, rlrevell,
	arjanv, alan, linux-kernel


* Andrew Morton <akpm@osdl.org> wrote:

> > next we
> > $CAPABILITY for $FOO and we're headed straight to interface-hell.
> 
> "interface hell"?  Wow.
> 
> Still.  It seems to be what we deserve if all that fancy stuff we have
> cannot address this very simple and very real-world problem.

please describe this "very simple and very real-world problem" in simple
terms. Lets make sure "problem" and "solution" didnt become detached.

	Ingo

^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  3:55     ` Christoph Hellwig
@ 2005-03-08  4:16       ` Andrew Morton
  2005-03-08  4:22         ` Ingo Molnar
  0 siblings, 1 reply; 266+ messages in thread
From: Andrew Morton @ 2005-03-08  4:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: paul, mpm, joq, cfriesen, chrisw, hch, rlrevell, arjanv, mingo,
	alan, linux-kernel

Christoph Hellwig <hch@infradead.org> wrote:
>
> On Mon, Mar 07, 2005 at 07:50:20PM -0800, Andrew Morton wrote:
> > 
> > So I still have the rt-lsm patch floating about, saying "merge me, merge
> > me!".  I'm not sure that the world would end were I to do so.
> > 
> > Consider this a prod in the direction of those who were pushing
> > alternatives ;)
> 
> It's still a really bad idea.

It solves a real problem and is well encapsulated.  The world won't end if
we merge it.

Still.  My point is: we're still awaiting anything better and thei is just
hanging around and hanging around.

>  You let the magic gid for oracle hugetlb
> patch go in with that reasonsing

Which continues to cause zero problems.

> now ew have relatime-lsm,

Not yet.

> next we
> $CAPABILITY for $FOO and we're headed straight to interface-hell.

"interface hell"?  Wow.

Still.  It seems to be what we deserve if all that fancy stuff we have
cannot address this very simple and very real-world problem.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-03-08  3:50   ` Andrew Morton
@ 2005-03-08  3:55     ` Christoph Hellwig
  2005-03-08  4:16       ` Andrew Morton
  2005-03-08  4:33     ` Matt Mackall
  1 sibling, 1 reply; 266+ messages in thread
From: Christoph Hellwig @ 2005-03-08  3:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Davis, mpm, joq, cfriesen, chrisw, hch, rlrevell, arjanv,
	mingo, alan, linux-kernel

On Mon, Mar 07, 2005 at 07:50:20PM -0800, Andrew Morton wrote:
> 
> So I still have the rt-lsm patch floating about, saying "merge me, merge
> me!".  I'm not sure that the world would end were I to do so.
> 
> Consider this a prod in the direction of those who were pushing
> alternatives ;)

It's still a really bad idea.  You let the magic gid for oracle hugetlb
patch go in with that reasonsing, now ew have relatime-lsm, next we
$CAPABILITY for $FOO and we're headed straight to interface-hell.


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
  2005-01-12 21:16 ` Paul Davis
@ 2005-03-08  3:50   ` Andrew Morton
  2005-03-08  3:55     ` Christoph Hellwig
  2005-03-08  4:33     ` Matt Mackall
  0 siblings, 2 replies; 266+ messages in thread
From: Andrew Morton @ 2005-03-08  3:50 UTC (permalink / raw)
  To: Paul Davis
  Cc: mpm, joq, cfriesen, chrisw, hch, rlrevell, arjanv, mingo, alan,
	linux-kernel


So I still have the rt-lsm patch floating about, saying "merge me, merge
me!".  I'm not sure that the world would end were I to do so.

Consider this a prod in the direction of those who were pushing
alternatives ;)


^ permalink raw reply	[flat|nested] 266+ messages in thread

* Re: [PATCH] [request for inclusion] Realtime LSM
       [not found] <20050112185258.GG2940@waste.org>
@ 2005-01-12 21:16 ` Paul Davis
  2005-03-08  3:50   ` Andrew Morton
  0 siblings, 1 reply; 266+ messages in thread
From: Paul Davis @ 2005-01-12 21:16 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Jack O'Quin, Chris Friesen, Chris Wright, Christoph Hellwig,
	Andrew Morton, Lee Revell, arjanv, mingo, alan, linux-kernel

>What I find offensive is you repeatedly telling me I'm naive, when
>I've actually written a proper RT kernel AND run a music production
>company.

But you are being naive Matt! Nobody claims that OSX is hard-RT, or
even that OS9 is hard-RT, but people are able to Get Work Done on
those systems without jumping through hoops or arguing with kernel
developers. This happens because (a) the OS is enough-RT and (b) the
developers in question accepted the requirements of DAW and sequencer
users as totally legitimate.

As I stated before, we already *have* Linux kernels whose performance
in the RT area is at least as good as OSX (thanks to Andrew and Ingo,
primarily), but users cannot access these facilities without doing a
song and dance and a download or two. This is the issue that requires
fixing, from our perspective.

--p


^ permalink raw reply	[flat|nested] 266+ messages in thread

end of thread, other threads:[~2005-03-10 21:20 UTC | newest]

Thread overview: 266+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-30  2:43 [PATCH] [request for inclusion] Realtime LSM Lee Revell
2005-01-03 14:03 ` Christoph Hellwig
2005-01-03 14:15   ` Arjan van de Ven
2005-01-07 16:40     ` Lee Revell
2005-01-04 18:16   ` Lee Revell
2005-01-04 18:20     ` Christoph Hellwig
2005-01-04 18:55       ` Jack O'Quin
2005-01-04 18:59         ` Lee Revell
2005-01-05  0:01           ` Alan Cox
2005-01-05  1:28             ` Lee Revell
2005-01-05  1:30             ` Lee Revell
2005-01-05  1:50             ` Chris Wright
2005-01-05  1:55               ` Lee Revell
2005-01-05  2:05                 ` Chris Wright
2005-01-05  2:58                   ` Kyle Moffett
2005-01-05  3:45                     ` Chris Wright
2005-01-05  4:06                   ` Jack O'Quin
2005-01-05 11:52                 ` Ingo Molnar
2005-01-05 15:19                   ` Lee Revell
2005-01-05 15:21                   ` Lee Revell
2005-01-07 12:56                     ` Paul Davis
2005-01-07 13:04                       ` Christoph Hellwig
2005-01-07 14:16                         ` Paul Davis
2005-01-07 14:26                           ` Arjan van de Ven
2005-01-07 14:38                             ` Paul Davis
2005-01-07 14:42                               ` Arjan van de Ven
2005-01-07 15:27                                 ` Paul Davis
2005-01-07 15:33                                   ` Arjan van de Ven
2005-01-07 15:41                                     ` Paul Davis
2005-01-07 16:03                                       ` Arjan van de Ven
2005-01-07 16:20                                         ` Takashi Iwai
2005-01-08  5:36                                           ` Con Kolivas
2005-01-08  6:21                                             ` Jack O'Quin
2005-01-07 16:20                                         ` Paul Davis
2005-01-07 21:12                                           ` Lee Revell
2005-01-07 21:49                                             ` Andrew Morton
2005-01-07 22:07                                               ` Valdis.Kletnieks
2005-01-07 22:36                                                 ` Chris Wright
2005-01-07 23:01                                                   ` Valdis.Kletnieks
2005-01-07 23:20                                                     ` Andrew Morton
2005-01-07 23:34                                                       ` Valdis.Kletnieks
2005-01-10 21:05                                                       ` Matt Mackall
2005-01-07 22:10                                               ` Christoph Hellwig
2005-01-07 22:26                                                 ` Paul Davis
2005-01-07 22:29                                                 ` Chris Wright
2005-01-08  6:12                                                   ` Jack O'Quin
2005-01-08 16:56                                                     ` ross
2005-01-08 18:25                                                       ` Christoph Hellwig
2005-01-08 22:20                                                       ` Lee Revell
2005-01-08 22:27                                                         ` Andreas Steinmetz
2005-01-08 22:14                                                     ` Lee Revell
2005-01-10 21:20                                                     ` Matt Mackall
2005-01-11 13:05                                                       ` Paul Davis
2005-01-11 16:28                                                         ` Jack O'Quin
2005-01-11 18:59                                                           ` Matt Mackall
2005-01-11 20:47                                                           ` utz lehmann
2005-01-11 21:07                                                           ` Lee Revell
2005-01-11 19:17                                                         ` Matt Mackall
2005-01-11 19:42                                                           ` Jack O'Quin
2005-01-11 20:50                                                           ` Chris Wright
2005-01-11 20:58                                                             ` Ingo Molnar
2005-01-11 21:14                                                               ` Chris Wright
2005-01-11 21:27                                                                 ` Ingo Molnar
2005-01-11 22:13                                                                   ` Chris Wright
2005-01-11 22:26                                                                     ` Con Kolivas
2005-01-12  3:21                                                                   ` Jack O'Quin
2005-01-12  4:29                                                                     ` Chris Wright
2005-01-13  5:44                                                                   ` Jack O'Quin
2005-01-13  6:34                                                                     ` Matt Mackall
2005-01-13 19:17                                                                       ` Jack O'Quin
2005-01-14 20:52                                                                         ` Lee Revell
2005-01-15  0:42                                                                           ` Jack O'Quin
2005-01-15  2:19                                                                             ` Randy.Dunlap
2005-01-15  4:06                                                                               ` Jack O'Quin
2005-01-15 13:49                                                                     ` Ingo Molnar
2005-01-15 23:02                                                                       ` Jack O'Quin
2005-01-15 23:38                                                                         ` Jack O'Quin
2005-01-16 23:13                                                                           ` Ingo Molnar
2005-01-16 23:57                                                                             ` Jack O'Quin
2005-01-17  9:17                                                                               ` Sytse Wielinga
2005-01-17 14:36                                                                                 ` Ingo Molnar
2005-01-17 10:06                                                                               ` Ingo Molnar
2005-01-18  5:02                                                                                 ` Jack O'Quin
2005-01-18  8:02                                                                                   ` Ingo Molnar
2005-01-18 17:05                                                                                     ` Jack O'Quin
2005-01-19  8:24                                                                                       ` Ingo Molnar
2005-01-19 14:39                                                                                         ` Ingo Molnar
2005-01-19 17:45                                                                                           ` Jack O'Quin
2005-01-19 18:32                                                                                             ` Matt Mackall
2005-01-20  8:07                                                                                               ` Ingo Molnar
2005-01-20  8:05                                                                                             ` Ingo Molnar
2005-01-11 14:30                                                       ` Jack O'Quin
2005-01-11 19:50                                                         ` Matt Mackall
2005-01-11 19:57                                                           ` Jack O'Quin
2005-01-11 20:05                                                             ` Matt Mackall
2005-01-11 20:29                                                               ` Lee Revell
2005-01-11 20:47                                                                 ` Chris Wright
2005-01-11 21:10                                                                   ` Lee Revell
2005-01-11 21:20                                                                     ` Chris Wright
2005-01-11 21:28                                                                   ` Matt Mackall
2005-01-11 21:38                                                                     ` Lee Revell
2005-01-11 21:41                                                                       ` Arjan van de Ven
2005-01-11 22:51                                                                         ` Paul Davis
2005-01-11 23:05                                                                           ` Chris Wright
2005-01-12  1:43                                                                             ` Jack O'Quin
2005-01-12  7:49                                                                               ` Arjan van de Ven
2005-01-12 21:12                                                                                 ` Lee Revell
2005-01-13  0:44                                                                                 ` Jack O'Quin
2005-01-13  7:28                                                                                   ` Arjan van de Ven
2005-01-13 21:04                                                                                     ` Jack O'Quin
2005-01-13 21:07                                                                                       ` Arjan van de Ven
2005-01-13 21:25                                                                                         ` Lee Revell
2005-01-13 21:43                                                                                           ` Arjan van de Ven
2005-01-13 23:31                                                                                             ` Jack O'Quin
2005-01-14  0:33                                                                                               ` Chris Wright
2005-01-14  0:50                                                                                               ` Con Kolivas
2005-01-14  1:20                                                                                                 ` Matt Mackall
2005-01-14  1:27                                                                                                   ` Con Kolivas
2005-01-14 17:20                                                                                               ` Mike Galbraith
2005-01-15  1:14                                                                                                 ` Jack O'Quin
2005-01-15  8:06                                                                                                   ` Mike Galbraith
2005-01-15 23:48                                                                                                     ` Jack O'Quin
2005-01-14  2:05                                                                                           ` utz lehmann
2005-01-14  2:08                                                                                             ` Con Kolivas
2005-01-14  2:23                                                                                               ` Andrew Morton
2005-01-14  2:35                                                                                               ` utz lehmann
2005-01-14  2:42                                                                                                 ` Con Kolivas
2005-01-14  3:20                                                                                                   ` Andrew Morton
2005-01-14  3:28                                                                                                     ` utz lehmann
2005-01-14  3:26                                                                                                   ` utz lehmann
2005-01-14  2:24                                                                                             ` Nick Piggin
2005-01-14  2:40                                                                                               ` Paul Davis
2005-01-14  2:57                                                                                                 ` Nick Piggin
2005-01-14  3:12                                                                                                 ` Andrew Morton
2005-01-14  3:18                                                                                                   ` Con Kolivas
2005-01-14  3:30                                                                                                     ` Paul Davis
2005-01-14  3:38                                                                                                       ` Con Kolivas
2005-01-14  3:51                                                                                                         ` Paul Davis
2005-01-14  4:00                                                                                                           ` Con Kolivas
2005-01-14  4:16                                                                                                             ` Nick Piggin
2005-01-14  4:04                                                                                                         ` Nick Piggin
2005-01-14  3:31                                                                                                     ` Nick Piggin
2005-01-14  3:34                                                                                                       ` Paul Davis
2005-01-14  4:11                                                                                                       ` Con Kolivas
2005-01-14  4:23                                                                                                         ` Nick Piggin
2005-01-14  4:45                                                                                                           ` Paul Davis
2005-01-14  5:14                                                                                                             ` Nick Piggin
2005-01-14  9:21                                                                                                       ` Will Dyson
2005-01-14  9:54                                                                                                         ` Nick Piggin
2005-01-14  6:57                                                                                                   ` Matt Mackall
2005-01-14  7:04                                                                                                     ` Andrew Morton
2005-01-14  7:55                                                                                                       ` Chris Wright
2005-01-14 20:10                                                                                                     ` Chris Wright
2005-01-14 20:55                                                                                                       ` Matt Mackall
2005-01-14 23:04                                                                                                         ` Chris Wright
2005-01-15  0:58                                                                                                           ` Matt Mackall
2005-01-11 22:05                                                                       ` Matt Mackall
2005-01-11 21:42                                                                     ` Chris Wright
2005-01-11 22:16                                                                       ` Matt Mackall
2005-01-11 22:21                                                                         ` Chris Wright
2005-01-11 22:36                                                                           ` utz lehmann
2005-01-11 22:41                                                                             ` Chris Wright
2005-01-11 22:17                                                                     ` utz
2005-01-11 22:48                                                                     ` Paul Davis
2005-01-11 23:06                                                                       ` Matt Mackall
2005-01-12  2:13                                                                         ` Paul Davis
2005-01-12 19:09                                                                           ` Matt Mackall
2005-01-12 21:25                                                                             ` Lee Revell
2005-01-11 20:19                                                             ` Chris Friesen
2005-01-11 22:45                                                           ` Paul Davis
2005-01-11 21:21                                                     ` Ingo Molnar
2005-01-12  2:10                                                       ` Jack O'Quin
2005-01-15  4:56                                                       ` Jack O'Quin
2005-01-15 14:43                                                         ` Ingo Molnar
2005-01-15 23:10                                                           ` Jack O'Quin
2005-01-16  1:48                                                             ` Jack O'Quin
2005-01-16  4:30                                                               ` Jack O'Quin
2005-01-16 23:22                                                                 ` Ingo Molnar
2005-01-07 23:00                                                 ` Lee Revell
2005-01-07 22:22                                               ` Paul Davis
2005-01-07 22:44                                               ` Andreas Steinmetz
2005-01-07 16:03                                       ` Martin Mares
2005-01-07 16:22                                         ` Paul Davis
2005-01-08 13:04                                           ` Paul Jakma
2005-01-07 14:47                               ` Christoph Hellwig
2005-01-07 15:26                                 ` Paul Davis
2005-01-07 16:08                                   ` Martin Mares
2005-01-07 16:14                                     ` Paul Davis
2005-01-07 16:29                                       ` Martin Mares
2005-01-07 16:36                                         ` Paul Davis
2005-01-07 17:06                                           ` Martin Mares
2005-01-07 17:29                                             ` Chris Wright
2005-01-07 17:32                                               ` Martin Mares
2005-01-07 17:38                                                 ` Chris Wright
2005-01-07 19:55                                                 ` Jack O'Quin
2005-01-07 16:37                                         ` Takashi Iwai
2005-01-07 16:41                                           ` Martin Mares
2005-01-07 17:53                                   ` Chris Wright
2005-01-07 18:01                             ` Chris Wright
2005-01-05 18:18                   ` Jack O'Quin
2005-01-05  4:04             ` Jack O'Quin
2005-01-05 11:25           ` Christoph Hellwig
2005-01-05 17:32             ` Lee Revell
2005-01-05 19:11               ` Christoph Hellwig
2005-01-05 11:20         ` Christoph Hellwig
2005-01-04 18:57       ` Lee Revell
2005-01-05  1:35         ` Andreas Steinmetz
2005-01-05  4:18           ` Alan Cox
2005-01-05  5:50             ` Andrew Morton
2005-01-05 12:06               ` Herbert Poetzl
2005-01-07  1:13                 ` Matt Mackall
2005-01-07  1:55                   ` Alan Cox
2005-01-07 20:05                     ` Matt Mackall
2005-01-05 20:09               ` Olaf Dietsche
2005-01-07  1:18             ` Matt Mackall
2005-01-07  2:36               ` Lee Revell
2005-01-07  5:54               ` Jack O'Quin
2005-01-07 20:02                 ` Matt Mackall
2005-01-07 20:21                   ` Chris Wright
2005-01-07 20:27                   ` Jack O'Quin
2005-01-07 20:46                     ` Matt Mackall
2005-01-07 20:55                       ` Lee Revell
2005-01-07 21:20                         ` Matt Mackall
2005-01-07 21:29                           ` Chris Wright
2005-01-07 20:45                   ` Lee Revell
2005-01-05 11:39           ` Christoph Hellwig
2005-01-05 17:35             ` Lee Revell
2005-01-05 19:11               ` Christoph Hellwig
2005-01-05 11:24         ` Christoph Hellwig
     [not found] <20050112185258.GG2940@waste.org>
2005-01-12 21:16 ` Paul Davis
2005-03-08  3:50   ` Andrew Morton
2005-03-08  3:55     ` Christoph Hellwig
2005-03-08  4:16       ` Andrew Morton
2005-03-08  4:22         ` Ingo Molnar
2005-03-08  4:28           ` Andrew Morton
2005-03-08  4:32             ` Christoph Hellwig
2005-03-08  4:47               ` Matt Mackall
2005-03-08  4:58                 ` Chris Wright
2005-03-08 18:55               ` Lee Revell
2005-03-08 19:11                 ` Paul Davis
2005-03-08 20:29                   ` Andrew Morton
2005-03-08 21:20                 ` Christoph Hellwig
2005-03-08 21:34                   ` Lee Revell
2005-03-08 23:55                     ` James Morris
2005-03-08  5:19           ` Jack O'Quin
2005-03-08  4:33     ` Matt Mackall
2005-03-08  4:40       ` Andrew Morton
2005-03-08  5:30         ` Jack O'Quin
2005-03-08  6:33           ` Matt Mackall
2005-03-09  3:39             ` Jack O'Quin
2005-03-09  3:44               ` Matt Mackall
2005-03-09  4:04                 ` Jack O'Quin
2005-03-10 14:01           ` Pavel Machek
2005-03-08  5:40         ` Peter Williams
2005-03-08  5:49           ` Ingo Molnar
2005-03-08  6:28             ` Peter Williams
2005-03-08  6:40               ` Chris Wright
2005-03-08  6:42                 ` Ingo Molnar
2005-03-08  6:00           ` Chris Wright
2005-03-08  6:18           ` Matt Mackall
2005-03-08  5:38       ` Ingo Molnar
2005-03-08  6:45       ` Chris Wright
2005-03-08  6:49         ` Matt Mackall
2005-03-08  6:55       ` Andrew Morton
2005-03-08  8:45         ` Matt Mackall
2005-03-08 19:17       ` utz lehmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).