linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.28, rlimits, performance and debian etch
@ 2009-02-26 21:48 Frans Pop
  2009-02-26 22:01 ` Steve Langasek
  2009-02-27  7:30 ` Peter Palfrader
  0 siblings, 2 replies; 14+ messages in thread
From: Frans Pop @ 2009-02-26 21:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: libpam-modules, debian-admin

Steve Langasek wrote:
> It has already been mentioned that this does not apply to the upcoming
> Debian 5.0 release (lenny); this patch is only present in the 4.0
> release (etch), it was actually fixed in the development series to not
> use RLIM_INFINITY *because* previous kernels didn't support this and
> would cause pam_limits to throw log warnings.

I've just migrated my home servers from Debian etch to lenny and bind9 now 
gives me this:

named[17207]: max open files (1024) is smaller than max sockets (4096)

Redhat's BTS [1] tells me this is a kernel issue thatshould be solved [2] 
by the "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY" patch in 
2.6.28.

One of my servers (DNS slave) is running the standard Debian 2.6.26 
kernel, the other (DNS master) is running 2.6.29-rc6 so that does include 
that patch. But both show the error!

I'd appreciate your input where to take this.

Cheers,
FJP

[1] https://bugzilla.redhat.com/show_bug.cgi?id=477540
[2] https://bugzilla.redhat.com/show_bug.cgi?id=461458

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-02-26 21:48 2.6.28, rlimits, performance and debian etch Frans Pop
@ 2009-02-26 22:01 ` Steve Langasek
  2009-02-27  7:30 ` Peter Palfrader
  1 sibling, 0 replies; 14+ messages in thread
From: Steve Langasek @ 2009-02-26 22:01 UTC (permalink / raw)
  To: Frans Pop; +Cc: linux-kernel, libpam-modules, debian-admin

On Thu, Feb 26, 2009 at 10:48:25PM +0100, Frans Pop wrote:
> Steve Langasek wrote:
> > It has already been mentioned that this does not apply to the upcoming
> > Debian 5.0 release (lenny); this patch is only present in the 4.0
> > release (etch), it was actually fixed in the development series to not
> > use RLIM_INFINITY *because* previous kernels didn't support this and
> > would cause pam_limits to throw log warnings.

> I've just migrated my home servers from Debian etch to lenny and bind9 now 
> gives me this:

> named[17207]: max open files (1024) is smaller than max sockets (4096)

That's a separate issue; Debian bug #515673.  I intend to solve this for
Debian 5.0.1.

-- 
Steve Langasek                   Give me a lever long enough and a Free OS
Debian Developer                   to set it on, and I can move the world.
Ubuntu Developer                                    http://www.debian.org/
slangasek@ubuntu.com                                     vorlon@debian.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-02-26 21:48 2.6.28, rlimits, performance and debian etch Frans Pop
  2009-02-26 22:01 ` Steve Langasek
@ 2009-02-27  7:30 ` Peter Palfrader
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Palfrader @ 2009-02-27  7:30 UTC (permalink / raw)
  To: Frans Pop; +Cc: linux-kernel, libpam-modules, debian-admin

On Thu, 26 Feb 2009, Frans Pop wrote:

> Steve Langasek wrote:
> > It has already been mentioned that this does not apply to the upcoming
> > Debian 5.0 release (lenny); this patch is only present in the 4.0
> > release (etch), it was actually fixed in the development series to not
> > use RLIM_INFINITY *because* previous kernels didn't support this and
> > would cause pam_limits to throw log warnings.
> 
> I've just migrated my home servers from Debian etch to lenny and bind9 now 
> gives me this:
> 
> named[17207]: max open files (1024) is smaller than max sockets (4096)
> 
> Redhat's BTS [1] tells me this is a kernel issue thatshould be solved [2] 
> by the "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY" patch in 
> 2.6.28.
> 
> One of my servers (DNS slave) is running the standard Debian 2.6.26 
> kernel, the other (DNS master) is running 2.6.29-rc6 so that does include 
> that patch. But both show the error!

That patch has been reverted for 2.6.29 and 2.6.28.(5 or 6 or so).

-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-29 12:19   ` Adam Tkac
  2009-01-29 18:05     ` Andrew Morton
@ 2009-02-08 22:31     ` Steve Langasek
  1 sibling, 0 replies; 14+ messages in thread
From: Steve Langasek @ 2009-02-08 22:31 UTC (permalink / raw)
  To: Adam Tkac
  Cc: Andrew Morton, Peter Palfrader, linux-kernel, debian-admin, team,
	libpam-modules, stable

On Thu, Jan 29, 2009 at 01:19:00PM +0100, Adam Tkac wrote:
> Hi all,

> I don't think the "rlim_infinity" patch should be reverted. Let me try
> to explain why.

> First, code which sets limits to RLIM_INFINITY is very bad idea and
> that code is Debian specific. I downloaded original pam 0.79 (stripped):

> for(i = 0; i < RLIM_NLIMITS; i++) {
> 	...
> 	int r = getrlimit(i, &pl->limits[i].limit);
> 	...

> as you can see original pam sets limits to inherited defaults. After
> code written above Debian adds their patch:

> if (limits_not_defined_in_limits_conf) {
> 	...
> 	case RLIMIT_NOFILE:
> 	...
> 		pl->limits[i].limit.rlim_cur = RLIM_INFINITY;
> 		pl->limits[i].limit.rlim_max = RLIM_INFINITY;
> 	...
> }

> so as you can see inherited default limits are overriden to infinity.
> In my opinion Debian should revert their patch which is, at least,
> pretty incorrect and unsecure.

It has already been mentioned that this does not apply to the upcoming
Debian 5.0 release (lenny); this patch is only present in the 4.0 release
(etch), it was actually fixed in the development series to not use
RLIM_INFINITY *because* previous kernels didn't support this and would cause
pam_limits to throw log warnings.

Nevertheless, this is not "incorrect and [in]secure."  Raising the default
limits to their maximum values in Debian's Linux-PAM is a deliberate
decision to ensure that the limits applied when su'ing from another user are
the same as when logging in directly as that user, without requiring full
specification of all limits in limits.conf for all users.  It happens to
fail in this case because the default set by the kernel for all users is
different from the maximum allowed by the kernel; but if setting NOFILE to
RLIM_INFINITY isn't a sane thing to do, why does the kernel allow it?  Note
that reverting the Debian PAM change doesn't prevent a local admin from
accidentally triggering this problem because they want to unlimit one or
more of their users.

The purpose of pam_limits is to enable admins to easily apply resource
policies, not to dictate what the set of valid policies is.  If
RLIM_INFINITY is not a sane value for RLIMIT_NOFILE, the kernel should
disallow it.

> Next argument is POSIX compatibility (from setrlimit() specification):

> "The value RLIM_INFINITY, defined in <sys/resource.h>, shall be
> considered to be larger than any other limit value. If a call to
> getrlimit() returns RLIM_INFINITY for a resource, it means the
> implementation shall not enforce limits on that resource. Specifying
> RLIM_INFINITY as any resource limit value on a successful call to
> setrlimit() shall inhibit enforcement of that resource limit."

> So kernel does what is expected. If you want "unlimited" number of
> descriptors, you have it.

This is not a POSIX compliance issue.  POSIX does not require that
setrlimit(RLIMIT_NOFILE, RLIM_INFINITY) *succeeds*; it only requires that,
*if* it succeeds, it has the described semantics.

Cheers,
-- 
Steve Langasek                   Give me a lever long enough and a Free OS
Debian Developer                   to set it on, and I can move the world.
Ubuntu Developer                                    http://www.debian.org/
slangasek@ubuntu.com                                     vorlon@debian.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-29 18:05     ` Andrew Morton
  2009-01-29 18:10       ` Peter Palfrader
@ 2009-02-02 16:20       ` Adam Tkac
  1 sibling, 0 replies; 14+ messages in thread
From: Adam Tkac @ 2009-02-02 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Peter Palfrader, linux-kernel, debian-admin, team,
	libpam-modules, stable

On Thu, Jan 29, 2009 at 10:05:09AM -0800, Andrew Morton wrote:
> On Thu, 29 Jan 2009 13:19:00 +0100 Adam Tkac <vonsch@gmail.com> wrote:
> 
> > 
> > Hi all,
> > 
> > I don't think the "rlim_infinity" patch should be reverted. Let me try
> > to explain why.
> > 
> > First, code which sets limits to RLIM_INFINITY is very bad idea and
> > that code is Debian specific. I downloaded original pam 0.79 (stripped):
> > 
> > for(i = 0; i < RLIM_NLIMITS; i++) {
> > 	...
> > 	int r = getrlimit(i, &pl->limits[i].limit);
> > 	...
> > 
> > as you can see original pam sets limits to inherited defaults. After
> > code written above Debian adds their patch:
> > 
> > if (limits_not_defined_in_limits_conf) {
> > 	...
> > 	case RLIMIT_NOFILE:
> > 	...
> > 		pl->limits[i].limit.rlim_cur = RLIM_INFINITY;
> > 		pl->limits[i].limit.rlim_max = RLIM_INFINITY;
> > 	...
> > }
> > 
> > so as you can see inherited default limits are overriden to infinity.
> > In my opinion Debian should revert their patch which is, at least,
> > pretty incorrect and unsecure.
> > 
> > Next argument is POSIX compatibility (from setrlimit() specification):
> > 
> > "The value RLIM_INFINITY, defined in <sys/resource.h>, shall be
> > considered to be larger than any other limit value. If a call to
> > getrlimit() returns RLIM_INFINITY for a resource, it means the
> > implementation shall not enforce limits on that resource. Specifying
> > RLIM_INFINITY as any resource limit value on a successful call to
> > setrlimit() shall inhibit enforcement of that resource limit."
> > 
> > So kernel does what is expected. If you want "unlimited" number of
> > descriptors, you have it.
> > 
> > Please consider again where exactly problem is, if in Debian patch or
> > in kernel patch. From my point of view Debian patch should be
> > reverted, not the kernel one.
> > 
> 
> Sure, debian might well be wrong.  But the bottom line is that the
> kernel changed, and people's machines broke.
> 
> If the kernel change was really really important then we might just
> grit our teeth and live with the breakage.  But this change _wasn't_ a
> terribly important one.  So I think we should back it out while we find
> another way of implementing it which does not break currently deployed
> installations.

Well, if we should fix Debian's problem I think better would be to set
limit of files to reasonable number instead of sysctl_nr_open (for
example 4096 or 8192).

Adam

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-29 18:05     ` Andrew Morton
@ 2009-01-29 18:10       ` Peter Palfrader
  2009-02-02 16:20       ` Adam Tkac
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Palfrader @ 2009-01-29 18:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Adam Tkac, linux-kernel, debian-admin, team, libpam-modules, stable

On Thu, 29 Jan 2009, Andrew Morton wrote:

> Sure, debian might well be wrong.  But the bottom line is that the
> kernel changed, and people's machines broke.
> 
> If the kernel change was really really important then we might just
> grit our teeth and live with the breakage.  But this change _wasn't_ a
> terribly important one.  So I think we should back it out while we find
> another way of implementing it which does not break currently deployed
> installations.

FWIWs, if Debian stable (4.0) is really the only place that breaks with
this then re-introducing it in a couple of months when the new version
of Debian (5.0) has been out for a while might well be an option.

Peter
-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-29 12:19   ` Adam Tkac
@ 2009-01-29 18:05     ` Andrew Morton
  2009-01-29 18:10       ` Peter Palfrader
  2009-02-02 16:20       ` Adam Tkac
  2009-02-08 22:31     ` Steve Langasek
  1 sibling, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2009-01-29 18:05 UTC (permalink / raw)
  To: Adam Tkac
  Cc: Peter Palfrader, linux-kernel, debian-admin, team,
	libpam-modules, stable

On Thu, 29 Jan 2009 13:19:00 +0100 Adam Tkac <vonsch@gmail.com> wrote:

> On Tue, Jan 27, 2009 at 03:17:03PM -0800, Andrew Morton wrote:
> > On Wed, 21 Jan 2009 12:52:19 +0100
> > Peter Palfrader <weasel@debian.org> wrote:
> > 
> > > Hi,
> > > 
> > > I spent several hours trying to get to the bottom of a serious
> > > performance issue that appeared on one of our servers after upgrading to
> > > 2.6.28.  In the end it's what could be considered a userspace bug that
> > > was triggered by a change in 2.6.28.  Since this might also affect other
> > > people I figured I'd at least document what I found here, and maybe we
> > > can even do something about it:
> > > 
> > > 
> > > So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
> > > the team maintaining our ftp archive complained that one of their
> > > scripts that previously ran in a few minutes still hadn't even come
> > > close to being done after an hour or so.  Downgrading to 2.6.27 fixed
> > > that.
> > > 
> > > Turns out that script is forking a lot and something in it or python or
> > > whereever closes all the file descriptors it doesn't want to pass on.
> > > That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
> > > closes them all with a few exceptions.
> > > 
> > > Turns out that takes a long time when your limit -n is now 2^20 (1048576).
> > > 
> > > With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
> > > now a thousand times that.
> > > 
> > > 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
> > > RLIM_INFINITY" (0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f)[1] that
> > > allows, as the title implies, to set the limit for number of files to
> > > infinity.
> > > 
> > > Closer investigation showed that the broken default ulimit did not apply
> > > to "system" processes (like stuff started from init).  In the end I
> > > could establish that all processes that passed through pam_limit at one
> > > point had the bad resource limit.
> > > 
> > > Apparently the pam library in Debian etch (4.0) initializes the limits
> > > to some default values when it doesn't have any settings in limit.conf
> > > to override them.  Turns out that for nofiles this is RLIM_INFINITY.
> > > Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
> > > package version 0.79-5 fixes that - tho I'm not sure what side effects
> > > that has.
> > > 
> > > Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
> > > uses a different pam (version).
> > > 
> > > 
> > > I'm a bit unsure where to go from here.  Maybe the pam library in etch
> > > should be fixed.  Maybe the patch should be reverted (but then it may be
> > > more correct now and that's what the changelog entry suggests).
> > > As a stopgap measure I could also just define nofile in limits.conf.
> > > 
> > > Thanks for listening.  Also thanks to Rik and Nocholas who helped track
> > > some of this down.
> > > 
> > > Cheers,
> > > Peter
> > > 1. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
> > >    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
> > 
> > Ho hum, thanks.
> > 
> > Well, I think we just revert it for now.  We can bring it back later
> > if someone is thus inclined.  Along with some sort of opt-in control,
> > perhaps in /proc.  Which defaults to "off".
> > 
> 
> Hi all,
> 
> I don't think the "rlim_infinity" patch should be reverted. Let me try
> to explain why.
> 
> First, code which sets limits to RLIM_INFINITY is very bad idea and
> that code is Debian specific. I downloaded original pam 0.79 (stripped):
> 
> for(i = 0; i < RLIM_NLIMITS; i++) {
> 	...
> 	int r = getrlimit(i, &pl->limits[i].limit);
> 	...
> 
> as you can see original pam sets limits to inherited defaults. After
> code written above Debian adds their patch:
> 
> if (limits_not_defined_in_limits_conf) {
> 	...
> 	case RLIMIT_NOFILE:
> 	...
> 		pl->limits[i].limit.rlim_cur = RLIM_INFINITY;
> 		pl->limits[i].limit.rlim_max = RLIM_INFINITY;
> 	...
> }
> 
> so as you can see inherited default limits are overriden to infinity.
> In my opinion Debian should revert their patch which is, at least,
> pretty incorrect and unsecure.
> 
> Next argument is POSIX compatibility (from setrlimit() specification):
> 
> "The value RLIM_INFINITY, defined in <sys/resource.h>, shall be
> considered to be larger than any other limit value. If a call to
> getrlimit() returns RLIM_INFINITY for a resource, it means the
> implementation shall not enforce limits on that resource. Specifying
> RLIM_INFINITY as any resource limit value on a successful call to
> setrlimit() shall inhibit enforcement of that resource limit."
> 
> So kernel does what is expected. If you want "unlimited" number of
> descriptors, you have it.
> 
> Please consider again where exactly problem is, if in Debian patch or
> in kernel patch. From my point of view Debian patch should be
> reverted, not the kernel one.
> 

Sure, debian might well be wrong.  But the bottom line is that the
kernel changed, and people's machines broke.

If the kernel change was really really important then we might just
grit our teeth and live with the breakage.  But this change _wasn't_ a
terribly important one.  So I think we should back it out while we find
another way of implementing it which does not break currently deployed
installations.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-27 23:17 ` Andrew Morton
@ 2009-01-29 12:19   ` Adam Tkac
  2009-01-29 18:05     ` Andrew Morton
  2009-02-08 22:31     ` Steve Langasek
  0 siblings, 2 replies; 14+ messages in thread
From: Adam Tkac @ 2009-01-29 12:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Peter Palfrader, linux-kernel, debian-admin, team,
	libpam-modules, stable

On Tue, Jan 27, 2009 at 03:17:03PM -0800, Andrew Morton wrote:
> On Wed, 21 Jan 2009 12:52:19 +0100
> Peter Palfrader <weasel@debian.org> wrote:
> 
> > Hi,
> > 
> > I spent several hours trying to get to the bottom of a serious
> > performance issue that appeared on one of our servers after upgrading to
> > 2.6.28.  In the end it's what could be considered a userspace bug that
> > was triggered by a change in 2.6.28.  Since this might also affect other
> > people I figured I'd at least document what I found here, and maybe we
> > can even do something about it:
> > 
> > 
> > So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
> > the team maintaining our ftp archive complained that one of their
> > scripts that previously ran in a few minutes still hadn't even come
> > close to being done after an hour or so.  Downgrading to 2.6.27 fixed
> > that.
> > 
> > Turns out that script is forking a lot and something in it or python or
> > whereever closes all the file descriptors it doesn't want to pass on.
> > That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
> > closes them all with a few exceptions.
> > 
> > Turns out that takes a long time when your limit -n is now 2^20 (1048576).
> > 
> > With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
> > now a thousand times that.
> > 
> > 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
> > RLIM_INFINITY" (0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f)[1] that
> > allows, as the title implies, to set the limit for number of files to
> > infinity.
> > 
> > Closer investigation showed that the broken default ulimit did not apply
> > to "system" processes (like stuff started from init).  In the end I
> > could establish that all processes that passed through pam_limit at one
> > point had the bad resource limit.
> > 
> > Apparently the pam library in Debian etch (4.0) initializes the limits
> > to some default values when it doesn't have any settings in limit.conf
> > to override them.  Turns out that for nofiles this is RLIM_INFINITY.
> > Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
> > package version 0.79-5 fixes that - tho I'm not sure what side effects
> > that has.
> > 
> > Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
> > uses a different pam (version).
> > 
> > 
> > I'm a bit unsure where to go from here.  Maybe the pam library in etch
> > should be fixed.  Maybe the patch should be reverted (but then it may be
> > more correct now and that's what the changelog entry suggests).
> > As a stopgap measure I could also just define nofile in limits.conf.
> > 
> > Thanks for listening.  Also thanks to Rik and Nocholas who helped track
> > some of this down.
> > 
> > Cheers,
> > Peter
> > 1. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
> >    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
> 
> Ho hum, thanks.
> 
> Well, I think we just revert it for now.  We can bring it back later
> if someone is thus inclined.  Along with some sort of opt-in control,
> perhaps in /proc.  Which defaults to "off".
> 

Hi all,

I don't think the "rlim_infinity" patch should be reverted. Let me try
to explain why.

First, code which sets limits to RLIM_INFINITY is very bad idea and
that code is Debian specific. I downloaded original pam 0.79 (stripped):

for(i = 0; i < RLIM_NLIMITS; i++) {
	...
	int r = getrlimit(i, &pl->limits[i].limit);
	...

as you can see original pam sets limits to inherited defaults. After
code written above Debian adds their patch:

if (limits_not_defined_in_limits_conf) {
	...
	case RLIMIT_NOFILE:
	...
		pl->limits[i].limit.rlim_cur = RLIM_INFINITY;
		pl->limits[i].limit.rlim_max = RLIM_INFINITY;
	...
}

so as you can see inherited default limits are overriden to infinity.
In my opinion Debian should revert their patch which is, at least,
pretty incorrect and unsecure.

Next argument is POSIX compatibility (from setrlimit() specification):

"The value RLIM_INFINITY, defined in <sys/resource.h>, shall be
considered to be larger than any other limit value. If a call to
getrlimit() returns RLIM_INFINITY for a resource, it means the
implementation shall not enforce limits on that resource. Specifying
RLIM_INFINITY as any resource limit value on a successful call to
setrlimit() shall inhibit enforcement of that resource limit."

So kernel does what is expected. If you want "unlimited" number of
descriptors, you have it.

Please consider again where exactly problem is, if in Debian patch or
in kernel patch. From my point of view Debian patch should be
reverted, not the kernel one.

Please add me to CC, I'm not member of the list.

Regards, Adam

___
Adam Tkac

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-21 11:52 Peter Palfrader
  2009-01-23 21:07 ` Florian Weimer
@ 2009-01-27 23:17 ` Andrew Morton
  2009-01-29 12:19   ` Adam Tkac
  1 sibling, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2009-01-27 23:17 UTC (permalink / raw)
  To: Peter Palfrader
  Cc: linux-kernel, debian-admin, team, libpam-modules, Adam Tkac, stable

On Wed, 21 Jan 2009 12:52:19 +0100
Peter Palfrader <weasel@debian.org> wrote:

> Hi,
> 
> I spent several hours trying to get to the bottom of a serious
> performance issue that appeared on one of our servers after upgrading to
> 2.6.28.  In the end it's what could be considered a userspace bug that
> was triggered by a change in 2.6.28.  Since this might also affect other
> people I figured I'd at least document what I found here, and maybe we
> can even do something about it:
> 
> 
> So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
> the team maintaining our ftp archive complained that one of their
> scripts that previously ran in a few minutes still hadn't even come
> close to being done after an hour or so.  Downgrading to 2.6.27 fixed
> that.
> 
> Turns out that script is forking a lot and something in it or python or
> whereever closes all the file descriptors it doesn't want to pass on.
> That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
> closes them all with a few exceptions.
> 
> Turns out that takes a long time when your limit -n is now 2^20 (1048576).
> 
> With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
> now a thousand times that.
> 
> 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
> RLIM_INFINITY" (0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f)[1] that
> allows, as the title implies, to set the limit for number of files to
> infinity.
> 
> Closer investigation showed that the broken default ulimit did not apply
> to "system" processes (like stuff started from init).  In the end I
> could establish that all processes that passed through pam_limit at one
> point had the bad resource limit.
> 
> Apparently the pam library in Debian etch (4.0) initializes the limits
> to some default values when it doesn't have any settings in limit.conf
> to override them.  Turns out that for nofiles this is RLIM_INFINITY.
> Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
> package version 0.79-5 fixes that - tho I'm not sure what side effects
> that has.
> 
> Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
> uses a different pam (version).
> 
> 
> I'm a bit unsure where to go from here.  Maybe the pam library in etch
> should be fixed.  Maybe the patch should be reverted (but then it may be
> more correct now and that's what the changelog entry suggests).
> As a stopgap measure I could also just define nofile in limits.conf.
> 
> Thanks for listening.  Also thanks to Rik and Nocholas who helped track
> some of this down.
> 
> Cheers,
> Peter
> 1. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
>    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f

Ho hum, thanks.

Well, I think we just revert it for now.  We can bring it back later
if someone is thus inclined.  Along with some sort of opt-in control,
perhaps in /proc.  Which defaults to "off".




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-23 22:02   ` David Daney
  2009-01-23 23:11     ` Peter Palfrader
@ 2009-01-25 10:59     ` Florian Weimer
  1 sibling, 0 replies; 14+ messages in thread
From: Florian Weimer @ 2009-01-25 10:59 UTC (permalink / raw)
  To: David Daney; +Cc: linux-kernel, DSA, team, libpam-modules

* David Daney:

> One problem is that for values of RLIMIT_NOFILE less than something
> like 4096, it is much faster to call sys_close() on all possible
> values than iterate through a handful of open files from /proc/self/fd
> using opendir(3)/readdir(3).

Really?  Yuck.

> The real solution is to convert your user space programs to use the
> new syscalls that allow for race-free setting of close-on-exec.
> Then you no longer need to mess around with iterating over these
> things.

These system calls are too recent to use in the next two Debian
releases.  In addition, we can't really be sure that all libraries use
the new calls.

I find the design of the CLOEXEC business somewhat revulsive, by the
way.  It reminds me of those DoSomethingEx APIs in another platform.
The *_at system calls are a similar wart.  Even with this stuff, I
still can't safely open a file with a different effective user ID in a
multithreaded application, or create a AF_UNIX socket with specific
file system permissions.  Some thread-specific context with what have
been traditionally considered per-process attributes might have been
better. 8-(

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-23 22:02   ` David Daney
@ 2009-01-23 23:11     ` Peter Palfrader
  2009-01-25 10:59     ` Florian Weimer
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Palfrader @ 2009-01-23 23:11 UTC (permalink / raw)
  To: David Daney; +Cc: Florian Weimer, linux-kernel, DSA, team, libpam-modules

On Fri, 23 Jan 2009, David Daney wrote:

> The real solution is to convert your user space programs to use the new 
> syscalls that allow for race-free setting of close-on-exec.  Then you no 
> longer need to mess around with iterating over these things.

It's python's popen2 implementation that does that for us. At least for
python2.4 and 2.5.

In our particular case moving away from calling external tools and doing
more within the scripts themselves brought a real speedup, but it's
probably not just us :)

-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-23 21:07 ` Florian Weimer
@ 2009-01-23 22:02   ` David Daney
  2009-01-23 23:11     ` Peter Palfrader
  2009-01-25 10:59     ` Florian Weimer
  0 siblings, 2 replies; 14+ messages in thread
From: David Daney @ 2009-01-23 22:02 UTC (permalink / raw)
  To: Florian Weimer; +Cc: linux-kernel, DSA, team, libpam-modules

Florian Weimer wrote:
> * Peter Palfrader:
> 
>> Turns out that script is forking a lot and something in it or python or
>> whereever closes all the file descriptors it doesn't want to pass on.
>> That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
>> closes them all with a few exceptions.
>>
>> Turns out that takes a long time when your limit -n is now 2^20 (1048576).
> 
> Interesting.
> 
> Can we make /proc more-or-less mandatory, so that the file descriptor
> list can be retrieved explicitly?

One problem is that for values of RLIMIT_NOFILE less than something like 
4096, it is much faster to call sys_close() on all possible values than 
iterate through a handful of open files from /proc/self/fd using 
opendir(3)/readdir(3).

Obviously for some large values of RLIMIT_NOFILE, this is no longer true.

People who have written code based on measuring the difference end up 
getting screwed when RLIMIT_NOFILE unexpectedly increases.

The real solution is to convert your user space programs to use the new 
syscalls that allow for race-free setting of close-on-exec.  Then you no 
longer need to mess around with iterating over these things.

David Daney

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: 2.6.28, rlimits, performance and debian etch
  2009-01-21 11:52 Peter Palfrader
@ 2009-01-23 21:07 ` Florian Weimer
  2009-01-23 22:02   ` David Daney
  2009-01-27 23:17 ` Andrew Morton
  1 sibling, 1 reply; 14+ messages in thread
From: Florian Weimer @ 2009-01-23 21:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: DSA, team, libpam-modules

* Peter Palfrader:

> Turns out that script is forking a lot and something in it or python or
> whereever closes all the file descriptors it doesn't want to pass on.
> That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
> closes them all with a few exceptions.
>
> Turns out that takes a long time when your limit -n is now 2^20 (1048576).

Interesting.

Can we make /proc more-or-less mandatory, so that the file descriptor
list can be retrieved explicitly?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* 2.6.28, rlimits, performance and debian etch
@ 2009-01-21 11:52 Peter Palfrader
  2009-01-23 21:07 ` Florian Weimer
  2009-01-27 23:17 ` Andrew Morton
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Palfrader @ 2009-01-21 11:52 UTC (permalink / raw)
  To: linux-kernel; +Cc: DSA, team, libpam-modules

Hi,

I spent several hours trying to get to the bottom of a serious
performance issue that appeared on one of our servers after upgrading to
2.6.28.  In the end it's what could be considered a userspace bug that
was triggered by a change in 2.6.28.  Since this might also affect other
people I figured I'd at least document what I found here, and maybe we
can even do something about it:


So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
the team maintaining our ftp archive complained that one of their
scripts that previously ran in a few minutes still hadn't even come
close to being done after an hour or so.  Downgrading to 2.6.27 fixed
that.

Turns out that script is forking a lot and something in it or python or
whereever closes all the file descriptors it doesn't want to pass on.
That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
closes them all with a few exceptions.

Turns out that takes a long time when your limit -n is now 2^20 (1048576).

With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
now a thousand times that.

2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
RLIM_INFINITY" (0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f)[1] that
allows, as the title implies, to set the limit for number of files to
infinity.

Closer investigation showed that the broken default ulimit did not apply
to "system" processes (like stuff started from init).  In the end I
could establish that all processes that passed through pam_limit at one
point had the bad resource limit.

Apparently the pam library in Debian etch (4.0) initializes the limits
to some default values when it doesn't have any settings in limit.conf
to override them.  Turns out that for nofiles this is RLIM_INFINITY.
Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
package version 0.79-5 fixes that - tho I'm not sure what side effects
that has.

Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
uses a different pam (version).


I'm a bit unsure where to go from here.  Maybe the pam library in etch
should be fixed.  Maybe the patch should be reverted (but then it may be
more correct now and that's what the changelog entry suggests).
As a stopgap measure I could also just define nofile in limits.conf.

Thanks for listening.  Also thanks to Rik and Nocholas who helped track
some of this down.

Cheers,
Peter
1. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
   http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
-- 
                           |  .''`.  ** Debian GNU/Linux **
      Peter Palfrader      | : :' :      The  universal
 http://www.palfrader.org/ | `. `'      Operating System
                           |   `-    http://www.debian.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-02-27  7:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-26 21:48 2.6.28, rlimits, performance and debian etch Frans Pop
2009-02-26 22:01 ` Steve Langasek
2009-02-27  7:30 ` Peter Palfrader
  -- strict thread matches above, loose matches on Subject: below --
2009-01-21 11:52 Peter Palfrader
2009-01-23 21:07 ` Florian Weimer
2009-01-23 22:02   ` David Daney
2009-01-23 23:11     ` Peter Palfrader
2009-01-25 10:59     ` Florian Weimer
2009-01-27 23:17 ` Andrew Morton
2009-01-29 12:19   ` Adam Tkac
2009-01-29 18:05     ` Andrew Morton
2009-01-29 18:10       ` Peter Palfrader
2009-02-02 16:20       ` Adam Tkac
2009-02-08 22:31     ` Steve Langasek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).