All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: multiple servers per automount
@ 2003-10-10 15:16 Ogden, Aaron A.
  2003-10-13  3:23 ` [NFS] " Ian Kent
  0 siblings, 1 reply; 21+ messages in thread
From: Ogden, Aaron A. @ 2003-10-10 15:16 UTC (permalink / raw)
  To: Ian Kent, Mike Waychison; +Cc: autofs mailing list, nfs



-----Original Message-----
From: Ian Kent [mailto:raven@themaw.net] 
Sent: Thursday, October 09, 2003 8:09 PM
To: Mike Waychison
Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
Subject: Re: [autofs] multiple servers per automount

>> The maximum number of plain pseudo-block device filesystems on a
given
>> filesystem is limitted to 256. (This includes proc, autofs, nfs..).
>>
>> This is because pseudo-block filesystems all use major 0, and each
have
>> a different minor (thus the 256 limit).
>>
>> There are however patches floating around (look at SuSe's kernels,
I'm
>> not sure about RH) that allow n majors to be used (default 5).  This
>> gives you 1280 mounts, a big step up :)
>>
>
> But as Aaron and I know things go pear shaped at just shy of 800
mounts
> with RedHat kernels. They have the more-unnamed patch.
>
> So this would indicate that even if there is a device system that can
> increase the number of unnamed devices that subsystems like NFS cannot
> handle this many mounts.

Maybe.  I'm not 100% certain though.  Currently I am holding steady at
710 active mounts, I am going to write a little script to mount more in
small increments, ie. read a list of ~1000 mountpoints from /home, mount
a few of them, check the filesystems, and repeat... this way I will know
exactly where things break down.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: multiple servers per automount
  2003-10-10 15:16 multiple servers per automount Ogden, Aaron A.
@ 2003-10-13  3:23 ` Ian Kent
  2003-10-14  7:05   ` Joseph V Moss
  0 siblings, 1 reply; 21+ messages in thread
From: Ian Kent @ 2003-10-13  3:23 UTC (permalink / raw)
  To: Ogden, Aaron A.; +Cc: autofs mailing list, nfs, Mike Waychison

On Fri, 10 Oct 2003, Ogden, Aaron A. wrote:

>
>
> > So this would indicate that even if there is a device system that can
> > increase the number of unnamed devices that subsystems like NFS cannot
> > handle this many mounts.
>
> Maybe.  I'm not 100% certain though.  Currently I am holding steady at
> 710 active mounts, I am going to write a little script to mount more in
> small increments, ie. read a list of ~1000 mountpoints from /home, mount
> a few of them, check the filesystems, and repeat... this way I will know
> exactly where things break down.

Interesting.

If you can edge it up then it's probably not an available port
restriction.

There may be more than one issue at work here.

-- 

   ,-._|\    Ian Kent
  /      \   Perth, Western Australia
  *_.--._/   E-mail: raven@themaw.net
        v    Web: http://themaw.net/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: multiple servers per automount
  2003-10-13  3:23 ` [NFS] " Ian Kent
@ 2003-10-14  7:05   ` Joseph V Moss
  2003-10-14 13:37       ` Ian Kent
  0 siblings, 1 reply; 21+ messages in thread
From: Joseph V Moss @ 2003-10-14  7:05 UTC (permalink / raw)
  To: Ian Kent; +Cc: Ogden, Aaron A., autofs mailing list, nfs, Mike Waychison

> On Fri, 10 Oct 2003, Ogden, Aaron A. wrote:
> 
> >
> >
> > > So this would indicate that even if there is a device system that can
> > > increase the number of unnamed devices that subsystems like NFS cannot
> > > handle this many mounts.
> >
> > Maybe.  I'm not 100% certain though.  Currently I am holding steady at
> > 710 active mounts, I am going to write a little script to mount more in
> > small increments, ie. read a list of ~1000 mountpoints from /home, mount
> > a few of them, check the filesystems, and repeat... this way I will know
> > exactly where things break down.
> 
> Interesting.
> 
> If you can edge it up then it's probably not an available port
> restriction.
> 
> There may be more than one issue at work here.
> 

The limit is 800 as others have stated.  Although, it can be less than that
if something else is already using up some of the reserved UDP ports.

I wrote a patch long ago against a 2.2.x kernel to enable it to use
multiple majors for NFS mounts (like the patches now common in several
distros).  I then ran into the 800 limit in the RPC layer.  After changing
the RPC layer to count up from 0, instead of down from 800, with no real
upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
I'm sure I could have done many thousand if I had had that many filesystems
around to mount.  Obviously, after 1024, it wasn't using reserved ports
anymore, but it didn't seem to matter.

Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
the RPC layer is different enough between 2.2 and 2.4 that it didn't work
right off.  Bumping it up to somewhere around 1024 should work, but using
non-reserved ports didn't seem to work when I made a simple attempt.

Of course, the real fix for the NFS layer is the expansion of the minor
numbers that's already occurred in 2.6 and the RPC layer problems should
be fixed by multiplexing multiple mounts on the same port.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RE: [autofs] multiple servers per automount
  2003-10-14  7:05   ` Joseph V Moss
@ 2003-10-14 13:37       ` Ian Kent
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Kent @ 2003-10-14 13:37 UTC (permalink / raw)
  To: Joseph V Moss; +Cc: Ogden, Aaron A., autofs mailing list, nfs, Mike Waychison

On Tue, 14 Oct 2003, Joseph V Moss wrote:

> The limit is 800 as others have stated.  Although, it can be less than that
> if something else is already using up some of the reserved UDP ports.
> 
> I wrote a patch long ago against a 2.2.x kernel to enable it to use
> multiple majors for NFS mounts (like the patches now common in several
> distros).  I then ran into the 800 limit in the RPC layer.  After changing
> the RPC layer to count up from 0, instead of down from 800, with no real
> upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> I'm sure I could have done many thousand if I had had that many filesystems
> around to mount.  Obviously, after 1024, it wasn't using reserved ports
> anymore, but it didn't seem to matter.
> 
> Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> right off.  Bumping it up to somewhere around 1024 should work, but using
> non-reserved ports didn't seem to work when I made a simple attempt.
> 
> Of course, the real fix for the NFS layer is the expansion of the minor
> numbers that's already occurred in 2.6 and the RPC layer problems should
> be fixed by multiplexing multiple mounts on the same port.
> 
> 

I don't see that expansion in 2.6 (test6). It looks to me like the 
allocation is done in set_anon_super (in fs/super.c) and that looks like 
it is restricted to 256. Please correct this for me. I can't see how there 
is any change to the number of unnmaed devices.

-- 

   ,-._|\    Ian Kent
  /      \   Perth, Western Australia
  *_.--._/   E-mail: raven@themaw.net
        v    Web: http://themaw.net/



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RE: [autofs] multiple servers per automount
@ 2003-10-14 13:37       ` Ian Kent
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Kent @ 2003-10-14 13:37 UTC (permalink / raw)
  To: Joseph V Moss; +Cc: Ogden, Aaron A., autofs mailing list, nfs, Mike Waychison

On Tue, 14 Oct 2003, Joseph V Moss wrote:

> The limit is 800 as others have stated.  Although, it can be less than that
> if something else is already using up some of the reserved UDP ports.
> 
> I wrote a patch long ago against a 2.2.x kernel to enable it to use
> multiple majors for NFS mounts (like the patches now common in several
> distros).  I then ran into the 800 limit in the RPC layer.  After changing
> the RPC layer to count up from 0, instead of down from 800, with no real
> upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> I'm sure I could have done many thousand if I had had that many filesystems
> around to mount.  Obviously, after 1024, it wasn't using reserved ports
> anymore, but it didn't seem to matter.
> 
> Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> right off.  Bumping it up to somewhere around 1024 should work, but using
> non-reserved ports didn't seem to work when I made a simple attempt.
> 
> Of course, the real fix for the NFS layer is the expansion of the minor
> numbers that's already occurred in 2.6 and the RPC layer problems should
> be fixed by multiplexing multiple mounts on the same port.
> 
> 

I don't see that expansion in 2.6 (test6). It looks to me like the 
allocation is done in set_anon_super (in fs/super.c) and that looks like 
it is restricted to 256. Please correct this for me. I can't see how there 
is any change to the number of unnmaed devices.

-- 

   ,-._|\    Ian Kent
  /      \   Perth, Western Australia
  *_.--._/   E-mail: raven@themaw.net
        v    Web: http://themaw.net/



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-14 13:37       ` Ian Kent
@ 2003-10-14 15:52         ` Mike Waychison
  -1 siblings, 0 replies; 21+ messages in thread
From: Mike Waychison @ 2003-10-14 15:52 UTC (permalink / raw)
  To: Ian Kent
  Cc: Joseph V Moss, Ogden, Aaron A.,
	autofs mailing list, nfs, Kernel Mailing List

Ian Kent wrote:

>On Tue, 14 Oct 2003, Joseph V Moss wrote:
>
>  
>
>>The limit is 800 as others have stated.  Although, it can be less than that
>>if something else is already using up some of the reserved UDP ports.
>>
>>I wrote a patch long ago against a 2.2.x kernel to enable it to use
>>multiple majors for NFS mounts (like the patches now common in several
>>distros).  I then ran into the 800 limit in the RPC layer.  After changing
>>the RPC layer to count up from 0, instead of down from 800, with no real
>>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
>>I'm sure I could have done many thousand if I had had that many filesystems
>>around to mount.  Obviously, after 1024, it wasn't using reserved ports
>>anymore, but it didn't seem to matter.
>>
>>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
>>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
>>right off.  Bumping it up to somewhere around 1024 should work, but using
>>non-reserved ports didn't seem to work when I made a simple attempt.
>>
>>Of course, the real fix for the NFS layer is the expansion of the minor
>>numbers that's already occurred in 2.6 and the RPC layer problems should
>>be fixed by multiplexing multiple mounts on the same port.
>>
>>
>>    
>>
>
>I don't see that expansion in 2.6 (test6). It looks to me like the 
>allocation is done in set_anon_super (in fs/super.c) and that looks like 
>it is restricted to 256. Please correct this for me. I can't see how there 
>is any change to the number of unnmaed devices.
>
>  
>

Here is the quick fix for this in RH 2.1AS kernels:

http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch

It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0. 

I don't know if anyone is working out a better scheme for 
get_unnamed_dev in 2.6 yet.  It does need to be done though.  A simple 
patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to 
PAGE_SIZE, automatically allowing for 32768 unnamed devices.

Mike Waychison


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: multiple servers per automount
@ 2003-10-14 15:52         ` Mike Waychison
  0 siblings, 0 replies; 21+ messages in thread
From: Mike Waychison @ 2003-10-14 15:52 UTC (permalink / raw)
  To: Ian Kent
  Cc: Ogden, Aaron A.,
	autofs mailing list, nfs, Kernel Mailing List, Joseph V Moss

Ian Kent wrote:

>On Tue, 14 Oct 2003, Joseph V Moss wrote:
>
>  
>
>>The limit is 800 as others have stated.  Although, it can be less than that
>>if something else is already using up some of the reserved UDP ports.
>>
>>I wrote a patch long ago against a 2.2.x kernel to enable it to use
>>multiple majors for NFS mounts (like the patches now common in several
>>distros).  I then ran into the 800 limit in the RPC layer.  After changing
>>the RPC layer to count up from 0, instead of down from 800, with no real
>>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
>>I'm sure I could have done many thousand if I had had that many filesystems
>>around to mount.  Obviously, after 1024, it wasn't using reserved ports
>>anymore, but it didn't seem to matter.
>>
>>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
>>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
>>right off.  Bumping it up to somewhere around 1024 should work, but using
>>non-reserved ports didn't seem to work when I made a simple attempt.
>>
>>Of course, the real fix for the NFS layer is the expansion of the minor
>>numbers that's already occurred in 2.6 and the RPC layer problems should
>>be fixed by multiplexing multiple mounts on the same port.
>>
>>
>>    
>>
>
>I don't see that expansion in 2.6 (test6). It looks to me like the 
>allocation is done in set_anon_super (in fs/super.c) and that looks like 
>it is restricted to 256. Please correct this for me. I can't see how there 
>is any change to the number of unnmaed devices.
>
>  
>

Here is the quick fix for this in RH 2.1AS kernels:

http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch

It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0. 

I don't know if anyone is working out a better scheme for 
get_unnamed_dev in 2.6 yet.  It does need to be done though.  A simple 
patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to 
PAGE_SIZE, automatically allowing for 32768 unnamed devices.

Mike Waychison

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-14 15:52         ` [NFS] " Mike Waychison
  (?)
@ 2003-10-14 20:44         ` H. Peter Anvin
  2003-10-14 23:12           ` Mike Waychison
  -1 siblings, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2003-10-14 20:44 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <3F8C1BB6.9010202@sun.com>
By author:    Mike Waychison <Michael.Waychison@Sun.COM>
In newsgroup: linux.dev.kernel
> 
> Here is the quick fix for this in RH 2.1AS kernels:
> 
> http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
> 
> It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0. 
> 
> I don't know if anyone is working out a better scheme for 
> get_unnamed_dev in 2.6 yet.  It does need to be done though.  A simple 
> patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to 
> PAGE_SIZE, automatically allowing for 32768 unnamed devices.
> 

dev_t enlargement, which solves this without a bunch of auxilliary
majors, should be in 2.6.

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-14 20:44         ` [NFS] RE: [autofs] " H. Peter Anvin
@ 2003-10-14 23:12           ` Mike Waychison
  2003-10-15 10:28             ` Ingo Oeser
  0 siblings, 1 reply; 21+ messages in thread
From: Mike Waychison @ 2003-10-14 23:12 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, Ian Kent

[-- Attachment #1: Type: text/plain, Size: 1092 bytes --]

H. Peter Anvin wrote:
> Followup to:  <3F8C1BB6.9010202@sun.com>
> By author:    Mike Waychison <Michael.Waychison@Sun.COM>
> In newsgroup: linux.dev.kernel
> 
>>Here is the quick fix for this in RH 2.1AS kernels:
>>
>>http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>>
>>It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0. 
>>
>>I don't know if anyone is working out a better scheme for 
>>get_unnamed_dev in 2.6 yet.  It does need to be done though.  A simple 
>>patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to 
>>PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>>
> 
> 
> dev_t enlargement, which solves this without a bunch of auxilliary
> majors, should be in 2.6.
> 
> 	-hpa

The problem still remains in 2.6 that we limit the count to 256.  I've 
attached a quick patch that I've compiled and tested.  I don't know if 
there is a better way to handle dynamic assignment of minors (haven't 
kept up to date in that realm), but if there is, then we should probably 
  use it instead.

Mike Waychison

[-- Attachment #2: max_anon.patch --]
[-- Type: text/plain, Size: 881 bytes --]

===== fs/super.c 1.108 vs edited =====
--- 1.108/fs/super.c	Wed Oct  1 15:36:45 2003
+++ edited/fs/super.c	Tue Oct 14 22:52:12 2003
@@ -528,14 +528,22 @@
  * filesystems which don't use real block-devices.  -- jrs
  */
 
-enum {Max_anon = 256};
-static unsigned long unnamed_dev_in_use[Max_anon/(8*sizeof(unsigned long))];
+enum {Max_anon = PAGE_SIZE * 8};
+static void *unnamed_dev_in_use = NULL;
 static spinlock_t unnamed_dev_lock = SPIN_LOCK_UNLOCKED;/* protects the above */
 
 int set_anon_super(struct super_block *s, void *data)
 {
 	int dev;
 	spin_lock(&unnamed_dev_lock);
+
+	if (!unnamed_dev_in_use)
+		unnamed_dev_in_use = (void *)get_zeroed_page(GFP_KERNEL);
+	if (!unnamed_dev_in_use) {
+		spin_unlock(&unnamed_dev_lock);
+		return -ENOMEM;
+	}
+
 	dev = find_first_zero_bit(unnamed_dev_in_use, Max_anon);
 	if (dev == Max_anon) {
 		spin_unlock(&unnamed_dev_lock);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-14 15:52         ` [NFS] " Mike Waychison
  (?)
@ 2003-10-15  7:22           ` Ian Kent
  -1 siblings, 0 replies; 21+ messages in thread
From: Ian Kent @ 2003-10-15  7:22 UTC (permalink / raw)
  To: Mike Waychison
  Cc: Joseph V Moss, Ogden, Aaron A.,
	autofs mailing list, nfs, Kernel Mailing List

On Tue, 14 Oct 2003, Mike Waychison wrote:

> Ian Kent wrote:
>
> >On Tue, 14 Oct 2003, Joseph V Moss wrote:
> >
> >
> >
> >>The limit is 800 as others have stated.  Although, it can be less than that
> >>if something else is already using up some of the reserved UDP ports.
> >>
> >>I wrote a patch long ago against a 2.2.x kernel to enable it to use
> >>multiple majors for NFS mounts (like the patches now common in several
> >>distros).  I then ran into the 800 limit in the RPC layer.  After changing
> >>the RPC layer to count up from 0, instead of down from 800, with no real
> >>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> >>I'm sure I could have done many thousand if I had had that many filesystems
> >>around to mount.  Obviously, after 1024, it wasn't using reserved ports
> >>anymore, but it didn't seem to matter.
> >>
> >>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> >>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> >>right off.  Bumping it up to somewhere around 1024 should work, but using
> >>non-reserved ports didn't seem to work when I made a simple attempt.
> >>
> >>Of course, the real fix for the NFS layer is the expansion of the minor
> >>numbers that's already occurred in 2.6 and the RPC layer problems should
> >>be fixed by multiplexing multiple mounts on the same port.
> >>
> >>
> >>
> >>
> >
> >I don't see that expansion in 2.6 (test6). It looks to me like the
> >allocation is done in set_anon_super (in fs/super.c) and that looks like
> >it is restricted to 256. Please correct this for me. I can't see how there
> >is any change to the number of unnmaed devices.
> >
> >
> >
>
> Here is the quick fix for this in RH 2.1AS kernels:
>
> http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>
> It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
>
> I don't know if anyone is working out a better scheme for
> get_unnamed_dev in 2.6 yet.  It does need to be done though.  A simple
> patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
> PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>

OK. Sounds like a good job for me to do (simple - maybe).
I'll spend a while looking for possible side effects.

Do you think that the possible NFS port allocation problems should hold up
this work or should it drive updates to NFS?

Comments from anyone about where to check and what to watch out for are
welcome.

-- 

   ,-._|\    Ian Kent
  /      \   Perth, Western Australia
  *_.--._/   E-mail: raven@themaw.net
        v    Web: http://themaw.net/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RE: [autofs] multiple servers per automount
@ 2003-10-15  7:22           ` Ian Kent
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Kent @ 2003-10-15  7:22 UTC (permalink / raw)
  To: Mike Waychison
  Cc: Joseph V Moss, Ogden, Aaron A.,
	autofs mailing list, nfs, Kernel Mailing List

On Tue, 14 Oct 2003, Mike Waychison wrote:

> Ian Kent wrote:
>
> >On Tue, 14 Oct 2003, Joseph V Moss wrote:
> >
> >
> >
> >>The limit is 800 as others have stated.  Although, it can be less than that
> >>if something else is already using up some of the reserved UDP ports.
> >>
> >>I wrote a patch long ago against a 2.2.x kernel to enable it to use
> >>multiple majors for NFS mounts (like the patches now common in several
> >>distros).  I then ran into the 800 limit in the RPC layer.  After changing
> >>the RPC layer to count up from 0, instead of down from 800, with no real
> >>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> >>I'm sure I could have done many thousand if I had had that many filesystems
> >>around to mount.  Obviously, after 1024, it wasn't using reserved ports
> >>anymore, but it didn't seem to matter.
> >>
> >>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> >>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> >>right off.  Bumping it up to somewhere around 1024 should work, but using
> >>non-reserved ports didn't seem to work when I made a simple attempt.
> >>
> >>Of course, the real fix for the NFS layer is the expansion of the minor
> >>numbers that's already occurred in 2.6 and the RPC layer problems should
> >>be fixed by multiplexing multiple mounts on the same port.
> >>
> >>
> >>
> >>
> >
> >I don't see that expansion in 2.6 (test6). It looks to me like the
> >allocation is done in set_anon_super (in fs/super.c) and that looks like
> >it is restricted to 256. Please correct this for me. I can't see how there
> >is any change to the number of unnmaed devices.
> >
> >
> >
>
> Here is the quick fix for this in RH 2.1AS kernels:
>
> http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>
> It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
>
> I don't know if anyone is working out a better scheme for
> get_unnamed_dev in 2.6 yet.  It does need to be done though.  A simple
> patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
> PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>

OK. Sounds like a good job for me to do (simple - maybe).
I'll spend a while looking for possible side effects.

Do you think that the possible NFS port allocation problems should hold up
this work or should it drive updates to NFS?

Comments from anyone about where to check and what to watch out for are
welcome.

-- 

   ,-._|\    Ian Kent
  /      \   Perth, Western Australia
  *_.--._/   E-mail: raven@themaw.net
        v    Web: http://themaw.net/



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: RE: [autofs] multiple servers per automount
@ 2003-10-15  7:22           ` Ian Kent
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Kent @ 2003-10-15  7:22 UTC (permalink / raw)
  To: Mike Waychison
  Cc: Joseph V Moss, Ogden, Aaron A.,
	autofs mailing list, nfs, Kernel Mailing List

On Tue, 14 Oct 2003, Mike Waychison wrote:

> Ian Kent wrote:
>
> >On Tue, 14 Oct 2003, Joseph V Moss wrote:
> >
> >
> >
> >>The limit is 800 as others have stated.  Although, it can be less than that
> >>if something else is already using up some of the reserved UDP ports.
> >>
> >>I wrote a patch long ago against a 2.2.x kernel to enable it to use
> >>multiple majors for NFS mounts (like the patches now common in several
> >>distros).  I then ran into the 800 limit in the RPC layer.  After changing
> >>the RPC layer to count up from 0, instead of down from 800, with no real
> >>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> >>I'm sure I could have done many thousand if I had had that many filesystems
> >>around to mount.  Obviously, after 1024, it wasn't using reserved ports
> >>anymore, but it didn't seem to matter.
> >>
> >>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> >>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> >>right off.  Bumping it up to somewhere around 1024 should work, but using
> >>non-reserved ports didn't seem to work when I made a simple attempt.
> >>
> >>Of course, the real fix for the NFS layer is the expansion of the minor
> >>numbers that's already occurred in 2.6 and the RPC layer problems should
> >>be fixed by multiplexing multiple mounts on the same port.
> >>
> >>
> >>
> >>
> >
> >I don't see that expansion in 2.6 (test6). It looks to me like the
> >allocation is done in set_anon_super (in fs/super.c) and that looks like
> >it is restricted to 256. Please correct this for me. I can't see how there
> >is any change to the number of unnmaed devices.
> >
> >
> >
>
> Here is the quick fix for this in RH 2.1AS kernels:
>
> http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>
> It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
>
> I don't know if anyone is working out a better scheme for
> get_unnamed_dev in 2.6 yet.  It does need to be done though.  A simple
> patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
> PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>

OK. Sounds like a good job for me to do (simple - maybe).
I'll spend a while looking for possible side effects.

Do you think that the possible NFS port allocation problems should hold up
this work or should it drive updates to NFS?

Comments from anyone about where to check and what to watch out for are
welcome.

-- 

   ,-._|\    Ian Kent
  /      \   Perth, Western Australia
  *_.--._/   E-mail: raven@themaw.net
        v    Web: http://themaw.net/



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-14 23:12           ` Mike Waychison
@ 2003-10-15 10:28             ` Ingo Oeser
  2003-10-15 16:16               ` Mike Waychison
  2003-10-23 13:37               ` Ian Kent
  0 siblings, 2 replies; 21+ messages in thread
From: Ingo Oeser @ 2003-10-15 10:28 UTC (permalink / raw)
  To: Mike Waychison
  Cc: linux-kernel, Ian Kent, linux-kernel, Ian Kent, linux-kernel,
	Ian Kent, linux-kernel, Ian Kent

On Wednesday 15 October 2003 01:12, Mike Waychison wrote:
> The problem still remains in 2.6 that we limit the count to 256.  I've
> attached a quick patch that I've compiled and tested.  I don't know if
> there is a better way to handle dynamic assignment of minors (haven't
> kept up to date in that realm), but if there is, then we should probably
>   use it instead.


In your patch you allocate inside the spinlock.

I would suggest to do sth. like the following:

void *local;
if (!unamed_dev_inuse) {
    local = get_zeroed_page(GFP_KERNEL);

    if (!local) 
        return -ENOMEM;
}

spinlock(&unamed_dev_lock);
mb();
if (!unamed_dev_inuse) {
    unamed_dev_inuse = local;

    /* Used globally, don't free now */
    local = NULL;
}

/* 
  Do the lookup and alloc
 */

spinunlock(&unamed_dev_lock);

/* Free page, because of race on allocation. */
if (local) 
    free_page(local);


Which will swap the pointers atomically and still alloc outside the
non-sleeping locking.


Regards

Ingo Oeser



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-15 10:28             ` Ingo Oeser
@ 2003-10-15 16:16               ` Mike Waychison
  2003-10-23 13:37               ` Ian Kent
  1 sibling, 0 replies; 21+ messages in thread
From: Mike Waychison @ 2003-10-15 16:16 UTC (permalink / raw)
  To: Ingo Oeser; +Cc: Mike Waychison, linux-kernel, Ian Kent

[-- Attachment #1: Type: text/plain, Size: 717 bytes --]

Ingo Oeser wrote:
> On Wednesday 15 October 2003 01:12, Mike Waychison wrote:
> 
>>The problem still remains in 2.6 that we limit the count to 256.  I've
>>attached a quick patch that I've compiled and tested.  I don't know if
>>there is a better way to handle dynamic assignment of minors (haven't
>>kept up to date in that realm), but if there is, then we should probably
>>  use it instead.
> 
> 
> 
> In your patch you allocate inside the spinlock.
> 
> I would suggest to do sth. like the following:
> 

Better yet..  we could move it into an __init section that will panic if 
the allocation fails (this should be the desired behaviour..).  This way 
we don't even have to grab the lock either.

Mike Waychison

[-- Attachment #2: max_anon_2.patch --]
[-- Type: text/plain, Size: 1592 bytes --]

===== fs/namespace.c 1.49 vs edited =====
--- 1.49/fs/namespace.c	Thu Jul 17 22:30:49 2003
+++ edited/fs/namespace.c	Wed Oct 15 15:59:11 2003
@@ -23,6 +23,7 @@
 #include <linux/mount.h>
 #include <asm/uaccess.h>
 
+extern void __init super_init(void);
 extern int __init init_rootfs(void);
 extern int __init sysfs_init(void);
 
@@ -1154,6 +1155,7 @@
 		d++;
 		i--;
 	} while (i);
+	super_init();
 	sysfs_init();
 	init_rootfs();
 	init_mount_tree();
===== fs/super.c 1.108 vs edited =====
--- 1.108/fs/super.c	Wed Oct  1 15:36:45 2003
+++ edited/fs/super.c	Wed Oct 15 15:59:50 2003
@@ -24,6 +24,7 @@
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/smp_lock.h>
+#include <linux/init.h>
 #include <linux/acct.h>
 #include <linux/blkdev.h>
 #include <linux/quotaops.h>
@@ -527,15 +528,22 @@
  * Unnamed block devices are dummy devices used by virtual
  * filesystems which don't use real block-devices.  -- jrs
  */
-
-enum {Max_anon = 256};
-static unsigned long unnamed_dev_in_use[Max_anon/(8*sizeof(unsigned long))];
+enum {Max_anon = PAGE_SIZE * 8};
+static void *unnamed_dev_in_use;
 static spinlock_t unnamed_dev_lock = SPIN_LOCK_UNLOCKED;/* protects the above */
 
+void __init super_init(void)
+{
+	unnamed_dev_in_use = (void *)get_zeroed_page(GFP_KERNEL);
+	if (!unnamed_dev_in_use)
+		panic("Could not allocate anonymous device map");
+}
+
 int set_anon_super(struct super_block *s, void *data)
 {
 	int dev;
 	spin_lock(&unnamed_dev_lock);
+
 	dev = find_first_zero_bit(unnamed_dev_in_use, Max_anon);
 	if (dev == Max_anon) {
 		spin_unlock(&unnamed_dev_lock);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-15 10:28             ` Ingo Oeser
  2003-10-15 16:16               ` Mike Waychison
@ 2003-10-23 13:37               ` Ian Kent
  2003-10-23 17:00                 ` Mike Waychison
  1 sibling, 1 reply; 21+ messages in thread
From: Ian Kent @ 2003-10-23 13:37 UTC (permalink / raw)
  To: Ingo Oeser; +Cc: Mike Waychison, Kernel Mailing List


Please forgive my ignorance Ingo but ...

I suffer from race condition blindness. A terible afflicition when one is 
trying to understand the sublties of the kernel, but I'm trying.

While I am not questioning your suggestion, I have thought about the code 
and fail to see the race you point out. Please help me along.

On Wed, 15 Oct 2003, Ingo Oeser wrote:

> On Wednesday 15 October 2003 01:12, Mike Waychison wrote:
> > The problem still remains in 2.6 that we limit the count to 256.  I've
> > attached a quick patch that I've compiled and tested.  I don't know if
> > there is a better way to handle dynamic assignment of minors (haven't
> > kept up to date in that realm), but if there is, then we should probably
> >   use it instead.
> 
> 
> In your patch you allocate inside the spinlock.

Do you mean we don't want to sleep under the spin lock?
Would a GFP_ATOMIC make a difference to the analysis?

> 
> I would suggest to do sth. like the following:
> 
> void *local;
> if (!unamed_dev_inuse) {
>     local = get_zeroed_page(GFP_KERNEL);
> 
>     if (!local) 
>         return -ENOMEM;
> }
> 
> spinlock(&unamed_dev_lock);
> mb();
> if (!unamed_dev_inuse) {
>     unamed_dev_inuse = local;
> 
>     /* Used globally, don't free now */
>     local = NULL;
> }
> 
> /* 
>   Do the lookup and alloc
>  */
> 
> spinunlock(&unamed_dev_lock);
> 
> /* Free page, because of race on allocation. */
> if (local) 
>     free_page(local);
> 
> 
> Which will swap the pointers atomically and still alloc outside the
> non-sleeping locking.

As I said please give me a hint about your thinking here.
And the use of a memory barrier as well ... umm?

-- 

   ,-._|\    Ian Kent
  /      \   Perth, Western Australia
  *_.--._/   E-mail: raven@themaw.net
        v    Web: http://themaw.net/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-23 13:37               ` Ian Kent
@ 2003-10-23 17:00                 ` Mike Waychison
  2003-10-23 17:09                   ` Tim Hockin
  2003-10-24  0:47                   ` Ian Kent
  0 siblings, 2 replies; 21+ messages in thread
From: Mike Waychison @ 2003-10-23 17:00 UTC (permalink / raw)
  To: Ian Kent; +Cc: Ingo Oeser, Kernel Mailing List

Ian Kent wrote:

>On Wed, 15 Oct 2003, Ingo Oeser wrote:
>  
>
>>In your patch you allocate inside the spinlock.
>>    
>>
>
>Do you mean we don't want to sleep under the spin lock?
>Would a GFP_ATOMIC make a difference to the analysis?
>  
>
Yes, sleeping within a spinlock is bad practice because it may 
eventually deadlock.  Pretend that the lock is taken, the call to 
kmalloc is made, the mm system doesn't have any immidiately free memory 
and through some flow of execution requires that a some pseudo-block 
device backed filesystem needs to be mounted -> deadlock.  I have no 
idea if this is currently a likely scenario, however not sleeping within 
a lock is 'The Right Thing' and should be avoided at all costs. 

GFP_ATOMIC should be avoided in most circumstances, particularly in 
environments where the code can be refactored to allow for the sleep.  
It is less likely to find free memory atomically and is thus more likely 
to fail.

>>I would suggest to do sth. like the following:
>>
>>void *local;
>>if (!unamed_dev_inuse) {
>>    local = get_zeroed_page(GFP_KERNEL);
>>
>>    if (!local) 
>>        return -ENOMEM;
>>}
>>
>>spinlock(&unamed_dev_lock);
>>mb();
>>if (!unamed_dev_inuse) {
>>    unamed_dev_inuse = local;
>>
>>    /* Used globally, don't free now */
>>    local = NULL;
>>}
>>
>>/* 
>>  Do the lookup and alloc
>> */
>>
>>spinunlock(&unamed_dev_lock);
>>
>>/* Free page, because of race on allocation. */
>>if (local) 
>>    free_page(local);
>>
>>
>>Which will swap the pointers atomically and still alloc outside the
>>non-sleeping locking.
>>    
>>
>
>As I said please give me a hint about your thinking here.
>And the use of a memory barrier as well ... umm?
>
>  
>

Ingo's patch simply moved the allocation outside the spinlock..  See my 
later patch about moving the allocation to and __init section, which is 
probably the cleaner thing to do and doesn't require grabbing the page 
and using it conditionally.

As for the mb(), I *thought* that a spinlock implied a memory barrier, 
however I think he put it there because it solves the age-old badness of 
double-checked locking (search google for good explanations of the badness).

-- 
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me, 
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-23 17:00                 ` Mike Waychison
@ 2003-10-23 17:09                   ` Tim Hockin
  2003-10-24  0:47                   ` Ian Kent
  1 sibling, 0 replies; 21+ messages in thread
From: Tim Hockin @ 2003-10-23 17:09 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Ian Kent, Ingo Oeser, Kernel Mailing List

On Thu, Oct 23, 2003 at 01:00:57PM -0400, Mike Waychison wrote:
> >Would a GFP_ATOMIC make a difference to the analysis?
 
> Yes, sleeping within a spinlock is bad practice because it may 
> eventually deadlock.  Pretend that the lock is taken, the call to 
> kmalloc is made, the mm system doesn't have any immidiately free memory 
> and through some flow of execution requires that a some pseudo-block 
> device backed filesystem needs to be mounted -> deadlock.  I have no 
> idea if this is currently a likely scenario, however not sleeping within 
> a lock is 'The Right Thing' and should be avoided at all costs. 

it's worse than that.  It's forbidden.  It's a VERY likely deadlock scenario
in the general sense, even if this particular case is not.  If you need to
lock something and you need to sleep holding that lock, use a semaphore.

-- 
Notice that as computers are becoming easier and easier to use,
suddenly there's a big market for "Dummies" books.  Cause and effect,
or merely an ironic juxtaposition of unrelated facts?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-23 17:00                 ` Mike Waychison
  2003-10-23 17:09                   ` Tim Hockin
@ 2003-10-24  0:47                   ` Ian Kent
  2003-10-24  1:42                     ` Tim Hockin
  1 sibling, 1 reply; 21+ messages in thread
From: Ian Kent @ 2003-10-24  0:47 UTC (permalink / raw)
  To: Mike Waychison; +Cc: Ingo Oeser, Kernel Mailing List


Thanks for the description.

I thought it was bad to call a function that could block while
holding a lock. At least I was close to right this time.

I wasn't aware of the badness I'll see what I can find.

On Thu, 23 Oct 2003, Mike Waychison wrote:

>
> Ingo's patch simply moved the allocation outside the spinlock..  See my
> later patch about moving the allocation to and __init section, which is
> probably the cleaner thing to do and doesn't require grabbing the page
> and using it conditionally.
>

Missed that when I returned to it. Found it now.

That is clearly a better way to do it.

I there any chance this would be accepted into 2.6.0?

I think it's quite important, hopefully others do as well.


-- 

   ,-._|\    Ian Kent
  /      \   Perth, Western Australia
  *_.--._/   E-mail: raven@themaw.net
        v    Web: http://themaw.net/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [NFS] RE: [autofs] multiple servers per automount
  2003-10-24  0:47                   ` Ian Kent
@ 2003-10-24  1:42                     ` Tim Hockin
  0 siblings, 0 replies; 21+ messages in thread
From: Tim Hockin @ 2003-10-24  1:42 UTC (permalink / raw)
  To: Ian Kent; +Cc: Mike Waychison, Ingo Oeser, Kernel Mailing List, torvalds

Recap: Mike Waychison posted a simple patch to make Max_anon bit array
(NFS mounts etc.) use exactly one page.

On Fri, Oct 24, 2003 at 08:47:57AM +0800, Ian Kent wrote:
> I there any chance this would be accepted into 2.6.0?
> 
> I think it's quite important, hopefully others do as well.


Wouldn't it be saner to have a sysctl to adjust that?  From 1 page to
2^20/(PAGE_SIZE * CHAR_BIT) pages?  Perhaps just in page-sized increments?

This would be a simple patch... But maybe it's not 'stabilization' for
2.6.0.

Maybe the simple version in 2.6.0 and the right version in 2.6.1?

Linus?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [NFS] RE: [autofs] multiple servers per automount
@ 2003-10-15 14:31 ` Lever, Charles
  0 siblings, 0 replies; 21+ messages in thread
From: Lever, Charles @ 2003-10-15 14:31 UTC (permalink / raw)
  To: Ian Kent
  Cc: Joseph V Moss, Ogden, Aaron A.,
	Mike Waychison, autofs mailing list, nfs, Kernel Mailing List

Ian Kent said:
> Do you think that the possible NFS port allocation problems
> should hold up this work or should it drive updates to NFS?

hi ian-

the port stuff has to be addressed at some point, but i don't
think you should wait for it, because it is behind a long queue
of other RPC work (like Kerberos for Linux NFS) that has a
higher priority.  also, there are other patches that partially
address this limitation, and certainly those will be used by
the desparate few who need it now. :^)

IMHO.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [NFS] RE: [autofs] multiple servers per automount
@ 2003-10-15 14:31 ` Lever, Charles
  0 siblings, 0 replies; 21+ messages in thread
From: Lever, Charles @ 2003-10-15 14:31 UTC (permalink / raw)
  To: Ian Kent
  Cc: Joseph V Moss, Ogden, Aaron A.,
	Mike Waychison, autofs mailing list, nfs, Kernel Mailing List

Ian Kent said:
> Do you think that the possible NFS port allocation problems
> should hold up this work or should it drive updates to NFS?

hi ian-

the port stuff has to be addressed at some point, but i don't
think you should wait for it, because it is behind a long queue
of other RPC work (like Kerberos for Linux NFS) that has a
higher priority.  also, there are other patches that partially
address this limitation, and certainly those will be used by
the desparate few who need it now. :^)

IMHO.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2003-10-24  1:52 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-10 15:16 multiple servers per automount Ogden, Aaron A.
2003-10-13  3:23 ` [NFS] " Ian Kent
2003-10-14  7:05   ` Joseph V Moss
2003-10-14 13:37     ` RE: [autofs] " Ian Kent
2003-10-14 13:37       ` Ian Kent
2003-10-14 15:52       ` [NFS] " Mike Waychison
2003-10-14 15:52         ` [NFS] " Mike Waychison
2003-10-14 20:44         ` [NFS] RE: [autofs] " H. Peter Anvin
2003-10-14 23:12           ` Mike Waychison
2003-10-15 10:28             ` Ingo Oeser
2003-10-15 16:16               ` Mike Waychison
2003-10-23 13:37               ` Ian Kent
2003-10-23 17:00                 ` Mike Waychison
2003-10-23 17:09                   ` Tim Hockin
2003-10-24  0:47                   ` Ian Kent
2003-10-24  1:42                     ` Tim Hockin
2003-10-15  7:22         ` Ian Kent
2003-10-15  7:22           ` Ian Kent
2003-10-15  7:22           ` Ian Kent
2003-10-15 14:31 [NFS] " Lever, Charles
2003-10-15 14:31 ` Lever, Charles

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.