linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Swapping for diskless nodes
@ 2001-08-09 14:26 Bulent Abali
  2001-08-09 15:13 ` Alan Cox
  0 siblings, 1 reply; 32+ messages in thread
From: Bulent Abali @ 2001-08-09 14:26 UTC (permalink / raw)
  To: Dirk W. Steinberg; +Cc: Ingo Oeser, linux-kernel, linux-mm, Alan Cox




>In such a scenario I would disagree with Alan that network paging is
>high latency as compared to disk access. I have a fully switched 100 Mpbs
>full-duplex ethernet network, and sending a page across the net into
>the memory of a fast server could have much less latency that writing
>that page out to a local old, slow IDE disk.

Have you actually tried swapping over the network using nbd or any other
network device mounted as a swap disk?  Never mind the latency.  Does it
work at all?  I am curious to know.

Last time I checked swapping over nbd required patching the network stack.
Because swapping occurs when memory is low and when memory is low TCP
doesn't do what you expect it to do...
Bulent




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 14:26 Swapping for diskless nodes Bulent Abali
@ 2001-08-09 15:13 ` Alan Cox
  2001-08-09 20:57   ` Rik van Riel
  2001-08-11  1:13   ` Pavel Machek
  0 siblings, 2 replies; 32+ messages in thread
From: Alan Cox @ 2001-08-09 15:13 UTC (permalink / raw)
  To: Bulent Abali
  Cc: Dirk W. Steinberg, Ingo Oeser, linux-kernel, linux-mm, Alan Cox

> Last time I checked swapping over nbd required patching the network stack.
> Because swapping occurs when memory is low and when memory is low TCP
> doesn't do what you expect it to do...

Its a case of having sufficient memory in the atomic pools. Its possible to
do some ugly quick kernel hack to make the pool commit less likely to be a 
problem.

Ultimately its an insoluble problem, neither SunOS, Solaris or NetBSD are
infallible, they just never fail for any normal situation, and thats good
enough for me as a solution

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 15:13 ` Alan Cox
@ 2001-08-09 20:57   ` Rik van Riel
  2001-08-09 22:46     ` Alan Cox
  2001-08-11  1:13   ` Pavel Machek
  1 sibling, 1 reply; 32+ messages in thread
From: Rik van Riel @ 2001-08-09 20:57 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bulent Abali, Dirk W. Steinberg, Ingo Oeser, linux-kernel, linux-mm

On Thu, 9 Aug 2001, Alan Cox wrote:

> Ultimately its an insoluble problem, neither SunOS, Solaris or
> NetBSD are infallible, they just never fail for any normal
> situation, and thats good enough for me as a solution

Memory reservations, with reservations on a per-socket
basis, can fix the problem.

Rik
--
IA64: a worthy successor to the i860.

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 20:57   ` Rik van Riel
@ 2001-08-09 22:46     ` Alan Cox
  2001-08-11  1:16       ` Pavel Machek
  0 siblings, 1 reply; 32+ messages in thread
From: Alan Cox @ 2001-08-09 22:46 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Alan Cox, Bulent Abali, Dirk W. Steinberg, Ingo Oeser,
	linux-kernel, linux-mm

> On Thu, 9 Aug 2001, Alan Cox wrote:
> 
> > Ultimately its an insoluble problem, neither SunOS, Solaris or
> > NetBSD are infallible, they just never fail for any normal
> > situation, and thats good enough for me as a solution
> 
> Memory reservations, with reservations on a per-socket
> basis, can fix the problem.

Only a probabalistic subset of the problem. But yes enough to make it "work"
except where mathematicians and crazy people are concerned. Do not NFS swap
on a BGP4 router with no fixed route to the server..

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 15:13 ` Alan Cox
  2001-08-09 20:57   ` Rik van Riel
@ 2001-08-11  1:13   ` Pavel Machek
  2001-08-14 12:57     ` Alan Cox
  1 sibling, 1 reply; 32+ messages in thread
From: Pavel Machek @ 2001-08-11  1:13 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bulent Abali, Dirk W. Steinberg, Ingo Oeser, linux-kernel, linux-mm

Hi

> > Last time I checked swapping over nbd required patching the network stack.
> > Because swapping occurs when memory is low and when memory is low TCP
> > doesn't do what you expect it to do...
> 
> Its a case of having sufficient memory in the atomic pools. Its possible to
> do some ugly quick kernel hack to make the pool commit less likely to be a 
> problem.
> 
> Ultimately its an insoluble problem, neither SunOS, Solaris or NetBSD are
> infallible, they just never fail for any normal situation, and thats good
> enough for me as a solution

Oops,  really? And if I can DoS such machine with ping -f (to eat atomic
ram)? And what are you going to tel your users? "It died so reboot"?
								Pavel
-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 22:46     ` Alan Cox
@ 2001-08-11  1:16       ` Pavel Machek
  0 siblings, 0 replies; 32+ messages in thread
From: Pavel Machek @ 2001-08-11  1:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Rik van Riel, Bulent Abali, Dirk W. Steinberg, Ingo Oeser,
	linux-kernel, linux-mm

Hi!

> > > Ultimately its an insoluble problem, neither SunOS, Solaris or
> > > NetBSD are infallible, they just never fail for any normal
> > > situation, and thats good enough for me as a solution
> > 
> > Memory reservations, with reservations on a per-socket
> > basis, can fix the problem.
> 
> Only a probabalistic subset of the problem. But yes enough to make it "work"
> except where mathematicians and crazy people are concerned. Do not NFS swap
> on a BGP4 router with no fixed route to the server..

That's cleaar misconfiguration. Similar misconfiguration to

a# mount b:/xyzzy /bar
b# mount a:/xyzzy /foo

. Similar misconfiguration to a nbd-swap-on b, b nbd-swap-on c, and c rely
on a for its routing.
								Pavel
-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-11  1:13   ` Pavel Machek
@ 2001-08-14 12:57     ` Alan Cox
  2001-08-16 21:46       ` Pavel Machek
  0 siblings, 1 reply; 32+ messages in thread
From: Alan Cox @ 2001-08-14 12:57 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Alan Cox, Bulent Abali, Dirk W. Steinberg, Ingo Oeser,
	linux-kernel, linux-mm

> > Ultimately its an insoluble problem, neither SunOS, Solaris or NetBSD are
> > infallible, they just never fail for any normal situation, and thats good
> > enough for me as a solution
> 
> Oops,  really? And if I can DoS such machine with ping -f (to eat atomic
> ram)? And what are you going to tel your users? "It died so reboot"?

For the simplistic case you can stop queueing data to user sockets but that
isnt neccessarily a cure - it can lead to bogus OOM by preventing progress
of apps that would otherwise read a packet then exit.

The good example of the insoluble end of it is a box with no default route
doing BGP4 routing with NFS swap. Now thats an extremely daft practical 
proposition but it illustrates the fact the priority ordering is not known
to the kernel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-14 12:57     ` Alan Cox
@ 2001-08-16 21:46       ` Pavel Machek
  2001-08-17  0:46         ` Rik van Riel
  0 siblings, 1 reply; 32+ messages in thread
From: Pavel Machek @ 2001-08-16 21:46 UTC (permalink / raw)
  To: Alan Cox
  Cc: Bulent Abali, Dirk W. Steinberg, Ingo Oeser, linux-kernel, linux-mm

Hi!

> The good example of the insoluble end of it is a box with no default route
> doing BGP4 routing with NFS swap. Now thats an extremely daft practical 
> proposition but it illustrates the fact the priority ordering is not known
> to the kernel

I'd call that configuration error. If swap-over-nbd works in all but
such cases, its okay with me.
								Pavel
-- 
I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-16 21:46       ` Pavel Machek
@ 2001-08-17  0:46         ` Rik van Riel
  2001-08-17  1:35           ` Jakob Østergaard
                             ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Rik van Riel @ 2001-08-17  0:46 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Alan Cox, Bulent Abali, Dirk W. Steinberg, Ingo Oeser,
	linux-kernel, linux-mm

On Thu, 16 Aug 2001, Pavel Machek wrote:

> I'd call that configuration error. If swap-over-nbd works in all but
> such cases, its okay with me.

Agreed. I'm very interested in this case too, I guess we
should start testing swap-over-nbd and trying to fix things
as we encounter them...

regards,

Rik
--
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-17  0:46         ` Rik van Riel
@ 2001-08-17  1:35           ` Jakob Østergaard
  2001-08-17 21:23             ` Pavel Machek
  2001-08-17  6:42           ` Andreas Haumer
  2001-08-17 21:03           ` Andreas Haumer
  2 siblings, 1 reply; 32+ messages in thread
From: Jakob Østergaard @ 2001-08-17  1:35 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Pavel Machek, Alan Cox, Bulent Abali, Dirk W. Steinberg,
	Ingo Oeser, linux-kernel, linux-mm

On Thu, Aug 16, 2001 at 09:46:59PM -0300, Rik van Riel wrote:
> On Thu, 16 Aug 2001, Pavel Machek wrote:
> 
> > I'd call that configuration error. If swap-over-nbd works in all but
> > such cases, its okay with me.
> 
> Agreed. I'm very interested in this case too, I guess we
> should start testing swap-over-nbd and trying to fix things
> as we encounter them...

FYI:  The following has been rock solid for the past two days, using the
machine mainly for emacs/LaTeX/konqueror/...   There's fairly heavy swap
traffic, often  25-40 MB swap is used.

joe@rhinehart:~$ free
             total       used       free     shared    buffers     cached
Mem:         38052      37164        888      20616       2864      16968
-/+ buffers/cache:      17332      20720
Swap:        65528      24788      40740
joe@rhinehart:~$ uname -a
Linux rhinehart 2.2.19pre17 #1 Tue Mar 13 22:37:59 EST 2001 i586 unknown
joe@rhinehart:~$ cat /proc/swaps 
Filename                        Type            Size    Used    Priority
/dev/nbd0                       partition       65528   24728   -2
joe@rhinehart:~$

I'm swapping over a 3Com 374TX pcmcia card in a 100Mbit hub (hooked up to a
switch, connected to the nbd-server machine)

No problems so far - but then again, this is not an NFS-booting BGP4 router  ;)

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-17  0:46         ` Rik van Riel
  2001-08-17  1:35           ` Jakob Østergaard
@ 2001-08-17  6:42           ` Andreas Haumer
  2001-08-17 21:25             ` Pavel Machek
  2001-08-17 21:03           ` Andreas Haumer
  2 siblings, 1 reply; 32+ messages in thread
From: Andreas Haumer @ 2001-08-17  6:42 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Pavel Machek, Alan Cox, Bulent Abali, Dirk W. Steinberg,
	Ingo Oeser, linux-kernel, linux-mm

Hi!

Rik van Riel wrote:
> 
> On Thu, 16 Aug 2001, Pavel Machek wrote:
> 
> > I'd call that configuration error. If swap-over-nbd works in all but
> > such cases, its okay with me.
> 
> Agreed. I'm very interested in this case too, I guess we
> should start testing swap-over-nbd and trying to fix things
> as we encounter them...
> 
We do "testing" swap-over-nbd for some time now... :-))

In fact, all our workstations in our office are xS+S Diskless Clients, 
and about 50 Diskless Clients are running at several customer sites.

In order to make Pavel happy :-) we did some more stress testing
now, and here are the results:

We set up a quite old machine (ASUS P55T2P4 motherboard,
64MB RAM, AMD K6/200 CPU, Matrox Millenium II graphics card,
RTL8139 100MBit Ethernet) as Diskless Client with NBD Swap.

We installed our up-coming BLD-3.1 with Linux-2.2.19 kernel,
Etherboot+initrd, DevFS and NBD swap patches.

We started all kind of programs (KDE, Netscape, StarOffice,
Acrobat Reader, Emacs, X11 with several background images,
The Gimp and so on). To make memory more tight, we created
to ramdisks of 16MB each and filled them up (counting for 32MB
RAM used in buffers). The machine was slow, but still usable!
(Well, I wouldn't recommend to actually _use_ a system under
such load, but anyway... :-)

We let this configuration run for several hours, and it 
stayed very well alive. Swapping over NBD was _heavy_

We also started a ping -f from the server to the client
and let this run for about an hour. The client lost quite a few
packets, and interactive performance was really low (you
had to wait more than a minute for a switch between two
KDE desktops), but the system stayed alive!

Here are some figures from the system in this situation:

root@dws4:~ {138} $ date
Fri Aug 17 08:13:17 CEST 2001

root@dws4:~ {139} $ uname -a
Linux dws4 2.2.19 #1 Thu Aug 9 09:01:01 CEST 2001 i586 unknown

root@dws4:~ {140} $ uptime
  8:13am  up 15:04,  5 users,  load average: 4.84, 5.91, 6.54

root@dws4:~ {141} $ free
             total       used       free     shared    buffers    
cached
Mem:         63488      62088       1400      15396      32488     
10476
-/+ buffers/cache:      19124      44364
Swap:       204792     117520      87272 

root@dws4:~ {142} $ mount
/dev/root on / type nfs
(ro,v2,rsize=4096,wsize=4096,nolock,addr=192.168.163.2)
none on /dev type devfs (rw)
/dev/root.old on /initrd type romfs (ro)
proc on /proc type proc (rw)
/dev/rd/1 on /var type ext2 (rw)
server.demo.xss.co.at:/home on /home type nfs
(rw,v3,rsize=8192,wsize=8192,soft,addr=server.demo.xss.co.at)

root@dws4:~ {143} $ vmstat 1
   procs                      memory    swap          io    
system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs 
us  sy  id
 1  8  3 118092   1196  32488  11044 183  14    46     4  563   115 
59  37   5
 0 10  0 117984   1160  32488  10976 1120  16   280     4 24068   791 
18  73  10
 2 11  0 117860   1236  32488  10852 704   0   176     0 23064   495 
14  61  25
 0  9  1 117792    612  32488  11332 740  56   185    14 22500   657 
17  58  24
 2  8  0 117780   1572  32488  10364 480  60   120    15 22563   378 
11  63  27
 0  8  0 117860   1192  32488  10812 664 124   166    31 23140   651 
22  60  18
 1  7  0 117860   1432  32488  10620 532  16   133     5 22842   443  
8  66  26
 2  6  0 117800   1252  32488  10700 512   0   128     0 23410   820 
23  70   6
 0  6  0 117764   1608  32488  10376 548   0   137     1 22497   551 
25  66   9
 2  5  0 117816   1176  32488  10768 880  88   220    23 24694   538 
15  60  25

root@dws4:~ {144} $ cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 5
model           : 6
model name      : AMD-K6tm w/ multimedia extensions
stepping        : 2
cpu MHz         : 200.459
cache size      : 64 KB
fdiv_bug        : no
hlt_bug         : no
sep_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr mce cx8 sep mmx
bogomips        : 399.76

root@dws4:~ {145} $ cat /proc/interrupts
           CPU0
  0:    5446658          XT-PIC  timer
  1:       3295          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  3:      60105          XT-PIC  serial
  8:          2          XT-PIC  rtc
 10:   52056240          XT-PIC  eth0
 13:          1          XT-PIC  fpu
NMI:          0 

root@dws4:~ {146} $ lspci
00:00.0 Host bridge: Intel Corporation 430HX - 82439HX TXC [Triton II]
(rev 03)
00:07.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton
II] (rev 01)
00:07.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE
[Natoma/Triton II]
00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139
(rev 10)
00:0b.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2164W
[Millennium II]

root@dws4:~ {147} $ ps aux
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  1064   60 ?        S    Aug16   0:01 init
[5]
root         2  0.0  0.0     0    0 ?        SW   Aug16   0:00
[kflushd]
root         3  0.0  0.0     0    0 ?        SW   Aug16   0:00
[kupdate]
root         4  0.8  0.0     0    0 ?        SW   Aug16   7:19
[kswapd]
root         5  0.0  0.0     0    0 ?        SW   Aug16   0:00
[kreclaimd]
root         6  0.0  0.0     0    0 ?        SW   Aug16   0:00
[keventd]
root        64  0.0  0.0     0    0 ?        SW   Aug16   0:00 [eth0]
root        70  0.5  0.0     0    0 ?        SW   Aug16   4:42
[rpciod]
daemon     163  0.0  0.0  1060    0 ?        SW   Aug16   0:00
[portmap]
root       165  0.0  0.0  1176    0 ?        SW   Aug16   0:00
[rpc.statd]
root       253  0.0  0.0  1496    0 ?        SW   Aug16   0:00
[devfsd]
root       316  0.0  0.0     0    0 ?        SW   Aug16   0:00 [lockd]
root       413  0.0  0.1  1468   68 ?        S    Aug16   0:01
/sbin/syslogd -p /var/log/log
root       420  0.0  0.0  1328    0 ?        SW   Aug16   0:00 [klogd]
root       465  0.0  0.1  1500   96 ?        S    Aug16   0:00
/usr/sbin/cron
root       555  0.0  0.0  1384    0 ?        SW   Aug16   0:00 [inetd]
root       592  0.0  0.2  1288  148 ?        S    Aug16   0:00
[ypbind]
root       596  0.0  0.2  1288  148 ?        S    Aug16   0:00
[ypbind]
root       597  0.0  0.2  1288  148 ?        S    Aug16   0:00
[ypbind]
root       599  0.0  0.2  1288  148 ?        S    Aug16   0:02
[ypbind]
root       606  0.0  0.0  1072    0 vc/3     SW   Aug16   0:00 [getty]
root       607  0.0  0.0  1072    0 vc/4     SW   Aug16   0:00 [getty]
root       674  0.7  0.0  1368    0 ?        SW   Aug16   7:08
[nbdswapc]
root       744  0.0  0.0  2176    0 ?        SW   Aug16   0:00 [login]
root       749  0.0  0.0  2200    0 vc/2     SW   Aug16   0:00 [sh]
root       786  0.0  0.0 12196    0 ?        SW   Aug16   0:00 [kdm]
root       787 31.9  5.6 42332 3588 ttyS1    D    Aug16 286:37
/usr/X11R6/bin/XFree86 -indirect dws4 :0 -depth 16 -dpi 75 vt12
root       881  0.0  0.0 13364    0 ?        SW   Aug16   0:00 [kdm]
demouser   898  0.0  0.0  1652    0 ?        SW   Aug16   0:00
[startkde]
demouser   926  0.0  0.0 13780    0 ?        SW   Aug16   0:03
[kdeinit]
demouser   928  0.0  0.0 14036    0 ?        SW   Aug16   0:01
[kdeinit]
demouser   930  0.0  0.6 13740  404 ?        S    Aug16   0:04
kdeinit: kded
demouser   938  0.0  0.0 13892    0 ?        SW   Aug16   0:00
[kdeinit]
demouser   950  0.0  0.0 13728    0 ?        SW   Aug16   0:00
[kdeinit]
demouser   952  0.0  0.0 13452    0 ?        SW   Aug16   0:05
[knotify]
demouser   953  0.0  0.0 10204    0 ?        SW   Aug16   0:04
[ksmserver]
demouser   954  0.3  1.6 15276 1032 ?        S    Aug16   3:33
kdeinit: kwin -session 11c0a8a369000098969662300000009350000
demouser   957  0.1  2.0 16852 1284 ?        S    Aug16   1:15
kdeinit: kdesktop
demouser   959  0.3  2.7 17212 1720 ?        S    Aug16   3:06
kdeinit: kicker
demouser   960  0.0  0.0 14232   24 ?        S    Aug16   0:00
kdeinit: kio_file file
/tmp/ksocket-demouser/klauncherPfMg8b.slave-socket
/tmp/ksocket-demouser/kdesktopsQAzbc.slave-soc
demouser   963  0.1  1.3 14788  856 ?        S    Aug16   1:20
kdeinit: klipper -icon klipper -miniicon klipper
demouser   965  0.0  0.0 14476    0 ?        SW   Aug16   0:01
[kdeinit]
demouser   967  0.0  0.0 14720    0 ?        SW   Aug16   0:02
[kdeinit]
demouser   968  0.0  0.0  1020    0 pts/0    SW   Aug16   0:00 [cat]
demouser   969  4.0  1.7  9624 1128 ?        S    Aug16  36:42 qps
demouser   971  0.0  0.0 15160    0 ?        SW   Aug16   0:05
[kdeinit]
demouser   972  0.0  0.0  2100    0 pts/1    SW   Aug16   0:00 [zsh]
root       991  0.0  0.0  1236   52 ?        S    Aug16   0:05
[in.telnetd]
root       992  0.0  0.0  2252    0 ?        SW   Aug16   0:00 [login]
root       993  0.0  0.0  2188    0 pts/2    SW   Aug16   0:00 [sh]
demouser  1002  0.6  3.3 121272 2140 pts/1   D    Aug16   5:30
/opt/StarOffice-5.2/program/soffice.bin
demouser  1019  0.0  3.3 121272 2140 pts/1   S    Aug16   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1020  0.0  3.3 121272 2140 pts/1   S    Aug16   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1021  0.0  3.3 121272 2140 pts/1   S    Aug16   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1022  0.0  3.3 121272 2140 pts/1   S    Aug16   0:03
/opt/StarOffice-5.2/program/soffice.bin
demouser  1024  0.0  3.3 121272 2140 pts/1   S    Aug16   0:10
/opt/StarOffice-5.2/program/soffice.bin
demouser  1025  0.0  3.3 121272 2140 pts/1   S    Aug16   0:01
/opt/StarOffice-5.2/program/soffice.bin
demouser  1026  0.0  3.3 121272 2140 pts/1   S    Aug16   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1027  0.0  3.3 121272 2140 pts/1   S    Aug16   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1028  0.0  3.3 121272 2148 pts/1   S    Aug16   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1029  0.0  3.3 121272 2148 pts/1   S    Aug16   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1031  0.0  3.3 121272 2148 pts/1   S    Aug16   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1035  0.4  3.3 23028 2104 ?        S    Aug16   4:05
kdeinit: konqueror --silent
demouser  1044  0.0  0.0 14636    0 ?        SW   Aug16   0:01
[kdeinit]
demouser  1048  0.0  0.0 10044   32 ?        S    Aug16   0:00 kdesud
demouser  1058  0.0  0.0 18264    0 ?        SW   Aug16   0:10
[kdeinit]
demouser  1059  0.0  0.0 18592    0 ?        SW   Aug16   0:09
[kdeinit]
demouser  1060  0.0  0.0 23604    0 ?        SW   Aug16   0:27
[netscape]
demouser  1061  0.0  0.0 16892    0 ?        SW   Aug16   0:00
[netscape]
demouser  1067  0.0  0.0  7356    0 ?        SW   Aug16   0:01 [emacs]
demouser  1069  0.0  1.1 12812  736 ?        S    Aug16   0:25 amor
demouser  1072  9.6  4.2 15768 2728 ?        D    Aug16  85:26
ksysguard
demouser  1073  5.2  0.5  1616  368 ?        S    Aug16  46:27
ksysguardd
demouser  1092  0.0  0.0 15392    0 ?        SW   Aug16   0:52
[konsole]
demouser  1094  0.0  0.0  2120    0 pts/3    SW   Aug16   0:00 [zsh]
demouser  1127  0.0  0.6 13656  388 pts/3    S    Aug16   0:18
/opt/Acrobat4/Reader/intellinux/bin/acroread
root      1144  0.2  0.0  1236   56 ?        S    Aug16   2:08
[in.telnetd]
root      1145  0.0  0.0  2252    0 ?        SW   Aug16   0:00 [login]
root      1146  0.0  0.0  2192    0 pts/4    SW   Aug16   0:00 [sh]
root      1165  6.7  0.7  2052  488 pts/4    S    Aug16  58:46 top
demouser  1167  4.7  1.7 15428 1104 ?        S    Aug16  38:19
kdeinit: konsole -ls -icon konsole -miniicon konsole -caption konsole
demouser  1168  0.0  0.0  2112    0 pts/5    SW   Aug16   0:00 [zsh]
demouser  1175 11.9  0.7  2052  488 pts/5    S    Aug16  95:01 top
root      1209  5.9  0.3  1072  216 pts/2    S    07:20   3:27 vmstat
1
demouser  1213  0.0  3.3 121272 2148 pts/1   S    07:25   0:00
/opt/StarOffice-5.2/program/soffice.bin
demouser  1214  0.0  3.3 121272 2148 pts/1   S    07:25   0:00
/opt/StarOffice-5.2/program/soffice.bin
root      1215  0.0  0.0  1236   60 ?        S    07:25   0:00
[in.telnetd]
demouser  1216  0.0  3.3 121272 2148 pts/1   S    07:25   0:00
/opt/StarOffice-5.2/program/soffice.bin
root      1217  0.0  0.0  2252    0 ?        SW   07:25   0:00 [login]
root      1218  0.0  0.6  2192  416 pts/6    S    07:26   0:01 -sh
demouser  1229  0.7  0.0 18564    0 ?        SW   07:35   0:18
[kdeinit]
demouser  1231  1.5  1.4 17640  920 ?        S    07:39   0:35
kdeinit: konqueror --silent
demouser  1234  1.3  0.3 17508  208 ?        S    07:42   0:28 gimp
/home/demouser/linux.jpg
demouser  1457  0.1  0.0  6196    0 ?        SW   07:54   0:02
[script-fu]
demouser  1463  0.1  0.0  3732    0 ?        SW   08:04   0:01 [gv]
demouser  1464  0.2  0.0  6128    0 ?        SW   08:05   0:01 [gs]
demouser  1475  0.3  0.1  2088  112 ?        RN   08:17   0:00
/opt/kde2/bin/kmatrix.kss -root
root      1476 15.8  1.7  2904 1104 pts/6    R    08:17   0:00 ps aux

root@dws4:~ {148} $ ifconfig
eth0      Link encap:Ethernet  HWaddr 00:00:21:C6:CE:CE
          inet addr:192.168.163.104  Bcast:192.168.163.255 
Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:32988520 errors:2 dropped:0 overruns:0 frame:0
          TX packets:26332220 errors:2 dropped:8 overruns:0 carrier:2
          collisions:6493721 txqueuelen:100
          RX bytes:2465285636 (2351.0 Mb)  TX bytes:3498769136 (3336.6
Mb)
          Interrupt:10 Base address:0x8000
 
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:3924  Metric:1
          RX packets:24508915 errors:0 dropped:0 overruns:0 frame:0
          TX packets:24508915 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1029721773 (982.0 Mb)  TX bytes:1029721773 (982.0
Mb) 

(Note: the diskless client was connected to a 100MBit HUB and 
flood-pinged, that's why there's a huge number of collisions)

Now I don't know what you would say, but I would call this
enough stable for real world use!

We have now updated our system for Linux 2.2.19. I'll try
to create clean nbd-swap-only patches for 2.2.19 over the
weekend (I hope I find some spare time). I'll announce them
on LKM as soon as they are ready.

I think, Linux with NBD swap is ready for production use.
We use it for more than a year now on our Diskless Clients.

If anyone want's to change Linux' swap mechanism for 2.5,
please keep in mind this application. It's very nice and
useful to have!

- andreas

-- 
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-17  0:46         ` Rik van Riel
  2001-08-17  1:35           ` Jakob Østergaard
  2001-08-17  6:42           ` Andreas Haumer
@ 2001-08-17 21:03           ` Andreas Haumer
  2001-08-17 22:31             ` Dirk W. Steinberg
  2 siblings, 1 reply; 32+ messages in thread
From: Andreas Haumer @ 2001-08-17 21:03 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Pavel Machek, Alan Cox, Bulent Abali, Dirk W. Steinberg,
	Ingo Oeser, linux-kernel, linux-mm

Hi!

Rik van Riel wrote:
> 
> On Thu, 16 Aug 2001, Pavel Machek wrote:
> 
> > I'd call that configuration error. If swap-over-nbd works in all but
> > such cases, its okay with me.
> 
> Agreed. I'm very interested in this case too, I guess we
> should start testing swap-over-nbd and trying to fix things
> as we encounter them...
> 
As I promised a few days ago I have just released the newest 
version of our NBD swap patches for Linux-2.2.19.
You can find them together with the NBD swap server and
client source code under the following URL:

<ftp://ftp.xss.co.at/pub/Linux/NBD/nbdswap-1.2-1.tar.gz>

It works for us, and we think it works reasonably well.
YMMV, though. Please check it out and tell us what you
think. We would really like to see something like this
to be included in Linux-2.5.

Suggestions, improvements and ideas are welcome.

Regards,

- andreas

-- 
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-17  1:35           ` Jakob Østergaard
@ 2001-08-17 21:23             ` Pavel Machek
  0 siblings, 0 replies; 32+ messages in thread
From: Pavel Machek @ 2001-08-17 21:23 UTC (permalink / raw)
  To: Jakob ?stergaard, Rik van Riel, Pavel Machek, Alan Cox,
	Bulent Abali, Dirk W. Steinberg, Ingo Oeser, linux-kernel,
	linux-mm

Hi!

> > > I'd call that configuration error. If swap-over-nbd works in all but
> > > such cases, its okay with me.
> > 
> > Agreed. I'm very interested in this case too, I guess we
> > should start testing swap-over-nbd and trying to fix things
> > as we encounter them...
> 
> FYI:  The following has been rock solid for the past two days, using the
> machine mainly for emacs/LaTeX/konqueror/...   There's fairly heavy swap
> traffic, often  25-40 MB swap is used.
> 
> joe@rhinehart:~$ free
>              total       used       free     shared    buffers     cached
> Mem:         38052      37164        888      20616       2864      16968
> -/+ buffers/cache:      17332      20720
> Swap:        65528      24788      40740
> joe@rhinehart:~$ uname -a
> Linux rhinehart 2.2.19pre17 #1 Tue Mar 13 22:37:59 EST 2001 i586 unknown
> joe@rhinehart:~$ cat /proc/swaps 
> Filename                        Type            Size    Used    Priority
> /dev/nbd0                       partition       65528   24728   -2
> joe@rhinehart:~$
> 
> I'm swapping over a 3Com 374TX pcmcia card in a 100Mbit hub (hooked up to a
> switch, connected to the nbd-server machine)

Can you try heavy ping -f's onto the swapping machine? Plus put some
*heavy* pressure on it. 40MB in swap is okay .. if you have 8MB main
memory.

								Pavel
-- 
The best software in life is free (not shareware)!		Pavel
GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-17  6:42           ` Andreas Haumer
@ 2001-08-17 21:25             ` Pavel Machek
  0 siblings, 0 replies; 32+ messages in thread
From: Pavel Machek @ 2001-08-17 21:25 UTC (permalink / raw)
  To: Andreas Haumer
  Cc: Rik van Riel, Pavel Machek, Alan Cox, Bulent Abali,
	Dirk W. Steinberg, Ingo Oeser, linux-kernel, linux-mm

Hi!

> > On Thu, 16 Aug 2001, Pavel Machek wrote:
> > 
> > > I'd call that configuration error. If swap-over-nbd works in all but
> > > such cases, its okay with me.
> > 
> > Agreed. I'm very interested in this case too, I guess we
> > should start testing swap-over-nbd and trying to fix things
> > as we encounter them...
> > 
> We do "testing" swap-over-nbd for some time now... :-))
> 
> In fact, all our workstations in our office are xS+S Diskless Clients, 
> and about 50 Diskless Clients are running at several customer sites.
> 
> In order to make Pavel happy :-) we did some more stress testing
> now, and here are the results:

Pavel is happy ;-).

> We set up a quite old machine (ASUS P55T2P4 motherboard,
> 64MB RAM, AMD K6/200 CPU, Matrox Millenium II graphics card,
> RTL8139 100MBit Ethernet) as Diskless Client with NBD Swap.
> 
> We installed our up-coming BLD-3.1 with Linux-2.2.19 kernel,
> Etherboot+initrd, DevFS and NBD swap patches.

Can you revert NBD swap patch and try again? It should break. If it
does not break, your testing is not good enough.
								Pavel
-- 
The best software in life is free (not shareware)!		Pavel
GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-17 21:03           ` Andreas Haumer
@ 2001-08-17 22:31             ` Dirk W. Steinberg
  2001-08-17 22:57               ` Pavel Machek
  0 siblings, 1 reply; 32+ messages in thread
From: Dirk W. Steinberg @ 2001-08-17 22:31 UTC (permalink / raw)
  To: Andreas Haumer
  Cc: Rik van Riel, Pavel Machek, Alan Cox, Bulent Abali, Ingo Oeser,
	linux-kernel, linux-mm

Andreas Haumer wrote:
> As I promised a few days ago I have just released the newest
> version of our NBD swap patches for Linux-2.2.19.
> You can find them together with the NBD swap server and
> client source code under the following URL:
> 
> <ftp://ftp.xss.co.at/pub/Linux/NBD/nbdswap-1.2-1.tar.gz>

Hi,

do you have NBD swap patches for 2.4.x as well?
Or does it work out-of-the-box with 2.4?

Cheers,
	Dirk

------------------------------------------
Ingenieurbüro Dipl.-Ing. Dirk W. Steinberg
Ringstr. 2, D-53567 Buchholz, Germany
Phone: +49-2683-9793-20, fax: -29
Mobile/GSM: +49-170-818-9793
Email: dws@dirksteinberg.de

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-17 22:31             ` Dirk W. Steinberg
@ 2001-08-17 22:57               ` Pavel Machek
  0 siblings, 0 replies; 32+ messages in thread
From: Pavel Machek @ 2001-08-17 22:57 UTC (permalink / raw)
  To: Dirk W. Steinberg; +Cc: kernel list

Hi!


> > As I promised a few days ago I have just released the newest
> > version of our NBD swap patches for Linux-2.2.19.
> > You can find them together with the NBD swap server and
> > client source code under the following URL:
> > 
> > <ftp://ftp.xss.co.at/pub/Linux/NBD/nbdswap-1.2-1.tar.gz>
> 
> Hi,
> 
> do you have NBD swap patches for 2.4.x as well?
> Or does it work out-of-the-box with 2.4?

It certainly does not work, but I do not have patches.
								Pavel
-- 
The best software in life is free (not shareware)!		Pavel
GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 15:14 ` Alan Cox
@ 2001-08-11  1:17   ` Pavel Machek
  0 siblings, 0 replies; 32+ messages in thread
From: Pavel Machek @ 2001-08-11  1:17 UTC (permalink / raw)
  To: Alan Cox; +Cc: Dirk W. Steinberg, linux-kernel

Hi!

> > what you say sound a lot like a hacker solution ("check that it uses the
> > right GFP_ levels"). I think it's about time that this deficit of linux
> 
> Nope. I'm simply advising people to check that nbd is correctly written.

This bug is unlikely to be in nbd, but you need to check whole network
stack. Even arp handling is cruical for working nbd swap!
								Pavel
-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 14:36     ` Andreas Haumer
@ 2001-08-11  1:11       ` Pavel Machek
  0 siblings, 0 replies; 32+ messages in thread
From: Pavel Machek @ 2001-08-11  1:11 UTC (permalink / raw)
  To: Andreas Haumer; +Cc: Dirk W. Steinberg, Alan Cox, linux-kernel

Hi!

> > what you say sound a lot like a hacker solution ("check that it uses the
> > right GFP_ levels"). I think it's about time that this deficit of linux
> > as compared to SunOS or *BSD should be removed. Network paging should be
> > supported as a standard feature of a stock kernel compile.
> > 
> We have swapping over NBD running for some time now on
> our "xS+S Diskless Client" system, and it works really
> fine! No problem running StarOffice, Netscape, The Gimp
> and KDE on a 128MB Diskless Client and 250MB swap over a 
> 100MBit switched ethernet!

Try going 8MB of ram, ping -f client and try to compile the kernel.

Netscape + SO + gimp on 128MB is rather light load.

> Check <http://www.xss.co.at/linux/NBD/Applications.html>
> to find our solution for that.
> 
> Kernel patches are a little bit outdated, but we have NBD swap
> for 2.2.19 running internally since this week, and we will
> update our web-page soon.

Be sure to mail me a copy.
								Pavel
-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 20:58     ` Rik van Riel
@ 2001-08-10  8:11       ` Eric W. Biederman
  0 siblings, 0 replies; 32+ messages in thread
From: Eric W. Biederman @ 2001-08-10  8:11 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Alan Cox, Dirk W. Steinberg, Ingo Oeser, linux-kernel, linux-mm

Rik van Riel <riel@conectiva.com.br> writes:

> On 9 Aug 2001, Eric W. Biederman wrote:
> 
> > I don't know about that.  We already can swap over just about
> > everything because we can swap over the loopback device.
> 
> Last I looked the loopback device could deadlock your
> system without you needing to swap over it ;)

It wouldn't suprise me.  But the fact remains that in 2.4 we allow it.
And if we allw it there is little excuse for doing it wrong.

Actually except for network cases it looks easier to prevent deadlocks
on the swapping path than with the loop back devices.  We can call
aops->prepare_write_out when we place the page in the swap cache
to make certain we aren't over a hole in a file, and there is room in the
filesystem to store the data.

Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 17:09   ` Eric W. Biederman
@ 2001-08-09 20:58     ` Rik van Riel
  2001-08-10  8:11       ` Eric W. Biederman
  0 siblings, 1 reply; 32+ messages in thread
From: Rik van Riel @ 2001-08-09 20:58 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Dirk W. Steinberg, Ingo Oeser, linux-kernel, linux-mm

On 9 Aug 2001, Eric W. Biederman wrote:

> I don't know about that.  We already can swap over just about
> everything because we can swap over the loopback device.

Last I looked the loopback device could deadlock your
system without you needing to swap over it ;)

Rik
--
IA64: a worthy successor to the i860.

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 10:50   ` Ingo Oeser
  2001-08-09 13:12     ` Dirk W. Steinberg
@ 2001-08-09 20:47     ` Rik van Riel
  1 sibling, 0 replies; 32+ messages in thread
From: Rik van Riel @ 2001-08-09 20:47 UTC (permalink / raw)
  To: Ingo Oeser; +Cc: linux-kernel, linux-mm

On Thu, 9 Aug 2001, Ingo Oeser wrote:

> Are there any races I have to consider?

Well, this IS a big issue against swap over network.

Swap over network is inherently prone to deadlock
situations, due to the following three problems:

1) we swap pages out when we are close to running
   out of free memory
2) to write pages out over the network, we need to
   allocate space to assemble network packets
3) we need to have memory to receive the ACKs on
   the packets we sent out

The only real solution to this would be memory
reservations so we know this memory won't be used
for other purposes.

What we can do right now is be careful about how
many writeouts over the network we do at the same
time, but that will still get us killed in case of
a ping flood ;)

regards,

Rik
--
IA64: a worthy successor to the i860.

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09  9:08 ` Alan Cox
                     ` (2 preceding siblings ...)
  2001-08-09 19:27   ` Pavel Machek
@ 2001-08-09 20:38   ` Rik van Riel
  3 siblings, 0 replies; 32+ messages in thread
From: Rik van Riel @ 2001-08-09 20:38 UTC (permalink / raw)
  To: Alan Cox; +Cc: Dirk W. Steinberg, linux-kernel

On Thu, 9 Aug 2001, Alan Cox wrote:

> > what is the best/recommended way to do remote swapping via the network
> > for diskless workstations or compute nodes in clusters in Linux 2.4?=20
> > Last time i checked was linux 2.2, and there were some races related=20
> > to network swapping back then. Has this been fixed for 2.4?
>
> The best answer probably is "don't". Networks are high latency
> things for paging and paging is latency sensitive.

Actually, swap over network can be faster than local swap
at times. ;)

Don't forget that disks are really high latency devices
and with local swap you are SURE that the data isn't
in memory while with remote swap you have a chance that
the server is caching your data ...

regards,

Rik
--
IA64: a worthy successor to the i860.

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09  9:08 ` Alan Cox
  2001-08-09 10:50   ` Ingo Oeser
  2001-08-09 14:17   ` Dirk W. Steinberg
@ 2001-08-09 19:27   ` Pavel Machek
  2001-08-09 20:38   ` Rik van Riel
  3 siblings, 0 replies; 32+ messages in thread
From: Pavel Machek @ 2001-08-09 19:27 UTC (permalink / raw)
  To: Alan Cox, Dirk W. Steinberg; +Cc: linux-kernel

Hi!

> > what is the best/recommended way to do remote swapping via the network
> > for diskless workstations or compute nodes in clusters in Linux 2.4?=20
> > Last time i checked was linux 2.2, and there were some races related=20
> > to network swapping back then. Has this been fixed for 2.4?
> 
> The best answer probably is "don't". Networks are high latency things for
> paging and paging is latency sensitive. If performance is not an issue then
> the nbd driver ought to work. You may need to check it uses the
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Alan, are you saying it should work reliably?

> right
> GFP_ levels to avoid deadlocks and you might need to up the amount of atomic
> pool memory. Hopefully other hacks arent needed

There still may be some deadlocks. Swapping over nbd seemed to work
for me... until I used mem=8M and did two ping -f's to the victim.

Issue is that you not only need to check nbd, you need to check whole
network layer, too.
								Pavel
-- 
I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 15:19 ` Alan Cox
@ 2001-08-09 17:09   ` Eric W. Biederman
  2001-08-09 20:58     ` Rik van Riel
  0 siblings, 1 reply; 32+ messages in thread
From: Eric W. Biederman @ 2001-08-09 17:09 UTC (permalink / raw)
  To: Alan Cox; +Cc: Dirk W. Steinberg, Ingo Oeser, linux-kernel, linux-mm

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> > the memory of a fast server could have much less latency that writing 
> > that page out to a local old, slow IDE disk. Clusters could even have
> > special high-bandwidth, low latency networks that could be used for
> > remote paging.
> > 
> > In a perfect world, all nodes in a cluster would be able to dynamically 
> > share a pool of "cluster swap" space, so any locally available swap that
> > is not used could be utilized by other nodes in the cluster.
> 
> That I think is a 2.5 problem. One thing that has been talked about several
> times now is removing all the swap special case crap from the mm and making
> swap a file system. That removes special cases and means anyone can write
> or use custom, or multiple swap filesystems, in theory including things like
> swap over a shared GFS pool
> 
> But its not for 2.4, no way

I don't know about that.  We already can swap over just about everything 
because we can swap over the loopback device.  So moving making the swapping
code do the right thing is not that big of an allowance, nor that
much of extra code so if 2.5 actually starts up I can see us doing that.

Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
       [not found] <no.id>
  2001-08-09  9:08 ` Alan Cox
  2001-08-09 15:14 ` Alan Cox
@ 2001-08-09 15:19 ` Alan Cox
  2001-08-09 17:09   ` Eric W. Biederman
  2 siblings, 1 reply; 32+ messages in thread
From: Alan Cox @ 2001-08-09 15:19 UTC (permalink / raw)
  To: Dirk W. Steinberg; +Cc: Ingo Oeser, linux-kernel, linux-mm, Alan Cox

> the memory of a fast server could have much less latency that writing 
> that page out to a local old, slow IDE disk. Clusters could even have
> special high-bandwidth, low latency networks that could be used for
> remote paging.
> 
> In a perfect world, all nodes in a cluster would be able to dynamically 
> share a pool of "cluster swap" space, so any locally available swap that
> is not used could be utilized by other nodes in the cluster.

That I think is a 2.5 problem. One thing that has been talked about several
times now is removing all the swap special case crap from the mm and making
swap a file system. That removes special cases and means anyone can write
or use custom, or multiple swap filesystems, in theory including things like
swap over a shared GFS pool

But its not for 2.4, no way

Alan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
       [not found] <no.id>
  2001-08-09  9:08 ` Alan Cox
@ 2001-08-09 15:14 ` Alan Cox
  2001-08-11  1:17   ` Pavel Machek
  2001-08-09 15:19 ` Alan Cox
  2 siblings, 1 reply; 32+ messages in thread
From: Alan Cox @ 2001-08-09 15:14 UTC (permalink / raw)
  To: Dirk W. Steinberg; +Cc: Alan Cox, linux-kernel

> 
> Alan,
> 
> what you say sound a lot like a hacker solution ("check that it uses the
> right GFP_ levels"). I think it's about time that this deficit of linux

Nope. I'm simply advising people to check that nbd is correctly written.

> as compared to SunOS or *BSD should be removed. Network paging should be
> supported as a standard feature of a stock kernel compile.

There I'd agree entirely.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 14:17   ` Dirk W. Steinberg
@ 2001-08-09 14:36     ` Andreas Haumer
  2001-08-11  1:11       ` Pavel Machek
  0 siblings, 1 reply; 32+ messages in thread
From: Andreas Haumer @ 2001-08-09 14:36 UTC (permalink / raw)
  To: Dirk W. Steinberg; +Cc: Alan Cox, linux-kernel

Hi!

"Dirk W. Steinberg" wrote:
> 
> Alan,
> 
> what you say sound a lot like a hacker solution ("check that it uses the
> right GFP_ levels"). I think it's about time that this deficit of linux
> as compared to SunOS or *BSD should be removed. Network paging should be
> supported as a standard feature of a stock kernel compile.
> 
We have swapping over NBD running for some time now on
our "xS+S Diskless Client" system, and it works really
fine! No problem running StarOffice, Netscape, The Gimp
and KDE on a 128MB Diskless Client and 250MB swap over a 
100MBit switched ethernet!

Check <http://www.xss.co.at/linux/NBD/Applications.html>
to find our solution for that.

Kernel patches are a little bit outdated, but we have NBD swap
for 2.2.19 running internally since this week, and we will
update our web-page soon.

Let us hear if it works for you.

Regards,

- andreas

-- 
Andreas Haumer                     | mailto:andreas@xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09  9:08 ` Alan Cox
  2001-08-09 10:50   ` Ingo Oeser
@ 2001-08-09 14:17   ` Dirk W. Steinberg
  2001-08-09 14:36     ` Andreas Haumer
  2001-08-09 19:27   ` Pavel Machek
  2001-08-09 20:38   ` Rik van Riel
  3 siblings, 1 reply; 32+ messages in thread
From: Dirk W. Steinberg @ 2001-08-09 14:17 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan,

what you say sound a lot like a hacker solution ("check that it uses the
right GFP_ levels"). I think it's about time that this deficit of linux
as compared to SunOS or *BSD should be removed. Network paging should be
supported as a standard feature of a stock kernel compile.

/ Dirk

Alan Cox wrote:
> > what is the best/recommended way to do remote swapping via the network
> > for diskless workstations or compute nodes in clusters in Linux 2.4?=20
> > Last time i checked was linux 2.2, and there were some races related=20
> > to network swapping back then. Has this been fixed for 2.4?
> 
> The best answer probably is "don't". Networks are high latency things for
> paging and paging is latency sensitive. If performance is not an issue then
> the nbd driver ought to work. You may need to check it uses the right
> GFP_ levels to avoid deadlocks and you might need to up the amount of atomic
> pool memory. Hopefully other hacks arent needed
> 
> [The general case of network swap is basically insoluble but its possible to
>  make it perfectly usable as Sun proved]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09 10:50   ` Ingo Oeser
@ 2001-08-09 13:12     ` Dirk W. Steinberg
  2001-08-09 20:47     ` Rik van Riel
  1 sibling, 0 replies; 32+ messages in thread
From: Dirk W. Steinberg @ 2001-08-09 13:12 UTC (permalink / raw)
  To: Ingo Oeser; +Cc: linux-kernel, linux-mm, Alan Cox

I'd like to second that example where you have weak diskless nodes and
a big server with a lot of memory. The important point here is that the
remote paging does not need to really write to the remote disk, especially
not synchronously. The page could eventually be migrated to the remote
disk asynchronously, or maybe not at all if there is no memory pressure
at the remote system.

In such a scenario I would disagree with Alan that network paging is 
high latency as compared to disk access. I have a fully switched 100 Mpbs
full-duplex ethernet network, and sending a page across the net into
the memory of a fast server could have much less latency that writing 
that page out to a local old, slow IDE disk. Clusters could even have
special high-bandwidth, low latency networks that could be used for
remote paging.

In a perfect world, all nodes in a cluster would be able to dynamically 
share a pool of "cluster swap" space, so any locally available swap that
is not used could be utilized by other nodes in the cluster.

/ Dirk

Ingo Oeser wrote:
> On Thu, Aug 09, 2001 at 10:08:37AM +0100, Alan Cox wrote:
> > > what is the best/recommended way to do remote swapping via the network
> > > for diskless workstations or compute nodes in clusters in Linux 2.4?=20
> > > Last time i checked was linux 2.2, and there were some races related=20
> > > to network swapping back then. Has this been fixed for 2.4?
> >
> > The best answer probably is "don't". Networks are high latency things for
> > paging and paging is latency sensitive. If performance is not an issue then
> > the nbd driver ought to work. You may need to check it uses the right
> > GFP_ levels to avoid deadlocks and you might need to up the amount of atomic
> > pool memory. Hopefully other hacks arent needed
> 
> While we are on it: I have an old machine with 64MB of RAM and a
> new, fast machine with 1GB of RAM.
> 
> Sometimes I need more RAM on the old one and asked myself,
> whether I could first swap over network to the other one, into
> its tmpfs, before digging into real swap on a hard disk.
> 
> I have only three machines attached to this small internal
> 100Mbit LAN.
> 
> Both machines use Kernel 2.4.x.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
  2001-08-09  9:08 ` Alan Cox
@ 2001-08-09 10:50   ` Ingo Oeser
  2001-08-09 13:12     ` Dirk W. Steinberg
  2001-08-09 20:47     ` Rik van Riel
  2001-08-09 14:17   ` Dirk W. Steinberg
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 32+ messages in thread
From: Ingo Oeser @ 2001-08-09 10:50 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm

On Thu, Aug 09, 2001 at 10:08:37AM +0100, Alan Cox wrote:
> > what is the best/recommended way to do remote swapping via the network
> > for diskless workstations or compute nodes in clusters in Linux 2.4?=20
> > Last time i checked was linux 2.2, and there were some races related=20
> > to network swapping back then. Has this been fixed for 2.4?
> 
> The best answer probably is "don't". Networks are high latency things for
> paging and paging is latency sensitive. If performance is not an issue then
> the nbd driver ought to work. You may need to check it uses the right
> GFP_ levels to avoid deadlocks and you might need to up the amount of atomic
> pool memory. Hopefully other hacks arent needed

While we are on it: I have an old machine with 64MB of RAM and a
new, fast machine with 1GB of RAM. 

Sometimes I need more RAM on the old one and asked myself,
whether I could first swap over network to the other one, into
its tmpfs, before digging into real swap on a hard disk.

I have only three machines attached to this small internal
100Mbit LAN.

Both machines use Kernel 2.4.x.

Are there any races I have to consider?

Thanks & Regards

Ingo Oeser
-- 
In der Wunschphantasie vieler Mann-Typen [ist die Frau] unsigned und
operatorvertraeglich. --- Dietz Proepper in dasr

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Swapping for diskless nodes
       [not found] <no.id>
@ 2001-08-09  9:08 ` Alan Cox
  2001-08-09 10:50   ` Ingo Oeser
                     ` (3 more replies)
  2001-08-09 15:14 ` Alan Cox
  2001-08-09 15:19 ` Alan Cox
  2 siblings, 4 replies; 32+ messages in thread
From: Alan Cox @ 2001-08-09  9:08 UTC (permalink / raw)
  To: Dirk W. Steinberg; +Cc: linux-kernel

> what is the best/recommended way to do remote swapping via the network
> for diskless workstations or compute nodes in clusters in Linux 2.4?=20
> Last time i checked was linux 2.2, and there were some races related=20
> to network swapping back then. Has this been fixed for 2.4?

The best answer probably is "don't". Networks are high latency things for
paging and paging is latency sensitive. If performance is not an issue then
the nbd driver ought to work. You may need to check it uses the right
GFP_ levels to avoid deadlocks and you might need to up the amount of atomic
pool memory. Hopefully other hacks arent needed

[The general case of network swap is basically insoluble but its possible to
 make it perfectly usable as Sun proved]

Alan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Swapping for diskless nodes
@ 2001-08-09  8:51 Dirk W. Steinberg
  0 siblings, 0 replies; 32+ messages in thread
From: Dirk W. Steinberg @ 2001-08-09  8:51 UTC (permalink / raw)
  To: linux-kernel

Hi,

what is the best/recommended way to do remote swapping via the network
for diskless workstations or compute nodes in clusters in Linux 2.4? 
Last time i checked was linux 2.2, and there were some races related 
to network swapping back then. Has this been fixed for 2.4?

What about the following options: Do they work at all? What are the advantages/
disadvantages? What are the performance implications? Race conditions?

1. Swapping via NFS? There was a patch for this for 2.2? Is there such a
   patch for 2.4 as well? Should one use UDP or TCP? NFSv2? NFSv3?

2. Using some sort of network block device (nbd, new nbd, gnbd, drbd, 
   possibly others?). Which one to use? I suspect that for performance
   a kernel mode implementation is needed for both client and server.

3. iSCSI. There are several implementations, and I don't know if any of 
   these is ready for production use. Both initiator and target 
   implementation would be needed because I don't have any native iSCSI
   targets available.

4. Swapping to GFS? Is that possible? Even if GFS is based on gnbd, not FC?

5. Anything else? Maybe some implementation of network memory in the context
   of a cluster computing environment (MOSIX, etc.).

Thanks for any answers.

Cheers,
	Dirk

------------------------------------------
Ingenieurbüro Dipl.-Ing. Dirk W. Steinberg
Ringstr. 2, D-53567 Buchholz, Germany
Phone: +49-2683-9793-20, fax: -29
Mobile/GSM: +49-170-818-9793
Email: dws@dirksteinberg.de

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2001-08-17 22:57 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-09 14:26 Swapping for diskless nodes Bulent Abali
2001-08-09 15:13 ` Alan Cox
2001-08-09 20:57   ` Rik van Riel
2001-08-09 22:46     ` Alan Cox
2001-08-11  1:16       ` Pavel Machek
2001-08-11  1:13   ` Pavel Machek
2001-08-14 12:57     ` Alan Cox
2001-08-16 21:46       ` Pavel Machek
2001-08-17  0:46         ` Rik van Riel
2001-08-17  1:35           ` Jakob Østergaard
2001-08-17 21:23             ` Pavel Machek
2001-08-17  6:42           ` Andreas Haumer
2001-08-17 21:25             ` Pavel Machek
2001-08-17 21:03           ` Andreas Haumer
2001-08-17 22:31             ` Dirk W. Steinberg
2001-08-17 22:57               ` Pavel Machek
     [not found] <no.id>
2001-08-09  9:08 ` Alan Cox
2001-08-09 10:50   ` Ingo Oeser
2001-08-09 13:12     ` Dirk W. Steinberg
2001-08-09 20:47     ` Rik van Riel
2001-08-09 14:17   ` Dirk W. Steinberg
2001-08-09 14:36     ` Andreas Haumer
2001-08-11  1:11       ` Pavel Machek
2001-08-09 19:27   ` Pavel Machek
2001-08-09 20:38   ` Rik van Riel
2001-08-09 15:14 ` Alan Cox
2001-08-11  1:17   ` Pavel Machek
2001-08-09 15:19 ` Alan Cox
2001-08-09 17:09   ` Eric W. Biederman
2001-08-09 20:58     ` Rik van Riel
2001-08-10  8:11       ` Eric W. Biederman
  -- strict thread matches above, loose matches on Subject: below --
2001-08-09  8:51 Dirk W. Steinberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).