linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Break 2.4 VM in five easy steps
@ 2001-06-06 15:31 Derek Glidden
  2001-06-06 15:46 ` John Alvord
  2001-06-06 21:30 ` Alan Cox
  0 siblings, 2 replies; 35+ messages in thread
From: Derek Glidden @ 2001-06-06 15:31 UTC (permalink / raw)
  To: Alexander Viro, linux-kernel


> Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD
> systems I've used were broken, but I've never thought that swap==2*RAM rule
> was one of them.

Yes, but Linux isn't 4.3BSD, SunOS or post-4.4 BSD.  Not to mention, all
other OS's I've had experience using *don't* break severely if you don't
follow the "swap==2*RAM" rule.  Except Linux 2.4.

> Not that being more kind on swap would be a bad thing, but that rule for
> amount of swap is pretty common. ISTR similar for (very old) SCO, so it's
> not just BSD world. How are modern Missed'em'V variants in that respect, BTW?

Yes, but that has traditionally been one of the big BENEFITS of Linux,
and other UNIXes.  As Sean Hunter said, "Virtual memory is one of the
killer features of
unix."  Linux has *never* in the past REQUIRED me to follow that rule. 
Which is a big reason I use it in so many places.

Take an example mentioned by someone on the list already: a laptop.  I
have two laptops that run Linux.  One has a 4GB disk, one has a 12GB
disk.  Both disks are VERY full of data and both machines get pretty
heavy use.  It's a fact that I just bumped one laptop (with 256MB of
swap configured) from 128MB to 256MB of RAM.  Does this mean that if I
want to upgrade to the 2.4 kernel on that machine I now have to back up
all that data, repartition the drive and restore everything just so I
can fastidiously follow the "swap == 2*RAM" rule else the 2.4 VM
subsystem will break?  Bollocks, to quote yet another participant in
this silly discussion.

I'm beginning to be amazed at the Linux VM hackers' attitudes regarding
this problem.  I expect this sort of behaviour from academics - ignoring
real actual problems being reported by real actual people really and
actually experiencing and reporting them because "technically" or
"theoretically" they "shouldn't be an issue" or because "the "literature
[documentation] says otherwise - but not from this group.  

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
@ 2001-06-07 20:33 Bulent Abali
  2001-06-07 19:40 ` Marcelo Tosatti
  2001-06-08 21:11 ` Marcelo Tosatti
  0 siblings, 2 replies; 35+ messages in thread
From: Bulent Abali @ 2001-06-07 20:33 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Mike Galbraith, Eric W. Biederman, Derek Glidden, lkml, linux-mm





>This is for the people who has been experiencing the lockups while running
>swapoff.
>
>Please test. (against 2.4.6-pre1)
>
>
>--- linux.orig/mm/swapfile.c Wed Jun  6 18:16:45 2001
>+++ linux/mm/swapfile.c Thu Jun  7 16:06:11 2001
>@@ -345,6 +345,8 @@
>         /*
>          * Find a swap page in use and read it in.
>          */
>+        if (current->need_resched)
>+             schedule();
>         swap_device_lock(si);
>         for (i = 1; i < si->max ; i++) {
>              if (si->swap_map[i] > 0 && si->swap_map[i] != SWAP_MAP_BAD)
{


I tested your patch against 2.4.5.  It works.  No more lockups.  Without
the
patch it took 14 minutes 51 seconds to complete swapoff (this is to recover
1.5GB of
swap space).  During this time the system was frozen.  No keyboard, no
screen, etc. Practically locked-up.

With the patch there are no more lockups. Swapoff kept running in the
background.
This is a winner.

But here is the caveat: swapoff keeps burning 100% of the cycles until it
completes.
This is not going to be a big deal during shutdowns.  Only when you enter
swapoff from
the command line it is going to be a problem.

I looked at try_to_unuse in swapfile.c.  I believe that the algorithm is
broken.
For each and every swap entry it is walking the entire process list
(for_each_task(p)).  It is also grabbing a whole bunch of locks
for each swap entry.  It might be worthwhile processing swap entries in
batches instead of one entry at a time.

In any case, I think having this patch is worthwhile as a quick and dirty
remedy.

Bulent Abali




^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
@ 2001-06-08 23:53 Bulent Abali
  0 siblings, 0 replies; 35+ messages in thread
From: Bulent Abali @ 2001-06-08 23:53 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Mike Galbraith, Eric W. Biederman, Derek Glidden, lkml, linux-mm


>> I looked at try_to_unuse in swapfile.c.  I believe that the algorithm is
>> broken.
>> For each and every swap entry it is walking the entire process list
>> (for_each_task(p)).  It is also grabbing a whole bunch of locks
>> for each swap entry.  It might be worthwhile processing swap entries in
>> batches instead of one entry at a time.
>>
>> In any case, I think having this patch is worthwhile as a quick and
dirty
>> remedy.
>
>Bulent,
>
>Could you please check if 2.4.6-pre2+the schedule patch has better
>swapoff behaviour for you?

No problem.  I will check it tomorrow. I don't think it can be any worse
than it is now.  The patch looks correct in principle.
I believe it should go in to 2.4.6.  But I will test it.

On small machines people don't notice it, but otherwise if you have few
GB of memory it really hurts.  Shutdowns take forever since swapoff takes
forever.





^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
@ 2001-06-09 20:32 Bulent Abali
  2001-06-10  2:12 ` Eric W. Biederman
  0 siblings, 1 reply; 35+ messages in thread
From: Bulent Abali @ 2001-06-09 20:32 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Mike Galbraith, Eric W. Biederman, Derek Glidden, lkml, linux-mm,
	Stephen Tweedie




>Bulent,
>
>Could you please check if 2.4.6-pre2+the schedule patch has better
>swapoff behaviour for you?

Marcelo,

It works as expected.  Doesn't lockup the box however swapoff keeps burning
the CPU cycles.  It took 4 1/2 minutes to swapoff about 256MB of swap
content.  Shutdown took just as long.  I was hoping that shutdown would
kill the swapoff process but it doesn't.  It just hangs there.  Shutdown
is the common case.  Therefore, swapoff needs to be optimized for
shutdowns.
You could imagine users frustration waiting for a shutdown when there are
gigabytes in the swap.

So to summarize, schedule patch is better than nothing but falls far short.
I would put it in 2.4.6.  Read on.

----------

The problem is with the try_to_unuse() algorithm which is very inefficient.
I searched the linux-mm archives and Tweedie was on to this. This is what
he wrote:  "it is much cheaper to find a swap entry for a given page than
to find the swap cache page for a given swap entry." And he posted a
patch http://mail.nl.linux.org/linux-mm/2001-03/msg00224.html
His patch is in the Redhat 7.1 kernel 2.4.2-2 and not in 2.4.5.

But in any case I believe the patch will not work as expected.
It seems to me that he is calling the function check_orphaned_swap(page)
in the wrong place.  He is calling the function while scanning the
active_list in refill_inactive_scan().  The problem with that is if you
wait
60 seconds or longer the orphaned swap pages will move from active
to inactive lists. Therefore the function will miss the orphans in inactive
lists.  Any comments?




^ permalink raw reply	[flat|nested] 35+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
@ 2001-06-10 13:56 Bulent Abali
  0 siblings, 0 replies; 35+ messages in thread
From: Bulent Abali @ 2001-06-10 13:56 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Marcelo Tosatti, Mike Galbraith, Derek Glidden, lkml, linux-mm,
	Stephen Tweedie



>The fix is to kill the dead/orphaned swap pages before we get to
>swapoff.  At shutdown time there is practically nothing active in
> ...
>Once the dead swap pages problem is fixed it is time to optimize
>swapoff.

I think fixing the orphaned swap pages problem will eliminate the
problem all together.  Probably there is no need to optimize
swapoff.

Because as the system is shutting down all the processes will be
killed and their pages in swap will be orphaned. If those pages
were to be reaped in a timely manner there wouldn't be any work
left for swapoff.

Bulent



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2001-06-11  9:19 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-06 15:31 Break 2.4 VM in five easy steps Derek Glidden
2001-06-06 15:46 ` John Alvord
2001-06-06 15:58   ` Derek Glidden
2001-06-06 18:27     ` Eric W. Biederman
2001-06-06 18:47       ` Derek Glidden
2001-06-06 18:52         ` Eric W. Biederman
2001-06-06 19:06           ` Mike Galbraith
2001-06-06 19:28             ` Eric W. Biederman
2001-06-07  4:32               ` Mike Galbraith
2001-06-07  6:38                 ` Eric W. Biederman
2001-06-07  7:28                   ` Mike Galbraith
2001-06-07  7:59                     ` Eric W. Biederman
2001-06-07  8:15                       ` Mike Galbraith
2001-06-07 17:10                 ` Marcelo Tosatti
2001-06-07 17:43                   ` Please test: workaround to help swapoff behaviour Marcelo Tosatti
2001-06-06 19:28           ` Break 2.4 VM in five easy steps Derek Glidden
2001-06-09  7:55           ` Rik van Riel
2001-06-06 20:43       ` Daniel Phillips
2001-06-06 21:57       ` LA Walsh
2001-06-07  6:35         ` Eric W. Biederman
2001-06-07 15:25           ` LA Walsh
2001-06-07 16:42             ` Eric W. Biederman
2001-06-07 20:47               ` LA Walsh
2001-06-08 19:38                 ` Pavel Machek
2001-06-09  7:34     ` Rik van Riel
2001-06-06 21:30 ` Alan Cox
2001-06-06 21:57   ` Derek Glidden
2001-06-09  8:09     ` Rik van Riel
2001-06-07 20:33 Please test: workaround to help swapoff behaviour Bulent Abali
2001-06-07 19:40 ` Marcelo Tosatti
2001-06-08 21:11 ` Marcelo Tosatti
2001-06-08 23:53 Bulent Abali
2001-06-09 20:32 Bulent Abali
2001-06-10  2:12 ` Eric W. Biederman
2001-06-10 13:56 Bulent Abali

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).