* Software suspend testing in 2.6.0-test1 @ 2003-07-17 19:46 Peter Osterlund 2003-07-17 20:00 ` Pavel Machek 0 siblings, 1 reply; 26+ messages in thread From: Peter Osterlund @ 2003-07-17 19:46 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Pavel Machek, Andrew Morton Hi, I have done some testing of the software suspend function in 2.6.0-test1. It works mostly very well, but I have found two problems. The first problem is that software suspend fails if a process is stopped before you invoke suspend. (For example, by starting cat from the shell and pressing ctrl-z.) When the processes are woken up again, the cat process is stuck in the schedule loop in refrigerator(), sucking up all available cpu time. The second problem is that freeing memory seems to be much slower than it has to be. It appears to be caused by the call to blk_congestion_wait() in balance_pgdat(). The patch below makes page freeing much faster, although I'm quite sure the patch is not correct. How can we fix this properly? The disk is mostly idle during page freeing, but it looks like blk_congestion_wait still doesn't return until the timeout expires. I tried HZ/2 and that made the page freeing extremely slow. --- linux/mm/vmscan.c.old Thu Jul 17 21:30:09 2003 +++ linux/mm/vmscan.c Thu Jul 17 21:29:58 2003 @@ -930,7 +930,7 @@ } if (all_zones_ok) break; - blk_congestion_wait(WRITE, HZ/10); + blk_congestion_wait(WRITE, HZ/50); } return nr_pages - to_free; } -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-17 19:46 Software suspend testing in 2.6.0-test1 Peter Osterlund @ 2003-07-17 20:00 ` Pavel Machek 2003-07-17 20:09 ` Andrew Morton 2003-07-21 10:00 ` Peter Osterlund 0 siblings, 2 replies; 26+ messages in thread From: Pavel Machek @ 2003-07-17 20:00 UTC (permalink / raw) To: Peter Osterlund; +Cc: Kernel Mailing List, Andrew Morton Hi! > I have done some testing of the software suspend function in > 2.6.0-test1. It works mostly very well, but I have found two problems. > > The first problem is that software suspend fails if a process is > stopped before you invoke suspend. (For example, by starting cat from > the shell and pressing ctrl-z.) When the processes are woken up again, > the cat process is stuck in the schedule loop in refrigerator(), > sucking up all available cpu time. Thanks for a report. If it came with a patch it would be better ;-). I'll take a look. > The second problem is that freeing memory seems to be much slower than > it has to be. It appears to be caused by the call to > blk_congestion_wait() in balance_pgdat(). The patch below makes page > freeing much faster, although I'm quite sure the patch is not correct. > > How can we fix this properly? The disk is mostly idle during page > freeing, but it looks like blk_congestion_wait still doesn't return > until the timeout expires. I tried HZ/2 and that made the page freeing > extremely slow. > > --- linux/mm/vmscan.c.old Thu Jul 17 21:30:09 2003 > +++ linux/mm/vmscan.c Thu Jul 17 21:29:58 2003 > @@ -930,7 +930,7 @@ > } > if (all_zones_ok) > break; > - blk_congestion_wait(WRITE, HZ/10); > + blk_congestion_wait(WRITE, HZ/50); > } > return nr_pages - to_free; > } This is certainly not okay. Andrew, you know more about vm internals... What does this ugly constant mean? Would it be possible to somehow make it variable and set it to zero during software suspend memory freeing? Pavel -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-17 20:00 ` Pavel Machek @ 2003-07-17 20:09 ` Andrew Morton 2003-07-18 9:59 ` Peter Osterlund 2003-07-21 10:00 ` Peter Osterlund 1 sibling, 1 reply; 26+ messages in thread From: Andrew Morton @ 2003-07-17 20:09 UTC (permalink / raw) To: Pavel Machek; +Cc: petero2, linux-kernel Pavel Machek <pavel@ucw.cz> wrote: > > > --- linux/mm/vmscan.c.old Thu Jul 17 21:30:09 2003 > > +++ linux/mm/vmscan.c Thu Jul 17 21:29:58 2003 > > @@ -930,7 +930,7 @@ > > } > > if (all_zones_ok) > > break; > > - blk_congestion_wait(WRITE, HZ/10); > > + blk_congestion_wait(WRITE, HZ/50); > > } > > return nr_pages - to_free; > > } > > This is certainly not okay. Andrew, you know more about vm > internals... What does this ugly constant mean? Most of the time the timeout is a "can't happen" - blk_congestion_wait() is terminated by completion of writeout. The timeout is mainly there to prevent hangs if weird and rare races happen. Otherwise we'd need lots more locking. I don't think we want to be calling it at all if reclaim is working well. Something like this. diff -puN mm/vmscan.c~a mm/vmscan.c --- 25/mm/vmscan.c~a Thu Jul 17 13:05:36 2003 +++ 25-akpm/mm/vmscan.c Thu Jul 17 13:05:58 2003 @@ -921,7 +921,7 @@ static int balance_pgdat(pg_data_t *pgda if (i < ZONE_HIGHMEM) { reclaim_state->reclaimed_slab = 0; shrink_slab(max_scan + nr_mapped, GFP_KERNEL); - to_free += reclaim_state->reclaimed_slab; + to_free -= reclaim_state->reclaimed_slab; } if (zone->all_unreclaimable) continue; @@ -930,7 +930,8 @@ static int balance_pgdat(pg_data_t *pgda } if (all_zones_ok) break; - blk_congestion_wait(WRITE, HZ/10); + if (to_free) + blk_congestion_wait(WRITE, HZ/10); } return nr_pages - to_free; } _ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-17 20:09 ` Andrew Morton @ 2003-07-18 9:59 ` Peter Osterlund 2003-07-18 10:24 ` Andrew Morton 0 siblings, 1 reply; 26+ messages in thread From: Peter Osterlund @ 2003-07-18 9:59 UTC (permalink / raw) To: Andrew Morton; +Cc: Pavel Machek, linux-kernel Andrew Morton <akpm@osdl.org> writes: > Pavel Machek <pavel@ucw.cz> wrote: > > > > > --- linux/mm/vmscan.c.old Thu Jul 17 21:30:09 2003 > > > +++ linux/mm/vmscan.c Thu Jul 17 21:29:58 2003 > > > @@ -930,7 +930,7 @@ > > > } > > > if (all_zones_ok) > > > break; > > > - blk_congestion_wait(WRITE, HZ/10); > > > + blk_congestion_wait(WRITE, HZ/50); > > > } > > > return nr_pages - to_free; > > > } > > > > This is certainly not okay. Andrew, you know more about vm > > internals... What does this ugly constant mean? > > Most of the time the timeout is a "can't happen" - blk_congestion_wait() is > terminated by completion of writeout. The timeout is mainly there to > prevent hangs if weird and rare races happen. Otherwise we'd need lots > more locking. > > I don't think we want to be calling it at all if reclaim is working well. > Something like this. ... > - blk_congestion_wait(WRITE, HZ/10); > + if (to_free) > + blk_congestion_wait(WRITE, HZ/10); It doesn't make any difference in my tests. to_free is very large when called from the suspend code, so it almost never becomes zero. With the patch below, page freeing is very fast on my computer. The idea is to not call blk_congestion_wait at all until we can't make any more progress. Then make the call, try again, and if we still can't make any more progress, give up. I've tested this patch in two scenarios. The first is to boot rh7.3 on a 256MB computer to runlevel 3, log in and start emacs, then immediately suspend. The old code could easily take 15s while the new code runs in ~1s. The second test case was to also start a program that allocates 200MB and goes to sleep. The new code swaps out all data and the disk light is turned on constantly. The old code swapped out some data (the disk light was not turned on constantly), then gave up and failed to suspend. (I only tested the old code once. Maybe this was just "bad luck".) If this patch is an acceptable approach to fix the problem, the balance_pgdat function should probably be cleaned up. We either call it with nr_pages=0 (from kswapd) or nr_pages="big number" (from shrink_all_memory). Maybe we should replace the nr_pages parameter with a boolean parameter, or just split the function in two. It should be possible to make some simplifications for the "free as much as possible" case. What do you think? diff -u -r orig/linux-p1/kernel/suspend.c linux-p1/kernel/suspend.c --- orig/linux-p1/kernel/suspend.c Sat Jul 12 00:52:16 2003 +++ linux-p1/kernel/suspend.c Fri Jul 18 11:30:29 2003 @@ -621,9 +621,20 @@ */ static void free_some_memory(void) { + int state = 0; + printk("Freeing memory: "); - while (shrink_all_memory(10000)) + for (;;) { + if (shrink_all_memory(10000)) { + state = 0; + } else { + state++; + if (state > 1) + break; + blk_congestion_wait(WRITE, HZ/5); + } printk("."); + } printk("|\n"); } diff -u -r orig/linux-p1/mm/vmscan.c linux-p1/mm/vmscan.c --- orig/linux-p1/mm/vmscan.c Mon Jul 14 09:07:43 2003 +++ linux-p1/mm/vmscan.c Fri Jul 18 11:30:38 2003 @@ -921,7 +921,7 @@ if (i < ZONE_HIGHMEM) { reclaim_state->reclaimed_slab = 0; shrink_slab(max_scan + nr_mapped, GFP_KERNEL); - to_free += reclaim_state->reclaimed_slab; + to_free -= reclaim_state->reclaimed_slab; } if (zone->all_unreclaimable) continue; @@ -930,7 +930,8 @@ } if (all_zones_ok) break; - blk_congestion_wait(WRITE, HZ/10); + if (nr_pages == 0) + blk_congestion_wait(WRITE, HZ/10); } return nr_pages - to_free; } -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 9:59 ` Peter Osterlund @ 2003-07-18 10:24 ` Andrew Morton 2003-07-18 15:22 ` Pavel Machek 0 siblings, 1 reply; 26+ messages in thread From: Andrew Morton @ 2003-07-18 10:24 UTC (permalink / raw) To: Peter Osterlund; +Cc: pavel, linux-kernel Peter Osterlund <petero2@telia.com> wrote: > > If this patch is an acceptable approach to fix the problem, Seems reasonable. > the balance_pgdat function should probably be cleaned up. Well it was rather bolted on the side of the kswapd code. But from an API perspective, being able to tell it how many page to free is a bit more flexible. Minor point. However I'm trying to remember why the code exists at all. Why doesn't swsusp just allocate lots of pages then free them again? Something like: LIST_HEAD(list); int sleep_count = 0; while (sleep_count < 10) { page = __alloc_pages(0, GFP_ATOMIC); if (page) { list_add(&page->list, &list); } else { blk_congestion_wait(WRITE, HZ/20); sleep_count++; } } <free all the pages on `list'> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 10:24 ` Andrew Morton @ 2003-07-18 15:22 ` Pavel Machek 2003-07-18 15:55 ` Peter Osterlund 0 siblings, 1 reply; 26+ messages in thread From: Pavel Machek @ 2003-07-18 15:22 UTC (permalink / raw) To: Andrew Morton; +Cc: Peter Osterlund, linux-kernel Hi! > > If this patch is an acceptable approach to fix the problem, > > Seems reasonable. > > > the balance_pgdat function should probably be cleaned up. > > Well it was rather bolted on the side of the kswapd code. But from an API > perspective, being able to tell it how many page to free is a bit more > flexible. Minor point. > > However I'm trying to remember why the code exists at all. Why doesn't > swsusp just allocate lots of pages then free them again? Because that either a) does not free enough pages or b) triggers OOM killer. It was actually your idea, IIRC ;-). Ahha, you seem to be addressing that in your code. Peter, perhaps you want to test that one? Pavel > Something like: > > LIST_HEAD(list); > int sleep_count = 0; > > while (sleep_count < 10) { > page = __alloc_pages(0, GFP_ATOMIC); > if (page) { > list_add(&page->list, &list); > } else { > blk_congestion_wait(WRITE, HZ/20); > sleep_count++; > } > } > <free all the pages on `list'> -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 15:22 ` Pavel Machek @ 2003-07-18 15:55 ` Peter Osterlund 2003-07-18 16:45 ` Andrew Morton 0 siblings, 1 reply; 26+ messages in thread From: Peter Osterlund @ 2003-07-18 15:55 UTC (permalink / raw) To: Pavel Machek; +Cc: Andrew Morton, linux-kernel Pavel Machek <pavel@ucw.cz> writes: > > However I'm trying to remember why the code exists at all. Why doesn't > > swsusp just allocate lots of pages then free them again? > > Because that either > > a) does not free enough pages or > > b) triggers OOM killer. > > It was actually your idea, IIRC ;-). > > Ahha, you seem to be addressing that in your code. Peter, perhaps you > want to test that one? I tried the patch below, but it didn't work. Nothing (or very little) was swapped out to disk. I also tried using GFP_KERNEL, but that seemed to cause a deadlock. (Maybe it would have gone OOM if I had waited long enough). I think the problem is that pdflush and friends are already frozen when this code runs. --- linux/kernel/suspend.c.old Fri Jul 18 15:46:48 2003 +++ linux/kernel/suspend.c Fri Jul 18 15:45:51 2003 @@ -621,10 +621,32 @@ */ static void free_some_memory(void) { - printk("Freeing memory: "); - while (shrink_all_memory(10000)) - printk("."); + LIST_HEAD(list); + struct page *page, *tmp; + int sleep_count = 0; + int i = 0; + + while (sleep_count < 10) { + page = alloc_page(GFP_ATOMIC); + if (page) { + list_add(&page->list, &list); + sleep_count = 0; + } else { + blk_congestion_wait(WRITE, HZ/20); + sleep_count++; + } + i++; + if (!(i%1000)) + printk("."); + } printk("|\n"); + + i = 0; + list_for_each_entry_safe(page, tmp, &list, list) { + __free_page(page); + i++; + } + printk("%d pages freed\n", i); } /* Make disk drivers accept operations, again */ -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 15:55 ` Peter Osterlund @ 2003-07-18 16:45 ` Andrew Morton 2003-07-18 17:50 ` Pavel Machek ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Andrew Morton @ 2003-07-18 16:45 UTC (permalink / raw) To: Peter Osterlund; +Cc: pavel, linux-kernel Peter Osterlund <petero2@telia.com> wrote: > > I tried the patch below, but it didn't work. Nothing (or very little) > was swapped out to disk. I also tried using GFP_KERNEL, but that > seemed to cause a deadlock. (Maybe it would have gone OOM if I had > waited long enough). I think the problem is that pdflush and friends > are already frozen when this code runs. Oh, we shouldn't be doing this sort of thing when the kernel threads are refrigerated. We do need kswapd services for the trick you tried. And all flavours of ext3_writepage() can block on kjournald activity, so if kjournald is refrigerated during the memory shrink the machine can deadlock. It would be much better to freeze kernel threads _after_ doing the big memory shrink. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 16:45 ` Andrew Morton @ 2003-07-18 17:50 ` Pavel Machek 2003-07-18 18:02 ` Patrick Mochel 2003-07-18 19:37 ` Andrew Morton 2003-07-18 19:58 ` Peter Osterlund 2003-07-18 21:05 ` Nigel Cunningham 2 siblings, 2 replies; 26+ messages in thread From: Pavel Machek @ 2003-07-18 17:50 UTC (permalink / raw) To: Andrew Morton; +Cc: Peter Osterlund, linux-kernel Hi! > > I tried the patch below, but it didn't work. Nothing (or very little) > > was swapped out to disk. I also tried using GFP_KERNEL, but that > > seemed to cause a deadlock. (Maybe it would have gone OOM if I had > > waited long enough). I think the problem is that pdflush and friends > > are already frozen when this code runs. > > Oh, we shouldn't be doing this sort of thing when the kernel threads are > refrigerated. We do need kswapd services for the trick you tried. > > And all flavours of ext3_writepage() can block on kjournald activity, so if > kjournald is refrigerated during the memory shrink the machine can deadlock. > > It would be much better to freeze kernel threads _after_ doing the big > memory shrink. I wanted to avoid that: we do want user threads refrigerated at that point so that we know noone is allocating memory as we are trying to do memory shrink. I'd like to avoid having refrigerator run in two phases.... Pavel -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 17:50 ` Pavel Machek @ 2003-07-18 18:02 ` Patrick Mochel 2003-07-18 18:04 ` Pavel Machek 2003-07-18 19:37 ` Andrew Morton 1 sibling, 1 reply; 26+ messages in thread From: Patrick Mochel @ 2003-07-18 18:02 UTC (permalink / raw) To: Pavel Machek; +Cc: Andrew Morton, Peter Osterlund, linux-kernel > I wanted to avoid that: we do want user threads refrigerated at that > point so that we know noone is allocating memory as we are trying to > do memory shrink. I'd like to avoid having refrigerator run in two > phases.... But we should be the only process running, and we can guarantee that by not sleeping and doing preempt_disable() when we begin. Especially if we start the refrigeration sequence after we shrink memory. Right? -pat ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 18:02 ` Patrick Mochel @ 2003-07-18 18:04 ` Pavel Machek 2003-07-18 21:05 ` Nigel Cunningham 0 siblings, 1 reply; 26+ messages in thread From: Pavel Machek @ 2003-07-18 18:04 UTC (permalink / raw) To: Patrick Mochel; +Cc: Andrew Morton, Peter Osterlund, linux-kernel Hi! > > I wanted to avoid that: we do want user threads refrigerated at that > > point so that we know noone is allocating memory as we are trying to > > do memory shrink. I'd like to avoid having refrigerator run in two > > phases.... > > But we should be the only process running, and we can guarantee that by > not sleeping and doing preempt_disable() when we begin. Especially > if we start the refrigeration sequence after we shrink > memory. Right? If we refrigerate after we shrink, userspace can allocate everything just after shrink. Pavel -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 18:04 ` Pavel Machek @ 2003-07-18 21:05 ` Nigel Cunningham 0 siblings, 0 replies; 26+ messages in thread From: Nigel Cunningham @ 2003-07-18 21:05 UTC (permalink / raw) To: Pavel Machek Cc: Patrick Mochel, Andrew Morton, Peter Osterlund, Linux Kernel Mailing List Hi. In 2.4, I ended up fusing the freezer and shrinking memory together. The essence is: freeze all other processes while need to free memory unfreeze processes shrink memory until think we have enough freeze processes Regards, Nigel On Sat, 2003-07-19 at 06:04, Pavel Machek wrote: > Hi! > > > > I wanted to avoid that: we do want user threads refrigerated at that > > > point so that we know noone is allocating memory as we are trying to > > > do memory shrink. I'd like to avoid having refrigerator run in two > > > phases.... > > > > But we should be the only process running, and we can guarantee that by > > not sleeping and doing preempt_disable() when we begin. Especially > > if we start the refrigeration sequence after we shrink > > memory. Right? > > If we refrigerate after we shrink, userspace can allocate everything > just after shrink. > Pavel -- Nigel Cunningham 495 St Georges Road South, Hastings 4201, New Zealand You see, at just the right time, when we were still powerless, Christ died for the ungodly. -- Romans 5:6, NIV. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 17:50 ` Pavel Machek 2003-07-18 18:02 ` Patrick Mochel @ 2003-07-18 19:37 ` Andrew Morton 1 sibling, 0 replies; 26+ messages in thread From: Andrew Morton @ 2003-07-18 19:37 UTC (permalink / raw) To: Pavel Machek; +Cc: petero2, linux-kernel Pavel Machek <pavel@ucw.cz> wrote: > > > It would be much better to freeze kernel threads _after_ doing the big > > memory shrink. > > I wanted to avoid that: we do want user threads refrigerated at that > point so that we know noone is allocating memory as we are trying to > do memory shrink. freeze(threads which have task->mm) shrink_memory(); freeze(threads which have !task->mm) That's pretty simple. > I'd like to avoid having refrigerator run in two phases.... The whole thing can deadlock without kernel thread services. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 16:45 ` Andrew Morton 2003-07-18 17:50 ` Pavel Machek @ 2003-07-18 19:58 ` Peter Osterlund 2003-07-18 20:15 ` Andrew Morton 2003-07-18 21:05 ` Nigel Cunningham 2 siblings, 1 reply; 26+ messages in thread From: Peter Osterlund @ 2003-07-18 19:58 UTC (permalink / raw) To: Andrew Morton; +Cc: pavel, linux-kernel Andrew Morton <akpm@osdl.org> writes: > Peter Osterlund <petero2@telia.com> wrote: > > > > I tried the patch below, but it didn't work. Nothing (or very little) > > was swapped out to disk. I also tried using GFP_KERNEL, but that > > seemed to cause a deadlock. (Maybe it would have gone OOM if I had > > waited long enough). I think the problem is that pdflush and friends > > are already frozen when this code runs. > > Oh, we shouldn't be doing this sort of thing when the kernel threads are > refrigerated. We do need kswapd services for the trick you tried. > > And all flavours of ext3_writepage() can block on kjournald activity, so if > kjournald is refrigerated during the memory shrink the machine can deadlock. > > It would be much better to freeze kernel threads _after_ doing the big > memory shrink. The patch below works, but doesn't really solve anything, because it is just as slow as the original code. This is not surprising because it is still balance_pgdat() that does the page freeing, the only difference being that it is now called from kswapd which is woken up by the alloc_page() call. Note that I had to use HZ/5 in free_some_memory() or else the loop would terminate too early. The main problem seems to be that balance_pgdat() is too slow when freeing memory. The function can fail to free memory in the inner loop either because the disk is congested or because too few pages are scanned, but in both cases the function calls blk_congestion_wait(), and in the second case the disk may be idle and then blk_congestion_wait() doesn't return until the timeout expires. diff -u -r -N linux-p0/fs/jbd/journal.c linux-p1/fs/jbd/journal.c --- linux-p0/fs/jbd/journal.c Fri Jul 18 21:07:12 2003 +++ linux-p1/fs/jbd/journal.c Fri Jul 18 21:13:15 2003 @@ -132,6 +132,8 @@ daemonize("kjournald"); + current->flags |= PF_IOTHREAD; + /* Set up an interval timer which can be used to trigger a commit wakeup after the commit interval expires */ init_timer(&timer); diff -u -r -N linux-p0/kernel/suspend.c linux-p1/kernel/suspend.c --- linux-p0/kernel/suspend.c Fri Jul 18 21:08:45 2003 +++ linux-p1/kernel/suspend.c Fri Jul 18 21:13:18 2003 @@ -621,10 +621,32 @@ */ static void free_some_memory(void) { - printk("Freeing memory: "); - while (shrink_all_memory(10000)) - printk("."); + LIST_HEAD(list); + struct page *page, *tmp; + int sleep_count = 0; + int i = 0; + + while (sleep_count < 10) { + page = alloc_page(GFP_ATOMIC); + if (page) { + list_add(&page->list, &list); + sleep_count = 0; + } else { + blk_congestion_wait(WRITE, HZ/5); + sleep_count++; + } + i++; + if (!(i%1000)) + printk("."); + } printk("|\n"); + + i = 0; + list_for_each_entry_safe(page, tmp, &list, list) { + __free_page(page); + i++; + } + printk("%d pages freed\n", i); } /* Make disk drivers accept operations, again */ diff -u -r -N linux-p0/mm/pdflush.c linux-p1/mm/pdflush.c --- linux-p0/mm/pdflush.c Fri Jul 18 21:08:47 2003 +++ linux-p1/mm/pdflush.c Fri Jul 18 21:13:20 2003 @@ -88,7 +88,7 @@ { daemonize("pdflush"); - current->flags |= PF_FLUSHER; + current->flags |= PF_FLUSHER | PF_IOTHREAD; my_work->fn = NULL; my_work->who = current; INIT_LIST_HEAD(&my_work->list); diff -u -r -N linux-p0/mm/vmscan.c linux-p1/mm/vmscan.c --- linux-p0/mm/vmscan.c Fri Jul 18 21:08:47 2003 +++ linux-p1/mm/vmscan.c Fri Jul 18 21:13:22 2003 @@ -921,7 +921,7 @@ if (i < ZONE_HIGHMEM) { reclaim_state->reclaimed_slab = 0; shrink_slab(max_scan + nr_mapped, GFP_KERNEL); - to_free += reclaim_state->reclaimed_slab; + to_free -= reclaim_state->reclaimed_slab; } if (zone->all_unreclaimable) continue; @@ -976,10 +976,11 @@ * us from recursively trying to free more memory as we're * trying to free the first piece of memory in the first place). */ - tsk->flags |= PF_MEMALLOC|PF_KSWAPD; + tsk->flags |= PF_MEMALLOC|PF_KSWAPD|PF_IOTHREAD; for ( ; ; ) { struct page_state ps; + int freed; if (current->flags & PF_FREEZE) refrigerator(PF_IOTHREAD); @@ -987,7 +988,8 @@ schedule(); finish_wait(&pgdat->kswapd_wait, &wait); get_page_state(&ps); - balance_pgdat(pgdat, 0, &ps); + freed = balance_pgdat(pgdat, 0, &ps); + printk("kswapd: freed %d pages\n", freed); } } -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 19:58 ` Peter Osterlund @ 2003-07-18 20:15 ` Andrew Morton 2003-07-18 22:13 ` Pavel Machek 2003-07-20 0:22 ` Peter Osterlund 0 siblings, 2 replies; 26+ messages in thread From: Andrew Morton @ 2003-07-18 20:15 UTC (permalink / raw) To: Peter Osterlund; +Cc: pavel, linux-kernel Peter Osterlund <petero2@telia.com> wrote: > > The patch below works, but doesn't really solve anything, because it > is just as slow as the original code. That's OK - it's a step in the right direction. > This is not surprising because > it is still balance_pgdat() that does the page freeing, the only > difference being that it is now called from kswapd which is woken up > by the alloc_page() call. Yup. Probaby we should not be throttling kswapd if it is making good progress, but this needs broad testing and thought. Something like this, untested: --- 25/mm/vmscan.c~less-kswapd-throttling Thu Jul 17 15:35:09 2003 +++ 25-akpm/mm/vmscan.c Thu Jul 17 15:35:12 2003 @@ -930,7 +930,8 @@ static int balance_pgdat(pg_data_t *pgda } if (all_zones_ok) break; - blk_congestion_wait(WRITE, HZ/10); + if (to_free) + blk_congestion_wait(WRITE, HZ/10); } return nr_pages - to_free; } > Note that I had to use HZ/5 in free_some_memory() or else the loop > would terminate too early. The main problem seems to be that > balance_pgdat() is too slow when freeing memory. The function can fail > to free memory in the inner loop either because the disk is congested > or because too few pages are scanned, but in both cases the function > calls blk_congestion_wait(), and in the second case the disk may be > idle and then blk_congestion_wait() doesn't return until the timeout > expires. hm, perhaps because kswapd is throttled rather than doing work. Try the above change, see if you still need the extended timeout in free_some_memory(). Also, play around with using __GFP_WAIT|__GFP_HIGH|__GFP_NORETRY rather than GFP_ATOMIC. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 20:15 ` Andrew Morton @ 2003-07-18 22:13 ` Pavel Machek 2003-07-20 0:22 ` Peter Osterlund 1 sibling, 0 replies; 26+ messages in thread From: Pavel Machek @ 2003-07-18 22:13 UTC (permalink / raw) To: Andrew Morton; +Cc: Peter Osterlund, linux-kernel Hi! > > The patch below works, but doesn't really solve anything, because it > > is just as slow as the original code. > > That's OK - it's a step in the right direction. Just beware: It is *not* okay (for production) to set kflushd as a PF_IOTHREAD. Its okay for testing through. Pavel -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 20:15 ` Andrew Morton 2003-07-18 22:13 ` Pavel Machek @ 2003-07-20 0:22 ` Peter Osterlund 2003-07-20 1:01 ` Andrew Morton 1 sibling, 1 reply; 26+ messages in thread From: Peter Osterlund @ 2003-07-20 0:22 UTC (permalink / raw) To: Andrew Morton; +Cc: pavel, linux-kernel Andrew Morton <akpm@osdl.org> writes: > Yup. Probaby we should not be throttling kswapd if it is making good > progress, but this needs broad testing and thought. Something like this, > untested: > > --- 25/mm/vmscan.c~less-kswapd-throttling Thu Jul 17 15:35:09 2003 > +++ 25-akpm/mm/vmscan.c Thu Jul 17 15:35:12 2003 > @@ -930,7 +930,8 @@ static int balance_pgdat(pg_data_t *pgda > } > if (all_zones_ok) > break; > - blk_congestion_wait(WRITE, HZ/10); > + if (to_free) > + blk_congestion_wait(WRITE, HZ/10); > } > return nr_pages - to_free; > } > > > > Note that I had to use HZ/5 in free_some_memory() or else the loop > > would terminate too early. The main problem seems to be that > > balance_pgdat() is too slow when freeing memory. The function can fail > > to free memory in the inner loop either because the disk is congested > > or because too few pages are scanned, but in both cases the function > > calls blk_congestion_wait(), and in the second case the disk may be > > idle and then blk_congestion_wait() doesn't return until the timeout > > expires. > > hm, perhaps because kswapd is throttled rather than doing work. > > Try the above change, see if you still need the extended timeout in > free_some_memory(). I have tried the change, but the writeout is still very slow. (Maybe somewhat faster than the original code, but far from being limited by disk bandwidth.) It turns out the extended timeout in free_some_memory() is needed to handle things like disk spin up time and thermal recalibration. Ideally there should not be a timeout at all in that function. It should wait until the congestion goes away or until all I/O is finished. But that's probably much harder to implement. Is there some global counter somewhere that can be used to figure out how many pages are currently being on their way to the disk? > Also, play around with using __GFP_WAIT|__GFP_HIGH|__GFP_NORETRY rather > than GFP_ATOMIC. I did, without much success. One thing that does seem to work though, is to only throttle in balance_pgdat() if we have called writepage() since the last throttling, like in the patch below. (ugly, writepage_called should not be a static variable.) With that patch I get full disk bandwidth speed during page freeing. Another disturbing thing I've seen is that on one of my machines, the suspend process sometimes fails to write anything to disk. When this happens, the suspend will fail if not enough memory is free, and the machine hangs shortly afterwards. I've seen this with plain 2.6.0-test1 and with all patches I have tested, with and without preempt. On another laptop, I have never seen this problem. (I'll try to debug this further, but it is complicated by the fact that the problematic laptop doesn't have a serial port.) diff -u -r orig/linux-page/fs/jbd/journal.c linux-page/fs/jbd/journal.c --- orig/linux-page/fs/jbd/journal.c Fri Jul 18 21:07:12 2003 +++ linux-page/fs/jbd/journal.c Sun Jul 20 01:08:56 2003 @@ -132,6 +132,8 @@ daemonize("kjournald"); + current->flags |= PF_IOTHREAD; + /* Set up an interval timer which can be used to trigger a commit wakeup after the commit interval expires */ init_timer(&timer); diff -u -r orig/linux-page/kernel/suspend.c linux-page/kernel/suspend.c --- orig/linux-page/kernel/suspend.c Sat Jul 19 21:07:14 2003 +++ linux-page/kernel/suspend.c Sun Jul 20 01:17:33 2003 @@ -623,10 +623,32 @@ */ static void free_some_memory(void) { - printk("Freeing memory: "); - while (shrink_all_memory(10000)) - printk("."); + LIST_HEAD(list); + struct page *page, *tmp; + int sleep_count = 0; + int i = 0; + + while (sleep_count < 10) { + page = alloc_page(__GFP_HIGH | __GFP_NOWARN); + if (page) { + list_add(&page->list, &list); + sleep_count = 0; + } else { + blk_congestion_wait(WRITE, HZ/5); + sleep_count++; + } + i++; + if (!(i%1000)) + printk("."); + } printk("|\n"); + + i = 0; + list_for_each_entry_safe(page, tmp, &list, list) { + __free_page(page); + i++; + } + printk("%d pages freed\n", i); } /* Make disk drivers accept operations, again */ diff -u -r orig/linux-page/mm/pdflush.c linux-page/mm/pdflush.c --- orig/linux-page/mm/pdflush.c Fri Jul 18 21:08:47 2003 +++ linux-page/mm/pdflush.c Sun Jul 20 01:08:56 2003 @@ -88,7 +88,7 @@ { daemonize("pdflush"); - current->flags |= PF_FLUSHER; + current->flags |= PF_FLUSHER | PF_IOTHREAD; my_work->fn = NULL; my_work->who = current; INIT_LIST_HEAD(&my_work->list); diff -u -r orig/linux-page/mm/vmscan.c linux-page/mm/vmscan.c --- orig/linux-page/mm/vmscan.c Fri Jul 18 21:08:47 2003 +++ linux-page/mm/vmscan.c Sun Jul 20 01:40:24 2003 @@ -49,6 +49,8 @@ int vm_swappiness = 60; static long total_memory; +static int writepage_called; + #ifdef ARCH_HAS_PREFETCH #define prefetch_prev_lru_page(_page, _base, _field) \ do { \ @@ -339,6 +341,7 @@ SetPageReclaim(page); res = mapping->a_ops->writepage(page, &wbc); + writepage_called = 1; if (res == WRITEPAGE_ACTIVATE) { ClearPageReclaim(page); @@ -894,6 +897,8 @@ for (priority = DEF_PRIORITY; priority; priority--) { int all_zones_ok = 1; + writepage_called = 0; + for (i = 0; i < pgdat->nr_zones; i++) { struct zone *zone = pgdat->node_zones + i; int nr_mapped = 0; @@ -921,7 +926,7 @@ if (i < ZONE_HIGHMEM) { reclaim_state->reclaimed_slab = 0; shrink_slab(max_scan + nr_mapped, GFP_KERNEL); - to_free += reclaim_state->reclaimed_slab; + to_free -= reclaim_state->reclaimed_slab; } if (zone->all_unreclaimable) continue; @@ -930,7 +935,9 @@ } if (all_zones_ok) break; - blk_congestion_wait(WRITE, HZ/10); +// if (to_free) + if (writepage_called) + blk_congestion_wait(WRITE, HZ/10); } return nr_pages - to_free; } @@ -976,10 +983,11 @@ * us from recursively trying to free more memory as we're * trying to free the first piece of memory in the first place). */ - tsk->flags |= PF_MEMALLOC|PF_KSWAPD; + tsk->flags |= PF_MEMALLOC|PF_KSWAPD|PF_IOTHREAD; for ( ; ; ) { struct page_state ps; + int freed; if (current->flags & PF_FREEZE) refrigerator(PF_IOTHREAD); @@ -987,7 +995,8 @@ schedule(); finish_wait(&pgdat->kswapd_wait, &wait); get_page_state(&ps); - balance_pgdat(pgdat, 0, &ps); + freed = balance_pgdat(pgdat, 0, &ps); +// printk("kswapd: freed %d pages\n", freed); } } -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-20 0:22 ` Peter Osterlund @ 2003-07-20 1:01 ` Andrew Morton 2003-07-20 7:45 ` Peter Osterlund 0 siblings, 1 reply; 26+ messages in thread From: Andrew Morton @ 2003-07-20 1:01 UTC (permalink / raw) To: Peter Osterlund; +Cc: pavel, linux-kernel Peter Osterlund <petero2@telia.com> wrote: > > I have tried the change, but the writeout is still very slow. (Maybe > somewhat faster than the original code, but far from being limited by > disk bandwidth.) Did you fix swsusp to leave kswapd unrefrigerated during the shrink? If not, the change wouldn't make any difference. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-20 1:01 ` Andrew Morton @ 2003-07-20 7:45 ` Peter Osterlund 0 siblings, 0 replies; 26+ messages in thread From: Peter Osterlund @ 2003-07-20 7:45 UTC (permalink / raw) To: Andrew Morton; +Cc: pavel, linux-kernel Andrew Morton <akpm@osdl.org> writes: > Peter Osterlund <petero2@telia.com> wrote: > > > > I have tried the change, but the writeout is still very slow. (Maybe > > somewhat faster than the original code, but far from being limited by > > disk bandwidth.) > > Did you fix swsusp to leave kswapd unrefrigerated during the shrink? If > not, the change wouldn't make any difference. Yes, handled by this part of the patch: @@ -976,10 +983,11 @@ * us from recursively trying to free more memory as we're * trying to free the first piece of memory in the first place). */ - tsk->flags |= PF_MEMALLOC|PF_KSWAPD; + tsk->flags |= PF_MEMALLOC|PF_KSWAPD|PF_IOTHREAD; Without that change, nothing got swapped to disk. It looks like __alloc_pages(GFP_ATOMIC,...) only wakes up the kswapd threads. Is the pdflush threads needed during memory freeing? My patch leaves them unrefrigerated too, but Pavel said that wasn't safe for some reason. -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-18 16:45 ` Andrew Morton 2003-07-18 17:50 ` Pavel Machek 2003-07-18 19:58 ` Peter Osterlund @ 2003-07-18 21:05 ` Nigel Cunningham 2 siblings, 0 replies; 26+ messages in thread From: Nigel Cunningham @ 2003-07-18 21:05 UTC (permalink / raw) To: Andrew Morton; +Cc: Peter Osterlund, Pavel Machek, Linux Kernel Mailing List On Sat, 2003-07-19 at 04:45, Andrew Morton wrote: > Oh, we shouldn't be doing this sort of thing when the kernel threads are > refrigerated. We do need kswapd services for the trick you tried. > > And all flavours of ext3_writepage() can block on kjournald activity, so if > kjournald is refrigerated during the memory shrink the machine can deadlock. > > It would be much better to freeze kernel threads _after_ doing the big > memory shrink. Yes, that's what the 2.4 version does. And it freezes the kernel threads in a particular order to avoid deadlocking. Regards, Nigel -- Nigel Cunningham 495 St Georges Road South, Hastings 4201, New Zealand You see, at just the right time, when we were still powerless, Christ died for the ungodly. -- Romans 5:6, NIV. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-17 20:00 ` Pavel Machek 2003-07-17 20:09 ` Andrew Morton @ 2003-07-21 10:00 ` Peter Osterlund 2003-07-21 12:58 ` Pavel Machek 1 sibling, 1 reply; 26+ messages in thread From: Peter Osterlund @ 2003-07-21 10:00 UTC (permalink / raw) To: Pavel Machek; +Cc: Kernel Mailing List Pavel Machek <pavel@ucw.cz> writes: > > I have done some testing of the software suspend function in > > 2.6.0-test1. It works mostly very well, but I have found two problems. > > > > The first problem is that software suspend fails if a process is > > stopped before you invoke suspend. (For example, by starting cat from > > the shell and pressing ctrl-z.) When the processes are woken up again, > > the cat process is stuck in the schedule loop in refrigerator(), > > sucking up all available cpu time. > > Thanks for a report. If it came with a patch it would be better > ;-). I'll take a look. Here is a patch to make software suspend correctly handle stopped processes. It should apply cleanly on top of the patch you sent to Linus to decouple SOFTWARE_SUSPEND and ACPI_SLEEP. diff -u -r orig/linux-stopped/kernel/suspend.c linux-stopped/kernel/suspend.c --- orig/linux-stopped/kernel/suspend.c Sun Jul 20 11:14:58 2003 +++ linux-stopped/kernel/suspend.c Mon Jul 21 11:51:51 2003 @@ -164,14 +164,18 @@ * Refrigerator and related stuff */ -#define INTERESTING(p) \ - /* We don't want to touch kernel_threads..*/ \ - if (p->flags & PF_IOTHREAD) \ - continue; \ - if (p == current) \ - continue; \ - if (p->state == TASK_ZOMBIE) \ - continue; +/* 0 = Ignore this process when freezing/thawing, 1 = freeze/thaw this process */ +static inline int interesting_process(struct task_struct *p) +{ + if (p->flags & PF_IOTHREAD) + return 0; + if (p == current) + return 0; + if ((p->state == TASK_ZOMBIE) || (p->state == TASK_DEAD)) + return 0; + + return 1; +} #define TIMEOUT (6 * HZ) /* Timeout for stopping processes */ @@ -214,9 +218,12 @@ read_lock(&tasklist_lock); do_each_thread(g, p) { unsigned long flags; - INTERESTING(p); + if (!interesting_process(p)) + continue; if (p->flags & PF_FROZEN) continue; + if (p->state == TASK_STOPPED) + continue; /* FIXME: smp problem here: we may not access other process' flags without locking */ @@ -247,12 +254,15 @@ printk( "Restarting tasks..." ); read_lock(&tasklist_lock); do_each_thread(g, p) { - INTERESTING(p); - - if (p->flags & PF_FROZEN) p->flags &= ~PF_FROZEN; - else - printk(KERN_INFO " Strange, %s not stopped\n", p->comm ); - wake_up_process(p); + if (!interesting_process(p)) + continue; + + p->flags &= ~PF_FREEZE; + if (p->flags & PF_FROZEN) { + p->flags &= ~PF_FROZEN; + wake_up_process(p); + } else + PRINTK(KERN_INFO " Strange, %s not frozen\n", p->comm ); } while_each_thread(g, p); read_unlock(&tasklist_lock); -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-21 10:00 ` Peter Osterlund @ 2003-07-21 12:58 ` Pavel Machek 2003-07-21 14:36 ` Peter Osterlund 0 siblings, 1 reply; 26+ messages in thread From: Pavel Machek @ 2003-07-21 12:58 UTC (permalink / raw) To: Peter Osterlund; +Cc: Pavel Machek, Kernel Mailing List Hi! > > Thanks for a report. If it came with a patch it would be better > > ;-). I'll take a look. > > Here is a patch to make software suspend correctly handle stopped > processes. It should apply cleanly on top of the patch you sent to > Linus to decouple SOFTWARE_SUSPEND and ACPI_SLEEP. > > @@ -164,14 +164,18 @@ > * Refrigerator and related stuff > */ > > -#define INTERESTING(p) > - /* We don't want to touch kernel_threads..*/ > - if (p->flags & PF_IOTHREAD) > - continue; > - if (p == current) > - continue; > - if (p->state == TASK_ZOMBIE) > - continue; > +/* 0 = Ignore this process when freezing/thawing, 1 = freeze/thaw this process */ > +static inline int interesting_process(struct task_struct *p) > +{ > + if (p->flags & PF_IOTHREAD) > + return 0; > + if (p == current) > + return 0; > + if ((p->state == TASK_ZOMBIE) || (p->state == TASK_DEAD)) > + return 0; > + > + return 1; > +} > Simple cleanup, okay. > @@ -214,9 +218,12 @@ > read_lock(&tasklist_lock); > do_each_thread(g, p) { > unsigned long flags; > - INTERESTING(p); > + if (!interesting_process(p)) > + continue; > if (p->flags & PF_FROZEN) > continue; > + if (p->state == TASK_STOPPED) > + continue; > > /* FIXME: smp problem here: we may not access other process' flags > without locking */ No need to handle TASK_STOPPED tasks, as they are "frozen" already. Ok. > read_lock(&tasklist_lock); > do_each_thread(g, p) { > - INTERESTING(p); > - > - if (p->flags & PF_FROZEN) p->flags &= ~PF_FROZEN; > - else > - printk(KERN_INFO " Strange, %s not stopped\n", p->comm ); > - wake_up_process(p); > + if (!interesting_process(p)) > + continue; > + > + p->flags &= ~PF_FREEZE; > + if (p->flags & PF_FROZEN) { > + p->flags &= ~PF_FROZEN; > + wake_up_process(p); > + } else > + But why do you touch PF_FROZEN here? Refrigerator should do that. And wake_up_process should not be needed... If it is in refrigerator, it polls PF_FREEZE... -- Pavel Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need... ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-21 12:58 ` Pavel Machek @ 2003-07-21 14:36 ` Peter Osterlund 2003-07-21 21:28 ` Pavel Machek 0 siblings, 1 reply; 26+ messages in thread From: Peter Osterlund @ 2003-07-21 14:36 UTC (permalink / raw) To: Pavel Machek; +Cc: Kernel Mailing List Pavel Machek <pavel@ucw.cz> writes: > > @@ -214,9 +218,12 @@ > > read_lock(&tasklist_lock); > > do_each_thread(g, p) { > > unsigned long flags; > > - INTERESTING(p); > > + if (!interesting_process(p)) > > + continue; > > if (p->flags & PF_FROZEN) > > continue; > > + if (p->state == TASK_STOPPED) > > + continue; > > > > /* FIXME: smp problem here: we may not access other process' flags > > without locking */ > > No need to handle TASK_STOPPED tasks, as they are "frozen" already. Ok. If the process was stopped before swsusp was invoked, the process will never reach refrigerator() and therefore never set the PF_FROZEN flag. Without the test for TASK_STOPPED, swsusp gives up and claims that it can't stop all processes. > > read_lock(&tasklist_lock); > > do_each_thread(g, p) { > > - INTERESTING(p); > > - > > - if (p->flags & PF_FROZEN) p->flags &= ~PF_FROZEN; > > - else > > - printk(KERN_INFO " Strange, %s not stopped\n", p->comm ); > > - wake_up_process(p); > > + if (!interesting_process(p)) > > + continue; > > + > > + p->flags &= ~PF_FREEZE; > > + if (p->flags & PF_FROZEN) { > > + p->flags &= ~PF_FROZEN; > > + wake_up_process(p); > > + } else > > + > But why do you touch PF_FROZEN here? Refrigerator should do that. > And wake_up_process should not be needed... > If it is in refrigerator, it polls PF_FREEZE... Note that the old code always called wake_up_process(), which is necessary to make the process run one more iteration in refrigerator() and relize that it is time to unfreeze. The patch changes things so that wake_up_process() is NOT called if the process is stopped at some other place than in refrigerator(). This ensures that processes that were stopped before we invoked swsusp are still stopped after resume. I manually clear PF_FREEZE here in an attempt to handle a race condition, but I realize I need to understand more of the scheduler and signal code before I know for sure if this is necessary and/or sufficient. -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-21 14:36 ` Peter Osterlund @ 2003-07-21 21:28 ` Pavel Machek 2003-07-21 23:46 ` Peter Osterlund 0 siblings, 1 reply; 26+ messages in thread From: Pavel Machek @ 2003-07-21 21:28 UTC (permalink / raw) To: Peter Osterlund; +Cc: Kernel Mailing List Hi! > > But why do you touch PF_FROZEN here? Refrigerator should do that. > > And wake_up_process should not be needed... > > If it is in refrigerator, it polls PF_FREEZE... > > Note that the old code always called wake_up_process(), which is > necessary to make the process run one more iteration in refrigerator() > and relize that it is time to unfreeze. > > The patch changes things so that wake_up_process() is NOT called if > the process is stopped at some other place than in refrigerator(). > This ensures that processes that were stopped before we invoked swsusp > are still stopped after resume. Yes, but you still print warning for them. I hopefully killed that. > I manually clear PF_FREEZE here in an attempt to handle a race > condition, but I realize I need to understand more of the scheduler > and signal code before I know for sure if this is necessary and/or > sufficient. I do not see the race so I killed that... Here's what I applied. Pavel --- /usr/src/tmp/linux/kernel/suspend.c 2003-07-21 22:15:39.000000000 +0200 +++ /usr/src/linux/kernel/suspend.c 2003-07-21 22:11:28.000000000 +0200 @@ -5,7 +5,7 @@ * machine suspend feature using pretty near only high-level routines * * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu> - * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz> + * Copyright (C) 1998,2001-2003 Pavel Machek <pavel@suse.cz> * * I'd like to thank the following people for their work: * @@ -84,7 +84,6 @@ #define ADDRESS2(x) __ADDRESS(__pa(x)) /* Needed for x86-64 where some pages are in memory twice */ /* References to section boundaries */ -extern char _text, _etext, _edata, __bss_start, _end; extern char __nosave_begin, __nosave_end; extern int is_head_of_free_region(struct page *); @@ -164,14 +163,21 @@ * Refrigerator and related stuff */ -#define INTERESTING(p) \ - /* We don't want to touch kernel_threads..*/ \ - if (p->flags & PF_IOTHREAD) \ - continue; \ - if (p == current) \ - continue; \ - if (p->state == TASK_ZOMBIE) \ - continue; +/* 0 = Ignore this process when freezing/thawing, 1 = freeze/thaw this process */ +static inline int interesting_process(struct task_struct *p) +{ + if (p->flags & PF_IOTHREAD) + return 0; + if (p == current) + return 0; + if ((p->state == TASK_ZOMBIE) || (p->state == TASK_DEAD)) + return 0; + if (p->state == TASK_STOPPED) + return 0; + + + return 1; +} #define TIMEOUT (6 * HZ) /* Timeout for stopping processes */ @@ -214,7 +220,8 @@ read_lock(&tasklist_lock); do_each_thread(g, p) { unsigned long flags; - INTERESTING(p); + if (!interesting_process(p)) + continue; if (p->flags & PF_FROZEN) continue; @@ -247,13 +254,14 @@ printk( "Restarting tasks..." ); read_lock(&tasklist_lock); do_each_thread(g, p) { - INTERESTING(p); - - if (p->flags & PF_FROZEN) - p->flags &= ~PF_FROZEN; - else - printk(KERN_INFO " Strange, %s not stopped\n", p->comm ); - wake_up_process(p); + if (!interesting_process(p)) + continue; + + if (p->flags & PF_FROZEN) { + p->flags &= ~PF_FROZEN; + wake_up_process(p); + } else + PRINTK(KERN_ERR " Strange, %s not frozen\n", p->comm ); } while_each_thread(g, p); read_unlock(&tasklist_lock); -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-21 21:28 ` Pavel Machek @ 2003-07-21 23:46 ` Peter Osterlund 2003-07-22 11:04 ` Pavel Machek 0 siblings, 1 reply; 26+ messages in thread From: Peter Osterlund @ 2003-07-21 23:46 UTC (permalink / raw) To: Pavel Machek; +Cc: Kernel Mailing List Hi! Pavel Machek <pavel@ucw.cz> writes: > > > But why do you touch PF_FROZEN here? Refrigerator should do that. > > > And wake_up_process should not be needed... > > > If it is in refrigerator, it polls PF_FREEZE... > > > > Note that the old code always called wake_up_process(), which is > > necessary to make the process run one more iteration in refrigerator() > > and relize that it is time to unfreeze. > > > > The patch changes things so that wake_up_process() is NOT called if > > the process is stopped at some other place than in refrigerator(). > > This ensures that processes that were stopped before we invoked swsusp > > are still stopped after resume. > > Yes, but you still print warning for them. I hopefully killed that. ... > +static inline int interesting_process(struct task_struct *p) > +{ > + if (p->flags & PF_IOTHREAD) > + return 0; > + if (p == current) > + return 0; > + if ((p->state == TASK_ZOMBIE) || (p->state == TASK_DEAD)) > + return 0; > + if (p->state == TASK_STOPPED) > + return 0; > + > + > + return 1; > +} But this doesn't work. We can't skip stopped tasks in the thaw_processes() function, because frozen tasks are also stopped tasks. Therefore nothing will be woken up during resume. It's probably best to just delete the printk("strange,...") line. -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Software suspend testing in 2.6.0-test1 2003-07-21 23:46 ` Peter Osterlund @ 2003-07-22 11:04 ` Pavel Machek 0 siblings, 0 replies; 26+ messages in thread From: Pavel Machek @ 2003-07-22 11:04 UTC (permalink / raw) To: Peter Osterlund; +Cc: Pavel Machek, Kernel Mailing List Hi! > > > > But why do you touch PF_FROZEN here? Refrigerator should do that. > > > > And wake_up_process should not be needed... > > > > If it is in refrigerator, it polls PF_FREEZE... > > > > > > Note that the old code always called wake_up_process(), which is > > > necessary to make the process run one more iteration in refrigerator() > > > and relize that it is time to unfreeze. > > > > > > The patch changes things so that wake_up_process() is NOT called if > > > the process is stopped at some other place than in refrigerator(). > > > This ensures that processes that were stopped before we invoked swsusp > > > are still stopped after resume. > > > > Yes, but you still print warning for them. I hopefully killed that. > ... > > +static inline int interesting_process(struct task_struct *p) > > +{ > > + if (p->flags & PF_IOTHREAD) > > + return 0; > > + if (p == current) > > + return 0; > > + if ((p->state == TASK_ZOMBIE) || (p->state == TASK_DEAD)) > > + return 0; > > + if (p->state == TASK_STOPPED) > > + return 0; > > + > > + > > + return 1; > > +} > > But this doesn't work. We can't skip stopped tasks in the > thaw_processes() function, because frozen tasks are also stopped > tasks. Therefore nothing will be woken up during resume. > > It's probably best to just delete the printk("strange,...") line. Okay, you are right, I'll stick with your solution for now. I'd prefer to keep that "strange" line, so that we know if they are some problems. Pavel -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2003-07-22 10:49 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-07-17 19:46 Software suspend testing in 2.6.0-test1 Peter Osterlund 2003-07-17 20:00 ` Pavel Machek 2003-07-17 20:09 ` Andrew Morton 2003-07-18 9:59 ` Peter Osterlund 2003-07-18 10:24 ` Andrew Morton 2003-07-18 15:22 ` Pavel Machek 2003-07-18 15:55 ` Peter Osterlund 2003-07-18 16:45 ` Andrew Morton 2003-07-18 17:50 ` Pavel Machek 2003-07-18 18:02 ` Patrick Mochel 2003-07-18 18:04 ` Pavel Machek 2003-07-18 21:05 ` Nigel Cunningham 2003-07-18 19:37 ` Andrew Morton 2003-07-18 19:58 ` Peter Osterlund 2003-07-18 20:15 ` Andrew Morton 2003-07-18 22:13 ` Pavel Machek 2003-07-20 0:22 ` Peter Osterlund 2003-07-20 1:01 ` Andrew Morton 2003-07-20 7:45 ` Peter Osterlund 2003-07-18 21:05 ` Nigel Cunningham 2003-07-21 10:00 ` Peter Osterlund 2003-07-21 12:58 ` Pavel Machek 2003-07-21 14:36 ` Peter Osterlund 2003-07-21 21:28 ` Pavel Machek 2003-07-21 23:46 ` Peter Osterlund 2003-07-22 11:04 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).