linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Infinite looping observed in __offline_pages
@ 2018-07-25 18:11 John Allen
  2018-07-25 20:03 ` Michal Hocko
  2018-08-01  1:37 ` Rashmica
  0 siblings, 2 replies; 12+ messages in thread
From: John Allen @ 2018-07-25 18:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: jallen, kamezawa.hiroyu, n-horiguchi, mgorman, mhocko

Hi All,

Under heavy stress and constant memory hot add/remove, I have observed 
the following loop to occasionally loop infinitely:

mm/memory_hotplug.c:__offline_pages

repeat:
        /* start memory hot removal */
        ret = -EINTR;
        if (signal_pending(current))
                goto failed_removal;

        cond_resched();
        lru_add_drain_all();
        drain_all_pages(zone);

        pfn = scan_movable_pages(start_pfn, end_pfn);
        if (pfn) { /* We have movable pages */
                ret = do_migrate_range(pfn, end_pfn);
                goto repeat;
        }

What appears to be happening in this case is that do_migrate_range 
returns a failure code which is being ignored. The failure is stemming 
from migrate_pages returning "1" which I'm guessing is the result of us 
hitting the following case:

mm/migrate.c: migrate_pages

	default:
		/*
		 * Permanent failure (-EBUSY, -ENOSYS, etc.):
		 * unlike -EAGAIN case, the failed page is
		 * removed from migration page list and not
		 * retried in the next outer loop.
		 */
		nr_failed++;
		break;
	}

Does a failure in do_migrate_range indicate that the range is 
unmigratable and the loop in __offline_pages should terminate and goto 
failed_removal? Or should we allow a certain number of retrys before we
give up on migrating the range?

This issue was observed on a ppc64le lpar on a 4.18-rc6 kernel.

-John


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-08-23  7:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-25 18:11 Infinite looping observed in __offline_pages John Allen
2018-07-25 20:03 ` Michal Hocko
2018-07-27 17:32   ` John Allen
2018-07-30  9:16     ` Michal Hocko
2018-08-01 11:09   ` Michael Ellerman
2018-08-01 11:20     ` Michal Hocko
2018-08-22  9:30   ` Aneesh Kumar K.V
2018-08-22 10:53     ` Michal Hocko
2018-08-22 18:58     ` Mike Kravetz
2018-08-23  3:01       ` Aneesh Kumar K.V
2018-08-23  7:25       ` Michal Hocko
2018-08-01  1:37 ` Rashmica

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).