All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: David Hildenbrand <david@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>,
	Oscar Salvador <osalvador@suse.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Sasha Levin <sashal@kernel.org>,
	Tyler Hicks <tyhicks@linux.microsoft.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	sthemmin@microsoft.com
Subject: Re: Pinning ZONE_MOVABLE pages
Date: Fri, 20 Nov 2020 21:17:07 +0000	[thread overview]
Message-ID: <20201120211707.GC4327@casper.infradead.org> (raw)
In-Reply-To: <9452B231-23F3-48F5-A0E2-D6C5603F87F1@redhat.com>

On Fri, Nov 20, 2020 at 09:59:24PM +0100, David Hildenbrand wrote:
> 
> > Am 20.11.2020 um 21:28 schrieb Pavel Tatashin <pasha.tatashin@soleen.com>:
> > 
> > Recently, I encountered a hang that is happening during memory hot
> > remove operation. It turns out that the hang is caused by pinned user
> > pages in ZONE_MOVABLE.
> > 
> > Kernel expects that all pages in ZONE_MOVABLE can be migrated, but
> > this is not the case if a user applications such as through dpdk
> > libraries pinned them via vfio dma map. Kernel keeps trying to
> > hot-remove them, but refcnt never gets to zero, so we are looping
> > until the hardware watchdog kicks in.
> > 
> > We cannot do dma unmaps before hot-remove, because hot-remove is a
> > slow operation, and we have thousands for network flows handled by
> > dpdk that we just cannot suspend for the duration of hot-remove
> > operation.
> > 
> 
> Hi!
> 
> It‘s a known problem also for VMs using vfio. I thought about this some while ago an came to the same conclusion: before performing long-term pinnings, we have to migrate pages off the movable zone. After that, it‘s too late.

We can't, though.  VMs using vfio pin their entire address space (right?)
so we end up with basically all of the !MOVABLE memory used for VMs and
the MOVABLE memory goes unused (I'm thinking about the case of a machine
which only hosts VMs and has nothing else to do with its memory).  In
that case, the sysadmin is going to reconfigure ZONE_MOVABLE away, and
now we just don't have any ZONE_MOVABLE.  So what's the point?

ZONE_MOVABLE can also be pinned by mlock() and other such system calls.
The kernel needs to understand that ZONE_MOVABLE memory may not actually
be movable, and skip the unmovable stuff.

  reply	other threads:[~2020-11-20 21:17 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20 20:27 Pinning ZONE_MOVABLE pages Pavel Tatashin
2020-11-20 20:27 ` Pavel Tatashin
2020-11-20 20:59 ` David Hildenbrand
2020-11-20 21:17   ` Matthew Wilcox [this message]
2020-11-20 21:34     ` David Hildenbrand
2020-11-20 21:53       ` Pavel Tatashin
2020-11-20 21:53         ` Pavel Tatashin
2020-11-20 21:58   ` Pavel Tatashin
2020-11-20 21:58     ` Pavel Tatashin
2020-11-20 22:06     ` David Hildenbrand
2020-11-22 21:06 ` David Rientjes
2020-11-22 21:06   ` David Rientjes
2020-11-23 15:31   ` Pavel Tatashin
2020-11-23 15:31     ` Pavel Tatashin
2020-11-23  9:01 ` Michal Hocko
2020-11-23 16:06   ` Pavel Tatashin
2020-11-23 16:06     ` Pavel Tatashin
2020-11-23 17:15     ` Jason Gunthorpe
2020-11-23 17:54       ` Pavel Tatashin
2020-11-23 17:54         ` Pavel Tatashin
2020-11-23 18:34         ` Jason Gunthorpe
2020-11-24  8:20     ` Michal Hocko
2020-11-23 15:04 ` Vlastimil Babka
2020-11-23 16:31   ` Pavel Tatashin
2020-11-23 16:31     ` Pavel Tatashin
2020-11-24  8:24     ` Michal Hocko
2020-11-24  8:43     ` Michal Hocko
2020-11-24  8:44       ` David Hildenbrand
2020-11-24  6:49 ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201120211707.GC4327@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=sashal@kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=tyhicks@linux.microsoft.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.