linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	lsf-pc@lists.linux-foundation.org, Jens Axboe <axboe@kernel.dk>,
	Benjamin LaHaise <bcrl@kvack.org>
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Do not pin pages for various direct-io scheme
Date: Wed, 22 Jan 2020 09:02:25 -0800	[thread overview]
Message-ID: <20200122170225.GB6009@redhat.com> (raw)
In-Reply-To: <CAPcyv4hF-bagqZk-n_2QyvG5zE=5uSWJnbkDsfY3FYHT0+F6FQ@mail.gmail.com>

On Wed, Jan 22, 2020 at 07:56:50AM -0800, Dan Williams wrote:
> On Tue, Jan 21, 2020 at 9:04 PM Jerome Glisse <jglisse@redhat.com> wrote:
> >
> > On Tue, Jan 21, 2020 at 08:19:54PM -0800, Dan Williams wrote:
> > > On Tue, Jan 21, 2020 at 6:34 PM <jglisse@redhat.com> wrote:
> > > >
> > > > From: Jérôme Glisse <jglisse@redhat.com>
> > > >
> > > > Direct I/O does pin memory through GUP (get user page) this does
> > > > block several mm activities like:
> > > >     - compaction
> > > >     - numa
> > > >     - migration
> > > >     ...
> > > >
> > > > It is also troublesome if the pinned pages are actualy file back
> > > > pages that migth go under writeback. In which case the page can
> > > > not be write protected from direct-io point of view (see various
> > > > discussion about recent work on GUP [1]). This does happens for
> > > > instance if the virtual memory address use as buffer for read
> > > > operation is the outcome of an mmap of a regular file.
> > > >
> > > >
> > > > With direct-io or aio (asynchronous io) pages are pinned until
> > > > syscall completion (which depends on many factors: io size,
> > > > block device speed, ...). For io-uring pages can be pinned an
> > > > indifinite amount of time.
> > > >
> > > >
> > > > So i would like to convert direct io code (direct-io, aio and
> > > > io-uring) to obey mmu notifier and thus allow memory management
> > > > and writeback to work and behave like any other process memory.
> > > >
> > > > For direct-io and aio this mostly gives a way to wait on syscall
> > > > completion. For io-uring this means that buffer might need to be
> > > > re-validated (ie looking up pages again to get the new set of
> > > > pages for the buffer). Impact for io-uring is the delay needed
> > > > to lookup new pages or wait on writeback (if necessary). This
> > > > would only happens _if_ an invalidation event happens, which it-
> > > > self should only happen under memory preissure or for NUMA
> > > > activities.
> > >
> > > This seems to assume that memory pressure and NUMA migration are rare
> > > events. Some of the proposed hierarchical memory management schemes
> > > [1] might impact that assumption.
> > >
> > > [1]: http://lore.kernel.org/r/20191101075727.26683-1-ying.huang@intel.com/
> > >
> >
> > Yes, it is true that it will likely becomes more and more an issues.
> > We are facing a tough choice here as pining block NUMA or any kind of
> > migration and thus might impede performance while invalidating an io-
> > uring buffer will also cause a small latency burst. I do not think we
> > can make everyone happy but at very least we should avoid pining and
> > provide knobs to let user decide what they care more about (ie io with-
> > out burst or better NUMA locality).
> 
> It's a question of tradeoffs and this proposal seems to have already
> decided that the question should be answered in favor a GPU/SVM
> centric view of the world without presenting the alternative.
> Direct-I/O colliding with GPU operations might also be solved by
> always triggering a migration, and applications that care would avoid
> colliding operations that slow down their GPU workload. A slow compat
> fallback that applications can programmatically avoid is more flexible
> than an upfront knob.

To make it clear i do not care about direct I/O colliding with anything
GPU or otherwise, anything like that is up to the application programmer.

My sole interest is with page pinning that block compaction and migration.
The former imped the kernel capability to materialize huge page, the
latter can impact performance badly including for the direct i/o user.
For instance if the process using io-uring get migrated to different node
after registering its buffer then it will keep using memory from a
different node which in the end might be much worse then the one time
extra latency spike the migration incur.

Cheers,
Jérôme



      reply	other threads:[~2020-01-22 17:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-22  2:31 [LSF/MM/BPF TOPIC] Do not pin pages for various direct-io scheme jglisse
2020-01-22  3:54 ` Jens Axboe
2020-01-22  4:57   ` Jerome Glisse
2020-01-22 11:59     ` Michal Hocko
2020-01-22 15:12       ` Jens Axboe
2020-01-22 16:54         ` Jerome Glisse
2020-01-22 17:04           ` Jens Axboe
2020-01-22 17:28             ` Jerome Glisse
2020-01-22 17:38               ` Jens Axboe
2020-01-22 17:40                 ` Jerome Glisse
2020-01-22 17:49                   ` Jens Axboe
2020-01-27 19:01   ` Jason Gunthorpe
2020-01-22  4:19 ` Dan Williams
2020-01-22  5:00   ` Jerome Glisse
2020-01-22 15:56     ` [Lsf-pc] " Dan Williams
2020-01-22 17:02       ` Jerome Glisse [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200122170225.GB6009@redhat.com \
    --to=jglisse@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bcrl@kvack.org \
    --cc=dan.j.williams@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).