All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	virtio-dev@lists.oasis-open.org,
	Miklos Szeredi <mszeredi@redhat.com>,
	Sage Weil <sweil@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [virtio-dev] [PATCH v3 2/2] virtio-fs: add DAX window
Date: Wed, 17 Jul 2019 11:48:40 +0100	[thread overview]
Message-ID: <20190717104840.GF7341@stefanha-x1.localdomain> (raw)
In-Reply-To: <20190627100346-mutt-send-email-mst@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 3370 bytes --]

On Thu, Jun 27, 2019 at 10:09:16AM -0400, Michael S. Tsirkin wrote:
> On Tue, Jun 25, 2019 at 10:55:15AM +0100, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (mst@redhat.com) wrote:
> > > On Mon, Jun 24, 2019 at 02:58:08PM +0100, Stefan Hajnoczi wrote:
> > > > On Tue, Jun 18, 2019 at 09:41:25PM -0400, Michael S. Tsirkin wrote:
> > > > > On Wed, Feb 20, 2019 at 12:46:13PM +0000, Stefan Hajnoczi wrote:
> > > > > > +
> > > > > > +\devicenormative{\paragraph}{Device Operation: DAX Window}{Device Types / File System Device / Device Operation / Device Operation: DAX Window}
> > > > > > +
> > > > > > +The device MUST allow mappings that completely or partially overlap existing mappings within the DAX window.
> > > > > 
> > > > > 
> > > > > Any alignment requirements?
> > > > 
> > > > Good point.  There are alignment requirements and the driver has no way
> > > > of knowing what they are.  I'll find a way to communicate them into the
> > > > guest, either via virtio or via FUSE.
> > > > 
> > > > > Also, with no limit on mappings, it looks like guest can use up lots of
> > > > > host VMAs quickly. Shouldn't there be a limit on # of mappings?
> > > > 
> > > > The VM can only deteriorate its own performance, right?
> > > 
> > > Only if QEMU is put in a container where virtual memory is
> > > limited.
> > > It's generally not a good idea where the only way for
> > > host to make progress is to allocate more memory
> > > without any limit.
> > > 
> > > If we are in a situation where we need to either kill
> > > the guest or hit swap, none of the choices is good.
> > 
> > There is a bound; it's cache region size / page size - so
> > that's ~1M mappings worst case (e.g. 4GB cache, 4kB page size)
> > That limit can be bought down if we impose a larger granularity
> > somewhere (and the reality is our kernel uses 2MB mapping chunks I
> > think).
> > 
> > > > We haven't seen catastrophic problems that bring the system to it's
> > > > knees.
> > > 
> > > Because you are not running malicious guests?
> > 
> > Hmm, I didn't realise a process having an excessive number of mappings
> > could harm any other process.
> > 
> > Dave
> 
> Well it allocates resources on the host. If you don't
> contain qemu then even just allocating virtual memory
> can make host swap, right? If you contain it then
> qemu will get killed instead but then you need to tell
> guest what not to do so as not to get qemu killed.

I investigated a little.  Linux has a maximum VMA count sysctl that is
affected by mmap and any other places that add/split VMAs:

  vm.max_map_count = 65530

This is a sysctl tunable and is kept below 65536 for legacy reasons.
ELF coredumps used to only support ~65536 sections.

The QEMU process needs its own VMAs for shared libraries and other
purposes, so each virtio-fs device should expose a significantly lower
DAX Window mapping limit to the driver.  Let's add a configuration space
field as Michael has suggested.

Regarding denial of service, the DAX Window size determines the overall
amount of host page cache that is accessible by the driver.  Together
with an enforced maximum map count we can allow the administrator to
configure devices so they only provide access to a fraction of the host
page cache, mitigating denial of service issues.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-07-17 10:48 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-20 12:46 [virtio-dev] [PATCH v3 0/2] virtio-fs: add virtio file system device Stefan Hajnoczi
2019-02-20 12:46 ` [virtio-dev] [PATCH v3 1/2] content: " Stefan Hajnoczi
2019-02-22 14:31   ` Dr. David Alan Gilbert
2019-02-25 15:54     ` Stefan Hajnoczi
2019-02-25 16:11   ` [virtio-dev] " Dr. David Alan Gilbert
2019-02-27 16:19     ` Stefan Hajnoczi
2019-06-19  1:29   ` [virtio-dev] " Michael S. Tsirkin
2019-07-23 15:58     ` Stefan Hajnoczi
2019-02-20 12:46 ` [virtio-dev] [PATCH v3 2/2] virtio-fs: add DAX window Stefan Hajnoczi
2019-06-19  1:41   ` Michael S. Tsirkin
2019-06-24 13:58     ` Stefan Hajnoczi
2019-06-24 14:10       ` Michael S. Tsirkin
2019-06-25  9:55         ` Dr. David Alan Gilbert
2019-06-27 14:09           ` Michael S. Tsirkin
2019-07-17 10:48             ` Stefan Hajnoczi [this message]
     [not found]             ` <20190717124258.GA13761@redhat.com>
2019-07-23 13:32               ` Stefan Hajnoczi
     [not found]                 ` <20190723140855.GA11628@redhat.com>
2019-07-23 14:52                   ` Stefan Hajnoczi
     [not found]                     ` <20190723155623.GA19189@redhat.com>
2019-07-24  8:33                       ` Stefan Hajnoczi
2019-06-19  1:30 ` [virtio-dev] [PATCH v3 0/2] virtio-fs: add virtio file system device Michael S. Tsirkin
2019-06-24 12:23   ` Stefan Hajnoczi
2019-06-24 13:57     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190717104840.GF7341@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=mst@redhat.com \
    --cc=mszeredi@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sweil@redhat.com \
    --cc=swhiteho@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.