All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>,
	qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	David Hildenbrand <dhildenb@redhat.com>,
	Den Lunev <den@openvz.org>, Paolo Bonzini <pbonzini@redhat.com>,
	Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Subject: Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots
Date: Mon, 22 Feb 2021 19:11:30 +0100	[thread overview]
Message-ID: <5471ac12-0dc5-1435-8fba-fad7b37bbcf1@redhat.com> (raw)
In-Reply-To: <20210222175440.GT6669@xz-x1>

On 22.02.21 18:54, Peter Xu wrote:
> On Mon, Feb 22, 2021 at 06:33:27PM +0100, David Hildenbrand wrote:
>> On 22.02.21 18:29, Peter Xu wrote:
>>> On Sat, Feb 20, 2021 at 02:59:42AM -0500, David Hildenbrand wrote:
>>>> Live snapshotting ends up reading all guest memory (dirty bitmap starts with all 1s), which is not what we want for virtio-mem - we don’t want to read and migrate memory that has been discarded and has no stable content.
>>>>
>>>> For ordinary migration we use the guest page hint API to clear bits in the dirty bitmap after dirty bitmap sync. Well, if we don‘t do bitmap syncs we‘ll never clear any dirty bits. That‘s the problem.
>>>
>>> Using dirty bitmap for that information is less efficient, becase it's
>>> definitely a larger granularity information than PAGE_SIZE.  If the disgarded
>>> ranges are always continuous and at the end of a memory region, we should have
>>> some parameter in the ramblock showing that where we got shrinked then we don't
>>> check dirty bitmap at all, rather than always assuming used_length is the one.
>>
>> They are randomly scattered across the whole RAMBlock. Shrinking/growing
>> will be done to some degree in the future (but it won't get rid of the
>> general sparse layout we can produce).
> 
> OK. Btw I think currently live snapshot should still be reading dirty bitmap,
> so maybe it's still fine.  It's just that it's still not very clear to hide
> virtio-mem information into dirty bitmap, imho, since that's not how we
> interpret dirty bitmap - which is only for the sake of tracking page changes.

Well, currently it is "what do we have to migrate".

> 
> What's the granule of virtio-mem for this discard behavior?  Maybe we could

virtio-mem granularity is at least 1MB. This corresponds to 256 bits (32 
bytes) in the dirty bitmap I think.

> decouple it with dirty bitmap some day; if the unit is big enough it's also a
> gain on efficiency so we skip in chunk rather than looping over tons of pages
> knowing that they're discarded.

Yeah, it's not optimal having to go over the dirty bitmap to cross off 
"discarded" parts and later having to find bits to migrate.

At least find_next_bit() can skip whole longs (8 bytes) and is fairly 
efficient. There is certainly room for improvement (the current guest 
free page hinting API is certainly a hack).

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2021-02-22 18:15 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 15:24 [PATCH v13 0/5] UFFD write-tracking migration/snapshots andrey.gruzdev--- via
2021-01-21 15:24 ` [PATCH v13 1/5] migration: introduce 'background-snapshot' migration capability andrey.gruzdev--- via
2021-01-21 15:24 ` [PATCH v13 2/5] migration: introduce UFFD-WP low-level interface helpers andrey.gruzdev--- via
2021-01-21 15:24 ` [PATCH v13 3/5] migration: support UFFD write fault processing in ram_save_iterate() andrey.gruzdev--- via
2021-01-21 15:24 ` [PATCH v13 4/5] migration: implementation of background snapshot thread andrey.gruzdev--- via
2021-01-28 18:29   ` Dr. David Alan Gilbert
2021-01-29  8:17     ` Andrey Gruzdev
2021-01-21 15:24 ` [PATCH v13 5/5] migration: introduce 'userfaultfd-wrlat.py' script andrey.gruzdev--- via
2021-02-09 12:37 ` [PATCH v13 0/5] UFFD write-tracking migration/snapshots David Hildenbrand
2021-02-09 18:38   ` Andrey Gruzdev
2021-02-09 19:06     ` David Hildenbrand
2021-02-09 20:09       ` Peter Xu
2021-02-09 20:31         ` Peter Xu
2021-02-11  9:21           ` Andrey Gruzdev
2021-02-11 17:18             ` Peter Xu
2021-02-11 18:15               ` Andrey Gruzdev
2021-02-11 16:19       ` Andrey Gruzdev
2021-02-11 17:32         ` Peter Xu
2021-02-11 18:28           ` Andrey Gruzdev
2021-02-11 19:01             ` David Hildenbrand
2021-02-11 20:31               ` Peter Xu
2021-02-11 20:44                 ` David Hildenbrand
2021-02-11 21:05                   ` Peter Xu
2021-02-11 21:09                     ` David Hildenbrand
2021-02-12  3:06                       ` Peter Xu
2021-02-12  8:52                         ` David Hildenbrand
2021-02-12 16:11                           ` Peter Xu
2021-02-13  9:34                             ` Andrey Gruzdev
2021-02-13 10:30                               ` David Hildenbrand
2021-02-16 23:35                               ` Peter Xu
2021-02-17 10:31                                 ` David Hildenbrand
2021-02-19  6:57                                 ` Andrey Gruzdev
2021-02-19  7:45                                   ` David Hildenbrand
2021-02-19 20:50                                   ` Peter Xu
2021-02-19 21:10                                     ` Peter Xu
2021-02-19 21:14                                       ` David Hildenbrand
2021-02-19 21:20                                         ` David Hildenbrand
2021-02-19 22:47                                           ` Peter Xu
2021-02-20  7:59                                             ` David Hildenbrand
2021-02-22 17:29                                               ` Peter Xu
2021-02-22 17:33                                                 ` David Hildenbrand
2021-02-22 17:54                                                   ` Peter Xu
2021-02-22 18:11                                                     ` David Hildenbrand [this message]
2021-02-24 16:56                                                       ` Andrey Gruzdev
2021-02-24 17:01                                                         ` David Hildenbrand
2021-02-24 17:52                                                           ` Andrey Gruzdev
2021-02-24 16:43                                     ` Andrey Gruzdev
2021-02-24 16:54                                       ` David Hildenbrand
2021-02-24 17:00                                         ` Andrey Gruzdev
2021-02-11 19:21     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5471ac12-0dc5-1435-8fba-fad7b37bbcf1@redhat.com \
    --to=david@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=andrey.gruzdev@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=dgilbert@redhat.com \
    --cc=dhildenb@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.