All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Alexander Duyck <aduyck@mirantis.com>,
	kvm@vger.kernel.org,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	x86@kernel.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	qemu-devel@nongnu.org, Lan Tianyu <tianyu.lan@intel.com>,
	Yang Zhang <yang.zhang.wz@gmail.com>,
	konrad.wilk@oracle.com,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Alexander Graf <agraf@suse.de>,
	Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [RFC PATCH 3/3] x86: Create dma_mark_dirty to dirty pages used for DMA by VM guest
Date: Mon, 14 Dec 2015 19:20:14 +0200	[thread overview]
Message-ID: <20151214191303-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <CAKgT0UdtY5stCPVA=-NoM+1C8Z3b+VXLL1OdrNth=dxvz0A9sQ@mail.gmail.com>

On Mon, Dec 14, 2015 at 08:34:00AM -0800, Alexander Duyck wrote:
> > This way distro can use a guest agent to disable
> > dirtying until before migration starts.
> 
> Right.  For a v2 version I would definitely want to have some way to
> limit the scope of this.  My main reason for putting this out here is
> to start altering the course of discussions since it seems like were
> weren't getting anywhere with the ixgbevf migration changes that were
> being proposed.

Absolutely, thanks for working on this.

> >> +     unsigned long pg_addr, start;
> >> +
> >> +     start = (unsigned long)addr;
> >> +     pg_addr = PAGE_ALIGN(start + size);
> >> +     start &= ~(sizeof(atomic_t) - 1);
> >> +
> >> +     /* trigger a write fault on each page, excluding first page */
> >> +     while ((pg_addr -= PAGE_SIZE) > start)
> >> +             atomic_add(0, (atomic_t *)pg_addr);
> >> +
> >> +     /* trigger a write fault on first word of DMA */
> >> +     atomic_add(0, (atomic_t *)start);
> >
> > start might not be aligned correctly for a cast to atomic_t.
> > It's harmless to do this for any memory, so I think you should
> > just do this for 1st byte of all pages including the first one.
> 
> You may not have noticed it but I actually aligned start in the line
> after pg_addr.

Yes you did. alignof would make it a bit more noticeable.

>  However instead of aligning to the start of the next
> atomic_t I just masked off the lower bits so that we start at the
> DWORD that contains the first byte of the starting address.  The
> assumption here is that I cannot trigger any sort of fault since if I
> have access to a given byte within a DWORD I will have access to the
> entire DWORD.

I'm curious where does this come from.  Isn't it true that access is
controlled at page granularity normally, so you can touch beginning of
page just as well?

>  I coded this up so that the spots where we touch the
> memory should match up with addresses provided by the hardware to
> perform the DMA over the PCI bus.

Yes but there's no requirement to do it like this from
virt POV. You just need to touch each page.

> Also I intentionally ran from highest address to lowest since that way
> we don't risk pushing the first cache line of the DMA buffer out of
> the L1 cache due to the PAGE_SIZE stride.
> 
> - Alex

Interesting. How does order of access help with this?

By the way, if you are into these micro-optimizations you might want to
limit prefetch, to this end you want to access the last line of the
page.  And it's probably worth benchmarking a bit and not doing it all just
based on theory, keep code simple in v1 otherwise.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Lan Tianyu <tianyu.lan@intel.com>,
	Yang Zhang <yang.zhang.wz@gmail.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	kvm@vger.kernel.org, konrad.wilk@oracle.com,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	x86@kernel.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	qemu-devel@nongnu.org, Alexander Graf <agraf@suse.de>,
	Alexander Duyck <aduyck@mirantis.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH 3/3] x86: Create dma_mark_dirty to dirty pages used for DMA by VM guest
Date: Mon, 14 Dec 2015 19:20:14 +0200	[thread overview]
Message-ID: <20151214191303-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <CAKgT0UdtY5stCPVA=-NoM+1C8Z3b+VXLL1OdrNth=dxvz0A9sQ@mail.gmail.com>

On Mon, Dec 14, 2015 at 08:34:00AM -0800, Alexander Duyck wrote:
> > This way distro can use a guest agent to disable
> > dirtying until before migration starts.
> 
> Right.  For a v2 version I would definitely want to have some way to
> limit the scope of this.  My main reason for putting this out here is
> to start altering the course of discussions since it seems like were
> weren't getting anywhere with the ixgbevf migration changes that were
> being proposed.

Absolutely, thanks for working on this.

> >> +     unsigned long pg_addr, start;
> >> +
> >> +     start = (unsigned long)addr;
> >> +     pg_addr = PAGE_ALIGN(start + size);
> >> +     start &= ~(sizeof(atomic_t) - 1);
> >> +
> >> +     /* trigger a write fault on each page, excluding first page */
> >> +     while ((pg_addr -= PAGE_SIZE) > start)
> >> +             atomic_add(0, (atomic_t *)pg_addr);
> >> +
> >> +     /* trigger a write fault on first word of DMA */
> >> +     atomic_add(0, (atomic_t *)start);
> >
> > start might not be aligned correctly for a cast to atomic_t.
> > It's harmless to do this for any memory, so I think you should
> > just do this for 1st byte of all pages including the first one.
> 
> You may not have noticed it but I actually aligned start in the line
> after pg_addr.

Yes you did. alignof would make it a bit more noticeable.

>  However instead of aligning to the start of the next
> atomic_t I just masked off the lower bits so that we start at the
> DWORD that contains the first byte of the starting address.  The
> assumption here is that I cannot trigger any sort of fault since if I
> have access to a given byte within a DWORD I will have access to the
> entire DWORD.

I'm curious where does this come from.  Isn't it true that access is
controlled at page granularity normally, so you can touch beginning of
page just as well?

>  I coded this up so that the spots where we touch the
> memory should match up with addresses provided by the hardware to
> perform the DMA over the PCI bus.

Yes but there's no requirement to do it like this from
virt POV. You just need to touch each page.

> Also I intentionally ran from highest address to lowest since that way
> we don't risk pushing the first cache line of the DMA buffer out of
> the L1 cache due to the PAGE_SIZE stride.
> 
> - Alex

Interesting. How does order of access help with this?

By the way, if you are into these micro-optimizations you might want to
limit prefetch, to this end you want to access the last line of the
page.  And it's probably worth benchmarking a bit and not doing it all just
based on theory, keep code simple in v1 otherwise.

-- 
MST

  reply	other threads:[~2015-12-14 17:20 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-13 21:28 [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking Alexander Duyck
2015-12-13 21:28 ` [Qemu-devel] " Alexander Duyck
2015-12-13 21:28 ` [RFC PATCH 1/3] swiotlb: Fold static unmap and sync calls into calling functions Alexander Duyck
2015-12-13 21:28   ` [Qemu-devel] " Alexander Duyck
2015-12-13 21:28 ` [RFC PATCH 2/3] xen/swiotlb: " Alexander Duyck
2015-12-13 21:28   ` [Qemu-devel] " Alexander Duyck
2015-12-13 21:28 ` [RFC PATCH 3/3] x86: Create dma_mark_dirty to dirty pages used for DMA by VM guest Alexander Duyck
2015-12-13 21:28   ` [Qemu-devel] " Alexander Duyck
2015-12-14 14:00   ` Michael S. Tsirkin
2015-12-14 14:00     ` [Qemu-devel] " Michael S. Tsirkin
2015-12-14 16:34     ` Alexander Duyck
2015-12-14 16:34       ` [Qemu-devel] " Alexander Duyck
2015-12-14 17:20       ` Michael S. Tsirkin [this message]
2015-12-14 17:20         ` Michael S. Tsirkin
2015-12-14 17:59         ` Alexander Duyck
2015-12-14 17:59           ` [Qemu-devel] " Alexander Duyck
2015-12-14 20:52           ` Michael S. Tsirkin
2015-12-14 20:52             ` [Qemu-devel] " Michael S. Tsirkin
2015-12-14 22:32             ` Alexander Duyck
2015-12-14 22:32               ` [Qemu-devel] " Alexander Duyck
2015-12-14  2:27 ` [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking Yang Zhang
2015-12-14  2:27   ` [Qemu-devel] " Yang Zhang
2015-12-14  2:27   ` Yang Zhang
2015-12-14  4:54   ` Alexander Duyck
2015-12-14  4:54     ` [Qemu-devel] " Alexander Duyck
2015-12-14  4:54     ` Alexander Duyck
2015-12-14  5:22     ` Yang Zhang
2015-12-14  5:22       ` [Qemu-devel] " Yang Zhang
2015-12-14  5:22       ` Yang Zhang
2015-12-14  5:46       ` Alexander Duyck
2015-12-14  5:46         ` [Qemu-devel] " Alexander Duyck
2015-12-14  5:46         ` Alexander Duyck
2015-12-14  7:20         ` Yang Zhang
2015-12-14  7:20           ` [Qemu-devel] " Yang Zhang
2015-12-14 14:02           ` Michael S. Tsirkin
2015-12-14 14:02             ` [Qemu-devel] " Michael S. Tsirkin
2015-12-14 14:02             ` Michael S. Tsirkin
2016-01-04 20:41 ` Konrad Rzeszutek Wilk
2016-01-04 20:41   ` [Qemu-devel] " Konrad Rzeszutek Wilk
2016-01-05  3:11   ` Alexander Duyck
2016-01-05  3:11     ` [Qemu-devel] " Alexander Duyck
2016-01-05  9:40     ` Michael S. Tsirkin
2016-01-05  9:40       ` [Qemu-devel] " Michael S. Tsirkin
2016-01-05 10:01       ` Dr. David Alan Gilbert
2016-01-05 10:01         ` [Qemu-devel] " Dr. David Alan Gilbert
2016-01-05 10:35         ` Michael S. Tsirkin
2016-01-05 10:35           ` [Qemu-devel] " Michael S. Tsirkin
2016-01-05 10:35           ` Michael S. Tsirkin
2016-01-05 10:45           ` Dr. David Alan Gilbert
2016-01-05 10:45             ` [Qemu-devel] " Dr. David Alan Gilbert
2016-01-05 10:59             ` Michael S. Tsirkin
2016-01-05 10:59               ` [Qemu-devel] " Michael S. Tsirkin
2016-01-05 11:03               ` Dr. David Alan Gilbert
2016-01-05 11:03                 ` [Qemu-devel] " Dr. David Alan Gilbert
2016-01-05 11:11                 ` Michael S. Tsirkin
2016-01-05 11:11                   ` [Qemu-devel] " Michael S. Tsirkin
2016-01-05 11:06               ` Michael S. Tsirkin
2016-01-05 11:06                 ` [Qemu-devel] " Michael S. Tsirkin
2016-01-05 11:05             ` Michael S. Tsirkin
2016-01-05 11:05               ` [Qemu-devel] " Michael S. Tsirkin
2016-01-05 12:43               ` Dr. David Alan Gilbert
2016-01-05 12:43                 ` [Qemu-devel] " Dr. David Alan Gilbert
2016-01-05 13:16                 ` Michael S. Tsirkin
2016-01-05 13:16                   ` [Qemu-devel] " Michael S. Tsirkin
2016-01-05 18:42                   ` Konrad Rzeszutek Wilk
2016-01-05 18:42                     ` [Qemu-devel] " Konrad Rzeszutek Wilk
2016-01-05 16:18       ` Alexander Duyck
2016-01-05 16:18         ` [Qemu-devel] " Alexander Duyck
2016-06-06  9:18         ` Zhou Jie
2016-06-06  9:18           ` Zhou Jie
2016-06-06 16:04           ` Alex Duyck
2016-06-06 16:04             ` Alex Duyck
2016-06-09 10:14             ` Zhou Jie
2016-06-09 15:39               ` Alexander Duyck
2016-06-12  3:03                 ` Zhou Jie
2016-06-13  1:28                   ` Alexander Duyck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151214191303-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=aduyck@mirantis.com \
    --cc=agraf@suse.de \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=dgilbert@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=tianyu.lan@intel.com \
    --cc=x86@kernel.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.