All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Pankaj Gupta <pagupta@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	KVM list <kvm@vger.kernel.org>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@ml01.01.org>,
	Linux MM <linux-mm@kvack.org>, Jan Kara <jack@suse.cz>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	haozhong zhang <haozhong.zhang@intel.com>,
	Nitesh Narayan Lal <nilal@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	ross zwisler <ross.zwisler@intel.com>,
	David Hildenbrand <david@redhat.com>,
	xiaoguangrong eric <xiaoguangrong.eric@gmail.com>
Subject: Re: [RFC 2/2] KVM: add virtio-pmem driver
Date: Mon, 16 Oct 2017 08:58:37 -0700	[thread overview]
Message-ID: <CAPcyv4hffSdoONfFohKZzfB2gywGYG9MmDxC0H9+5R53w+ROVQ@mail.gmail.com> (raw)
In-Reply-To: <20171016144753.GB14135@stefanha-x1.localdomain>

On Mon, Oct 16, 2017 at 7:47 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Fri, Oct 13, 2017 at 06:48:15AM -0400, Pankaj Gupta wrote:
>> > On Thu, Oct 12, 2017 at 09:20:26PM +0530, Pankaj Gupta wrote:
>> > > +static blk_qc_t virtio_pmem_make_request(struct request_queue *q,
>> > > +                 struct bio *bio)
>> > > +{
>> > > + blk_status_t rc = 0;
>> > > + struct bio_vec bvec;
>> > > + struct bvec_iter iter;
>> > > + struct virtio_pmem *pmem = q->queuedata;
>> > > +
>> > > + if (bio->bi_opf & REQ_FLUSH)
>> > > +         //todo host flush command
>> >
>> > This detail is critical to the device design.  What is the plan?
>>
>> yes, this is good point.
>>
>> was thinking of guest sending a flush command to Qemu which
>> will do a fsync on file fd.
>
> Previously there was discussion about fsyncing a specific file range
> instead of the whole file.  This could perform better in cases where
> only a subset of dirty pages need to be flushed.
>
> One possibility is to design the virtio interface to communicate ranges
> but the emulation code simply fsyncs the fd for the time being.  Later
> on, if the necessary kernel and userspace interfaces are added, we can
> make use of the interface.

Range based is not a natural storage cache management mechanism. All
that is it available typically is a full write-cache-flush mechanism
and upper layers would need to customized for range-based flushing.

>> If we do a async flush and move the task to wait queue till we receive
>> flush complete reply from host we can allow other tasks to execute
>> in current cpu.
>>
>> Any suggestions you have or anything I am not foreseeing here?
>
> My main thought about this patch series is whether pmem should be a
> virtio-blk feature bit instead of a whole new device.  There is quite a
> bit of overlap between the two.

I'd be open to that... there's already provisions in the pmem driver
for platforms where cpu caches are flushed on power-loss, a virtio
mode for this shared-memory case seems reasonable.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Pankaj Gupta <pagupta@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	KVM list <kvm@vger.kernel.org>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@ml01.01.org>,
	Linux MM <linux-mm@kvack.org>, Jan Kara <jack@suse.cz>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	haozhong zhang <haozhong.zhang@intel.com>,
	Nitesh Narayan Lal <nilal@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	ross zwisler <ross.zwisler@intel.com>,
	David Hildenbrand <david@redhat.com>,
	xiaoguangrong eric <xiaoguangrong.eric@gmail.com>
Subject: Re: [RFC 2/2] KVM: add virtio-pmem driver
Date: Mon, 16 Oct 2017 08:58:37 -0700	[thread overview]
Message-ID: <CAPcyv4hffSdoONfFohKZzfB2gywGYG9MmDxC0H9+5R53w+ROVQ@mail.gmail.com> (raw)
In-Reply-To: <20171016144753.GB14135@stefanha-x1.localdomain>

On Mon, Oct 16, 2017 at 7:47 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Fri, Oct 13, 2017 at 06:48:15AM -0400, Pankaj Gupta wrote:
>> > On Thu, Oct 12, 2017 at 09:20:26PM +0530, Pankaj Gupta wrote:
>> > > +static blk_qc_t virtio_pmem_make_request(struct request_queue *q,
>> > > +                 struct bio *bio)
>> > > +{
>> > > + blk_status_t rc = 0;
>> > > + struct bio_vec bvec;
>> > > + struct bvec_iter iter;
>> > > + struct virtio_pmem *pmem = q->queuedata;
>> > > +
>> > > + if (bio->bi_opf & REQ_FLUSH)
>> > > +         //todo host flush command
>> >
>> > This detail is critical to the device design.  What is the plan?
>>
>> yes, this is good point.
>>
>> was thinking of guest sending a flush command to Qemu which
>> will do a fsync on file fd.
>
> Previously there was discussion about fsyncing a specific file range
> instead of the whole file.  This could perform better in cases where
> only a subset of dirty pages need to be flushed.
>
> One possibility is to design the virtio interface to communicate ranges
> but the emulation code simply fsyncs the fd for the time being.  Later
> on, if the necessary kernel and userspace interfaces are added, we can
> make use of the interface.

Range based is not a natural storage cache management mechanism. All
that is it available typically is a full write-cache-flush mechanism
and upper layers would need to customized for range-based flushing.

>> If we do a async flush and move the task to wait queue till we receive
>> flush complete reply from host we can allow other tasks to execute
>> in current cpu.
>>
>> Any suggestions you have or anything I am not foreseeing here?
>
> My main thought about this patch series is whether pmem should be a
> virtio-blk feature bit instead of a whole new device.  There is quite a
> bit of overlap between the two.

I'd be open to that... there's already provisions in the pmem driver
for platforms where cpu caches are flushed on power-loss, a virtio
mode for this shared-memory case seems reasonable.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Pankaj Gupta <pagupta@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	KVM list <kvm@vger.kernel.org>,
	Qemu Developers <qemu-devel@nongnu.org>,
	linux-nvdimm <linux-nvdimm@ml01.01.org>,
	Linux MM <linux-mm@kvack.org>, Jan Kara <jack@suse.cz>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	haozhong zhang <haozhong.zhang@intel.com>,
	Nitesh Narayan Lal <nilal@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	ross zwisler <ross.zwisler@intel.com>,
	David Hildenbrand <david@redhat.com>,
	xiaoguangrong eric <xiaoguangrong.eric@gmail.com>
Subject: Re: [Qemu-devel] [RFC 2/2] KVM: add virtio-pmem driver
Date: Mon, 16 Oct 2017 08:58:37 -0700	[thread overview]
Message-ID: <CAPcyv4hffSdoONfFohKZzfB2gywGYG9MmDxC0H9+5R53w+ROVQ@mail.gmail.com> (raw)
In-Reply-To: <20171016144753.GB14135@stefanha-x1.localdomain>

On Mon, Oct 16, 2017 at 7:47 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Fri, Oct 13, 2017 at 06:48:15AM -0400, Pankaj Gupta wrote:
>> > On Thu, Oct 12, 2017 at 09:20:26PM +0530, Pankaj Gupta wrote:
>> > > +static blk_qc_t virtio_pmem_make_request(struct request_queue *q,
>> > > +                 struct bio *bio)
>> > > +{
>> > > + blk_status_t rc = 0;
>> > > + struct bio_vec bvec;
>> > > + struct bvec_iter iter;
>> > > + struct virtio_pmem *pmem = q->queuedata;
>> > > +
>> > > + if (bio->bi_opf & REQ_FLUSH)
>> > > +         //todo host flush command
>> >
>> > This detail is critical to the device design.  What is the plan?
>>
>> yes, this is good point.
>>
>> was thinking of guest sending a flush command to Qemu which
>> will do a fsync on file fd.
>
> Previously there was discussion about fsyncing a specific file range
> instead of the whole file.  This could perform better in cases where
> only a subset of dirty pages need to be flushed.
>
> One possibility is to design the virtio interface to communicate ranges
> but the emulation code simply fsyncs the fd for the time being.  Later
> on, if the necessary kernel and userspace interfaces are added, we can
> make use of the interface.

Range based is not a natural storage cache management mechanism. All
that is it available typically is a full write-cache-flush mechanism
and upper layers would need to customized for range-based flushing.

>> If we do a async flush and move the task to wait queue till we receive
>> flush complete reply from host we can allow other tasks to execute
>> in current cpu.
>>
>> Any suggestions you have or anything I am not foreseeing here?
>
> My main thought about this patch series is whether pmem should be a
> virtio-blk feature bit instead of a whole new device.  There is quite a
> bit of overlap between the two.

I'd be open to that... there's already provisions in the pmem driver
for platforms where cpu caches are flushed on power-loss, a virtio
mode for this shared-memory case seems reasonable.

  reply	other threads:[~2017-10-16 15:58 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-12 15:50 [RFC 0/2] KVM "fake DAX" device flushing Pankaj Gupta
2017-10-12 15:50 ` [Qemu-devel] " Pankaj Gupta
2017-10-12 15:50 ` Pankaj Gupta
2017-10-12 15:50 ` [RFC 1/2] pmem: Move reusable code to base header files Pankaj Gupta
2017-10-12 15:50   ` [Qemu-devel] " Pankaj Gupta
2017-10-12 15:50   ` Pankaj Gupta
2017-10-12 20:42   ` Dan Williams
2017-10-12 20:42     ` [Qemu-devel] " Dan Williams
2017-10-12 20:42     ` Dan Williams
2017-10-12 21:27     ` [Qemu-devel] " Pankaj Gupta
2017-10-12 21:27       ` Pankaj Gupta
     [not found] ` <20171012155027.3277-1-pagupta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-12 15:50   ` [RFC 2/2] KVM: add virtio-pmem driver Pankaj Gupta
2017-10-12 15:50     ` [Qemu-devel] " Pankaj Gupta
2017-10-12 15:50     ` Pankaj Gupta
2017-10-12 15:50     ` Pankaj Gupta
2017-10-12 20:51     ` Dan Williams
2017-10-12 20:51       ` [Qemu-devel] " Dan Williams
2017-10-12 20:51       ` Dan Williams
2017-10-12 21:25       ` Pankaj Gupta
2017-10-12 21:25         ` [Qemu-devel] " Pankaj Gupta
2017-10-12 21:25         ` Pankaj Gupta
2017-10-12 21:54         ` Dan Williams
2017-10-12 21:54           ` [Qemu-devel] " Dan Williams
2017-10-12 21:54           ` Dan Williams
     [not found]           ` <CAPcyv4gkri7t+3Unf0sc9AHMnz-v9G_qV_bJppLjUUNAn7drrQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-12 22:18             ` Pankaj Gupta
2017-10-12 22:18               ` [Qemu-devel] " Pankaj Gupta
2017-10-12 22:18               ` Pankaj Gupta
2017-10-12 22:18               ` Pankaj Gupta
2017-10-12 22:27               ` Rik van Riel
2017-10-12 22:27                 ` [Qemu-devel] " Rik van Riel
2017-10-12 22:27                 ` Rik van Riel
2017-10-12 22:27                 ` Rik van Riel
2017-10-12 22:27                 ` Rik van Riel
2017-10-12 22:39                 ` Pankaj Gupta
2017-10-12 22:39                   ` [Qemu-devel] " Pankaj Gupta
2017-10-12 22:39                   ` Pankaj Gupta
2017-10-12 22:52                 ` Pankaj Gupta
2017-10-12 22:52                   ` [Qemu-devel] " Pankaj Gupta
2017-10-12 22:52                   ` Pankaj Gupta
2017-10-12 22:59                   ` Dan Williams
2017-10-12 22:59                     ` [Qemu-devel] " Dan Williams
2017-10-12 22:59                     ` Dan Williams
2017-10-12 23:07                     ` Pankaj Gupta
2017-10-12 23:07                       ` [Qemu-devel] " Pankaj Gupta
2017-10-12 23:07                       ` Pankaj Gupta
2017-10-13  9:44     ` Stefan Hajnoczi
2017-10-13  9:44       ` [Qemu-devel] " Stefan Hajnoczi
2017-10-13  9:44       ` Stefan Hajnoczi
2017-10-13 10:48       ` Pankaj Gupta
2017-10-13 10:48         ` [Qemu-devel] " Pankaj Gupta
2017-10-13 10:48         ` Pankaj Gupta
     [not found]         ` <24301306.20068579.1507891695416.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-16 14:47           ` Stefan Hajnoczi
2017-10-16 14:47             ` [Qemu-devel] " Stefan Hajnoczi
2017-10-16 14:47             ` Stefan Hajnoczi
2017-10-16 14:47             ` Stefan Hajnoczi
2017-10-16 15:58             ` Dan Williams [this message]
2017-10-16 15:58               ` [Qemu-devel] " Dan Williams
2017-10-16 15:58               ` Dan Williams
2017-10-16 17:04             ` Pankaj Gupta
2017-10-16 17:04               ` [Qemu-devel] " Pankaj Gupta
2017-10-16 17:04               ` Pankaj Gupta
     [not found]       ` <20171013094431.GA27308-lxVrvc10SDRcolVlb+j0YCZi+YwRKgec@public.gmane.org>
2017-10-13 15:25         ` Dan Williams
2017-10-13 15:25           ` [Qemu-devel] " Dan Williams
2017-10-13 15:25           ` Dan Williams
2017-10-13 15:25           ` Dan Williams
     [not found]     ` <20171012155027.3277-3-pagupta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-17  7:16       ` Christoph Hellwig
2017-10-17  7:16         ` [Qemu-devel] " Christoph Hellwig
2017-10-17  7:16         ` Christoph Hellwig
2017-10-17  7:16         ` Christoph Hellwig
2017-10-17  7:40         ` [Qemu-devel] " Pankaj Gupta
2017-10-17  7:40           ` Pankaj Gupta
2017-10-17  8:02           ` Christoph Hellwig
2017-10-17  8:02             ` Christoph Hellwig
2017-10-17  8:30             ` Pankaj Gupta
2017-10-17  8:30               ` Pankaj Gupta
2017-10-18 13:03               ` Stefan Hajnoczi
2017-10-18 13:03                 ` Stefan Hajnoczi
2017-10-18 15:51                 ` Dan Williams
2017-10-18 15:51                   ` Dan Williams
     [not found]                   ` <CAPcyv4h6aFkyHhh4R4DTznbSCLf9CuBoszk0Q1gB5EKNcp_SeQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-19  8:01                     ` Stefan Hajnoczi
2017-10-19  8:01                       ` Stefan Hajnoczi
2017-10-19  8:01                       ` Stefan Hajnoczi
2017-10-19  8:01                   ` Christoph Hellwig
2017-10-19  8:01                     ` Christoph Hellwig
     [not found]                     ` <20171019080149.GB10089-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2017-10-19 18:21                       ` Dan Williams
2017-10-19 18:21                         ` Dan Williams
2017-10-19 18:21                         ` Dan Williams
     [not found]                         ` <CAPcyv4j=Cdp68C15HddKaErpve2UGRfSTiL6bHiS=3gQybz9pg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-20  8:00                           ` Christoph Hellwig
2017-10-20  8:00                             ` Christoph Hellwig
2017-10-20  8:00                             ` Christoph Hellwig
     [not found]                             ` <20171020080049.GA25471-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2017-10-20 15:05                               ` Dan Williams
2017-10-20 15:05                                 ` Dan Williams
2017-10-20 15:05                                 ` Dan Williams
2017-10-20 16:06                                 ` Christoph Hellwig
2017-10-20 16:06                                   ` Christoph Hellwig
2017-10-20 16:11                                   ` Dan Williams
2017-10-20 16:11                                     ` Dan Williams
2017-10-12 15:50 ` [RFC] QEMU: Add virtio pmem device Pankaj Gupta
2017-10-12 15:50   ` [Qemu-devel] " Pankaj Gupta
2017-10-12 15:50   ` Pankaj Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4hffSdoONfFohKZzfB2gywGYG9MmDxC0H9+5R53w+ROVQ@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=haozhong.zhang@intel.com \
    --cc=jack@suse.cz \
    --cc=kvm@vger.kernel.org \
    --cc=kwolf@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=nilal@redhat.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=riel@redhat.com \
    --cc=ross.zwisler@intel.com \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.