From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Hajnoczi Subject: Re: [RFC 2/2] KVM: add virtio-pmem driver Date: Mon, 16 Oct 2017 15:47:53 +0100 Message-ID: <20171016144753.GB14135@stefanha-x1.localdomain> References: <20171012155027.3277-1-pagupta@redhat.com> <20171012155027.3277-3-pagupta@redhat.com> <20171013094431.GA27308@stefanha-x1.localdomain> <24301306.20068579.1507891695416.JavaMail.zimbra@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <24301306.20068579.1507891695416.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Pankaj Gupta Cc: kwolf-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, jack-AlSwsSmVLrQ@public.gmane.org, xiaoguangrong eric , kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, david-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org, ross zwisler , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, stefanha-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, nilal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org List-Id: linux-nvdimm@lists.01.org On Fri, Oct 13, 2017 at 06:48:15AM -0400, Pankaj Gupta wrote: > > On Thu, Oct 12, 2017 at 09:20:26PM +0530, Pankaj Gupta wrote: > > > +static blk_qc_t virtio_pmem_make_request(struct request_queue *q, > > > + struct bio *bio) > > > +{ > > > + blk_status_t rc = 0; > > > + struct bio_vec bvec; > > > + struct bvec_iter iter; > > > + struct virtio_pmem *pmem = q->queuedata; > > > + > > > + if (bio->bi_opf & REQ_FLUSH) > > > + //todo host flush command > > > > This detail is critical to the device design. What is the plan? > > yes, this is good point. > > was thinking of guest sending a flush command to Qemu which > will do a fsync on file fd. Previously there was discussion about fsyncing a specific file range instead of the whole file. This could perform better in cases where only a subset of dirty pages need to be flushed. One possibility is to design the virtio interface to communicate ranges but the emulation code simply fsyncs the fd for the time being. Later on, if the necessary kernel and userspace interfaces are added, we can make use of the interface. > If we do a async flush and move the task to wait queue till we receive > flush complete reply from host we can allow other tasks to execute > in current cpu. > > Any suggestions you have or anything I am not foreseeing here? My main thought about this patch series is whether pmem should be a virtio-blk feature bit instead of a whole new device. There is quite a bit of overlap between the two. Stefan From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753345AbdJPOr6 (ORCPT ); Mon, 16 Oct 2017 10:47:58 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:47031 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752933AbdJPOr5 (ORCPT ); Mon, 16 Oct 2017 10:47:57 -0400 X-Google-Smtp-Source: ABhQp+R/4qjyudjTIjUAzJjyIE/e2ebC+IUJXZsmGTk89+yBpKBXibCrpPWIHpFJcJq9MEUce7LiTA== Date: Mon, 16 Oct 2017 15:47:53 +0100 From: Stefan Hajnoczi To: Pankaj Gupta Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, linux-mm@kvack.org, jack@suse.cz, stefanha@redhat.com, dan j williams , riel@redhat.com, haozhong zhang , nilal@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, ross zwisler , david@redhat.com, xiaoguangrong eric Subject: Re: [RFC 2/2] KVM: add virtio-pmem driver Message-ID: <20171016144753.GB14135@stefanha-x1.localdomain> References: <20171012155027.3277-1-pagupta@redhat.com> <20171012155027.3277-3-pagupta@redhat.com> <20171013094431.GA27308@stefanha-x1.localdomain> <24301306.20068579.1507891695416.JavaMail.zimbra@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <24301306.20068579.1507891695416.JavaMail.zimbra@redhat.com> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 13, 2017 at 06:48:15AM -0400, Pankaj Gupta wrote: > > On Thu, Oct 12, 2017 at 09:20:26PM +0530, Pankaj Gupta wrote: > > > +static blk_qc_t virtio_pmem_make_request(struct request_queue *q, > > > + struct bio *bio) > > > +{ > > > + blk_status_t rc = 0; > > > + struct bio_vec bvec; > > > + struct bvec_iter iter; > > > + struct virtio_pmem *pmem = q->queuedata; > > > + > > > + if (bio->bi_opf & REQ_FLUSH) > > > + //todo host flush command > > > > This detail is critical to the device design. What is the plan? > > yes, this is good point. > > was thinking of guest sending a flush command to Qemu which > will do a fsync on file fd. Previously there was discussion about fsyncing a specific file range instead of the whole file. This could perform better in cases where only a subset of dirty pages need to be flushed. One possibility is to design the virtio interface to communicate ranges but the emulation code simply fsyncs the fd for the time being. Later on, if the necessary kernel and userspace interfaces are added, we can make use of the interface. > If we do a async flush and move the task to wait queue till we receive > flush complete reply from host we can allow other tasks to execute > in current cpu. > > Any suggestions you have or anything I am not foreseeing here? My main thought about this patch series is whether pmem should be a virtio-blk feature bit instead of a whole new device. There is quite a bit of overlap between the two. Stefan From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f70.google.com (mail-wm0-f70.google.com [74.125.82.70]) by kanga.kvack.org (Postfix) with ESMTP id 574A26B0069 for ; Mon, 16 Oct 2017 10:47:57 -0400 (EDT) Received: by mail-wm0-f70.google.com with SMTP id v127so9603377wma.3 for ; Mon, 16 Oct 2017 07:47:57 -0700 (PDT) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id 202sor1931240wmq.22.2017.10.16.07.47.55 for (Google Transport Security); Mon, 16 Oct 2017 07:47:55 -0700 (PDT) Date: Mon, 16 Oct 2017 15:47:53 +0100 From: Stefan Hajnoczi Subject: Re: [RFC 2/2] KVM: add virtio-pmem driver Message-ID: <20171016144753.GB14135@stefanha-x1.localdomain> References: <20171012155027.3277-1-pagupta@redhat.com> <20171012155027.3277-3-pagupta@redhat.com> <20171013094431.GA27308@stefanha-x1.localdomain> <24301306.20068579.1507891695416.JavaMail.zimbra@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <24301306.20068579.1507891695416.JavaMail.zimbra@redhat.com> Sender: owner-linux-mm@kvack.org List-ID: To: Pankaj Gupta Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, linux-mm@kvack.org, jack@suse.cz, stefanha@redhat.com, dan j williams , riel@redhat.com, haozhong zhang , nilal@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, ross zwisler , david@redhat.com, xiaoguangrong eric On Fri, Oct 13, 2017 at 06:48:15AM -0400, Pankaj Gupta wrote: > > On Thu, Oct 12, 2017 at 09:20:26PM +0530, Pankaj Gupta wrote: > > > +static blk_qc_t virtio_pmem_make_request(struct request_queue *q, > > > + struct bio *bio) > > > +{ > > > + blk_status_t rc = 0; > > > + struct bio_vec bvec; > > > + struct bvec_iter iter; > > > + struct virtio_pmem *pmem = q->queuedata; > > > + > > > + if (bio->bi_opf & REQ_FLUSH) > > > + //todo host flush command > > > > This detail is critical to the device design. What is the plan? > > yes, this is good point. > > was thinking of guest sending a flush command to Qemu which > will do a fsync on file fd. Previously there was discussion about fsyncing a specific file range instead of the whole file. This could perform better in cases where only a subset of dirty pages need to be flushed. One possibility is to design the virtio interface to communicate ranges but the emulation code simply fsyncs the fd for the time being. Later on, if the necessary kernel and userspace interfaces are added, we can make use of the interface. > If we do a async flush and move the task to wait queue till we receive > flush complete reply from host we can allow other tasks to execute > in current cpu. > > Any suggestions you have or anything I am not foreseeing here? My main thought about this patch series is whether pmem should be a virtio-blk feature bit instead of a whole new device. There is quite a bit of overlap between the two. Stefan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54249) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e46gi-0000vq-LR for qemu-devel@nongnu.org; Mon, 16 Oct 2017 10:48:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e46gQ-0001aC-BC for qemu-devel@nongnu.org; Mon, 16 Oct 2017 10:48:16 -0400 Received: from mail-wm0-x242.google.com ([2a00:1450:400c:c09::242]:48703) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e46gQ-0001YY-2m for qemu-devel@nongnu.org; Mon, 16 Oct 2017 10:47:58 -0400 Received: by mail-wm0-x242.google.com with SMTP id i124so4085634wmf.3 for ; Mon, 16 Oct 2017 07:47:56 -0700 (PDT) Date: Mon, 16 Oct 2017 15:47:53 +0100 From: Stefan Hajnoczi Message-ID: <20171016144753.GB14135@stefanha-x1.localdomain> References: <20171012155027.3277-1-pagupta@redhat.com> <20171012155027.3277-3-pagupta@redhat.com> <20171013094431.GA27308@stefanha-x1.localdomain> <24301306.20068579.1507891695416.JavaMail.zimbra@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <24301306.20068579.1507891695416.JavaMail.zimbra@redhat.com> Subject: Re: [Qemu-devel] [RFC 2/2] KVM: add virtio-pmem driver List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pankaj Gupta Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, linux-mm@kvack.org, jack@suse.cz, stefanha@redhat.com, dan j williams , riel@redhat.com, haozhong zhang , nilal@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, ross zwisler , david@redhat.com, xiaoguangrong eric On Fri, Oct 13, 2017 at 06:48:15AM -0400, Pankaj Gupta wrote: > > On Thu, Oct 12, 2017 at 09:20:26PM +0530, Pankaj Gupta wrote: > > > +static blk_qc_t virtio_pmem_make_request(struct request_queue *q, > > > + struct bio *bio) > > > +{ > > > + blk_status_t rc = 0; > > > + struct bio_vec bvec; > > > + struct bvec_iter iter; > > > + struct virtio_pmem *pmem = q->queuedata; > > > + > > > + if (bio->bi_opf & REQ_FLUSH) > > > + //todo host flush command > > > > This detail is critical to the device design. What is the plan? > > yes, this is good point. > > was thinking of guest sending a flush command to Qemu which > will do a fsync on file fd. Previously there was discussion about fsyncing a specific file range instead of the whole file. This could perform better in cases where only a subset of dirty pages need to be flushed. One possibility is to design the virtio interface to communicate ranges but the emulation code simply fsyncs the fd for the time being. Later on, if the necessary kernel and userspace interfaces are added, we can make use of the interface. > If we do a async flush and move the task to wait queue till we receive > flush complete reply from host we can allow other tasks to execute > in current cpu. > > Any suggestions you have or anything I am not foreseeing here? My main thought about this patch series is whether pmem should be a virtio-blk feature bit instead of a whole new device. There is quite a bit of overlap between the two. Stefan