All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kenneth Lee <liguozhu@hisilicon.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Leon Romanovsky" <leon@kernel.org>,
	"Kenneth Lee" <nek.in.cn@gmail.com>,
	"Tim Sell" <timothy.sell@unisys.com>,
	linux-doc@vger.kernel.org,
	"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
	"Zaibo Xu" <xuzaibo@huawei.com>,
	zhangfei.gao@foxmail.com, linuxarm@huawei.com,
	haojian.zhuang@linaro.org, "Christoph Lameter" <cl@linux.com>,
	"Hao Fang" <fanghao11@huawei.com>,
	"Gavin Schenk" <g.schenk@eckelmann.de>,
	"RDMA mailing list" <linux-rdma@vger.kernel.org>,
	"Zhou Wang" <wangzhou1@hisilicon.com>,
	"Doug Ledford" <dledford@redhat.com>,
	"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>,
	"David Kershner" <david.kershner@unisys.com>,
	"Johan Hovold" <johan@kernel.org>,
	"Cyrille Pitchen" <cyrille.pitchen@free-electrons.com>,
	"Sagar Dharia" <sdharia@codeaurora.org>,
	"Jens Axboe" <axboe@kernel.dk>,
	guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	linux-kernel@vger.kernel.org, "Vinod Koul" <vkoul@kernel.org>,
	linux-crypto@vger.kernel.org,
	"Philippe Ombredanne" <pombredanne@nexb.com>,
	"Sanyog Kale" <sanyog.r.kale@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	linux-accelerators@lists.ozlabs.org
Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
Date: Fri, 23 Nov 2018 16:02:42 +0800	[thread overview]
Message-ID: <20181123080242.GK157308@Turing-Arch-b> (raw)
In-Reply-To: <20181122025840.GB19938@ziepe.ca>

On Wed, Nov 21, 2018 at 07:58:40PM -0700, Jason Gunthorpe wrote:
> Date: Wed, 21 Nov 2018 19:58:40 -0700
> From: Jason Gunthorpe <jgg@ziepe.ca>
> To: Kenneth Lee <liguozhu@hisilicon.com>
> CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>,
>  Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander
>  Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu
>  <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com,
>  haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang
>  <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing
>  list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>,
>  Doug Ledford <dledford@redhat.com>, Uwe Kleine-König
>  <u.kleine-koenig@pengutronix.de>, David Kershner
>  <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille
>  Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia
>  <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>,
>  guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap
>  <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul
>  <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne
>  <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S.
>  Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.9.4 (2018-02-28)
> Message-ID: <20181122025840.GB19938@ziepe.ca>
> 
> On Wed, Nov 21, 2018 at 02:08:05PM +0800, Kenneth Lee wrote:
> 
> > > But considering Jean's SVA stuff seems based on mmu notifiers, I have
> > > a hard time believing that it has any different behavior from RDMA's
> > > ODP, and if it does have different behavior, then it is probably just
> > > a bug in the ODP implementation.
> > 
> > As Jean has explained, his solution is based on page table sharing. I think ODP
> > should also consider this new feature.
> 
> Shared page tables would require the HW to walk the page table format
> of the CPU directly, not sure how that would be possible for ODP?
> 
> Presumably the implementation for ARM relies on the IOMMU hardware
> doing this?

Yes, that is the idea. And since Jean is merging the AMD and Intel solution
together, I assume they can do the same. This is also the reason I want to solve
my problem on top of IOMMU directly. But anyway, let me try to see if I can
merge the logic with ODP.

> 
> > > > > If all your driver needs is to mmap some PCI bar space, route
> > > > > interrupts and do DMA mapping then mediated VFIO is probably a good
> > > > > choice. 
> > > > 
> > > > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and
> > > > try not to add complexity to the mm subsystem.
> > > 
> > > Why would a mediated VFIO driver touch the mm subsystem? Sounds like
> > > you don't have a VFIO driver if it needs to do stuff like that...
> > 
> > VFIO has no ODP-like solution, and if we want to solve the fork problem, we have
> > to make some change to iommu and the fork procedure. Further, VFIO takes every
> > queue as a independent device. This create a lot of trouble on resource
> > management. For example, you will need a manager process to withdraw the unused
> > device and you need to let the user process know about PASID of the queue, and
> > so on.
> 
> Well, I would think you'd add SVA support to the VFIO driver as a
> generic capability - it seems pretty useful for any VFIO user as it
> avoids all the kernel upcalls to do memory pinning and DMA address
> translation.

It is already part of Jean's patchset. And that's why I built my solution on
VFIO in the first place. But I think the concept of SVA and PASID is not
compatible with the original VFIO concept space. You would not share your whole
address space to a device at all in a virtual machine manager, wouldn't you? And
if you can manage to have a separated mdev for your virtual machine, why bother
to set a PASID to it?  The answer to those problem, I think, will be Intel's
Scalable IO Virtualization. For accelerator, the requirement is simply: getting
a handle to device, attaching the process's mm with the handle by sharing the
process's page table with its iommu indexed by PASID, and start the
communication...

> 
> Once the VFIO driver knows about this as a generic capability then the
> device it exposes to userspace would use CPU addresses instead of DMA
> addresses.
> 
> The question is if your driver needs much more than the device
> agnostic generic services VFIO provides.
> 
> I'm not sure what you have in mind with resource management.. It is
> hard to revoke resources from userspace, unless you are doing
> kernel syscalls, but then why do all this?

Say, I have 1024 queues in my accelerator. I can get one by opening the device
and attach it with the fd. If the process exit by any means, the queue can be
returned with the release of the fd. But if it is mdev, it will still be there
and some one should tell the allocator it is available again. This is not easy
to design in user space.

> 
> Jason

-- 

WARNING: multiple messages have this Message-ID (diff)
From: Kenneth Lee <liguozhu@hisilicon.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Leon Romanovsky" <leon@kernel.org>,
	"Kenneth Lee" <nek.in.cn@gmail.com>,
	"Tim Sell" <timothy.sell@unisys.com>,
	linux-doc@vger.kernel.org,
	"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
	"Zaibo Xu" <xuzaibo@huawei.com>,
	zhangfei.gao@foxmail.com, linuxarm@huawei.com,
	haojian.zhuang@linaro.org, "Christoph Lameter" <cl@linux.com>,
	"Hao Fang" <fanghao11@huawei.com>,
	"Gavin Schenk" <g.schenk@eckelmann.de>,
	"RDMA mailing list" <linux-rdma@vger.kernel.org>,
	"Zhou Wang" <wangzhou1@hisilicon.com>,
	"Doug Ledford" <dledford@redhat.com>,
	"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>,
	"David Kershner" <david.kershner@unisys.com>,
	"Johan Hovold" <johan@kernel.org>,
	"Cyrille Pitchen" <cyrille.pitchen@free-electrons.com>
Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
Date: Fri, 23 Nov 2018 16:02:42 +0800	[thread overview]
Message-ID: <20181123080242.GK157308@Turing-Arch-b> (raw)
In-Reply-To: <20181122025840.GB19938@ziepe.ca>

On Wed, Nov 21, 2018 at 07:58:40PM -0700, Jason Gunthorpe wrote:
> Date: Wed, 21 Nov 2018 19:58:40 -0700
> From: Jason Gunthorpe <jgg@ziepe.ca>
> To: Kenneth Lee <liguozhu@hisilicon.com>
> CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>,
>  Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander
>  Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu
>  <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com,
>  haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang
>  <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing
>  list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>,
>  Doug Ledford <dledford@redhat.com>, Uwe Kleine-König
>  <u.kleine-koenig@pengutronix.de>, David Kershner
>  <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille
>  Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia
>  <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>,
>  guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap
>  <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul
>  <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne
>  <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S.
>  Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.9.4 (2018-02-28)
> Message-ID: <20181122025840.GB19938@ziepe.ca>
> 
> On Wed, Nov 21, 2018 at 02:08:05PM +0800, Kenneth Lee wrote:
> 
> > > But considering Jean's SVA stuff seems based on mmu notifiers, I have
> > > a hard time believing that it has any different behavior from RDMA's
> > > ODP, and if it does have different behavior, then it is probably just
> > > a bug in the ODP implementation.
> > 
> > As Jean has explained, his solution is based on page table sharing. I think ODP
> > should also consider this new feature.
> 
> Shared page tables would require the HW to walk the page table format
> of the CPU directly, not sure how that would be possible for ODP?
> 
> Presumably the implementation for ARM relies on the IOMMU hardware
> doing this?

Yes, that is the idea. And since Jean is merging the AMD and Intel solution
together, I assume they can do the same. This is also the reason I want to solve
my problem on top of IOMMU directly. But anyway, let me try to see if I can
merge the logic with ODP.

> 
> > > > > If all your driver needs is to mmap some PCI bar space, route
> > > > > interrupts and do DMA mapping then mediated VFIO is probably a good
> > > > > choice. 
> > > > 
> > > > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and
> > > > try not to add complexity to the mm subsystem.
> > > 
> > > Why would a mediated VFIO driver touch the mm subsystem? Sounds like
> > > you don't have a VFIO driver if it needs to do stuff like that...
> > 
> > VFIO has no ODP-like solution, and if we want to solve the fork problem, we have
> > to make some change to iommu and the fork procedure. Further, VFIO takes every
> > queue as a independent device. This create a lot of trouble on resource
> > management. For example, you will need a manager process to withdraw the unused
> > device and you need to let the user process know about PASID of the queue, and
> > so on.
> 
> Well, I would think you'd add SVA support to the VFIO driver as a
> generic capability - it seems pretty useful for any VFIO user as it
> avoids all the kernel upcalls to do memory pinning and DMA address
> translation.

It is already part of Jean's patchset. And that's why I built my solution on
VFIO in the first place. But I think the concept of SVA and PASID is not
compatible with the original VFIO concept space. You would not share your whole
address space to a device at all in a virtual machine manager, wouldn't you? And
if you can manage to have a separated mdev for your virtual machine, why bother
to set a PASID to it?  The answer to those problem, I think, will be Intel's
Scalable IO Virtualization. For accelerator, the requirement is simply: getting
a handle to device, attaching the process's mm with the handle by sharing the
process's page table with its iommu indexed by PASID, and start the
communication...

> 
> Once the VFIO driver knows about this as a generic capability then the
> device it exposes to userspace would use CPU addresses instead of DMA
> addresses.
> 
> The question is if your driver needs much more than the device
> agnostic generic services VFIO provides.
> 
> I'm not sure what you have in mind with resource management.. It is
> hard to revoke resources from userspace, unless you are doing
> kernel syscalls, but then why do all this?

Say, I have 1024 queues in my accelerator. I can get one by opening the device
and attach it with the fd. If the process exit by any means, the queue can be
returned with the release of the fd. But if it is mdev, it will still be there
and some one should tell the allocator it is available again. This is not easy
to design in user space.

> 
> Jason

-- 

  reply	other threads:[~2018-11-23 18:44 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-12  7:58 [RFCv3 PATCH 0/6] A General Accelerator Framework, WarpDrive Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce Kenneth Lee
2018-11-13  0:23   ` Leon Romanovsky
2018-11-13  0:23     ` Leon Romanovsky
2018-11-14  2:58     ` Kenneth Lee
2018-11-14  2:58       ` Kenneth Lee
2018-11-14  2:58       ` Kenneth Lee
2018-11-14 16:00       ` Leon Romanovsky
2018-11-14 16:00         ` Leon Romanovsky
2018-11-15  8:51         ` Kenneth Lee
2018-11-15  8:51           ` Kenneth Lee
2018-11-15 14:54           ` Leon Romanovsky
2018-11-15 14:54             ` Leon Romanovsky
2018-11-19  9:14             ` Kenneth Lee
2018-11-19  9:14               ` Kenneth Lee
2018-11-19  9:19               ` Kenneth Lee
2018-11-19  9:19                 ` Kenneth Lee
2018-11-19 10:48                 ` Leon Romanovsky
2018-11-19 10:48                   ` Leon Romanovsky
2018-11-19 16:48                   ` Jerome Glisse
2018-11-19 16:48                     ` Jerome Glisse
2018-11-19 18:27                     ` Jason Gunthorpe
2018-11-19 18:27                       ` Jason Gunthorpe
2018-11-19 18:42                       ` Jerome Glisse
2018-11-19 18:42                         ` Jerome Glisse
2018-11-19 18:53                         ` Jason Gunthorpe
2018-11-19 18:53                           ` Jason Gunthorpe
2018-11-19 19:17                           ` Jerome Glisse
2018-11-19 19:17                             ` Jerome Glisse
2018-11-19 19:27                             ` Jason Gunthorpe
2018-11-19 19:27                               ` Jason Gunthorpe
2018-11-19 19:46                               ` Jerome Glisse
2018-11-19 19:46                                 ` Jerome Glisse
2018-11-19 20:11                                 ` Jason Gunthorpe
2018-11-19 20:11                                   ` Jason Gunthorpe
2018-11-19 20:26                                   ` Jerome Glisse
2018-11-19 20:26                                     ` Jerome Glisse
2018-11-19 21:26                                     ` Jason Gunthorpe
2018-11-19 21:26                                       ` Jason Gunthorpe
2018-11-19 21:33                                       ` Jerome Glisse
2018-11-19 21:33                                         ` Jerome Glisse
2018-11-19 21:41                                         ` Jason Gunthorpe
2018-11-19 21:41                                           ` Jason Gunthorpe
2018-11-19 19:02                         ` Leon Romanovsky
2018-11-19 19:02                           ` Leon Romanovsky
2018-11-19 19:19                         ` Christopher Lameter
2018-11-19 19:19                           ` Christopher Lameter
2018-11-19 19:25                           ` Jerome Glisse
2018-11-19 19:25                             ` Jerome Glisse
2018-11-20  2:30                   ` Kenneth Lee
2018-11-20  2:30                     ` Kenneth Lee
2018-11-27  2:52                     ` Kenneth Lee
2018-11-27  2:52                       ` Kenneth Lee
2018-11-19 18:49               ` Jason Gunthorpe
2018-11-19 18:49                 ` Jason Gunthorpe
2018-11-20  3:07                 ` Kenneth Lee
2018-11-20  3:07                   ` Kenneth Lee
2018-11-20  3:29                   ` Jason Gunthorpe
2018-11-20  3:29                     ` Jason Gunthorpe
2018-11-20  9:16                     ` Jonathan Cameron
2018-11-20  9:16                       ` Jonathan Cameron
2018-11-20 12:19                       ` Jean-Philippe Brucker
2018-11-20 12:19                         ` Jean-Philippe Brucker
2018-11-21  6:08                     ` Kenneth Lee
2018-11-21  6:08                       ` Kenneth Lee
2018-11-22  2:58                       ` Jason Gunthorpe
2018-11-22  2:58                         ` Jason Gunthorpe
2018-11-23  8:02                         ` Kenneth Lee [this message]
2018-11-23  8:02                           ` Kenneth Lee
2018-11-23 18:05                           ` Jason Gunthorpe
2018-11-23 18:05                             ` Jason Gunthorpe
2018-11-24  4:13                             ` Kenneth Lee
2018-11-24  4:13                               ` Kenneth Lee
2018-11-20  5:17                   ` Leon Romanovsky
2018-11-20  5:17                     ` Leon Romanovsky
2018-11-21  3:02                     ` Kenneth Lee
2018-11-21  3:02                       ` Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 2/6] uacce: add uacce module Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 3/6] crypto/hisilicon: add hisilicon Queue Manager driver Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 4/6] crypto/hisilicon: add Hisilicon zip driver Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 5/6] crypto: add uacce support to Hisilicon qm Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 6/6] uacce: add user sample for uacce/warpdrive Kenneth Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181123080242.GK157308@Turing-Arch-b \
    --to=liguozhu@hisilicon.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=axboe@kernel.dk \
    --cc=cl@linux.com \
    --cc=cyrille.pitchen@free-electrons.com \
    --cc=davem@davemloft.net \
    --cc=david.kershner@unisys.com \
    --cc=dledford@redhat.com \
    --cc=fanghao11@huawei.com \
    --cc=g.schenk@eckelmann.de \
    --cc=guodong.xu@linaro.org \
    --cc=haojian.zhuang@linaro.org \
    --cc=jgg@ziepe.ca \
    --cc=johan@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-accelerators@lists.ozlabs.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=nek.in.cn@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pombredanne@nexb.com \
    --cc=rdunlap@infradead.org \
    --cc=sanyog.r.kale@intel.com \
    --cc=sdharia@codeaurora.org \
    --cc=timothy.sell@unisys.com \
    --cc=u.kleine-koenig@pengutronix.de \
    --cc=vkoul@kernel.org \
    --cc=wangzhou1@hisilicon.com \
    --cc=xuzaibo@huawei.com \
    --cc=zhangfei.gao@foxmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.