linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: "Kenneth Lee" <liguozhu@hisilicon.com>,
	"Tim Sell" <timothy.sell@unisys.com>,
	linux-doc@vger.kernel.org,
	"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
	"Zaibo Xu" <xuzaibo@huawei.com>,
	zhangfei.gao@foxmail.com, linuxarm@huawei.com,
	haojian.zhuang@linaro.org, "Christoph Lameter" <cl@linux.com>,
	"Hao Fang" <fanghao11@huawei.com>,
	"Gavin Schenk" <g.schenk@eckelmann.de>,
	"RDMA mailing list" <linux-rdma@vger.kernel.org>,
	"Zhou Wang" <wangzhou1@hisilicon.com>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Doug Ledford" <dledford@redhat.com>,
	"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>,
	"David Kershner" <david.kershner@unisys.com>,
	"Kenneth Lee" <nek.in.cn@gmail.com>,
	"Johan Hovold" <johan@kernel.org>,
	"Cyrille Pitchen" <cyrille.pitchen@free-electrons.com>,
	"Sagar Dharia" <sdharia@codeaurora.org>,
	"Jens Axboe" <axboe@kernel.dk>,
	guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	linux-kernel@vger.kernel.org, "Vinod Koul" <vkoul@kernel.org>,
	linux-crypto@vger.kernel.org,
	"Philippe Ombredanne" <pombredanne@nexb.com>,
	"Sanyog Kale" <sanyog.r.kale@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	linux-accelerators@lists.ozlabs.org
Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
Date: Mon, 19 Nov 2018 11:48:54 -0500	[thread overview]
Message-ID: <20181119164853.GA4593@redhat.com> (raw)
In-Reply-To: <20181119104801.GF8268@mtr-leonro.mtl.com>

On Mon, Nov 19, 2018 at 12:48:01PM +0200, Leon Romanovsky wrote:
> On Mon, Nov 19, 2018 at 05:19:10PM +0800, Kenneth Lee wrote:
> > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote:
> > > On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote:
> > > > On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote:
> > > > > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote:
> > > > > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote:
> > > > > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:

[...]

> > > > memory exposed to user is properly protected from security point of view.
> > > > 3. "stop using the page for a while for the copying" - I'm not fully
> > > > understand this claim, maybe this article will help you to better
> > > > describe : https://lwn.net/Articles/753027/
> > >
> > > This topic was being discussed in RFCv2. The key problem here is that:
> > >
> > > The device need to hold the memory for its own calculation, but the CPU/software
> > > want to stop it for a while for synchronizing with disk or COW.
> > >
> > > If the hardware support SVM/SVA (Shared Virtual Memory/Address), it is easy, the
> > > device share page table with CPU, the device will raise a page fault when the
> > > CPU downgrade the PTE to read-only.
> > >
> > > If the hardware cannot share page table with the CPU, we then need to have
> > > some way to change the device page table. This is what happen in ODP. It
> > > invalidates the page table in device upon mmu_notifier call back. But this cannot
> > > solve the COW problem: if the user process A share a page P with device, and A
> > > forks a new process B, and it continue to write to the page. By COW, the
> > > process B will keep the page P, while A will get a new page P'. But you have
> > > no way to let the device know it should use P' rather than P.
> 
> I didn't hear about such issue and we supported fork for a long time.
> 

Just to comment on this, any infiniband driver which use umem and do
not have ODP (here ODP for me means listening to mmu notifier so all
infiniband driver except mlx5) will be affected by same issue AFAICT.

AFAICT there is no special thing happening after fork() inside any of
those driver. So if parent create a umem mr before fork() and program
hardware with it then after fork() the parent might start using new
page for the umem range while the old memory is use by the child. The
reverse is also true (parent using old memory and child new memory)
bottom line you can not predict which memory the child or the parent
will use for the range after fork().

So no matter what you consider the child or the parent, what the hw
will use for the mr is unlikely to match what the CPU use for the
same virtual address. In other word:

Before fork:
    CPU parent: virtual addr ptr1 -> physical address = 0xCAFE
    HARDWARE:   virtual addr ptr1 -> physical address = 0xCAFE

Case 1:
    CPU parent: virtual addr ptr1 -> physical address = 0xCAFE
    CPU child:  virtual addr ptr1 -> physical address = 0xDEAD
    HARDWARE:   virtual addr ptr1 -> physical address = 0xCAFE

Case 2:
    CPU parent: virtual addr ptr1 -> physical address = 0xBEEF
    CPU child:  virtual addr ptr1 -> physical address = 0xCAFE
    HARDWARE:   virtual addr ptr1 -> physical address = 0xCAFE


This apply for every single page and is not predictable. This only
apply to private memory (mmap() with MAP_PRIVATE)

I am not familiar enough with RDMA user space API contract to know
if this is an issue or not.

Note that this can not be fix, no one should have done umem without
ODP like mlx5. For this to work properly you need sane hardware like
mlx5.

Cheers,
Jérôme

  reply	other threads:[~2018-11-19 16:49 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-12  7:58 [RFCv3 PATCH 0/6] A General Accelerator Framework, WarpDrive Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce Kenneth Lee
2018-11-13  0:23   ` Leon Romanovsky
2018-11-14  2:58     ` Kenneth Lee
2018-11-14 16:00       ` Leon Romanovsky
2018-11-15  8:51         ` Kenneth Lee
2018-11-15 14:54           ` Leon Romanovsky
2018-11-19  9:14             ` Kenneth Lee
2018-11-19  9:19               ` Kenneth Lee
2018-11-19 10:48                 ` Leon Romanovsky
2018-11-19 16:48                   ` Jerome Glisse [this message]
2018-11-19 18:27                     ` Jason Gunthorpe
2018-11-19 18:42                       ` Jerome Glisse
2018-11-19 18:53                         ` Jason Gunthorpe
2018-11-19 19:17                           ` Jerome Glisse
2018-11-19 19:27                             ` Jason Gunthorpe
2018-11-19 19:46                               ` Jerome Glisse
2018-11-19 20:11                                 ` Jason Gunthorpe
2018-11-19 20:26                                   ` Jerome Glisse
2018-11-19 21:26                                     ` Jason Gunthorpe
2018-11-19 21:33                                       ` Jerome Glisse
2018-11-19 21:41                                         ` Jason Gunthorpe
2018-11-19 19:02                         ` Leon Romanovsky
2018-11-19 19:19                         ` Christopher Lameter
2018-11-19 19:25                           ` Jerome Glisse
2018-11-20  2:30                   ` Kenneth Lee
2018-11-27  2:52                     ` Kenneth Lee
2018-11-19 18:49               ` Jason Gunthorpe
2018-11-20  3:07                 ` Kenneth Lee
2018-11-20  3:29                   ` Jason Gunthorpe
2018-11-20  9:16                     ` Jonathan Cameron
2018-11-20 12:19                       ` Jean-Philippe Brucker
2018-11-21  6:08                     ` Kenneth Lee
2018-11-22  2:58                       ` Jason Gunthorpe
2018-11-23  8:02                         ` Kenneth Lee
2018-11-23 18:05                           ` Jason Gunthorpe
2018-11-24  4:13                             ` Kenneth Lee
2018-11-20  5:17                   ` Leon Romanovsky
2018-11-21  3:02                     ` Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 2/6] uacce: add uacce module Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 3/6] crypto/hisilicon: add hisilicon Queue Manager driver Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 4/6] crypto/hisilicon: add Hisilicon zip driver Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 5/6] crypto: add uacce support to Hisilicon qm Kenneth Lee
2018-11-12  7:58 ` [RFCv3 PATCH 6/6] uacce: add user sample for uacce/warpdrive Kenneth Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181119164853.GA4593@redhat.com \
    --to=jglisse@redhat.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=axboe@kernel.dk \
    --cc=cl@linux.com \
    --cc=cyrille.pitchen@free-electrons.com \
    --cc=davem@davemloft.net \
    --cc=david.kershner@unisys.com \
    --cc=dledford@redhat.com \
    --cc=fanghao11@huawei.com \
    --cc=g.schenk@eckelmann.de \
    --cc=guodong.xu@linaro.org \
    --cc=haojian.zhuang@linaro.org \
    --cc=jgg@ziepe.ca \
    --cc=johan@kernel.org \
    --cc=leon@kernel.org \
    --cc=liguozhu@hisilicon.com \
    --cc=linux-accelerators@lists.ozlabs.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=nek.in.cn@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pombredanne@nexb.com \
    --cc=rdunlap@infradead.org \
    --cc=sanyog.r.kale@intel.com \
    --cc=sdharia@codeaurora.org \
    --cc=timothy.sell@unisys.com \
    --cc=u.kleine-koenig@pengutronix.de \
    --cc=vkoul@kernel.org \
    --cc=wangzhou1@hisilicon.com \
    --cc=xuzaibo@huawei.com \
    --cc=zhangfei.gao@foxmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).