All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alejandro Lucero <alejandro.lucero@netronome.com>
To: "Walker, Benjamin" <benjamin.walker@intel.com>
Cc: "thomas@monjalon.net" <thomas@monjalon.net>,
	"Gonzalez Monroy, Sergio" <sergio.gonzalez.monroy@intel.com>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"Tan, Jianfeng" <jianfeng.tan@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: Running DPDK as an unprivileged user
Date: Tue, 28 Nov 2017 14:16:14 +0000	[thread overview]
Message-ID: <CAD+H991d38o01tw4v32GunnLznUy=8y-DMKGWTwGG8+EDrmjKg@mail.gmail.com> (raw)
In-Reply-To: <1511805495.2692.82.camel@intel.com>

On Mon, Nov 27, 2017 at 5:58 PM, Walker, Benjamin <benjamin.walker@intel.com
> wrote:

> On Sun, 2017-11-05 at 01:17 +0100, Thomas Monjalon wrote:
> > Hi, restarting an old topic,
> >
> > 05/01/2017 16:52, Tan, Jianfeng:
> > > On 1/5/2017 5:34 AM, Walker, Benjamin wrote:
> > > > > > Note that this
> > > > > > probably means that using uio on recent kernels is subtly
> > > > > > broken and cannot be supported going forward because there
> > > > > > is no uio mechanism to pin the memory.
> > > > > >
> > > > > > The first open question I have is whether DPDK should allow
> > > > > > uio at all on recent (4.x) kernels. My current understanding
> > > > > > is that there is no way to pin memory and hugepages can now
> > > > > > be moved around, so uio would be unsafe. What does the
> > > > > > community think here?
> > >
> > > Back to this question, removing uio support in DPDK seems a little
> > > overkill to me. Can we just document it down? Like, firstly warn users
> > > do not invoke migrate_pages() or move_pages() to a DPDK process; as for
> > > the kcompactd daemon and some more cases (like compaction could be
> > > triggered by alloc_pages()), could we just recommend to disable
> > > CONFIG_COMPACTION?
> >
> > We really need to better document the limitations of UIO.
> > May we have some suggestions here?
> >
> > > Another side, how does vfio pin those memory? Through memlock (from
> code
> > > in vfio_pin_pages())? So why not just mlock those hugepages?
> >
> > Good question. Why not mlock the hugepages?
>
> mlock just guarantees that a virtual page is always backed by *some*
> physical
> page of memory. It does not guarantee that over the lifetime of the
> process a
> virtual page is mapped to the *same* physical page. The kernel is free to
> transparently move memory around, compress it, dedupe it, etc.
>
> vfio is not pinning the memory, but instead is using the IOMMU (a piece of
> hardware) to participate in the memory management on the platform. If a
> device
> begins a DMA transfer to an I/O virtual address, the IOMMU will coordinate
> with
> the main MMU to make sure that the data ends up in the correct location,
> even as
> the virtual to physical mappings are being modified.


This last comment confused me because you said VFIO did the page pinning in
your first email.
I have been looking at the kernel code and the VFIO driver does pin the
pages, at least the iommu type 1.

I can see a problem adding support to UIO for doing the same, because that
implies there is a device
doing DMAs and programmed from user space, which is something the UIO
maintainer is against. But because
vfio-noiommu mode was implemented just for this, I guess that could be
added to the VFIO driver. This does not
solve the problem of software not using vfio though.

Apart from improving the UIO documentation when used with DPDK, maybe some
sort of check could be done
and DPDK requiring a explicit parameter for making the user aware of the
potential risk when UIO is used and the
kernel page migration is enabled. Not sure if this last thing could be
easily known from user space.

On another side, we suffered a similar problem when VMs were using SRIOV
and memory balloning. The IOMMU was
removing the mapping for the memory removed, but the kernel inside the VM
did not get any event and the device
ended up doing some wrong DMA operation.

  reply	other threads:[~2017-11-28 14:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-29 20:41 Running DPDK as an unprivileged user Walker, Benjamin
2016-12-30  1:14 ` Stephen Hemminger
2017-01-02 14:32   ` Thomas Monjalon
2017-01-02 19:47     ` Stephen Hemminger
2017-01-03 22:50       ` Walker, Benjamin
2017-01-04 10:11         ` Thomas Monjalon
2017-01-04 21:35           ` Walker, Benjamin
2017-01-04 11:39 ` Tan, Jianfeng
2017-01-04 21:34   ` Walker, Benjamin
2017-01-05 10:09     ` Sergio Gonzalez Monroy
2017-01-05 10:16       ` Sergio Gonzalez Monroy
2017-01-05 14:58         ` Tan, Jianfeng
2017-01-05 15:52     ` Tan, Jianfeng
2017-11-05  0:17       ` Thomas Monjalon
2017-11-27 17:58         ` Walker, Benjamin
2017-11-28 14:16           ` Alejandro Lucero [this message]
2017-11-28 17:50             ` Walker, Benjamin
2017-11-28 19:13               ` Alejandro Lucero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD+H991d38o01tw4v32GunnLznUy=8y-DMKGWTwGG8+EDrmjKg@mail.gmail.com' \
    --to=alejandro.lucero@netronome.com \
    --cc=anatoly.burakov@intel.com \
    --cc=benjamin.walker@intel.com \
    --cc=dev@dpdk.org \
    --cc=jianfeng.tan@intel.com \
    --cc=sergio.gonzalez.monroy@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.