From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754900AbdERBv7 convert rfc822-to-8bit (ORCPT ); Wed, 17 May 2017 21:51:59 -0400 Received: from mga04.intel.com ([192.55.52.120]:63648 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754376AbdERBv5 (ORCPT ); Wed, 17 May 2017 21:51:57 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.38,356,1491289200"; d="scan'208";a="1131722860" From: "Chen, Xiaoguang" To: Alex Williamson CC: Gerd Hoffmann , "Tian, Kevin" , "linux-kernel@vger.kernel.org" , "zhenyuw@linux.intel.com" , "Lv, Zhiyuan" , "intel-gvt-dev@lists.freedesktop.org" , "Wang, Zhi A" Subject: RE: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf Thread-Topic: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf Thread-Index: AQHSwAN47y043l0J90m8wsAsHWts2aHgTQ8AgAGNjMCAAazCQIAAU9qAgAD30YCAAIqCgIAJhcZQ///LVYCAACZvAIABMD1g//+Lz4CAAGhigIAAfLgAgARXT/CAAHIJgIABcqeQgAH05ICAAMZxEA== Date: Thu, 18 May 2017 01:51:38 +0000 Message-ID: References: <1493372130-27727-1-git-send-email-xiaoguang.chen@intel.com> <1493372130-27727-7-git-send-email-xiaoguang.chen@intel.com> <1493718658.8581.82.camel@redhat.com> <20170504100833.199bc8ba@t450s.home> <1493967331.371.53.camel@redhat.com> <20170505091115.7a680636@t450s.home> <1494509273.17970.12.camel@redhat.com> <20170511094526.164985ee@w520.home> <20170511205829.672854c3@t450s.home> <1494580325.14352.57.camel@redhat.com> <20170512103828.4a1378a1@t450s.home> <20170515114409.414d1fdb@w520.home> <20170517154331.1b465f69@w520.home> In-Reply-To: <20170517154331.1b465f69@w520.home> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 10.0.102.7 dlp-reaction: no-action x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Alex, >-----Original Message----- >From: Alex Williamson [mailto:alex.williamson@redhat.com] >Sent: Thursday, May 18, 2017 5:44 AM >To: Chen, Xiaoguang >Cc: Gerd Hoffmann ; Tian, Kevin ; >linux-kernel@vger.kernel.org; zhenyuw@linux.intel.com; Lv, Zhiyuan >; intel-gvt-dev@lists.freedesktop.org; Wang, Zhi A > >Subject: Re: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf > >On Tue, 16 May 2017 10:16:28 +0000 >"Chen, Xiaoguang" wrote: > >> Hi Alex, >> >> >-----Original Message----- >> >From: Alex Williamson [mailto:alex.williamson@redhat.com] >> >Sent: Tuesday, May 16, 2017 1:44 AM >> >To: Chen, Xiaoguang >> >Cc: Gerd Hoffmann ; Tian, Kevin >> >; intel-gfx@lists.freedesktop.org; >> >linux-kernel@vger.kernel.org; zhenyuw@linux.intel.com; Lv, Zhiyuan >> >; intel-gvt- dev@lists.freedesktop.org; Wang, >> >Zhi A >> >Subject: Re: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the >> >dmabuf >> > >> >On Mon, 15 May 2017 03:36:50 +0000 >> >"Chen, Xiaoguang" wrote: >> > >> >> Hi Alex and Gerd, >> >> >> >> >-----Original Message----- >> >> >From: Alex Williamson [mailto:alex.williamson@redhat.com] >> >> >Sent: Saturday, May 13, 2017 12:38 AM >> >> >To: Gerd Hoffmann >> >> >Cc: Chen, Xiaoguang ; Tian, Kevin >> >> >; intel-gfx@lists.freedesktop.org; linux- >> >> >kernel@vger.kernel.org; zhenyuw@linux.intel.com; Lv, Zhiyuan >> >> >; intel-gvt-dev@lists.freedesktop.org; Wang, >> >> >Zhi A >> >> >Subject: Re: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting >> >> >the dmabuf >> >> > >> >> >On Fri, 12 May 2017 11:12:05 +0200 Gerd Hoffmann >> >> > wrote: >> >> > >> >> >> Hi, >> >> >> >> >> >> > If the contents of the framebuffer change or if the parameters >> >> >> > of the framebuffer change? I can't image that creating a new >> >> >> > dmabuf fd for every visual change within the framebuffer would >> >> >> > be efficient, but I don't have any concept of what a dmabuf actually >does. >> >> >> >> >> >> Ok, some background: >> >> >> >> >> >> The drm subsystem has the concept of planes. The most important >> >> >> plane is the primary framebuffer (i.e. what gets scanned out to >> >> >> the physical display). The cursor is a plane too, and there can >> >> >> be additional overlay planes for stuff like video playback. >> >> >> >> >> >> Typically there are multiple planes in a system and only one of >> >> >> them gets scanned out to the crtc, i.e. the fbdev emulation >> >> >> creates one plane for the framebuffer console. The X-Server >> >> >> creates a plane too, and when you switch between X-Server and >> >> >> framebuffer console via ctrl-alt-fn the intel driver just >> >> >> reprograms the encoder to scan out the one or the other plane to the crtc. >> >> >> >> >> >> The dma-buf handed out by gvt is a reference to a plane. I >> >> >> think on the host side gvt can see only the active plane (from >> >> >> encoder/crtc register >> >> >> programming) not the inactive ones. >> >> >> >> >> >> The dma-buf can be imported as opengl texture and then be used >> >> >> to render the guest display to a host window. I think it is >> >> >> even possible to use the dma-buf as plane in the host drm driver >> >> >> and scan it out directly to a physical display. The actual >> >> >> framebuffer content stays in gpu memory all the time, the cpu never has >to touch it. >> >> >> >> >> >> It is possible to cache the dma-buf handles, i.e. when the guest >> >> >> boots you'll get the first for the fbcon plane, when the >> >> >> x-server starts the second for the x-server framebuffer, and >> >> >> when the user switches to the text console via ctrl-alt-fn you >> >> >> can re-use the fbcon dma-buf you already have. >> >> >> >> >> >> The caching becomes more important for good performance when the >> >> >> guest uses pageflipping (wayland does): define two planes, >> >> >> render into one while displaying the other, then flip the two >> >> >> for a atomic display update. >> >> >> >> >> >> The caching also makes it a bit difficult to create a good interface. >> >> >> So, the current patch set creates: >> >> >> >> >> >> (a) A way to query the active planes (ioctl >> >> >> INTEL_VGPU_QUERY_DMABUF added by patch 5/6 of this series). >> >> >> (b) A way to create a dma-buf for the active plane (ioctl >> >> >> INTEL_VGPU_GENERATE_DMABUF). >> >> >> >> >> >> Typical userspace workflow is to first query the plane, then >> >> >> check if it already has a dma-buf for it, and if not create one. >> >> > >> >> >Thank you! This is immensely helpful! >> >> > >> >> >> > What changes to the framebuffer require a new dmabuf fd? >> >> >> > Shouldn't the user query the parameters of the framebuffer >> >> >> > through a dmabuf fd and shouldn't the dmabuf fd have some >> >> >> > signaling mechanism to the user (eventfd perhaps) to notify >> >> >> > the user to re- >> >evaluate the parameters? >> >> >> >> >> >> dma-bufs don't support that, they are really just a handle to a >> >> >> piece of memory, all metadata (format, size) most be >> >> >> communicated by >> >other means. >> >> >> >> >> >> > Otherwise are you imagining that the user polls the vfio region? >> >> >> >> >> >> Hmm, notification support would probably a good reason to have a >> >> >> separate file handle to manage the dma-bufs (instead of using >> >> >> driver-specific ioctls on the vfio fd), because the driver could >> >> >> also use the management fd for notifications then. >> >> > >> >> >I like this idea of a separate control fd for dmabufs, it provides >> >> >not only a central management point, but also a nice abstraction >> >> >for the vfio device specific interface. We potentially only need >> >> >a single >> >> >VFIO_DEVICE_GET_DMABUF_MGR_FD() ioctl to get a dmabuf management >> >> >fd (perhaps with a type parameter, ex. GFX) where maybe we could >> >> >have vfio-core incorporate this reference into the group >> >> >lifecycle, so the vendor driver only needs to fdget/put this >> >> >manager fd for the various plane dmabuf fds spawned in order to get core- >level reference counting. >> >> Following is my understanding of the management fd idea: >> >> 1) QEMU will call VFIO_DEVICE_GET_DMABUF_MGR_FD() ioctl to create a >> >> fd >> >and saved the fd in vfio group while initializing the vfio. >> > >> >Ideally there'd be kernel work here too if we want vfio-core to >> >incorporate lifecycle of this fd into the device/group/container >> >lifecycle. Maybe we even want to generalize it further to something >> >like VFIO_DEVICE_GET_FD which takes a parameter of what type of FD to >> >get, GFX_DMABUF_MGR_FD in this case. vfio- core would probably >> >allocate the fd, tap into the release hook for reference counting and >> >pass it to the vfio_device_ops (mdev vendor driver in this case) to attach >further. >> I tried to implement this today and now it functionally worked. >> I am still a little confuse of how to tap the fd into the release hook of >device/group/container. >> I tried to create the fd in vfio core but found it is difficult to get the file >operations for the fd, the file operations should be supplied by vendor drivers. > >I'm not fully convinced there's benefit to having vfio-core attempt to do this, I >was just hoping to avoid each vendor driver needing to implement their own >reference counting. I don't think vfio-core wants to get into tracking each ioctl >for each vendor specific fd type to know which create new references and which >don't. Perhaps we'll come up with easier ways to do this as we go. Got it. > >> So the fd is created in kvmgt.c for now. >> Below is part of the codes: >> >> diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c >> b/drivers/gpu/drm/i915/gvt/kvmgt.c >> index 389f072..d0649ba 100644 >> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c >> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c >> @@ -41,6 +41,7 @@ >> #include >> #include >> #include >> +#include >> >> #include "i915_drv.h" >> #include "gvt.h" >> @@ -524,6 +525,63 @@ static int intel_vgpu_reg_init_opregion(struct >intel_vgpu *vgpu) >> return ret; >> } >> >> +static int intel_vgpu_dmabuf_mgr_fd_mmap(struct file *file, struct >> +vm_area_struct *vma) { >> + WARN_ON(1); > >A user can abuse this, simply return error. OK. > >> + >> + return 0; >> +} >> + >> +static int intel_vgpu_dmabuf_mgr_fd_release(struct inode *inode, >> +struct file *filp) { >> + struct intel_vgpu *vgpu = filp->private_data; >> + >> + if (vgpu->vdev.vfio_device != NULL) >> + vfio_device_put(vgpu->vdev.vfio_device); > >When does the case occur where we don't have a vfio_device? This looks a bit >like a warning flag that reference counting isn't handled properly. This situation happen only when anonymous fd created successfully but error occur while trying to get the vfio_device. We should return error while user space trying to create the management fd and print an error message while kernel release this fd. > >> + >> + return 0; >> +} >> + >> +static long intel_vgpu_dmabuf_mgr_fd_ioctl(struct file *filp, >> + unsigned int ioctl, unsigned long arg) { >> + struct intel_vgpu *vgpu = filp->private_data; >> + int minsz; >> + struct intel_vgpu_dmabuf dmabuf; >> + int ret; >> + struct fd f; >> + f = fdget(dmabuf.fd); >> + minsz = offsetofend(struct intel_vgpu_dmabuf, tiled); >> + if (copy_from_user(&dmabuf, (void __user *)arg, minsz)) >> + return -EFAULT; >> + if (ioctl == INTEL_VGPU_QUERY_DMABUF) >> + ret = intel_gvt_ops->vgpu_query_dmabuf(vgpu, &dmabuf); >> + else if (ioctl == INTEL_VGPU_GENERATE_DMABUF) >> + ret = intel_gvt_ops->vgpu_generate_dmabuf(vgpu, >> +&dmabuf); > >Why do we need vendor specific ioctls here? Aren't querying the current plane >and getting an fd for that plane very generic concepts? >Is the resulting dmabuf Intel specific? No. not Intel specific. Like Gerd said "Typical userspace workflow is to first query the plane, then check if it already has a dma-buf for it, and if not create one". We first query the plane info(WITHOUT creating a fd). User space need to check whether there's a dmabuf for the plane(user space usually cached two or three dmabuf to handle double buffer or triple buffer situation) only there's no dmabuf for the plane we will create a dmabuf for it(another ioctl). > >> + else { >> + fdput(f); >> + gvt_vgpu_err("unsupported dmabuf operation\n"); >> + return -EINVAL; >> + } >> + >> + if (ret != 0) { >> + fdput(f); >> + gvt_vgpu_err("gvt-g get dmabuf failed:%d\n", ret); >> + return -EINVAL; >> + } >> + fdput(f); >> + >> + return copy_to_user((void __user *)arg, &dmabuf, minsz) ? >> +-EFAULT : 0; } >> + >> +static const struct file_operations intel_vgpu_dmabuf_mgr_fd_ops = { >> + .release = intel_vgpu_dmabuf_mgr_fd_release, >> + .unlocked_ioctl = intel_vgpu_dmabuf_mgr_fd_ioctl, >> + .mmap = intel_vgpu_dmabuf_mgr_fd_mmap, >> + .llseek = noop_llseek, >> +}; >> + >> static int intel_vgpu_create(struct kobject *kobj, struct mdev_device >> *mdev) { >> struct intel_vgpu *vgpu = NULL; @@ -1259,6 +1317,31 @@ static >> long intel_vgpu_ioctl(struct mdev_device *mdev, unsigned int cmd, >> } else if (cmd == VFIO_DEVICE_RESET) { >> intel_gvt_ops->vgpu_reset(vgpu); >> return 0; >> + } else if (cmd == VFIO_DEVICE_GET_FD) { >> + struct vfio_fd vfio_fd; >> + int fd; >> + struct vfio_device *device; >> + >> + minsz = offsetofend(struct vfio_fd, fd); >> + if (copy_from_user(&vfio_fd, (void __user *)arg, minsz)) >> + return -EINVAL; >> + >> + if (vfio_fd.argsz < minsz) >> + return -EINVAL; >> + >> + fd = anon_inode_getfd("vfio_dmabuf_mgr_fd", >&intel_vgpu_dmabuf_mgr_fd_ops, >> + vgpu, O_RDWR | O_CLOEXEC); >> + if (fd < 0) >> + return -EINVAL; >> + >> + vfio_fd.fd = fd; >> + device = vfio_device_get_from_dev(mdev_dev(mdev)); >> + if (device == NULL) >> + gvt_vgpu_err("kvmgt: vfio device is null\n"); >> + else >> + vgpu->vdev.vfio_device = device; >> + >> + return copy_to_user((void __user *)arg, &vfio_fd, >> + minsz) ? -EFAULT : 0; >> } >> >> return 0; >> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h >> index 519eff3..98be2e0 100644 >> --- a/include/uapi/linux/vfio.h >> +++ b/include/uapi/linux/vfio.h >> @@ -484,6 +485,20 @@ struct vfio_pci_hot_reset { >> >> #define VFIO_DEVICE_PCI_HOT_RESET _IO(VFIO_TYPE, VFIO_BASE + 13) >> >> +/** >> + * VFIO_DEVICE_GET_FD - _IOW(VFIO_TYPE, VFIO_BASE + 21, struct >> +vfio_fd) >> + * >> + * Create a fd for a vfio device. >> + * This fd can be used for various purpose. >> + */ >> +struct vfio_fd { >> + __u32 argsz; >> + __u32 flags; >> + /* out */ >> + __u32 fd; >> +}; >> +#define VFIO_DEVICE_GET_FD _IO(VFIO_TYPE, VFIO_BASE + 14) > > >The idea was that we pass some sort of type to VFIO_DEVICE_GET_FD, for >instance we might ask for a DEVICE_FD_GRAPHICS_DMABUF and the vfio bus >driver (mdev vendor driver) would test whether it supports that type of thing and >either return an fd or error. We can return the fd the same way we do for >VFIO_DEVICE_GET_FD. For instance the user should do something like: > >dmabuf_fd = ioctl(device_fd, > VFIO_DEVICE_GET_FD, DEVICE_FD_GRAPHICS_DMABUF); if >(dmabuf_fd < 0) > /* not supported... */ >else > /* do stuff */ OK. Got it. > >Thanks, >Alex