On Tue, Sep 14, 2021 at 10:53 AM Chia-I Wu wrote: > ,On Mon, Sep 13, 2021 at 6:57 PM Gurchetan Singh > wrote: > > > > > > > > > > On Mon, Sep 13, 2021 at 11:52 AM Chia-I Wu wrote: > >> > >> . > >> > >> On Mon, Sep 13, 2021 at 10:48 AM Gurchetan Singh > >> wrote: > >> > > >> > > >> > > >> > On Fri, Sep 10, 2021 at 12:33 PM Chia-I Wu wrote: > >> >> > >> >> On Wed, Sep 8, 2021 at 6:37 PM Gurchetan Singh > >> >> wrote: > >> >> > > >> >> > We don't want fences from different 3D contexts (virgl, gfxstream, > >> >> > venus) to be on the same timeline. With explicit context creation, > >> >> > we can specify the number of ring each context wants. > >> >> > > >> >> > Execbuffer can specify which ring to use. > >> >> > > >> >> > Signed-off-by: Gurchetan Singh > >> >> > Acked-by: Lingfeng Yang > >> >> > --- > >> >> > drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ > >> >> > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 > ++++++++++++++++++++++++-- > >> >> > 2 files changed, 35 insertions(+), 2 deletions(-) > >> >> > > >> >> > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h > b/drivers/gpu/drm/virtio/virtgpu_drv.h > >> >> > index a5142d60c2fa..cca9ab505deb 100644 > >> >> > --- a/drivers/gpu/drm/virtio/virtgpu_drv.h > >> >> > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h > >> >> > @@ -56,6 +56,7 @@ > >> >> > #define STATE_ERR 2 > >> >> > > >> >> > #define MAX_CAPSET_ID 63 > >> >> > +#define MAX_RINGS 64 > >> >> > > >> >> > struct virtio_gpu_object_params { > >> >> > unsigned long size; > >> >> > @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { > >> >> > uint32_t ctx_id; > >> >> > uint32_t context_init; > >> >> > bool context_created; > >> >> > + uint32_t num_rings; > >> >> > + uint64_t base_fence_ctx; > >> >> > struct mutex context_lock; > >> >> > }; > >> >> > > >> >> > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > >> >> > index f51f3393a194..262f79210283 100644 > >> >> > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > >> >> > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > >> >> > @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct > drm_device *dev, void *data, > >> >> > int in_fence_fd = exbuf->fence_fd; > >> >> > int out_fence_fd = -1; > >> >> > void *buf; > >> >> > + uint64_t fence_ctx; > >> >> > + uint32_t ring_idx; > >> >> > + > >> >> > + fence_ctx = vgdev->fence_drv.context; > >> >> > + ring_idx = 0; > >> >> > > >> >> > if (vgdev->has_virgl_3d == false) > >> >> > return -ENOSYS; > >> >> > @@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct > drm_device *dev, void *data, > >> >> > if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) > >> >> > return -EINVAL; > >> >> > > >> >> > + if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) { > >> >> > + if (exbuf->ring_idx >= vfpriv->num_rings) > >> >> > + return -EINVAL; > >> >> > + > >> >> > + if (!vfpriv->base_fence_ctx) > >> >> > + return -EINVAL; > >> >> > + > >> >> > + fence_ctx = vfpriv->base_fence_ctx; > >> >> > + ring_idx = exbuf->ring_idx; > >> >> > + } > >> >> > + > >> >> > exbuf->fence_fd = -1; > >> >> > > >> >> > virtio_gpu_create_context(dev, file); > >> >> > @@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct > drm_device *dev, void *data, > >> >> > goto out_memdup; > >> >> > } > >> >> > > >> >> > - out_fence = virtio_gpu_fence_alloc(vgdev, > vgdev->fence_drv.context, 0); > >> >> > + out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, > ring_idx); > >> >> > if(!out_fence) { > >> >> > ret = -ENOMEM; > >> >> > goto out_unresv; > >> >> > @@ -691,7 +707,7 @@ static int > virtio_gpu_context_init_ioctl(struct drm_device *dev, > >> >> > return -EINVAL; > >> >> > > >> >> > /* Number of unique parameters supported at this time. */ > >> >> > - if (num_params > 1) > >> >> > + if (num_params > 2) > >> >> > return -EINVAL; > >> >> > > >> >> > ctx_set_params = > memdup_user(u64_to_user_ptr(args->ctx_set_params), > >> >> > @@ -731,6 +747,20 @@ static int > virtio_gpu_context_init_ioctl(struct drm_device *dev, > >> >> > > >> >> > vfpriv->context_init |= value; > >> >> > break; > >> >> > + case VIRTGPU_CONTEXT_PARAM_NUM_RINGS: > >> >> > + if (vfpriv->base_fence_ctx) { > >> >> > + ret = -EINVAL; > >> >> > + goto out_unlock; > >> >> > + } > >> >> > + > >> >> > + if (value > MAX_RINGS) { > >> >> > + ret = -EINVAL; > >> >> > + goto out_unlock; > >> >> > + } > >> >> > + > >> >> > + vfpriv->base_fence_ctx = > dma_fence_context_alloc(value); > >> >> With multiple fence contexts, we should do something about implicit > fencing. > >> >> > >> >> The classic example is Mesa and X server. When both use virgl and > the > >> >> global fence context, no dma_fence_wait is fine. But when Mesa uses > >> >> venus and the ring fence context, dma_fence_wait should be inserted. > >> > > >> > > >> > If I read your comment correctly, the use case is: > >> > > >> > context A (venus) > >> > > >> > sharing a render target with > >> > > >> > context B (Xserver backed virgl) > >> > > >> > ? > >> > > >> > Which function do you envisage dma_fence_wait(...) to be inserted? > Doesn't implicit synchronization mean there's no fence to share between > contexts (only buffer objects)? > >> > >> Fences can be implicitly shared via reservation objects associated > >> with buffer objects. > >> > >> > It may be possible to wait on the reservation object associated with > a buffer object from a different context (userspace can also do > DRM_IOCTL_VIRTGPU_WAIT), but not sure if that's what you're looking for. > >> > >> Right, that's what I am looking for. Userspace expects implicit > >> fencing to work. While there are works to move the userspace to do > >> explicit fencing, it is not there yet in general and we can't require > >> the userspace to do explicit fencing or DRM_IOCTL_VIRTGPU_WAIT. > > > > > > Another option would be to use the upcoming > DMA_BUF_IOCTL_EXPORT_SYNC_FILE + VIRTGPU_EXECBUF_FENCE_FD_IN (which checks > the dma_fence context). > That requires the X server / compositors to be modified. For example, > venus works under Android (where there is explicit fencing) or under a > modified compositor (which does DMA_BUF_IOCTL_EXPORT_SYNC_FILE or > DRM_IOCTL_VIRTGPU_WAIT). But it does not work too well with an > unmodified X server. > Some semi-recent virgl modifications will be needed regardless for interop, such as VIRGL_CAP_V2_UNTYPED_RESOURCE (?). Not sure aren't too many virgl users (most developers) Does Xserver just pick up the latest Mesa release (including virgl/venus)? Suppose context types land in 5.16, the userspace changes land (both Venus/Virgl) in 21.2 stable releases. https://docs.mesa3d.org/release-calendar.html > > > > Generally, if it only requires virgl changes, userspace changes are fine > since OpenGL drivers implement implicit sync in many ways. Waiting on the > reservation object in the kernel is fine too though. > I don't think we want to assume virgl to be the only consumer of > dma-bufs, despite that it is the most common use case. > > > > Though venus doesn't use the NUM_RINGS param yet. Getting all > permutations of context type + display integration working would take some > time (patchset mostly tested with wayland + gfxstream/Android [no implicit > sync]). > > > > WDYT of someone figuring out virgl/venus interop later, independently of > this patchset? > > I think we should understand the implications of multiple fence > contexts better, even if some changes are not included in this > patchset. > > From my view, we don't need implicit fencing in most cases and > implicit fencing should be considered a legacy path. But X server / > compositors today happen to require it. Other drivers seem to use a > flag to control whether implicit fences are set up or waited (e.g., > AMDGPU_GEM_CREATE_EXPLICIT_SYNC, MSM_SUBMIT_NO_IMPLICIT, or > EXEC_OBJECT_WRITE). It seems to be the least surprising thing to do. > IMO, the easiest way is just to limit the change to userspace if possible since implicit sync is legacy/something we want to deprecate over time. Another option is to add something like VIRTGPU_EXECBUF_EXPLICIT_SYNC (similar to MSM_SUBMIT_NO_IMPLICIT), where the reservation objects are waited on / added to without that flag. Since explicit sync will need new hypercalls/params and is a major, that feature is expected to be independent of context types. With that option, waiting on the reservation object would just be another bug fix + addition to 5.16 (perhaps by you) so we can proceed in parallel faster. VIRTGPU_EXECBUF_EXPLICIT_SYNC (or an equivalent) would be added later. > > > > >> > >> > >> > >> > >> > > >> >> > >> >> > >> >> > + vfpriv->num_rings = value; > >> >> > + break; > >> >> > default: > >> >> > ret = -EINVAL; > >> >> > goto out_unlock; > >> >> > -- > >> >> > 2.33.0.153.gba50c8fa24-goog > >> >> > > >> >> > > >> >> > > --------------------------------------------------------------------- > >> >> > To unsubscribe, e-mail: > virtio-dev-unsubscribe@lists.oasis-open.org > >> >> > For additional commands, e-mail: > virtio-dev-help@lists.oasis-open.org > >> >> > >