From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: ** X-Spam-Status: No, score=2.5 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8ACD7C433B4 for ; Sat, 1 May 2021 22:27:57 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3729D61221 for ; Sat, 1 May 2021 22:27:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3729D61221 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EED506E10C; Sat, 1 May 2021 22:27:55 +0000 (UTC) Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by gabe.freedesktop.org (Postfix) with ESMTPS id ED7D66E10C; Sat, 1 May 2021 22:27:54 +0000 (UTC) Received: by mail-pg1-x530.google.com with SMTP id d10so1042390pgf.12; Sat, 01 May 2021 15:27:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kXuRaUGpGVOGsrh7llZsXXv2xB/TCoqp2uCVmvs2J9M=; b=t+nu2H1ZG7yVinF4Ljth33K0LVSbJ9sUu8VMFfLCxktWTYMszF1QiC8uZRGshHQnUA xdIyGdXWIpSbIMmdOF3zr/bijBbFWPIVvm2FysnXZIXimDma+5FSUYoyv3Kjos5nnEDd 8YRTp6bP7UIciTMtTDBUtLed9JrM/GXIYEvqWoOm8zouAGN84mqfOCYiSd8miq6jQMoN hSzjumpVkazAi2h21BIQbE/GRwdOYAeZKCO9Vf/jjcBj6hS042trBZcouBMKzbp7R0YX kuGzki+7XnTsGJAuA6a8PdCzpV8mwSxILqd7NzjY3bbsowVjROfTNrR0vha6xMYja+8w BNXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kXuRaUGpGVOGsrh7llZsXXv2xB/TCoqp2uCVmvs2J9M=; b=Gw/AmJBoM3+Bj6PNeyUNok7k0sj+RqWyYFHWT4XfjAfTX25u5w5DYfZdlfBzuSgA6l 9myHJPXsuv51BAqFRA03iXCBwRrMQLszeXSaLVYTFcD7LQ3P/9HYy2UK8DiIl9A30l/D jQXTPTFxxPB1OWH6OkJBF00dJ1KMPCxdKFvUMkCaW8F57+dR85j67K5HoqPRdIxVK2Lb ktOgvINqmVgNNY5+tKdeU4O2donSfNuK2VFAQ1iCwwEkP7it4xrsKAl9/7E6VKAa2xoV hGH3NdJQKiO7vLC+iPxKUu11v92+65QAe19HCbciaUI73WNqhG1G11z3Ui4R7/WdHaAU f5QQ== X-Gm-Message-State: AOAM5310S0vW32KjqX5oR+b+n+CJvg3eXcNrqROl/fRkH8Fm+LBcAUwK KwhEk2m8jojEIpiQ7Tc6js/toBkXkZvtMNW0914= X-Google-Smtp-Source: ABdhPJwGElwySGTKsOZrR7H2r4HWMgI0SnB9PG3vVw8gAr/7iYNlhvYTsSrv55QEUisfebLclwztv7ww+uHw4Ap2L4s= X-Received: by 2002:a63:8f17:: with SMTP id n23mr3115990pgd.82.1619908074602; Sat, 01 May 2021 15:27:54 -0700 (PDT) MIME-Version: 1.0 References: <153487a3-082b-faf7-2a4d-ae15993b2a5d@gmail.com> In-Reply-To: From: =?UTF-8?B?TWFyZWsgT2zFocOhaw==?= Date: Sat, 1 May 2021 18:27:18 -0400 Message-ID: Subject: Re: [Mesa-dev] [RFC] Linux Graphics Next: Explicit fences everywhere and no BO fences - initial proposal To: =?UTF-8?Q?Michel_D=C3=A4nzer?= X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?Q?Christian_K=C3=B6nig?= , dri-devel , ML Mesa-dev Content-Type: multipart/mixed; boundary="===============0093185052==" Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" --===============0093185052== Content-Type: multipart/alternative; boundary="0000000000005b01f905c14c3d7d" --0000000000005b01f905c14c3d7d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Apr 28, 2021 at 5:07 AM Michel D=C3=A4nzer wro= te: > On 2021-04-28 8:59 a.m., Christian K=C3=B6nig wrote: > > Hi Dave, > > > > Am 27.04.21 um 21:23 schrieb Marek Ol=C5=A1=C3=A1k: > >> Supporting interop with any device is always possible. It depends on > which drivers we need to interoperate with and update them. We've already > found the path forward for amdgpu. We just need to find out how many othe= r > drivers need to be updated and evaluate the cost/benefit aspect. > >> > >> Marek > >> > >> On Tue, Apr 27, 2021 at 2:38 PM Dave Airlie airlied@gmail.com>> wrote: > >> > >> On Tue, 27 Apr 2021 at 22:06, Christian K=C3=B6nig > >> ckoenig.leichtzumerken@gmail.com>> wrote: > >> > > >> > Correct, we wouldn't have synchronization between device with an= d > without user queues any more. > >> > > >> > That could only be a problem for A+I Laptops. > >> > >> Since I think you mentioned you'd only be enabling this on newer > >> chipsets, won't it be a problem for A+A where one A is a generatio= n > >> behind the other? > >> > > > > Crap, that is a good point as well. > > > >> > >> I'm not really liking where this is going btw, seems like a ill > >> thought out concept, if AMD is really going down the road of > designing > >> hw that is currently Linux incompatible, you are going to have to > >> accept a big part of the burden in bringing this support in to mor= e > >> than just amd drivers for upcoming generations of gpu. > >> > > > > Well we don't really like that either, but we have no other option as > far as I can see. > > I don't really understand what "future hw may remove support for kernel > queues" means exactly. While the per-context queues can be mapped to > userspace directly, they don't *have* to be, do they? I.e. the kernel > driver should be able to either intercept userspace access to the queues, > or in the worst case do it all itself, and provide the existing > synchronization semantics as needed? > > Surely there are resource limits for the per-context queues, so the kerne= l > driver needs to do some kind of virtualization / multi-plexing anyway, or > we'll get sad user faces when there's no queue available for game>. > > I'm probably missing something though, awaiting enlightenment. :) > The hw interface for userspace is that the ring buffer is mapped to the process address space alongside a doorbell aperture (4K page) that isn't real memory, but when the CPU writes into it, it tells the hw scheduler that there are new GPU commands in the ring buffer. Userspace inserts all the wait, draw, and signal commands into the ring buffer and then "rings" the doorbell. It's my understanding that the ring buffer and the doorbell are always mapped in the same GPU address space as the process, which makes it very difficult to emulate the current protected ring buffers in the kernel. The VMID of the ring buffer is also not changeable. The hw scheduler doesn't do any synchronization and it doesn't see any dependencies. It only chooses which queue to execute, so it's really just a simple queue manager handling the virtualization aspect and not much else. Marek --0000000000005b01f905c14c3d7d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Wed, Apr 28, 2021 at 5:07 AM Michel D=C3=A4nzer <michel@daenzer.net> wrote= :
On 2021-04-28 = 8:59 a.m., Christian K=C3=B6nig wrote:
> Hi Dave,
>
> Am 27.04.21 um 21:23 schrieb Marek Ol=C5=A1=C3=A1k:
>> Supporting interop with any device is always possible. It depends = on which drivers we need to interoperate with and update them. We've al= ready found the path forward for amdgpu. We just need to find out how many = other drivers need to be updated and evaluate the cost/benefit aspect.
>>
>> Marek
>>
>> On Tue, Apr 27, 2021 at 2:38 PM Dave Airlie <airlied@gmail.com <mailto:airlied@gmail.com>= > wrote:
>>
>>=C2=A0 =C2=A0 =C2=A0On Tue, 27 Apr 2021 at 22:06, Christian K=C3=B6= nig
>>=C2=A0 =C2=A0 =C2=A0<ckoenig.leichtzumerken@gmail.com <mailto:= ckoen= ig.leichtzumerken@gmail.com>> wrote:
>>=C2=A0 =C2=A0 =C2=A0>
>>=C2=A0 =C2=A0 =C2=A0> Correct, we wouldn't have synchronizat= ion between device with and without user queues any more.
>>=C2=A0 =C2=A0 =C2=A0>
>>=C2=A0 =C2=A0 =C2=A0> That could only be a problem for A+I Lapto= ps.
>>
>>=C2=A0 =C2=A0 =C2=A0Since I think you mentioned you'd only be e= nabling this on newer
>>=C2=A0 =C2=A0 =C2=A0chipsets, won't it be a problem for A+A whe= re one A is a generation
>>=C2=A0 =C2=A0 =C2=A0behind the other?
>>
>
> Crap, that is a good point as well.
>
>>
>>=C2=A0 =C2=A0 =C2=A0I'm not really liking where this is going b= tw, seems like a ill
>>=C2=A0 =C2=A0 =C2=A0thought out concept, if AMD is really going dow= n the road of designing
>>=C2=A0 =C2=A0 =C2=A0hw that is currently Linux incompatible, you ar= e going to have to
>>=C2=A0 =C2=A0 =C2=A0accept a big part of the burden in bringing thi= s support in to more
>>=C2=A0 =C2=A0 =C2=A0than just amd drivers for upcoming generations = of gpu.
>>
>
> Well we don't really like that either, but we have no other option= as far as I can see.

I don't really understand what "future hw may remove support for k= ernel queues" means exactly. While the per-context queues can be mappe= d to userspace directly, they don't *have* to be, do they? I.e. the ker= nel driver should be able to either intercept userspace access to the queue= s, or in the worst case do it all itself, and provide the existing synchron= ization semantics as needed?

Surely there are resource limits for the per-context queues, so the kernel = driver needs to do some kind of virtualization / multi-plexing anyway, or w= e'll get sad user faces when there's no queue available for <cur= rent hot game>.

I'm probably missing something though, awaiting enlightenment. :)

The hw interface for userspace is that the ring b= uffer is mapped to the process address space alongside a doorbell aperture = (4K page) that isn't real memory, but when the CPU writes into it, it t= ells the hw scheduler that there are new GPU commands in the ring buffer. U= serspace inserts all the wait, draw, and signal commands into the ring buff= er and then "rings" the doorbell. It's my understanding that = the ring buffer and the doorbell are always mapped in the same GPU address = space as the process, which makes it very difficult to emulate the current = protected ring buffers in the kernel. The VMID of the ring buffer is also n= ot changeable.

The hw scheduler doesn't do any synchronization and it doe= sn't see any dependencies. It only chooses which queue to execute, so i= t's really just a simple queue manager handling the virtualization aspe= ct and not much else.

Marek
--0000000000005b01f905c14c3d7d-- --===============0093185052== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel --===============0093185052==--