From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 107403] Quadratic behavior due to leaking fence contexts in reservation objects Date: Fri, 27 Jul 2018 12:20:54 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0557294065==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [IPv6:2610:10:20:722:a800:ff:fe98:4b55]) by gabe.freedesktop.org (Postfix) with ESMTP id B00486E98F for ; Fri, 27 Jul 2018 12:20:54 +0000 (UTC) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0557294065== Content-Type: multipart/alternative; boundary="15326940540.BaEDA7Dd7.16360" Content-Transfer-Encoding: 7bit --15326940540.BaEDA7Dd7.16360 Date: Fri, 27 Jul 2018 12:20:54 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D107403 Bug ID: 107403 Summary: Quadratic behavior due to leaking fence contexts in reservation objects Product: DRI Version: XOrg git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: bas@basnieuwenhuizen.nl As part of the Vulkan CTS, radv creates about 30k AMDGPU contexts (about 1-= 20 live at the same time though). Each of those creates a bunch of fence contexts, one for each ring, to use = for fences created from submitted jobs. However, as part of running jobs, fences with those contexts get attached to the vm->root.base.bo->tbo.resv of the corresponding vm. Which means that at some point we have tens of thousands of fences attached to it as they never get removed. They only ever get deduplicated with a later fence from the same f= ence context, so fences from destroyed contexts never get removed. Then in amdgpu_gem_va_ioctl -> amdgpu_vm_clear_freed -> amdgpu_vm_bo_update_mapping we do an amdgpu_sync_resv, which tries to add t= hat to an amdgpu_sync object. Which only has a 16-entry hashtable, so adding the fences to the hashtable results in quadratic behavior. Combine this with doing sparse buffer tests at the end, which do lots of VA operations this results in tests taking 20+ minuts. So I could reduce the number of amdgpu contexts a bit in radv, but the bigg= er issue in my opnion is that we are pretty much leaking and never reclaiming = the fences. Any idea how to best remove some signalled fences? --=20 You are receiving this mail because: You are the assignee for the bug.= --15326940540.BaEDA7Dd7.16360 Date: Fri, 27 Jul 2018 12:20:54 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated
Bug ID 107403
Summary Quadratic behavior due to leaking fence contexts in reservati= on objects
Product DRI
Version XOrg git
Hardware Other
OS All
Status NEW
Severity normal
Priority medium
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter bas@basnieuwenhuizen.nl

As part of the Vulkan CTS, radv creates about 30k AMDGPU conte=
xts (about 1-20
live at the same time though).

Each of those creates a bunch of fence contexts, one for each ring, to use =
for
fences created from submitted jobs.

However, as part of running jobs, fences with those contexts get attached to
the
vm->root.base.bo->tbo.resv of the corresponding vm. Which means that =
at some
point we have tens of thousands of fences attached to it as they never get
removed. They only ever get deduplicated with a later fence from the same f=
ence
context, so fences from destroyed contexts never get removed.

Then in amdgpu_gem_va_ioctl -> amdgpu_vm_clear_freed ->
amdgpu_vm_bo_update_mapping we do an amdgpu_sync_resv, which tries to add t=
hat
to an amdgpu_sync object. Which only has a 16-entry hashtable, so adding the
fences to the hashtable results in quadratic behavior.

Combine this with doing sparse buffer tests at the end, which do lots of VA
operations this results in tests taking 20+ minuts.

So I could reduce the number of amdgpu contexts a bit in radv, but the bigg=
er
issue in my opnion is that we are pretty much leaking and never reclaiming =
the
fences.

Any idea how to best remove some signalled fences?


You are receiving this mail because:
  • You are the assignee for the bug.
= --15326940540.BaEDA7Dd7.16360-- --===============0557294065== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============0557294065==--