From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrey Grodzovsky Subject: Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process. Date: Tue, 22 May 2018 11:49:02 -0400 Message-ID: References: <87muxsbmkp.fsf@xmission.com> <8840ac96-50c4-f94d-eb7c-f007940163f3@amd.com> <877eowa5qh.fsf@xmission.com> <20180425135552.GD7592@redhat.com> <20180425171757.GA10441@redhat.com> <874ljyu98e.fsf@xmission.com> <20180430160006.GB10583@redhat.com> <79b2ce10-2cd7-b6f2-551e-0b4ae21072af@amd.com> <28de0150-0a31-f51a-4f56-0a71f741e07e@amd.com> <3ff3a5f4-c109-bf86-2772-9d88abc419df@amd.com> <662c84bf-ac38-db28-1a11-b17719c9b8d0@daenzer.net> <12c806f9-f283-5bed-d137-7719ba73205a@daenzer.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------46AF11A61433C2D9CAB4AF13" Return-path: Received: from NAM03-DM3-obe.outbound.protection.outlook.com (mail-dm3nam03on0058.outbound.protection.outlook.com [104.47.41.58]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3CAAD6E2D5 for ; Tue, 22 May 2018 15:49:11 +0000 (UTC) In-Reply-To: <12c806f9-f283-5bed-d137-7719ba73205a@daenzer.net> Content-Language: en-US List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: =?UTF-8?Q?Michel_D=c3=a4nzer?= Cc: "Koenig, Christian" , ML dri-devel List-Id: dri-devel@lists.freedesktop.org This is a multi-part message in MIME format. --------------46AF11A61433C2D9CAB4AF13 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 05/18/2018 04:46 AM, Michel Dänzer wrote: > On 2018-05-17 09:05 PM, Andrey Grodzovsky wrote: >> On 05/17/2018 10:48 AM, Michel Dänzer wrote: >>> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote: >>>> Hi Michele and others, I am trying to implement the approach bellow to >>>> resolve AMDGPU's hang when commands are stuck in pipe during process >>>> exit. >>>> >>>> I noticed that once I implemented the file_operation.flush callback >>>> then during run of X, i see the flush callback gets called not only for >>>> Xorg process but for other >>>> >>>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg passes his >>>> FDs to children, Christian mentioned he remembered a discussion to >>>> always set FD_CLOEXEC flag when opening the hardware device file, so >>>> >>>> we suspect a bug in Xorg with regard to this behavior. >>> Try the libdrm patch below. >>> >>> Note that the X server passes DRM file descriptors to DRI3 clients. >> Tried it, didn't help. I still see other processes calling .flush for >> /dev/dri/card0 > Try the attached xserver patch on top. With these patches, I no longer > see any DRM file descriptors being opened without O_CLOEXEC running Xorg > -pogo in strace. Thanks for the patch, unfortunately this is my first time  building xorg form source and I hit some blocks with dependencies. I wonder if you could quickly apply to amdgpu the attached small patch and run xinit from command line. In case the FD is not passed any more you will only see Xorg print in dmeg afterwards, otherwise 'sh' and 'xkbcomp' will also get printed. Andrey > > Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the flush > callback being called from multiple processes is an issue, maybe the > flush callback isn't appropriate after all. > > --------------46AF11A61433C2D9CAB4AF13 Content-Type: text/x-patch; name="test_flush.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="test_flush.patch" diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index b0bf2f2..1f63712 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -855,9 +855,18 @@ static const struct dev_pm_ops amdgpu_pm_ops = { .runtime_idle = amdgpu_pmops_runtime_idle, }; +static int amdgpu_flush(struct file *f, fl_owner_t id) { + + DRM_ERROR("%s\n", current->comm); + + return 0; +} + + static const struct file_operations amdgpu_driver_kms_fops = { .owner = THIS_MODULE, .open = drm_open, + .flush = amdgpu_flush, .release = drm_release, .unlocked_ioctl = amdgpu_drm_ioctl, .mmap = amdgpu_mmap, --------------46AF11A61433C2D9CAB4AF13 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --------------46AF11A61433C2D9CAB4AF13--