From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB238C433ED for ; Thu, 20 May 2021 14:11:20 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 84725611AB for ; Thu, 20 May 2021 14:11:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 84725611AB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=nouveau-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4F56C6E3D0; Thu, 20 May 2021 14:11:14 +0000 (UTC) Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by gabe.freedesktop.org (Postfix) with ESMTPS id 397926E3A0 for ; Thu, 20 May 2021 14:11:11 +0000 (UTC) Received: by mail-wr1-x434.google.com with SMTP id c14so16048355wrx.3 for ; Thu, 20 May 2021 07:11:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=EZn5cTpf04rs7DrIpRGC7VfODM+KW/8ycoUDK6u5orI=; b=TNGIEYJgOZq1Eh+4x7qcTaPm3DLJ5jKJa0xHMBdeeXZQmqjxQfOiJL5yZ6EvJwnXRQ N4tE9HhrwqlK/2gTw5fJlwXLKNif3wGvPcFne1Ho5UvSY4fdPc0uNRkFvcKuPHGyKFZD WrjJYNOxU9/TraKunkElzm75ojxSGj9bWBOeM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=EZn5cTpf04rs7DrIpRGC7VfODM+KW/8ycoUDK6u5orI=; b=eVZZq34DfwEQSotH4OANSRGpDMj7AZKpCOKm5z24wGhCvwzQuIe6ZjqWw7v4h1IUX5 bhcx7oIwrYWs2yIUkWKQ7X4odcZv74iB509QRIILNHDsfAxZwRq1XXmhqBAbc3fO3gv9 wAEo0uOHZwsbiMWBNoUiQXIY1DCjL34okwu27NUKTHDMVK6uXfKRcMg34RGiMEKk7XNA jb9Lq0N1StkjLG6OLlpi/IfhvKwKtLX/NigGAJPH+pkyovgHTu812Fy8uYkJrn47Ju5w 1SM5jrj9CZzKZ7SRdZTUuRBv6Q4s65e3C+Rf3ftcut0ncYxK60aIsKsJgB1Tyweg/hGn hAUw== X-Gm-Message-State: AOAM531cWhiYLjlNT7cowLHN1He5W9pTGIiq+MeRyIzUo4IJh9O9SH7r N5qYrTPgJh44/S8Ea406Oaq1bQ== X-Google-Smtp-Source: ABdhPJxUVutV30Iv8fYdx28p3MLidEuJpon5vjzJuQ0E0g9jRVACtL0BQV6x9aB0f3YrD/Z52ffWIQ== X-Received: by 2002:a5d:5388:: with SMTP id d8mr4350021wrv.423.1621519869961; Thu, 20 May 2021 07:11:09 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f20sm3223893wmh.41.2021.05.20.07.11.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 May 2021 07:11:09 -0700 (PDT) Date: Thu, 20 May 2021 16:11:07 +0200 From: Daniel Vetter To: "Nieto, David M" Message-ID: References: <7f8fc38a-cd25-aa1f-fa2d-5d3334edb3d2@linux.intel.com> <71428a10-4b2f-dbbf-7678-7487f9eda6a5@linux.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Operating-System: Linux phenom 5.10.32scarlett+ Subject: Re: [Nouveau] [Intel-gfx] [PATCH 0/7] Per client engine busyness X-BeenThere: nouveau@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Nouveau development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin , Simon Ser , "nouveau@lists.freedesktop.org" , Intel Graphics Development , Daniel Stone , Maling list - DRI developers , Daniel Vetter , "Koenig, Christian" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: nouveau-bounces@lists.freedesktop.org Sender: "Nouveau" On Wed, May 19, 2021 at 11:17:24PM +0000, Nieto, David M wrote: > [AMD Official Use Only] > > Parsing over 550 processes for fdinfo is taking between 40-100ms single > threaded in a 2GHz skylake IBRS within a VM using simple string > comparisons and DIRent parsing. And that is pretty much the worst case > scenario with some more optimized implementations. I think this is plenty ok, and if it's not you could probably make this massively faster with io_uring for all the fs operations and whack a parser-generator on top for real parsing speed. So imo we shouldn't worry about algorithmic inefficiency of the fdinfo approach at all, and focuse more on trying to reasonably (but not too much, this is still drm render stuff after all) standardize how it works and how we'll extend it all. I think there's tons of good suggestions in this thread on this topic already. /me out -Daniel > > David > ________________________________ > From: Daniel Vetter > Sent: Wednesday, May 19, 2021 11:23 AM > To: Tvrtko Ursulin > Cc: Daniel Stone ; jhubbard@nvidia.com ; nouveau@lists.freedesktop.org ; Intel Graphics Development ; Maling list - DRI developers ; Simon Ser ; Koenig, Christian ; aritger@nvidia.com ; Nieto, David M > Subject: Re: [Intel-gfx] [PATCH 0/7] Per client engine busyness > > On Wed, May 19, 2021 at 6:16 PM Tvrtko Ursulin > wrote: > > > > > > On 18/05/2021 10:40, Tvrtko Ursulin wrote: > > > > > > On 18/05/2021 10:16, Daniel Stone wrote: > > >> Hi, > > >> > > >> On Tue, 18 May 2021 at 10:09, Tvrtko Ursulin > > >> wrote: > > >>> I was just wondering if stat(2) and a chrdev major check would be a > > >>> solid criteria to more efficiently (compared to parsing the text > > >>> content) detect drm files while walking procfs. > > >> > > >> Maybe I'm missing something, but is the per-PID walk actually a > > >> measurable performance issue rather than just a bit unpleasant? > > > > > > Per pid and per each open fd. > > > > > > As said in the other thread what bothers me a bit in this scheme is that > > > the cost of obtaining GPU usage scales based on non-GPU criteria. > > > > > > For use case of a top-like tool which shows all processes this is a > > > smaller additional cost, but then for a gpu-top like tool it is somewhat > > > higher. > > > > To further expand, not only cost would scale per pid multiplies per open > > fd, but to detect which of the fds are DRM I see these three options: > > > > 1) Open and parse fdinfo. > > 2) Name based matching ie /dev/dri/.. something. > > 3) Stat the symlink target and check for DRM major. > > stat with symlink following should be plenty fast. > > > All sound quite sub-optimal to me. > > > > Name based matching is probably the least evil on system resource usage > > (Keeping the dentry cache too hot? Too many syscalls?), even though > > fundamentally I don't it is the right approach. > > > > What happens with dup(2) is another question. > > We need benchmark numbers showing that on anything remotely realistic > it's an actual problem. Until we've demonstrated it's a real problem > we don't need to solve it. > > E.g. top with any sorting enabled also parses way more than it > displays on every update. It seems to be doing Just Fine (tm). > > > Does anyone have any feedback on the /proc//gpu idea at all? > > When we know we have a problem to solve we can take a look at solutions. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7CDavid.Nieto%40amd.com%7Cf6aea97532cf41f916de08d91af32cc1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637570453997158377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4CFrY9qWbJREcIcSzeO9KIn2P%2Fw6k%2BYdNlh6rdS%2BEh4%3D&reserved=0 -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 122CDC433B4 for ; Thu, 20 May 2021 14:11:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3A242610A2 for ; Thu, 20 May 2021 14:11:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3A242610A2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4B3996E3AC; Thu, 20 May 2021 14:11:14 +0000 (UTC) Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4E1B56E3D0 for ; Thu, 20 May 2021 14:11:11 +0000 (UTC) Received: by mail-wr1-x432.google.com with SMTP id h4so17795160wrt.12 for ; Thu, 20 May 2021 07:11:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=EZn5cTpf04rs7DrIpRGC7VfODM+KW/8ycoUDK6u5orI=; b=TNGIEYJgOZq1Eh+4x7qcTaPm3DLJ5jKJa0xHMBdeeXZQmqjxQfOiJL5yZ6EvJwnXRQ N4tE9HhrwqlK/2gTw5fJlwXLKNif3wGvPcFne1Ho5UvSY4fdPc0uNRkFvcKuPHGyKFZD WrjJYNOxU9/TraKunkElzm75ojxSGj9bWBOeM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=EZn5cTpf04rs7DrIpRGC7VfODM+KW/8ycoUDK6u5orI=; b=iGrfRLfrfiq99TCC8xsMtRvASHkwpuhs8DAVd+FLJkhXQM4+qcjF9aKXbiaRbul6AC yhcsB5Xep80/bEiljJYbwKj68IyWP1MMNN/c/GEIlBU6/aZDPog18RrVXkGXSc59tfCq 59H4Kf9wF1fFbFpKAkuZThH7ajuxwhzFui5YiA5NJWO0Ho7houEtadifiDFV0+nAbc6e miCWEbboXvVZOAtsO9P5lFO2srVMLX5etyRJAE9d5JgZaNJCEerZEIayhcfxve8xN+8w Yz3xzyR3DNI2vqso7NOvZtz5BcYFb9gZpeWp5AbAwJuPxcZ5BWST3/3EWGrRfPJ+RWZQ oz1g== X-Gm-Message-State: AOAM532j7Wm00LtLsZPr5aWcAFMLxdmBRCudvHw4ZvthJ2FitzunGL9y DVOFgjaR5gQj60T/gTRRH2ZSb1bRtPNtnA== X-Google-Smtp-Source: ABdhPJxUVutV30Iv8fYdx28p3MLidEuJpon5vjzJuQ0E0g9jRVACtL0BQV6x9aB0f3YrD/Z52ffWIQ== X-Received: by 2002:a5d:5388:: with SMTP id d8mr4350021wrv.423.1621519869961; Thu, 20 May 2021 07:11:09 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f20sm3223893wmh.41.2021.05.20.07.11.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 May 2021 07:11:09 -0700 (PDT) Date: Thu, 20 May 2021 16:11:07 +0200 From: Daniel Vetter To: "Nieto, David M" Subject: Re: [Intel-gfx] [PATCH 0/7] Per client engine busyness Message-ID: References: <7f8fc38a-cd25-aa1f-fa2d-5d3334edb3d2@linux.intel.com> <71428a10-4b2f-dbbf-7678-7487f9eda6a5@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: Linux phenom 5.10.32scarlett+ X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin , "nouveau@lists.freedesktop.org" , Intel Graphics Development , Maling list - DRI developers , "jhubbard@nvidia.com" , "Koenig, Christian" , "aritger@nvidia.com" Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Wed, May 19, 2021 at 11:17:24PM +0000, Nieto, David M wrote: > [AMD Official Use Only] > > Parsing over 550 processes for fdinfo is taking between 40-100ms single > threaded in a 2GHz skylake IBRS within a VM using simple string > comparisons and DIRent parsing. And that is pretty much the worst case > scenario with some more optimized implementations. I think this is plenty ok, and if it's not you could probably make this massively faster with io_uring for all the fs operations and whack a parser-generator on top for real parsing speed. So imo we shouldn't worry about algorithmic inefficiency of the fdinfo approach at all, and focuse more on trying to reasonably (but not too much, this is still drm render stuff after all) standardize how it works and how we'll extend it all. I think there's tons of good suggestions in this thread on this topic already. /me out -Daniel > > David > ________________________________ > From: Daniel Vetter > Sent: Wednesday, May 19, 2021 11:23 AM > To: Tvrtko Ursulin > Cc: Daniel Stone ; jhubbard@nvidia.com ; nouveau@lists.freedesktop.org ; Intel Graphics Development ; Maling list - DRI developers ; Simon Ser ; Koenig, Christian ; aritger@nvidia.com ; Nieto, David M > Subject: Re: [Intel-gfx] [PATCH 0/7] Per client engine busyness > > On Wed, May 19, 2021 at 6:16 PM Tvrtko Ursulin > wrote: > > > > > > On 18/05/2021 10:40, Tvrtko Ursulin wrote: > > > > > > On 18/05/2021 10:16, Daniel Stone wrote: > > >> Hi, > > >> > > >> On Tue, 18 May 2021 at 10:09, Tvrtko Ursulin > > >> wrote: > > >>> I was just wondering if stat(2) and a chrdev major check would be a > > >>> solid criteria to more efficiently (compared to parsing the text > > >>> content) detect drm files while walking procfs. > > >> > > >> Maybe I'm missing something, but is the per-PID walk actually a > > >> measurable performance issue rather than just a bit unpleasant? > > > > > > Per pid and per each open fd. > > > > > > As said in the other thread what bothers me a bit in this scheme is that > > > the cost of obtaining GPU usage scales based on non-GPU criteria. > > > > > > For use case of a top-like tool which shows all processes this is a > > > smaller additional cost, but then for a gpu-top like tool it is somewhat > > > higher. > > > > To further expand, not only cost would scale per pid multiplies per open > > fd, but to detect which of the fds are DRM I see these three options: > > > > 1) Open and parse fdinfo. > > 2) Name based matching ie /dev/dri/.. something. > > 3) Stat the symlink target and check for DRM major. > > stat with symlink following should be plenty fast. > > > All sound quite sub-optimal to me. > > > > Name based matching is probably the least evil on system resource usage > > (Keeping the dentry cache too hot? Too many syscalls?), even though > > fundamentally I don't it is the right approach. > > > > What happens with dup(2) is another question. > > We need benchmark numbers showing that on anything remotely realistic > it's an actual problem. Until we've demonstrated it's a real problem > we don't need to solve it. > > E.g. top with any sorting enabled also parses way more than it > displays on every update. It seems to be doing Just Fine (tm). > > > Does anyone have any feedback on the /proc//gpu idea at all? > > When we know we have a problem to solve we can take a look at solutions. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7CDavid.Nieto%40amd.com%7Cf6aea97532cf41f916de08d91af32cc1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637570453997158377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4CFrY9qWbJREcIcSzeO9KIn2P%2Fw6k%2BYdNlh6rdS%2BEh4%3D&reserved=0 -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 205AEC433B4 for ; Thu, 20 May 2021 14:11:20 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6BBC56109F for ; Thu, 20 May 2021 14:11:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6BBC56109F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9F67A6E3A0; Thu, 20 May 2021 14:11:13 +0000 (UTC) Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by gabe.freedesktop.org (Postfix) with ESMTPS id 39CA46E3AC for ; Thu, 20 May 2021 14:11:11 +0000 (UTC) Received: by mail-wr1-x42c.google.com with SMTP id r12so17845430wrp.1 for ; Thu, 20 May 2021 07:11:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=EZn5cTpf04rs7DrIpRGC7VfODM+KW/8ycoUDK6u5orI=; b=TNGIEYJgOZq1Eh+4x7qcTaPm3DLJ5jKJa0xHMBdeeXZQmqjxQfOiJL5yZ6EvJwnXRQ N4tE9HhrwqlK/2gTw5fJlwXLKNif3wGvPcFne1Ho5UvSY4fdPc0uNRkFvcKuPHGyKFZD WrjJYNOxU9/TraKunkElzm75ojxSGj9bWBOeM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=EZn5cTpf04rs7DrIpRGC7VfODM+KW/8ycoUDK6u5orI=; b=EjEFeln/PgpF3XAS2qMB2tEs7IcM/8tP7Pu1KQkxW7/xWQ4FL6zujFSKWhr+oMLh5B 2OYcqStV1g/UGcfPhfE84s5BNR/Wv7RKun3C+cPNn1nvx1Kvh/S7lr6nsUMWrfxlT2Wk iYDGOmzFL8uJyR7oeS56pH4YZqQC2GrO+mikXtfQuvmjENWBFDRucKKZcYA8Gdux1Yiu cZoomEXjE1FNcC0vNGPsvtS8WBLB0a7umIfnp3XsZiz0zM9qLmVuquQM7CyxpY1NFXQR vNeHEpRH+jJMRXDh+iN+E9tNNFlQ6RrMUzobiEJ6KT/aLB3zroNBjmePfBtVoMC2JiBO AF3g== X-Gm-Message-State: AOAM533MHt4w5WdJxKG+ni820FrJ+joIsTNe5Ce4/WycAhApN1npiFtl K0KXiNZ/FsKqxC5ywfre6HBrgQ== X-Google-Smtp-Source: ABdhPJxUVutV30Iv8fYdx28p3MLidEuJpon5vjzJuQ0E0g9jRVACtL0BQV6x9aB0f3YrD/Z52ffWIQ== X-Received: by 2002:a5d:5388:: with SMTP id d8mr4350021wrv.423.1621519869961; Thu, 20 May 2021 07:11:09 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id f20sm3223893wmh.41.2021.05.20.07.11.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 May 2021 07:11:09 -0700 (PDT) Date: Thu, 20 May 2021 16:11:07 +0200 From: Daniel Vetter To: "Nieto, David M" Message-ID: References: <7f8fc38a-cd25-aa1f-fa2d-5d3334edb3d2@linux.intel.com> <71428a10-4b2f-dbbf-7678-7487f9eda6a5@linux.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Operating-System: Linux phenom 5.10.32scarlett+ Subject: Re: [Intel-gfx] [PATCH 0/7] Per client engine busyness X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Simon Ser , "nouveau@lists.freedesktop.org" , Intel Graphics Development , Maling list - DRI developers , "jhubbard@nvidia.com" , "Koenig, Christian" , "aritger@nvidia.com" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Wed, May 19, 2021 at 11:17:24PM +0000, Nieto, David M wrote: > [AMD Official Use Only] > > Parsing over 550 processes for fdinfo is taking between 40-100ms single > threaded in a 2GHz skylake IBRS within a VM using simple string > comparisons and DIRent parsing. And that is pretty much the worst case > scenario with some more optimized implementations. I think this is plenty ok, and if it's not you could probably make this massively faster with io_uring for all the fs operations and whack a parser-generator on top for real parsing speed. So imo we shouldn't worry about algorithmic inefficiency of the fdinfo approach at all, and focuse more on trying to reasonably (but not too much, this is still drm render stuff after all) standardize how it works and how we'll extend it all. I think there's tons of good suggestions in this thread on this topic already. /me out -Daniel > > David > ________________________________ > From: Daniel Vetter > Sent: Wednesday, May 19, 2021 11:23 AM > To: Tvrtko Ursulin > Cc: Daniel Stone ; jhubbard@nvidia.com ; nouveau@lists.freedesktop.org ; Intel Graphics Development ; Maling list - DRI developers ; Simon Ser ; Koenig, Christian ; aritger@nvidia.com ; Nieto, David M > Subject: Re: [Intel-gfx] [PATCH 0/7] Per client engine busyness > > On Wed, May 19, 2021 at 6:16 PM Tvrtko Ursulin > wrote: > > > > > > On 18/05/2021 10:40, Tvrtko Ursulin wrote: > > > > > > On 18/05/2021 10:16, Daniel Stone wrote: > > >> Hi, > > >> > > >> On Tue, 18 May 2021 at 10:09, Tvrtko Ursulin > > >> wrote: > > >>> I was just wondering if stat(2) and a chrdev major check would be a > > >>> solid criteria to more efficiently (compared to parsing the text > > >>> content) detect drm files while walking procfs. > > >> > > >> Maybe I'm missing something, but is the per-PID walk actually a > > >> measurable performance issue rather than just a bit unpleasant? > > > > > > Per pid and per each open fd. > > > > > > As said in the other thread what bothers me a bit in this scheme is that > > > the cost of obtaining GPU usage scales based on non-GPU criteria. > > > > > > For use case of a top-like tool which shows all processes this is a > > > smaller additional cost, but then for a gpu-top like tool it is somewhat > > > higher. > > > > To further expand, not only cost would scale per pid multiplies per open > > fd, but to detect which of the fds are DRM I see these three options: > > > > 1) Open and parse fdinfo. > > 2) Name based matching ie /dev/dri/.. something. > > 3) Stat the symlink target and check for DRM major. > > stat with symlink following should be plenty fast. > > > All sound quite sub-optimal to me. > > > > Name based matching is probably the least evil on system resource usage > > (Keeping the dentry cache too hot? Too many syscalls?), even though > > fundamentally I don't it is the right approach. > > > > What happens with dup(2) is another question. > > We need benchmark numbers showing that on anything remotely realistic > it's an actual problem. Until we've demonstrated it's a real problem > we don't need to solve it. > > E.g. top with any sorting enabled also parses way more than it > displays on every update. It seems to be doing Just Fine (tm). > > > Does anyone have any feedback on the /proc//gpu idea at all? > > When we know we have a problem to solve we can take a look at solutions. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7CDavid.Nieto%40amd.com%7Cf6aea97532cf41f916de08d91af32cc1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637570453997158377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4CFrY9qWbJREcIcSzeO9KIn2P%2Fw6k%2BYdNlh6rdS%2BEh4%3D&reserved=0 -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx