From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02957C10DCE for ; Wed, 18 Mar 2020 19:42:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BBF4120752 for ; Wed, 18 Mar 2020 19:42:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=riseup.net header.i=@riseup.net header.b="hXhiz5aJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726631AbgCRTmK (ORCPT ); Wed, 18 Mar 2020 15:42:10 -0400 Received: from mx1.riseup.net ([198.252.153.129]:48168 "EHLO mx1.riseup.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726619AbgCRTmK (ORCPT ); Wed, 18 Mar 2020 15:42:10 -0400 Received: from capuchin.riseup.net (unknown [10.0.1.176]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.riseup.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (not verified)) by mx1.riseup.net (Postfix) with ESMTPS id 48jL5s3WBmzFfYM; Wed, 18 Mar 2020 12:42:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1584560529; bh=a5nYf4uJw7aYCyEYlZi0u83zqkdSfNAtDLlEBJVc2Ro=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=hXhiz5aJJY9j4uFI3UeyxV2cfoXs3OaO2P/Rb7QdZa1xS3dZLAIdURebHeezemFIP j6gkLQ8yZp7M5wwrXX5V9MUr+HX7snsQpmjo7xb6lbqGs5HuKRduZwZPySU11Ag07w RP4Esq6hlHJWO6GkAMmcQdEVp8c5g4Y1xXugR0yE= X-Riseup-User-ID: 6DF06374A87D2B38EFD187F6AD1B7F97D48C7066E73C4B6BA964BE507E2E536A Received: from [127.0.0.1] (localhost [127.0.0.1]) by capuchin.riseup.net (Postfix) with ESMTPSA id 48jL5s027dz8wXk; Wed, 18 Mar 2020 12:42:08 -0700 (PDT) From: Francisco Jerez To: chris.p.wilson@intel.com, intel-gfx@lists.freedesktop.org, linux-pm@vger.kernel.org Cc: Peter Zijlstra , "Rafael J. Wysocki" , "Pandruvada\, Srinivas" Subject: Re: [Intel-gfx] [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU load. In-Reply-To: <87r1xzafwn.fsf@riseup.net> References: <20200310214203.26459-1-currojerez@riseup.net> <20200310214203.26459-3-currojerez@riseup.net> <158387916218.28297.4489489879582782488@build.alporthouse.com> <87r1xzafwn.fsf@riseup.net> Date: Wed, 18 Mar 2020 12:42:11 -0700 Message-ID: <87k13h78mk.fsf@riseup.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Francisco Jerez writes: > Chris Wilson writes: > >> Quoting Francisco Jerez (2020-03-10 21:41:55) >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915= /gt/intel_lrc.c >>> index b9b3f78f1324..a5d7a80b826d 100644 >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c >>> @@ -1577,6 +1577,11 @@ static void execlists_submit_ports(struct intel_= engine_cs *engine) >>> /* we need to manually load the submit queue */ >>> if (execlists->ctrl_reg) >>> writel(EL_CTRL_LOAD, execlists->ctrl_reg); >>> + >>> + if (execlists_num_ports(execlists) > 1 && >> pending[1] is always defined, the minimum submission is one slot, with >> pending[1] as the sentinel NULL. >> >>> + execlists->pending[1] && >>> + !atomic_xchg(&execlists->overload, 1)) >>> + intel_gt_pm_active_begin(&engine->i915->gt); >> >> engine->gt >> > > Applied your suggestions above locally, will probably wait to have a few > more changes batched up before sending a v2. > >>> } >>>=20=20 >>> static bool ctx_single_port_submission(const struct intel_context *ce) >>> @@ -2213,6 +2218,12 @@ cancel_port_requests(struct intel_engine_execlis= ts * const execlists) >>> clear_ports(execlists->inflight, ARRAY_SIZE(execlists->inflight= )); >>>=20=20 >>> WRITE_ONCE(execlists->active, execlists->inflight); >>> + >>> + if (atomic_xchg(&execlists->overload, 0)) { >>> + struct intel_engine_cs *engine =3D >>> + container_of(execlists, typeof(*engine), execli= sts); >>> + intel_gt_pm_active_end(&engine->i915->gt); >>> + } >>> } >>>=20=20 >>> static inline void >>> @@ -2386,6 +2397,9 @@ static void process_csb(struct intel_engine_cs *e= ngine) >>> /* port0 completed, advanced to port1 */ >>> trace_ports(execlists, "completed", execlists->= active); >>>=20=20 >>> + if (atomic_xchg(&execlists->overload, 0)) >>> + intel_gt_pm_active_end(&engine->i915->g= t); >> >> So this looses track if we preempt a dual-ELSP submission with a >> single-ELSP submission (and never go back to dual). >> > > Yes, good point. You're right that if a dual-ELSP submission gets > preempted by a single-ELSP submission "overload" will remain signaled > until the first completion interrupt arrives (e.g. from the preempting > submission). > >> If you move this to the end of the loop and check >> >> if (!execlists->active[1] && atomic_xchg(&execlists->overload, 0)) >> intel_gt_pm_active_end(engine->gt); >> >> so that it covers both preemption/promotion and completion. >> > > That sounds reasonable. > >> However, that will fluctuate quite rapidly. (And runs the risk of >> exceeding the sentinel.) >> >> An alternative approach would be to couple along >> schedule_in/schedule_out >> >> atomic_set(overload, -1); >> >> __execlists_schedule_in: >> if (!atomic_fetch_inc(overload) >> intel_gt_pm_active_begin(engine->gt); >> __execlists_schedule_out: >> if (!atomic_dec_return(overload) >> intel_gt_pm_active_end(engine->gt); >> >> which would mean we are overloaded as soon as we try to submit an >> overlapping ELSP. >> > > That sounds good to me too, and AFAICT would have roughly the same > behavior as this metric except for the preemption corner case you > mention above. I'll try this and verify that I get approximately the > same performance numbers. > This suggestion seems to lead to some minor regressions, I'm investigating the issue. Will send a v2 as soon as I have something along the lines of what you suggested running with equivalent performance to v1. --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEAREIAB0WIQST8OekYz69PM20/4aDmTidfVK/WwUCXnJ5kwAKCRCDmTidfVK/ W8RGAP0fnP5bgq8VI2HwdNgG26U1KneF9BgkFiRqan1wgKlaTgD/W7IGl6KnT8Xh OeVjlFfQFrRzOU86JZfmyCb7sXAtNFk= =JAxw -----END PGP SIGNATURE----- --==-=-=--