All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Teres Alexis, Alan Previn" <alan.previn.teres.alexis@intel.com>
To: "tvrtko.ursulin@linux.intel.com" <tvrtko.ursulin@linux.intel.com>,
	"Ceraolo Spurio, Daniele" <daniele.ceraolospurio@intel.com>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH 1/1] drm/i915/guc: Delay disabling guc_id scheduling for better hysteresis
Date: Tue, 20 Sep 2022 06:22:48 +0000	[thread overview]
Message-ID: <d69e9fd9b3df695779adc2cb94b96bf44ff59878.camel@intel.com> (raw)
In-Reply-To: <c6e968c8-aaa0-949c-61c9-857ef5d9f617@intel.com>


On Fri, 2022-09-16 at 08:36 -0700, Ceraolo Spurio, Daniele wrote:
> 
> On 9/16/2022 1:58 AM, Tvrtko Ursulin wrote:
> > On 16/09/2022 08:53, Teres Alexis, Alan Previn wrote:
> > > On Thu, 2022-09-15 at 09:48 +0100, Tvrtko Ursulin wrote:
> > > > On 15/09/2022 03:12, Alan Previn wrote:
> > > > > From: Matthew Brost <matthew.brost@intel.com>
> > > > > 
> > > > > +static void guc_context_sched_disable(struct intel_context *ce)
> > > > > +{
> > > > > +    struct intel_guc *guc = ce_to_guc(ce);
> > > > > +    u64 delay = guc->submission_state.sched_disable_delay_ms;
> > > > > +    unsigned long flags;
> > > > > +
> > > > > +    spin_lock_irqsave(&ce->guc_state.lock, flags);
> > > > > +
> > > > > +    if (bypass_sched_disable(guc, ce)) {
> > > > > +        spin_unlock_irqrestore(&ce->guc_state.lock, flags);
> > > > > +        intel_context_sched_disable_unpin(ce);
> > > > > +    } else if (!intel_context_is_closed(ce) && 
> > > > > !guc_id_pressure(guc, ce) &&
> > > > > +           delay) {
> > > > > spin_unlock_irqrestore(&ce->guc_state.lock, flags);
> > > > > -        goto unpin;
> > > > > +        mod_delayed_work(system_unbound_wq,
> > > > > +                 &ce->guc_state.sched_disable_delay,
> > > > > +                 msecs_to_jiffies(delay));
> > > > > +    } else {
> > > > > +        do_sched_disable(guc, ce, flags);
> > > > >        }
> > > > 
> > > > lock
> > > > if
> > > >     unlock
> > > >     do stuff
> > > > else if
> > > >     unlock
> > > >     do stuff
> > > > else
> > > >     do_sched_disable - which unlocks inside
> > > > 
> > > > IMO it creates less readable code for the benefit of not repeating
> > > > with_intel_runtime_pm -> __guc_context_sched_disable two times. Dunno..
> > > > it's ugly but I have no suggestions. Hm does it have to send using the
> > > > busy loop? It couldn't just queue the request and then wait for 
> > > > reply if
> > > > disable message was emitted?
> > > > 
> > > I agree that the above code lacks readability - will see if i can 
> > > break it down to smaller
> > > functions with cleaner in-function lock/unlock pairs. I agree that a 
> > > little code duplication
> > > is better than less readable code. It was inherited code i didn't 
> > > want to modify but I'll
> > > go ahead and refactor this.
> > > 
> > > On the busy loop - im assuming you are refering to the actual ct 
> > > sending. I'll consult my
> > > team if i am missing anything more but based on comments, I believe 
> > > callers must use that
> > > function to guarantee reservation of space in the G2H CTB to always 
> > > have space to capture
> > > responses for actions that MUST be acknowledged from GuC 
> > > (acknowledged by either replying
> > > with a success or failure). This is necessary for coherent guc-id 
> > > state machine (because the
> > > GuC firmware will drop requests for guc-id's that are not registered 
> > > or not in a
> > > 'sched-enabled' state).
> > 
> > Maybe to explain what I meant a bit better. I thought that the reason 
> > for strange unlock pattern is the with_runtime_pm which needs to 
> > happen for the CT send/receive loop. What I meant was would it be 
> > possible to do it like this:
> > 
Alan: I assumed the strange lock-unlock was for no other reason than to ensure that
once the delayed task begins executing, we dont let go of the lock until after
we complete the call prep_context_pending_disable (which modifies the state-machine
for the context's guc-id). And as we expect, the runtime_pm is about trying to send
the H2G signal to the GuC which touches register. (i.e. lock was on ce-gucid state
and power was for guc-interaction via ct). If my assumption are correct, then I
can refactor the functions a bit to remove that weird unlock flow while keeping the
lock held from start until prep_context_pending_disable, else i could
still go ahead and do that (with some minimal duplication).


> > state lock
> > ...
> > sent = queue_msg_2_guc (this would be i915 only, no runtime pm needed)
> > ...
> > state unlock
> > 
> > if (sent)
> >   with_runtime_pm:
> >      send/receive queued guc messages (only this would talk to guc)
> > 
> > But I have truly no idea if the CT messaging infrastructure supports 
> > something like this.
> > 
Alan: I'll look around and see if we have something that provides that flow
but i don't think I should pursue such a redesign as part of this series
if none exists.


> > Anyway, see what it is possible and if it is not or too hard for now 
> > leave it be.
> > 
> > > > > -    guc_id = prep_context_pending_disable(ce);
> > > > > +}
> > > > >    -    spin_unlock_irqrestore(&ce->guc_state.lock, flags);
> > > > > +static void guc_flush_all_delayed_disable_sched_contexts(struct 
> > > > > intel_guc *guc)
> > > > > +{
> > > > > +    struct intel_context *ce;
> > > > > +    unsigned long index;
> > > > > +    unsigned long flags;
> > > > > +    unsigned long ceflags;
> > > > >    -    with_intel_runtime_pm(runtime_pm, wakeref)
> > > > > -        __guc_context_sched_disable(guc, ce, guc_id);
> > > > > +    xa_lock_irqsave(&guc->context_lookup, flags);
> > > > > +    xa_for_each(&guc->context_lookup, index, ce) {
> > > > > +        if (!kref_get_unless_zero(&ce->ref))
> > > > > +            continue;
> > > > > +        xa_unlock(&guc->context_lookup);
> > > > 
> > > > So this whole loop _needs_ to run with interrupts disabled? Explaining
> > > > why in a comment would be good.
> > > > 
> > > Being mid-reset, the locking mode is consistent with other functions 
> > > also used
> > > as part of the reset preparation that parses and potentially modifies 
> > > contexts.
> > > I believe the goal is to handle all of this parsing without getting 
> > > conflicting
> > > latent G2H replies that breaks the preparation flow (that herds 
> > > active contexts
> > > into a fewer set of states as part of reset) - but i will double check
> > > with my colleagues.
> > > 
> > > > > +        if (test_bit(CONTEXT_GUC_INIT, &ce->flags) &&
> > > > > + cancel_delayed_work(&ce->guc_state.sched_disable_delay)) {
> > > > > +            spin_lock_irqsave(&ce->guc_state.lock, ceflags);
> > > > > + spin_unlock_irqrestore(&ce->guc_state.lock, ceflags);
> > > > 
> > > > This deserves a comment about what lock toggling wants to ensure.
> > > > 
> > > I realize this might have been my local rebasing mistake, the 
> > > intention was to
> > > handle cases where sched_disable_delay wasn't pending but potentially 
> > > still
> > > executing do_sched_disable. I believe I could try 
> > > cancel_delayed_work_sync (but
> > > not sure if i can call that might-sleep funtion mid reset while not-
> > > interruptible). Else, i would move that lock-unlock to if the
> > > cancel_delayed_work did not return true (as per original intent 
> > > before my
> > > rebase error).
> > 
> > Okay a comment like "flush any currently executing do_sched_disable" 
> > above the lock toggling would do then. Even better if you add what 
> > happens if that flushing isn't done.
> > 
got it. will do.


> > > > Also, if the loops runs with interrupts disabled what is the point of
> > > > irqsave variant in here??
> > > Yes - its redundant, let me fix that, apologies for that.
> > > 
> > > > Also2, what is the reason for dropping the lock? intel_context_put?
> > > Being consistent with other reset preparation code that closes contexts,
> > > the lock is dropped before the intel_context_put.
> > > (I hope i am not misunderstanding your question).
> > 
> > Right, okay.. so for locking inversion issues - intel_context_put must 
> > not be called with guc_state.lock held, yes?
> > 
> > Not a mandatory request, but if you want you could look into the 
> > option of avoiding lock dropping and instead doing xa_erase and 
> > recording the list of contexts for sched disable or put in a local 
> > list, and doing them all in one block outside the locked/irq disabled 
> > section. Although tbh I am not sure if that would improve anything. 
> > Probably not in this case of a reset path, but if there are other 
> > places in GuC code which do this it may be beneficial for less 
> > hammering on the CPU and core synchronisation for atomics.
> > 
> > > > > +    /*
> > > > > +     * If the context gets closed while the execbuf is ongoing, 
> > > > > the context
> > > > > +     * close code will race with the below code to cancel the 
> > > > > delayed work.
> > > > > +     * If the context close wins the race and cancels the work, it 
> > > > > will
> > > > > +     * immediately call the sched disable (see guc_context_close), 
> > > > > so there
> > > > > +     * is a chance we can get past this check while the 
> > > > > sched_disable code
> > > > > +     * is being executed. To make sure that code completes before 
> > > > > we check
> > > > > +     * the status further down, we wait for the close process to 
> > > > > complete.
> > > > > +     */
> > > > > +    if (cancel_delayed_work_sync(&ce->guc_state.sched_disable_delay))
> > > > > +        intel_context_sched_disable_unpin(ce);
> > > > > +    else if (intel_context_is_closed(ce))
> > > > > +        wait_for(context_close_done(ce), 1);
> > > > 
> > > > Comment makes it sounds important to handle the race, althought it
> > > > doesn't really explain the consequences. But most importantly, what if
> > > > close doesn't complete in 1ms?
> > > 
> > > will add the consequence (i believe the consequence is that we could 
> > > prematurely
> > > read context flags bit indicating its gucid is still registered and 
> > > after skipping
> > > re-registration, find that context gets closed and guc-id freed).
> > > 
> > > Yes the 1 second is arbitrary and unnervingly short. Just spent 
> > > sometime trying to
> > 
> > One millisecond even. :)
> 
hehe - typo :)

> Normally 1ms is not a slow time for this. We can only hit the wait if 
> the context_close call has already called the cancel_delayed_work, so 
> the only thing left to do in that function is to send the H2G, which is 
> relatively fast. However, I've just realized that if the H2G buffer is 
> full the H2G will stall, so in that case it can take longer (maximum 
> stall time before a hang is declared is 1.5s).
> 
Daniele, so increase to 1500 milisecs?

> > What would be the consequence of prematurely reading the still 
> > registered context flag? Execbuf fails? GuC hangs? Kernel crashes? 
> > Something else?
> 
> i915 thinks that a request pending with the GuC, while GuC thinks we 
> disabled it (because the completion of the delayed_disable happens after 
> the new request has been submitted). The heartbeat will still go 
> through, so no reset will be triggered, the request is just lost. A new 
> request submission on the same context should be able to recover it, but 
> we didn't test that.
> 
> 
> > And why cant' this race with context close happen at any other point 
> > than this particular spot in guc_request_alloc? Immediately after the 
> > added checks? After atomic_add_unless?
> 
> The race is tied to the canceling of the delayed work. context_close 
> only does something if it cancels the delayed work, so if the 
> cancel_delayed_work_sync here does the cancelling then context_close is 
> a no-op.
> 
> > > figure out portions of the SCHED_foo state machine bits and believe 
> > > that its possible
> > > for guc_request_alloc to just force context_close to be done from 
> > > here as it would
> > > force it into a state requiring re-registration and would close that 
> > > a few lines
> > > below. I will however verify with my team mates as i am new to these 
> > > SCHED_foo state
> > > machine bits.
> 
> I'm not sure we want to force context_close unconditionally here, that'd 
> be a big overhead. Doing it only if there is a close pending is also 
> potentially an issue, the whole point is that the close can race in. The 
> close also has to work on its own, because in the normal use-cases we 
> don't get a context_close while a request submission is ongoing.
> Unless you meant something different and I completely misunderstood.
> 
> Daniele
> 
> > Yes it always looked complicated and perhaps it has even grown more so 
> > - I'm afraid I cannot be of any help there.
> > 
> > Regards,
> > 
> > Tvrtko


  reply	other threads:[~2022-09-20  6:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-15  2:12 [Intel-gfx] [PATCH 0/1] Delay disabling GuC scheduling of an idle context Alan Previn
2022-09-15  2:12 ` [Intel-gfx] [PATCH 1/1] drm/i915/guc: Delay disabling guc_id scheduling for better hysteresis Alan Previn
2022-09-15  8:48   ` Tvrtko Ursulin
2022-09-16  7:53     ` Teres Alexis, Alan Previn
2022-09-16  8:58       ` Tvrtko Ursulin
2022-09-16 15:36         ` Ceraolo Spurio, Daniele
2022-09-20  6:22           ` Teres Alexis, Alan Previn [this message]
2022-09-20 14:33           ` Tvrtko Ursulin
2022-09-21  7:59           ` Teres Alexis, Alan Previn
2022-09-21 22:30           ` Teres Alexis, Alan Previn
2022-09-21 22:30         ` Teres Alexis, Alan Previn
2022-09-15  3:22 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Delay disabling GuC scheduling of an idle context Patchwork
2022-09-15  3:41 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-09-15 20:41 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d69e9fd9b3df695779adc2cb94b96bf44ff59878.camel@intel.com \
    --to=alan.previn.teres.alexis@intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.