All of lore.kernel.org
 help / color / mirror / Atom feed
From: George Dunlap <dunlapg@umich.edu>
To: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
Cc: "Jan Beulich" <jbeulich@suse.com>,
	"Juergen Gross" <jgross@suse.de>,
	"Dario Faggioli" <dfaggioli@suse.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	"Jürgen Groß" <jgross@suse.com>
Subject: Re: [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer
Date: Fri, 14 Oct 2022 17:42:40 +0100	[thread overview]
Message-ID: <CAFLBxZZUknp80CQx9rhQhD=hzqV+eOR5Me=eyU1sJUHMaMUybg@mail.gmail.com> (raw)
In-Reply-To: <YymUZCfLZRWl6xr5@mail-itl>

[-- Attachment #1: Type: text/plain, Size: 3771 bytes --]

On Tue, Sep 20, 2022 at 11:23 AM Marek Marczykowski-Górecki <
marmarek@invisiblethingslab.com> wrote:

>
> I have two (non exclusive) ideas here:
> 1. If old_cpu is actually still available, do not move it at all.
> 2. Use sched_migrate() instead of sched_set_res().
>

Other possibilities:

3.  Make sure that svc->rqd is set to null when the affinity is broken.

Currently on vcpu creation, sched_init_vcpu() expects to set the pcpu; and
it looks like for credit2, the svc->rqd may not be set until the first time
it's woken up (that's the 'if' part of the 'if/else' clause whose 'else'
contains the ASSERT() you're hitting).  If when we broke the CPU affinity
on suspend, we set the runqueues to NULL, then on wake it would "take" the
runqueue assigned by restore_vcpu_affinity().

4. Make sched2_unit_wake() tolerant of pcpus changing under its feet.

#3 would potentially make things more robust, but would require adding some
sort of call-back to notify schedulers that affinity had been broken.  ATM
this might only be used by credit2.

#4 would potentially be dangerous: if some other bit of credit2 code which
assumes the svc->rq is valid.


> Here is the patch that fixes it for me:
> ---8<---
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 83455fbde1c8..dcf202d8b307 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1246,19 +1246,29 @@ void restore_vcpu_affinity(struct domain *d)
>              }
>          }
>
> -        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> +        /* Prefer old cpu if available. */
> +        if ( cpumask_test_cpu(old_cpu, cpumask_scratch_cpu(cpu)) )
> +            res = get_sched_res(old_cpu);
> +        else
> +            res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
>          sched_set_res(unit, res);
>
>          spin_unlock_irq(lock);
>
> -        /* v->processor might have changed, so reacquire the lock. */
> -        lock = unit_schedule_lock_irq(unit);
> -        res = sched_pick_resource(unit_scheduler(unit), unit);
> -        sched_set_res(unit, res);
> -        spin_unlock_irq(lock);
> -
> +        /*
> +         * If different cpu was chosen, it was random, let scheduler do
> proper
> +         * decision.
> +         */
>          if ( old_cpu != sched_unit_master(unit) )
> +        {
> +            /* v->processor might have changed, so reacquire the lock. */
> +            lock = unit_schedule_lock_irq(unit);
> +            res = sched_pick_resource(unit_scheduler(unit), unit);
> +            sched_migrate(unit_scheduler(unit), unit, res->master_cpu);
> +            spin_unlock_irq(lock);
> +
>              sched_move_irqs(unit);
> +        }
>      }
>
>      rcu_read_unlock(&sched_res_rculock);
> ---8<---
>
> I have several doubts here:
>
> 1. If old_cpu is available, is sched_set_res() needed at all?
> 2. Should both calls be changed to sched_migrate()? Currently I changed
>    only the second one, in case scheduler could be confused about
>    old_cpu not being available anymore.
> 3. Are there any extra locking requirements for sched_migrate() at this
>    stage? The long comment above sched_unit_migrate_start() suggests
>    there might be, but I'm not sure if that's really the case during
>    resume.
> 4. Related to the above - should thaw_domains() be modified to call
>    restore_vcpu_affinity() for all domains first, and unpause only
>    later? That could reduce locking requirements, I guess.
>

Unfortunately this code has had a lot of churn since the last time I really
engaged with it; I'm going to have to come back to this on Monday.

Jürgen / Dario, any thoughts?

 -George

[-- Attachment #2: Type: text/html, Size: 4790 bytes --]

  parent reply	other threads:[~2022-10-14 16:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-18 14:16 [Xen-devel] Xen crash after S3 suspend - Xen 4.13 Marek Marczykowski-Górecki
2020-03-18 14:50 ` Andrew Cooper
2020-03-18 22:10   ` Marek Marczykowski-Górecki
2020-03-19  0:28     ` Dario Faggioli
2020-03-19  0:59       ` Marek Marczykowski-Górecki
2020-03-23  0:09       ` Marek Marczykowski-Górecki
2020-03-23  8:14         ` Jan Beulich
2020-09-29 14:27         ` Marek Marczykowski-Górecki
2020-09-29 15:07           ` Jürgen Groß
2020-09-29 15:16             ` Marek Marczykowski-Górecki
2020-09-29 15:27               ` Jürgen Groß
2021-01-31  2:15                 ` [Xen-devel] Xen crash after S3 suspend - Xen 4.13 and newer Marek Marczykowski-Górecki
2021-10-09 16:28                   ` Marek Marczykowski-Górecki
2022-08-21 16:14                     ` Marek Marczykowski-Górecki
2022-08-22  9:53                       ` Jan Beulich
2022-08-22 10:00                         ` Marek Marczykowski-Górecki
2022-09-20 10:22                           ` Marek Marczykowski-Górecki
2022-09-20 14:30                             ` Jan Beulich
2022-10-11 11:22                               ` Marek Marczykowski-Górecki
2022-10-14 16:42                             ` George Dunlap [this message]
2022-10-21  6:41                             ` Juergen Gross
2022-08-22 15:34                       ` Juergen Gross
2022-09-06 11:46                         ` Juergen Gross
2022-09-06 12:35                           ` Marek Marczykowski-Górecki
2022-09-07 12:21                             ` Dario Faggioli
2022-09-07 15:07                               ` marmarek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFLBxZZUknp80CQx9rhQhD=hzqV+eOR5Me=eyU1sJUHMaMUybg@mail.gmail.com' \
    --to=dunlapg@umich.edu \
    --cc=andrew.cooper3@citrix.com \
    --cc=dfaggioli@suse.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=jgross@suse.de \
    --cc=marmarek@invisiblethingslab.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.