qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: qemu-ppc@nongnu.org, Greg Kurz <groug@kaod.org>, qemu-devel@nongnu.org
Subject: Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter
Date: Thu, 3 Oct 2019 08:37:41 +1000	[thread overview]
Message-ID: <20191002223741.GC11105@umbus.fritz.box> (raw)
In-Reply-To: <3db4ae61-a4f0-3ddf-e734-5795c7176559@kaod.org>

[-- Attachment #1: Type: text/plain, Size: 5023 bytes --]

On Wed, Oct 02, 2019 at 04:47:56PM +0200, Cédric Le Goater wrote:
> On 02/10/2019 16:21, Greg Kurz wrote:
> > On Wed, 2 Oct 2019 11:02:45 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> > 
> >> On Tue, Oct 01, 2019 at 06:56:28PM +0200, Greg Kurz wrote:
> >>> On Tue, 1 Oct 2019 13:56:10 +0200
> >>> Cédric Le Goater <clg@kaod.org> wrote:
> >>>
> >>>> On 01/10/2019 13:06, Greg Kurz wrote:
> >>>>> On Tue,  1 Oct 2019 10:57:22 +0200
> >>>>> Cédric Le Goater <clg@kaod.org> wrote:
> >>>>>
> >>>>>> When vCPUs are hotplugged, they are added to the QEMU CPU list before
> >>>>>> being fully realized. This can crash the XIVE presenter because the
> >>>>>> 'tctx' pointer is not necessarily initialized when looking for a
> >>>>>> matching target.
> >>>>>>
> >>>>>
> >>>>> Ouch... :-\
> >>>>>
> >>>>>> These vCPUs are not valid targets for the presenter. Skip them.
> >>>>>>
> >>>>>
> >>>>> This likely fixes this specific issue, but more generally, this
> >>>>> seems to indicate that using CPU_FOREACH() is rather fragile.
> >>>>>
> >>>>> What about tracking XIVE TM contexts with a QLIST in xive.c ?
> >>>>
> >>>> This is a good idea.  
> >>>>
> >>>> On HW, the thread interrupt contexts belong to the XIVE presenter 
> >>>> subengine. This is the logic doing the CAM line matching to find
> >>>> a target for an event notification. So we should model the context 
> >>>> list below the XiveRouter in QEMU which models both router and 
> >>>> presenter subengines. We have done without a presenter model for 
> >>>> the moment and I don't think we will need to introduce one.  
> >>>>
> >>>> This would be a nice improvements of my patchset adding support
> >>>> for xive escalations and better support of multi chip systems. 
> >>>> I have introduced a PNV_CHIP_CPU_FOREACH in this patchset which 
> >>>> would become useless with a list of tctx under the XIVE interrupt
> >>>> controller, XiveRouter, SpaprXive, PnvXive.
> >>>>
> >>>
> >>> I agree. It makes more sense to have the list below the XiveRouter,
> >>> rather than relying on CPU_FOREACH(), which looks a bit weird from
> >>> a device emulation code perspective.
> >>
> >> That does sound like a better idea long term.  However, for now, I
> >> think the NULL check is a reasonable way of fixing the real error
> >> we're hitting, so I've applied the patch here.
> >>
> > 
> > Fair enough.
> > 
> > Reviewed-by: Greg Kurz <groug@kaod.org>
> > 
> >> Future cleanups to a different approach remain welcome, of course.
> >>
> > 
> > I've started to look. This should simplify Cedric's "add XIVE support
> > for KVM guests" series, but I'll wait for your massive cleanup series
> > to get merged first.
> 
> 
> This is a combo patchset. 
> 
> 
> These are prereqs fixes related to the presenter and how CAM lines
> are scanned. This is in direct relation with the CPU_FOREACH() issue.
> 
>   ppc/xive: Introduce a XivePresenter interface
>   ppc/xive: Implement the XivePresenter interface
>   ppc/pnv: Introduce a PNV_CHIP_CPU_FOREACH() helper
>   ppc/pnv: Introduce a pnv_xive_is_cpu_enabled() helper
>   ppc/xive: Introduce a XiveFabric interface
>   ppc/pnv: Implement the XiveFabric interface
>   ppc/spapr: Implement the XiveFabric interface
>   ppc/xive: Use the XiveFabric and XivePresenter interfaces
> 
> These are for interrupt resend (escalation). To be noted, the removal 
> of the get_tctx() XiveRouter handler which has some relation with 
> the previous subserie.
> 
>   ppc/xive: Extend the TIMA operation with a XivePresenter parameter
>   ppc/pnv: Clarify how the TIMA is accessed on a multichip system
>   ppc/xive: Move the TIMA operations to the controller model
>   ppc/xive: Remove the get_tctx() XiveRouter handler
>   ppc/xive: Introduce a xive_tctx_ipb_update() helper
>   ppc/xive: Introduce helpers for the NVT id
>   ppc/xive: Synthesize interrupt from the saved IPB in the NVT
> 
> This is a bug :
> 
>   ppc/pnv: Remove pnv_xive_vst_size() routine
>   ppc/pnv: Dump the XIVE NVT table
>   ppc/pnv: Skip empty slots of the XIVE NVT table
> 
> This is a model for the block id configuration and better support
> for multichip systems : 
> 
>   ppc/pnv: Introduce a pnv_xive_block_id() helper
>   ppc/pnv: Extend XiveRouter with a get_block_id() handler
> 
> Misc improvements : 
> 
>   ppc/pnv: Quiesce some XIVE errors
>   ppc/xive: Introduce a xive_os_cam_decode() helper
>   ppc/xive: Check V bit in TM_PULL_POOL_CTX
>   ppc/pnv: Improve trigger data definition
>   ppc/pnv: Use the EAS trigger bit when triggering an interrupt from PSI
> 
> 
> I should move some in front to have them merged before hand if 
> possible. They have been on the list for 3 months.

Sorry :(.  I've been kind of overwhelmed.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-10-02 22:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-01  8:57 [PATCH] spapr/xive: skip partially initialized vCPUs in presenter Cédric Le Goater
2019-10-01 11:06 ` Greg Kurz
2019-10-01 11:56   ` Cédric Le Goater
2019-10-01 16:56     ` Greg Kurz
2019-10-02  1:02       ` David Gibson
2019-10-02 14:21         ` Greg Kurz
2019-10-02 14:47           ` Cédric Le Goater
2019-10-02 22:37             ` David Gibson [this message]
2019-10-03  8:02               ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191002223741.GC11105@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=clg@kaod.org \
    --cc=groug@kaod.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).