From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42648) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gSB3z-0002Mu-37 for qemu-devel@nongnu.org; Wed, 28 Nov 2018 20:24:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gSB3v-0001NL-F3 for qemu-devel@nongnu.org; Wed, 28 Nov 2018 20:24:18 -0500 Date: Thu, 29 Nov 2018 11:47:18 +1100 From: David Gibson Message-ID: <20181129004718.GJ2251@umbus.fritz.box> References: <20181116105729.23240-1-clg@kaod.org> <20181116105729.23240-9-clg@kaod.org> <20181127234956.GR2251@umbus.fritz.box> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="mjUDp/cLGeqUhYyE" Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [PATCH v5 08/36] ppc/xive: introduce a simplified XIVE presenter List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?iso-8859-1?Q?C=E9dric?= Le Goater Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, Benjamin Herrenschmidt --mjUDp/cLGeqUhYyE Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 28, 2018 at 11:59:58AM +0100, C=E9dric Le Goater wrote: > On 11/28/18 12:49 AM, David Gibson wrote: > > On Fri, Nov 16, 2018 at 11:57:01AM +0100, C=E9dric Le Goater wrote: > >> The last sub-engine of the XIVE architecture is the Interrupt > >> Virtualization Presentation Engine (IVPE). On HW, they share elements, > >> the Power Bus interface (CQ), the routing table descriptors, and they > >> can be combined in the same HW logic. We do the same in QEMU and > >> combine both engines in the XiveRouter for simplicity. > >=20 > > Ok, I'm not entirely convinced combining the IVPE and IVRE into a > > single object is a good idea, but we can probably discuss that once > > I've read further. >=20 > We could introduce a simplified presenter for sPAPR but I am not even > sure of that as it will get more complex if we support the EBB one day.= =20 I wasn't really thinking about PAPR for this comment. > >> When the IVRE has completed its job of matching an event source with a > >> Notification Virtual Target (NVT) to notify, it forwards the event > >> notification to the IVPE sub-engine. The IVPE scans the thread > >> interrupt contexts of the Notification Virtual Targets (NVT) > >> dispatched on the HW processor threads and if a match is found, it > >> signals the thread. If not, the IVPE escalates the notification to > >> some other targets and records the notification in a backlog queue. > >> > >> The IVPE maintains the thread interrupt context state for each of its > >> NVTs not dispatched on HW processor threads in the Notification > >> Virtual Target table (NVTT). > >> > >> The model currently only supports single NVT notifications. > >> > >> Signed-off-by: C=E9dric Le Goater > >> --- > >> include/hw/ppc/xive.h | 13 +++ > >> include/hw/ppc/xive_regs.h | 22 ++++ > >> hw/intc/xive.c | 223 +++++++++++++++++++++++++++++++++++++ > >> 3 files changed, 258 insertions(+) > >> > >> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h > >> index 5987f26ddb98..e715a6c6923d 100644 > >> --- a/include/hw/ppc/xive.h > >> +++ b/include/hw/ppc/xive.h > >> @@ -197,6 +197,10 @@ typedef struct XiveRouterClass { > >> XiveEND *end); > >> int (*set_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_id= x, > >> XiveEND *end); > >> + int (*get_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_id= x, > >> + XiveNVT *nvt); > >> + int (*set_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_id= x, > >> + XiveNVT *nvt); > >=20 > > As with the ENDs, I don't think get/set is a good interface for a > > bigger-than-word-size object. >=20 > We need to agree on this interface before I respin. So you would like=20 > to add a extra argument specifying the word being accessed ? Yes. Ok, 3 options I can see at this point: 1) read/write accessors which take a word number 2) A "get" accessor which copies the whole structure, but "write" accessor which takes a word number. The asymmetry is a bit ugly, but it's the non-atomic writeback of the whole structure which I'm most uncomfortable with. 3) A map/unmap interface which gives you / releases a pointer to the "live" structure. For powernv that would become address_space_map()/unmap(). For PAPR it would just be reutn pointer / no-op. >=20 > >=20 > >> } XiveRouterClass; > >> =20 > >> void xive_eas_pic_print_info(XiveEAS *eas, uint32_t lisn, Monitor *mo= n); > >> @@ -207,6 +211,10 @@ int xive_router_get_end(XiveRouter *xrtr, uint8_t= end_blk, uint32_t end_idx, > >> XiveEND *end); > >> int xive_router_set_end(XiveRouter *xrtr, uint8_t end_blk, uint32_t e= nd_idx, > >> XiveEND *end); > >> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t n= vt_idx, > >> + XiveNVT *nvt); > >> +int xive_router_set_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t n= vt_idx, > >> + XiveNVT *nvt); > >> =20 > >> /* > >> * XIVE END ESBs > >> @@ -274,4 +282,9 @@ extern const MemoryRegionOps xive_tm_ops; > >> =20 > >> void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon); > >> =20 > >> +static inline uint32_t xive_tctx_cam_line(uint8_t nvt_blk, uint32_t n= vt_idx) > >> +{ > >> + return (nvt_blk << 19) | nvt_idx; > >=20 > > I'm guessing this formula is the standard way of combining the NVT > > block and index into a single word? =20 >=20 > That number is the VP/NVT identifier which is written in the CAM value.= =20 > The index is on 19 bits because of the NVT definition in the END=20 > structure. It is being increased to 24 bits on Power10=20 >=20 > > If so, I think we should > > standardize on passing a single word "nvt_id" around and only > > splitting it when we need to use the block separately. =20 >=20 > This is really the only place where we concatenate the two NVT values, > block and index.=20 Hm, ok. I know we don't model them (yet, maybe ever) but could combined values appear in the PowerBUS messages that handle remote notifications? > > Same goes for > > the end_id, assuming there's a standard way of putting that into a > > single word. That will address the point I raised earlier about lisn > > being passed around as a single word, but these later stage ids being > > split. >=20 > Hmm, I am not sure this is a good option. It is not how the PowerNV=20 > model would use it, skiboot is very much aware of these blocks and=20 > indexes and for remote accesses chips are identified using the block.=20 > I will take a look at it but I am not found of it. I can add helpers=20 > in some places though. =20 Hm, ok. Do the block and index appear as an (effectively) single field in the EAS? > I agree we have some kind of issue linking the HW model with the sPAPR=20 > machine. The guest interface is only about IRQ numbers, priorities and > cpu numbers. We really don't care about XIVE blocks and indexes in that= =20 > case. we can clarify the code by bypassing the XiveRouter interfaces > to the table and directly use the sPAPR interrupt controller. That=20 > should help a bit for the hcalls but we would still have to fill in=20 > the EAT and the END with some index values if we want to use the router > algorithm. I don't think this is too much of a problem. These are essentially machine internal details so we can choose an allocation to suit us. The obvious one is to put everything in a single block, at least as long as that won't limit our numbers too much. > > We'll probably want some inlines or macros to build an > > nvt/end/lisn/whatever id from block and index as well. > >=20 > >> +} > >> + > >> #endif /* PPC_XIVE_H */ > >> diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h > >> index 2e3d6cb507da..05cb992d2815 100644 > >> --- a/include/hw/ppc/xive_regs.h > >> +++ b/include/hw/ppc/xive_regs.h > >> @@ -158,4 +158,26 @@ typedef struct XiveEND { > >> #define END_W7_F1_LOG_SERVER_ID PPC_BITMASK32(1, 31) > >> } XiveEND; > >> =20 > >> +/* Notification Virtual Target (NVT) */ > >> +typedef struct XiveNVT { > >> + uint32_t w0; > >> +#define NVT_W0_VALID PPC_BIT32(0) > >> + uint32_t w1; > >> + uint32_t w2; > >> + uint32_t w3; > >> + uint32_t w4; > >> + uint32_t w5; > >> + uint32_t w6; > >> + uint32_t w7; > >> + uint32_t w8; > >> +#define NVT_W8_GRP_VALID PPC_BIT32(0) > >> + uint32_t w9; > >> + uint32_t wa; > >> + uint32_t wb; > >> + uint32_t wc; > >> + uint32_t wd; > >> + uint32_t we; > >> + uint32_t wf; > >> +} XiveNVT; > >> + > >> #endif /* PPC_XIVE_REGS_H */ > >> diff --git a/hw/intc/xive.c b/hw/intc/xive.c > >> index 4c6cb5d52975..5ba3b06e6e25 100644 > >> --- a/hw/intc/xive.c > >> +++ b/hw/intc/xive.c > >> @@ -373,6 +373,32 @@ void xive_tctx_pic_print_info(XiveTCTX *tctx, Mon= itor *mon) > >> } > >> } > >> =20 > >> +/* The HW CAM (23bits) is hardwired to : > >> + * > >> + * 0x000||0b1||4Bit chip number||7Bit Thread number. > >> + * > >> + * and when the block grouping extension is enabled : > >> + * > >> + * 4Bit chip number||0x001||7Bit Thread number. > >> + */ > >> +static uint32_t tctx_hw_cam_line(bool block_group, uint8_t chip_id, u= int8_t tid) > >> +{ > >> + if (block_group) { > >> + return 1 << 11 | (chip_id & 0xf) << 7 | (tid & 0x7f); > >> + } else { > >> + return (chip_id & 0xf) << 11 | 1 << 7 | (tid & 0x7f); > >> + } > >> +} > >> + > >> +static uint32_t xive_tctx_hw_cam_line(XiveTCTX *tctx, bool block_grou= p) > >> +{ > >> + PowerPCCPU *cpu =3D POWERPC_CPU(tctx->cs); > >> + CPUPPCState *env =3D &cpu->env; > >> + uint32_t pir =3D env->spr_cb[SPR_PIR].default_value; > >=20 > > I don't much like reaching into the cpu state itself. I think a > > better idea would be to have the TCTX have its HW CAM id set during > > initialization (via a property) and then use that. This will mean > > less mucking about if future cpu revisions don't split the PIR into > > chip and tid ids in the same way. >=20 > yes good idea. I will see how to handle the block_group boolean. may be we > can leave it out of the model for now as it is not used. Yes, it would be nice to leave the block_group stuff as a later extensions when/if we need it. If we put it in as a stub and nothing is using/testing it, it's likely it will be broken if we ever do actually try to use it. >=20 > >=20 > >> + return tctx_hw_cam_line(block_group, (pir >> 8) & 0xf, pir & 0x7f= ); > >> +} > >> + > >> static void xive_tctx_reset(void *dev) > >> { > >> XiveTCTX *tctx =3D XIVE_TCTX(dev); > >> @@ -1013,6 +1039,195 @@ int xive_router_set_end(XiveRouter *xrtr, uint= 8_t end_blk, uint32_t end_idx, > >> return xrc->set_end(xrtr, end_blk, end_idx, end); > >> } > >> =20 > >> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t n= vt_idx, > >> + XiveNVT *nvt) > >> +{ > >> + XiveRouterClass *xrc =3D XIVE_ROUTER_GET_CLASS(xrtr); > >> + > >> + return xrc->get_nvt(xrtr, nvt_blk, nvt_idx, nvt); > >> +} > >> + > >> +int xive_router_set_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t n= vt_idx, > >> + XiveNVT *nvt) > >> +{ > >> + XiveRouterClass *xrc =3D XIVE_ROUTER_GET_CLASS(xrtr); > >> + > >> + return xrc->set_nvt(xrtr, nvt_blk, nvt_idx, nvt); > >> +} > >> + > >> +static bool xive_tctx_ring_match(XiveTCTX *tctx, uint8_t ring, > >> + uint8_t nvt_blk, uint32_t nvt_idx, > >> + bool cam_ignore, uint32_t logic_serv) > >> +{ > >> + uint8_t *regs =3D &tctx->regs[ring]; > >> + uint32_t w2 =3D be32_to_cpu(*((uint32_t *) ®s[TM_WORD2])); > >> + uint32_t cam =3D xive_tctx_cam_line(nvt_blk, nvt_idx); > >> + bool block_group =3D false; /* TODO (PowerNV) */ > >> + > >> + /* TODO (PowerNV): ignore low order bits of nvt id */ > >> + > >> + switch (ring) { > >> + case TM_QW3_HV_PHYS: > >> + return (w2 & TM_QW3W2_VT) && xive_tctx_hw_cam_line(tctx, bloc= k_group) =3D=3D > >> + tctx_hw_cam_line(block_group, nvt_blk, nvt_idx); > >=20 > > The difference between "xive_tctx_hw_cam_line" and "tctx_hw_cam_line" > > here is far from obvious. =20 >=20 > yes. I lacked inspiration ... I'd suggest that the one which takes the tctx as a parameter be tctx_hw_cam_line() and the other be nvt_hw_cam_line() or similar. The crucial difference here is that one is what the thread is looking for, the other is what the NVT is advertising. > > Remember that namespacing prefixes aren't > > necessary for static functions, which can let you give more > > descriptive names without getting excessively long. >=20 > OK. > =20 > >> + case TM_QW2_HV_POOL: > >> + return (w2 & TM_QW2W2_VP) && (cam =3D=3D GETFIELD(TM_QW2W2_PO= OL_CAM, w2)); > >> + > >> + case TM_QW1_OS: > >> + return (w2 & TM_QW1W2_VO) && (cam =3D=3D GETFIELD(TM_QW1W2_OS= _CAM, w2)); > >> + > >> + case TM_QW0_USER: > >> + return ((w2 & TM_QW1W2_VO) && (cam =3D=3D GETFIELD(TM_QW1W2_O= S_CAM, w2)) && > >> + (w2 & TM_QW0W2_VU) && > >> + (logic_serv =3D=3D GETFIELD(TM_QW0W2_LOGIC_SERV, w2))= ); > >> + > >> + default: > >> + g_assert_not_reached(); > >> + } > >> +} > >> + > >> +static int xive_presenter_tctx_match(XiveTCTX *tctx, uint8_t format, > >> + uint8_t nvt_blk, uint32_t nvt_id= x, > >> + bool cam_ignore, uint32_t logic_= serv) > >> +{ > >> + if (format =3D=3D 0) { > >> + /* F=3D0 & i=3D1: Logical server notification */ > >> + if (cam_ignore =3D=3D true) { > >> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: no support for LS " > >> + "NVT %x/%x\n", nvt_blk, nvt_idx); > >> + return -1; > >> + } > >> + > >> + /* F=3D0 & i=3D0: Specific NVT notification */ > >> + if (xive_tctx_ring_match(tctx, TM_QW3_HV_PHYS, > >> + nvt_blk, nvt_idx, false, 0)) { > >> + return TM_QW3_HV_PHYS; > >> + } > >> + if (xive_tctx_ring_match(tctx, TM_QW2_HV_POOL, > >> + nvt_blk, nvt_idx, false, 0)) { > >> + return TM_QW2_HV_POOL; > >> + } > >> + if (xive_tctx_ring_match(tctx, TM_QW1_OS, > >> + nvt_blk, nvt_idx, false, 0)) { > >> + return TM_QW1_OS; > >> + } > >=20 > > Hm. It's a bit pointless to iterate through each ring calling a > > common function, when that "common" function consists entirely of a > > switch which makes it not really common at all. > >=20 > > So I think you want separate helper functions for each ring's match, > > or even just fold the previous function into this one. >=20 > yes. It can be improved. I did try different layouts. I might just fold= =20 > both routine in one as you propose. =20 >=20 > >> + } else { > >> + /* F=3D1 : User level Event-Based Branch (EBB) notification */ > >> + if (xive_tctx_ring_match(tctx, TM_QW0_USER, > >> + nvt_blk, nvt_idx, false, logic_serv))= { > >> + return TM_QW0_USER; > >> + } > >> + } > >> + return -1; > >> +} > >> + > >> +typedef struct XiveTCTXMatch { > >> + XiveTCTX *tctx; > >> + uint8_t ring; > >> +} XiveTCTXMatch; > >> + > >> +static bool xive_presenter_match(XiveRouter *xrtr, uint8_t format, > >> + uint8_t nvt_blk, uint32_t nvt_idx, > >> + bool cam_ignore, uint8_t priority, > >> + uint32_t logic_serv, XiveTCTXMatch *= match) > >> +{ > >> + CPUState *cs; > >> + > >> + /* TODO (PowerNV): handle chip_id overwrite of block field for > >> + * hardwired CAM compares */ > >> + > >> + CPU_FOREACH(cs) { > >> + PowerPCCPU *cpu =3D POWERPC_CPU(cs); > >> + XiveTCTX *tctx =3D XIVE_TCTX(cpu->intc); > >> + int ring; > >> + > >> + /* > >> + * HW checks that the CPU is enabled in the Physical Thread > >> + * Enable Register (PTER). > >> + */ > >> + > >> + /* > >> + * Check the thread context CAM lines and record matches. We > >> + * will handle CPU exception delivery later > >> + */ > >> + ring =3D xive_presenter_tctx_match(tctx, format, nvt_blk, nvt= _idx, > >> + cam_ignore, logic_serv); > >> + /* > >> + * Save the context and follow on to catch duplicates, that we > >> + * don't support yet. > >> + */ > >> + if (ring !=3D -1) { > >> + if (match->tctx) { > >> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: already found a= thread " > >> + "context NVT %x/%x\n", nvt_blk, nvt_idx= ); > >> + return false; > >> + } > >> + > >> + match->ring =3D ring; > >> + match->tctx =3D tctx; > >> + } > >> + } > >> + > >> + if (!match->tctx) { > >> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: NVT %x/%x is not dispat= ched\n", > >> + nvt_blk, nvt_idx); > >> + return false; > >=20 > > Hmm.. this isn't actually an error isn't it? At least not for powernv >=20 > It is on sPAPR, it would mean the END was configured with an unknow CPU.= =20 Right. > It is not error on PowerNV, when we support escalations. >=20 > > - that just means the NVT isn't currently dispatched, so we'll need to > > trigger the escalation interrupt. =20 >=20 > Yes. >=20 > > Does this get changed later in the series? >=20 > No. But this code is common to PAPR and powernv, yes, so it will need to? --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --mjUDp/cLGeqUhYyE Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlv/NxQACgkQbDjKyiDZ s5IT3xAAil1SZfWxSg2Clk52haP/kIXVubpbC11b/OFxPKaaazGbYxXnqeMp35a9 tVjqT543rdqT/110DFXI0caZl/ecJLOELRYZRBS7ftOVVSXHYhsXnDDXFcWvnNYU sP2P3A1xRmd3O/W4Qto1EzmeAum4qK0P1nYPj5w5EJA+IS3avspfUR8bst6gObpi lUq8NJlbedM9ZvTLIFaI98dMZtO2k9BL6IvtD0IIcSJHbmY3M2G6aOnxue4jjwgH NNvgL4Fx/ndbx2j7y4co2UGS3K1+FCc/bnp9MkJtDljQJwKE/RUXNV9AnaxvYI3u FWroWpXBYwm2Rh827AsTvpmXNLUHQkwgYMjHbAk6cwub6XKMjRFUVIkUOze87UHz pk25gi6WSq7U8Xp1NBcIkgu8s8ClA+e62H0VFW+hQSD5cGWfKd2FiKQTldu0zrWC 2QNiLoAEIul+sVnlswBvUATQX0JcUvcsPwlc8vW+G6KRe4IrtHu4YWqUiSjf4YdU vz3e1PcWZ2SRMLkZZa5Te+9txoNSvn/HXdP62Hf/XHis6SPg0LhxyH6fJiEThOO7 aQg+HPng+3LWwZJ4uZz2r0TxxRUZG172pbJd5jW66pc29BUTTalK+vSSBdL/gLyS LItlKffX0gmpQ3T8ZshV8DGSqKKAJ0pJgDVErFZ1QUO3zHkWhNE= =lC+M -----END PGP SIGNATURE----- --mjUDp/cLGeqUhYyE--