LKML Archive on
 help / color / Atom feed
From: Marc Zyngier <>
To: Andre Przywara <>
Cc: Zenghui Yu <>, <>,
	<>, <>,
	<>, <>,
	<>, <>,
	<>, <>,
	<>, <>,
	"Raslan, KarimAllah" <>
Subject: Re: [RFC PATCH] KVM: arm/arm64: Enable direct irqfd MSI injection
Date: Thu, 16 May 2019 08:21:21 +0100
Message-ID: <> (raw)
In-Reply-To: <>

Hi Andre,

On Wed, 15 May 2019 17:38:32 +0100,
Andre Przywara <> wrote:
> On Mon, 18 Mar 2019 13:30:40 +0000
> Marc Zyngier <> wrote:
> Hi,
> > On Sun, 17 Mar 2019 19:35:48 +0000
> > Marc Zyngier <> wrote:
> > 
> > [...]
> > 
> > > A first approach would be to keep a small cache of the last few
> > > successful translations for this ITS, cache that could be looked-up by
> > > holding a spinlock instead. A hit in this cache could directly be
> > > injected. Any command that invalidates or changes anything (DISCARD,
> > > INV, INVALL, MAPC with V=0, MAPD with V=0, MOVALL, MOVI) should nuke
> > > the cache altogether.  
> > 
> > And to explain what I meant with this, I've pushed a branch[1] with a
> > basic prototype. It is good enough to get a VM to boot, but I wouldn't
> > trust it for anything serious just yet.
> > 
> > If anyone feels like giving it a go and check whether it has any
> > benefit performance wise, please do so.
> So I took a stab at the performance aspect, and it took me a while to find
> something where it actually makes a difference. The trick is to create *a
> lot* of interrupts. This is my setup now:
> - GICv3 and ITS
> - 5.1.0 kernel vs. 5.1.0 plus Marc's rebased "ITS cache" patches on top
> - 4 VCPU guest on a 4 core machine
> - passing through a M.2 NVMe SSD (or a USB3 controller) to the guest
> - running FIO in the guest, with:
>   - 4K block size, random reads, queue depth 16, 4 jobs (small)
>   - 1M block size, sequential reads, QD 1, 1 job (big)
> For the NVMe disk I see a whopping 19% performance improvement with Marc's
> series (for the small blocks). For a SATA SSD connected via USB3.0 I still
> see 6% improvement. For NVMe there were 50,000 interrupts per second on
> the host, the USB3 setup came only up to 10,000/s. For big blocks (with
> IRQs in the low thousands/s) the win is less, but still a measurable
> 3%.

Thanks for having a go at this, and identifying the case where it
actually matters (I would have hoped that the original reporter would
have helped with this, but hey, never mind). The results are pretty
impressive (more so than I anticipated), and I wonder whether we could
improve things further (50k interrupts/s is not that high -- I get
more than 100k on some machines just by playing with their sdcard...).

Could you describe how many interrupt sources each device has? The
reason I'm asking is that the cache size is pretty much hardcoded at
the moment (4 entries per vcpu), and that could have an impact on
performance if we keep evicting entries in the cache (note to self:
add some statistics for that).

Another area where we can improve things is that I think the
invalidation mechanism is pretty trigger happy (MOVI really doesn't
need to invalidate the cache). On the other hand, I'm not sure your
guest does too much of that.

Finally, the single cache spin-lock is bound to be a bottleneck of its
own at high interrupt rates, and I wonder whether we should move the
whole thing over to an RCU friendly data structure (the vgic_irq
structure really isn't that friendly). It'd be good to find out how
contended that spinlock is on your system.

> Now that I have the setup, I can rerun experiments very quickly (given I
> don't loose access to the machine), so let me know if someone needs
> further tests.

Another useful data point would be the delta with bare-metal: how much
overhead do we have with KVM, with and without this patch series. Oh,
and for easier comparison, please write it as a table that we can dump
in the cover letter when I actually post the series! ;-)



Jazz is not dead, it just smell funny.

  reply index

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-17 14:36 Zenghui Yu
2019-03-17 14:50 ` Raslan, KarimAllah
2019-03-17 18:05   ` Auger Eric
2019-03-17 19:35 ` Marc Zyngier
2019-03-18 13:30   ` Marc Zyngier
2019-05-15 16:38     ` Andre Przywara
2019-05-16  7:21       ` Marc Zyngier [this message]
2019-05-20 15:31         ` Zenghui Yu
2019-05-20 18:00           ` Raslan, KarimAllah
2019-03-19  1:09   ` Zenghui Yu
2019-03-19 10:01     ` Marc Zyngier
2019-03-19 15:59       ` Zenghui Yu
2019-03-19 16:57         ` Marc Zyngier

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on

Archives are clonable:
	git clone --mirror lkml/git/0.git
	git clone --mirror lkml/git/1.git
	git clone --mirror lkml/git/2.git
	git clone --mirror lkml/git/3.git
	git clone --mirror lkml/git/4.git
	git clone --mirror lkml/git/5.git
	git clone --mirror lkml/git/6.git
	git clone --mirror lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ \
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone