All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Elliott Mitchell <ehem+xen@m5p.com>
Cc: xen-devel@lists.xenproject.org,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>, "Wei Liu" <wl@xen.org>,
	"Kelly Choi" <kelly.choi@cloud.com>
Subject: Re: Serious AMD-Vi(?) issue
Date: Thu, 18 Apr 2024 09:09:51 +0200	[thread overview]
Message-ID: <f0bdb386-0870-4468-846c-6c8a91eaf806@suse.com> (raw)
In-Reply-To: <ZiDBc3ye2wqmBAfq@mattapan.m5p.com>

On 18.04.2024 08:45, Elliott Mitchell wrote:
> On Wed, Apr 17, 2024 at 02:40:09PM +0200, Jan Beulich wrote:
>> On 11.04.2024 04:41, Elliott Mitchell wrote:
>>> On Thu, Mar 28, 2024 at 07:25:02AM +0100, Jan Beulich wrote:
>>>> On 27.03.2024 18:27, Elliott Mitchell wrote:
>>>>> On Mon, Mar 25, 2024 at 02:43:44PM -0700, Elliott Mitchell wrote:
>>>>>> On Mon, Mar 25, 2024 at 08:55:56AM +0100, Jan Beulich wrote:
>>>>>>>
>>>>>>> In fact when running into trouble, the usual course of action would be to
>>>>>>> increase verbosity in both hypervisor and kernel, just to make sure no
>>>>>>> potentially relevant message is missed.
>>>>>>
>>>>>> More/better information might have been obtained if I'd been engaged
>>>>>> earlier.
>>>>>
>>>>> This is still true, things are in full mitigation mode and I'll be
>>>>> quite unhappy to go back with experiments at this point.
>>>>
>>>> Well, it very likely won't work without further experimenting by someone
>>>> able to observe the bad behavior. Recall we're on xen-devel here; it is
>>>> kind of expected that without clear (and practical) repro instructions
>>>> experimenting as well as info collection will remain with the reporter.
>>>
>>> After looking at the situation and considering the issues, I /may/ be
>>> able to setup for doing more testing.  I guess I should confirm, which of
>>> those criteria do you think currently provided information fails at?
>>>
>>> AMD-IOMMU + Linux MD RAID1 + dual Samsung SATA (or various NVMe) +
>>> dbench; seems a pretty specific setup.
>>
>> Indeed. If that's the only way to observe the issue, it suggests to me
>> that it'll need to be mainly you to do further testing, and perhaps even
>> debugging. Which isn't to say we're not available to help, but from all
>> I have gathered so far we're pretty much in the dark even as to which
>> component(s) may be to blame. As can still be seen at the top in reply
>> context, some suggestions were given as to obtaining possible further
>> information (or confirming the absence thereof).
> 
> There may be other ways which haven't yet been found.
> 
> I've been left with the suspicion AMD was to some degree sponsoring
> work to ensure Xen works on their hardware.  Given the severity of this
> problem I would kind of expect them not want to gain a reputation for
> having data loss issues.  Assuming a suitable pair of devices weren't
> already on-hand, I would kind of expect this to be well within their
> budget.

You've got to talk to AMD then. Plus I assume it's clear to you that
even if the (presumably) necessary hardware was available, it still
would require respective setup, leaving open whether the issue then
could indeed be reproduced.

>> I'd also like to come back to the vague theory you did voice, in that
>> you're suspecting flushes to take too long. I continue to have trouble
>> with this, and I would therefore like to ask that you put this down in
>> more technical terms, making connections to actual actions taken by
>> software / hardware.
> 
> I'm trying to figure out a pattern.
> 
> Nominally all the devices are roughly on par (only a very cheap flash
> device will be unable to overwhelm SATA's bandwidth).  Yet why did the
> Crucial SATA device /seem/ not to have the issue?  Why did a Crucial NVMe
> device demonstrate the issue.
> 
> My guess is the flash controllers Samsung uses may be able to start
> executing commands faster than the ones Crucial uses.  Meanwhile NVMe
> is lower overhead and latency than SATA (SATA's overhead isn't an issue
> for actual disks).  Perhaps the IOMMU is still flushing its TLB, or
> hasn't loaded the new tables.

Which would be an IOMMU issue then, that software at best may be able to
work around.

Jan

> I suspect when the MD-RAID1 issues block requests to a pair of devices,
> it likely sends the block to one device and then reuses most/all of the
> structures for the second device.  As a result the second request would
> likely get a command to the device rather faster than the first request.
> 
> Perhaps look into what structures the MD-RAID1 subsystem reuses are.
> Then see whether doing early setup of those structures triggers the
> issue?
> 
> (okay I'm deep into speculation here, but this seems the simplest
> explanation for what could be occuring)
> 
> 



  reply	other threads:[~2024-04-18  7:10 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-25 20:24 Serious AMD-Vi issue Elliott Mitchell
2024-02-12 23:23 ` Elliott Mitchell
2024-03-04 19:56   ` Elliott Mitchell
2024-03-18 19:41   ` Serious AMD-Vi(?) issue Elliott Mitchell
2024-03-22 16:41     ` Kelly Choi
2024-03-22 19:22       ` Elliott Mitchell
2024-03-25  7:55         ` Jan Beulich
2024-03-25 21:43           ` Elliott Mitchell
2024-03-27 17:27             ` Elliott Mitchell
2024-03-28  6:25               ` Jan Beulich
2024-03-28 15:22                 ` Elliott Mitchell
2024-03-28 16:17                   ` Elliott Mitchell
2024-04-11  2:41                 ` Elliott Mitchell
2024-04-17 12:40                   ` Jan Beulich
2024-04-18  6:45                     ` Elliott Mitchell
2024-04-18  7:09                       ` Jan Beulich [this message]
2024-04-19  4:33                         ` Elliott Mitchell
2024-05-11  4:09                           ` Elliott Mitchell
2024-05-13  8:44                             ` Roger Pau Monné
2024-05-13 20:11                               ` Elliott Mitchell
2024-05-14  8:22                                 ` Jan Beulich
2024-05-14 20:51                                   ` Elliott Mitchell
2024-05-15 13:40                                     ` Kelly Choi
2024-05-16  5:21                                       ` Elliott Mitchell
2024-05-14  8:20                               ` Jan Beulich
2024-03-04 23:55 ` AMD-Vi issue Andrew Cooper
2024-03-05  0:34   ` Elliott Mitchell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0bdb386-0870-4468-846c-6c8a91eaf806@suse.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ehem+xen@m5p.com \
    --cc=kelly.choi@cloud.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.