From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1E8A0C48BF6 for ; Mon, 4 Mar 2024 19:56:43 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.688514.1072727 (Exim 4.92) (envelope-from ) id 1rhEQE-0001Lm-EW; Mon, 04 Mar 2024 19:56:26 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 688514.1072727; Mon, 04 Mar 2024 19:56:26 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rhEQE-0001Lf-AT; Mon, 04 Mar 2024 19:56:26 +0000 Received: by outflank-mailman (input) for mailman id 688514; Mon, 04 Mar 2024 19:56:24 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rhEQC-0001LZ-HA for xen-devel@lists.xenproject.org; Mon, 04 Mar 2024 19:56:24 +0000 Received: from mailhost.m5p.com (mailhost.m5p.com [74.104.188.4]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 4517cad8-da61-11ee-a1ee-f123f15fe8a2; Mon, 04 Mar 2024 20:56:21 +0100 (CET) Received: from m5p.com (mailhost.m5p.com [IPv6:2001:470:1f07:15ff:0:0:0:f7]) by mailhost.m5p.com (8.17.1/8.15.2) with ESMTPS id 424JuBmC026059 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 4 Mar 2024 14:56:17 -0500 (EST) (envelope-from ehem@m5p.com) Received: (from ehem@localhost) by m5p.com (8.17.1/8.15.2/Submit) id 424JuBjF026058; Mon, 4 Mar 2024 11:56:11 -0800 (PST) (envelope-from ehem) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 4517cad8-da61-11ee-a1ee-f123f15fe8a2 Date: Mon, 4 Mar 2024 11:56:11 -0800 From: Elliott Mitchell To: xen-devel@lists.xenproject.org, Jan Beulich , Andrew Cooper Subject: Re: Serious AMD-Vi issue Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Mon, Feb 12, 2024 at 03:23:00PM -0800, Elliott Mitchell wrote: > On Thu, Jan 25, 2024 at 12:24:53PM -0800, Elliott Mitchell wrote: > > Apparently this was first noticed with 4.14, but more recently I've been > > able to reproduce the issue: > > > > https://bugs.debian.org/988477 > > > > The original observation features MD-RAID1 using a pair of Samsung > > SATA-attached flash devices. The main line shows up in `xl dmesg`: > > > > (XEN) AMD-Vi: IO_PAGE_FAULT: DDDD:bb:dd.f d0 addr ffffff???????000 flags 0x8 I > > > > Where the device points at the SATA controller. I've ended up > > reproducing this with some noticable differences. > > > > A major goal of RAID is to have different devices fail at different > > times. Hence my initial run had a Samsung device plus a device from > > another reputable flash manufacturer. > > > > I initially noticed this due to messages in domain 0's dmesg about > > errors from the SATA device. Wasn't until rather later that I noticed > > the IOMMU warnings in Xen's dmesg (perhaps post-domain 0 messages should > > be duplicated into domain 0's dmesg?). > > > > All of the failures consistently pointed at the Samsung device. Due to > > the expectation it would fail first (lower quality offering with > > lesser guarantees), I proceeded to replace it with a NVMe device. > > > > With some monitoring I discovered the NVMe device was now triggering > > IOMMU errors, though not nearly as many as the Samsung SATA device did. > > As such looks like AMD-Vi plus MD-RAID1 appears to be exposing some sort > > of IOMMU issue with Xen. > > > > > > All I can do is offer speculation about the underlying cause. There > > does seem to be a pattern of higher-performance flash storage devices > > being more severely effected. > > > > I was speculating about the issue being the MD-RAID1 driver abusing > > Linux's DMA infrastructure in some fashion. > > > > Upon further consideration, I'm wondering if this is perhaps a latency > > issue. I imagine there is some sort of flush after the IOMMU tables are > > modified. Perhaps the Samsung SATA (and all NVMe) devices were trying to > > execute commands before reloading the IOMMU tables is complete. > > Ping! > > The recipe seems to be Linux MD RAID1, plus Samsung SATA or any NVMe. > > To make it explicit, when I tried Crucial SATA + Samsung SATA. IOMMU > errors matched the Samsung SATA (a number of times the SATA driver > complained). > > As stated, I'm speculating lower latency devices starting to execute > commands before IOMMU tables have finished reloading. When originally > implemented fast flash devices were rare. I guess I'm lucky I ended up with some slightly higher-latency hardware. This is a very serious issue as data loss can occur. AMD needs to fund their Xen engineers more, otherwise soon AMD hardware may no longer be viable with Xen. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg@m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445