From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.server123.net (Postfix) with ESMTPS for ; Wed, 22 May 2019 00:15:43 +0200 (CEST) Date: Tue, 21 May 2019 18:07:36 -0400 From: Mike Snitzer Message-ID: <20190521220736.GB30736@redhat.com> References: <5D8A23C5-B6AD-48EA-B0AD-AD1BD1A2B97B@gmail.com> <9d19e5b1-b76f-27da-fa4a-f3a83e6e2791@gmail.com> <2a12ef24-ab21-a9bb-af40-3743d0b8e2c7@knorrie.org> <33d155ac-9b09-c8b0-3df1-88063dac964f@knorrie.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <33d155ac-9b09-c8b0-3df1-88063dac964f@knorrie.org> Subject: Re: [dm-crypt] Dm-integrity freeze List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Hans van Kranenburg Cc: Milan Broz , Victor Helmholtz , dm-crypt@saout.de, device-mapper development , Mikulas Patocka On Tue, May 21 2019 at 4:33pm -0400, Hans van Kranenburg wrote: > Hi, > > On 5/21/19 10:43 AM, Hans van Kranenburg wrote: > > Hi, > > > > I'm seeing the same lockup, also 4.19. This is mdadm RAID10 on top of 4x > > a partition with only dm-integrity. > > > > It just happened out of the blue, no heavy load or anything. All IO to > > it is frozen now. > > > > [...] > > There it is again... dmesg dump below. All cpus on 100% iowait. > > It's triggered after a few minutes by running some Windows 2019 server > install (ugh, don't ask) in a Xen HVM domU, which writes into a raw > sparse file on a btrfs filesystem on LVM on mdadm RAID10 on 4x > dm-integrity (wheeee!!)... > > This morning it was triggered a few minutes after starting an old > windows 2008 server image that I copied to this machine. > > When running only other Linux vms, and when copying data onto > filesystems that live in LVM logical volumes I haven't seen this problem > yet, at all, in the last few weeks that this machine is running. > > I noticed there's a "dm integrity: fix deadlock with overlapping I/O" > fix in a later 4.19. Is there any chance this is related? I have no > idea, but any hints or suggestions about what to try would be appreciated. Yes, all your hung tasks are hung in wait_and_add_new_range(). Please use that later 4.19 or apply commit 4ed319c6ac08 ("dm integrity: fix deadlock with overlapping I/O") Mike From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Snitzer Subject: Re: Dm-integrity freeze Date: Tue, 21 May 2019 18:07:36 -0400 Message-ID: <20190521220736.GB30736@redhat.com> References: <5D8A23C5-B6AD-48EA-B0AD-AD1BD1A2B97B@gmail.com> <9d19e5b1-b76f-27da-fa4a-f3a83e6e2791@gmail.com> <2a12ef24-ab21-a9bb-af40-3743d0b8e2c7@knorrie.org> <33d155ac-9b09-c8b0-3df1-88063dac964f@knorrie.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <33d155ac-9b09-c8b0-3df1-88063dac964f@knorrie.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: Hans van Kranenburg Cc: dm-crypt@saout.de, device-mapper development , Mikulas Patocka , Milan Broz , Victor Helmholtz List-Id: dm-devel.ids On Tue, May 21 2019 at 4:33pm -0400, Hans van Kranenburg wrote: > Hi, > > On 5/21/19 10:43 AM, Hans van Kranenburg wrote: > > Hi, > > > > I'm seeing the same lockup, also 4.19. This is mdadm RAID10 on top of 4x > > a partition with only dm-integrity. > > > > It just happened out of the blue, no heavy load or anything. All IO to > > it is frozen now. > > > > [...] > > There it is again... dmesg dump below. All cpus on 100% iowait. > > It's triggered after a few minutes by running some Windows 2019 server > install (ugh, don't ask) in a Xen HVM domU, which writes into a raw > sparse file on a btrfs filesystem on LVM on mdadm RAID10 on 4x > dm-integrity (wheeee!!)... > > This morning it was triggered a few minutes after starting an old > windows 2008 server image that I copied to this machine. > > When running only other Linux vms, and when copying data onto > filesystems that live in LVM logical volumes I haven't seen this problem > yet, at all, in the last few weeks that this machine is running. > > I noticed there's a "dm integrity: fix deadlock with overlapping I/O" > fix in a later 4.19. Is there any chance this is related? I have no > idea, but any hints or suggestions about what to try would be appreciated. Yes, all your hung tasks are hung in wait_and_add_new_range(). Please use that later 4.19 or apply commit 4ed319c6ac08 ("dm integrity: fix deadlock with overlapping I/O") Mike