From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Snitzer Subject: Re: block: fix blk_queue_split() resource exhaustion Date: Fri, 24 Jun 2016 11:15:47 -0400 Message-ID: <20160624151547.GA13898@redhat.com> References: <1466583730-28595-1-git-send-email-lars.ellenberg@linbit.com> <20160624142711.GF3239@soda.linbit> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20160624142711.GF3239@soda.linbit> Sender: linux-raid-owner@vger.kernel.org To: Lars Ellenberg Cc: Ming Lei , linux-block@vger.kernel.org, Roland Kammerer , Jens Axboe , NeilBrown , Kent Overstreet , Shaohua Li , Alasdair Kergon , "open list:DEVICE-MAPPER (LVM)" , Ingo Molnar , Peter Zijlstra , Takashi Iwai , Jiri Kosina , Zheng Liu , Keith Busch , "Martin K. Petersen" , "Kirill A. Shutemov" , Linux Kernel Mailing List , "open list:BCACHE (BLOCK LAYER CACHE)" , "open list:SOFTWARE RAID (Multiple Disks) SUPPORT" List-Id: linux-raid.ids On Fri, Jun 24 2016 at 10:27am -0400, Lars Ellenberg wrote: > On Fri, Jun 24, 2016 at 07:36:57PM +0800, Ming Lei wrote: > > > > > > This is not a theoretical problem. > > > At least int DRBD, and an unfortunately high IO concurrency wrt. the > > > "max-buffers" setting, without this patch we have a reproducible deadlock. > > > > Is there any log about the deadlock? And is there any lockdep warning > > if it is enabled? > > In DRBD, to avoid potentially very long internal queues as we wait for > our replication peer device and local backend, we limit the number of > in-flight bios we accept, and block in our ->make_request_fn() if that > number exceeds a configured watermark ("max-buffers"). > > Works fine, as long as we could assume that once our make_request_fn() > returns, any bios we "recursively" submitted against the local backend > would be dispatched. Which used to be the case. It'd be useful to know whether this patch fixes your issue: https://patchwork.kernel.org/patch/7398411/ Ming Lei didn't like it due to concerns about I contexts changing (whereby breaking merging that occurs via plugging). But if it _does_ fix your issue then the case for the change is increased; and we just need to focus on addressing Ming's concerns (Mikulas has some ideas). Conversely, and in parallel, Mikulas can look to see if your approach fixes the observed dm-snapshot deadlock that he set out to fix. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751479AbcFXPPw (ORCPT ); Fri, 24 Jun 2016 11:15:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57091 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750988AbcFXPPu (ORCPT ); Fri, 24 Jun 2016 11:15:50 -0400 Date: Fri, 24 Jun 2016 11:15:47 -0400 From: Mike Snitzer To: Lars Ellenberg Cc: Ming Lei , linux-block@vger.kernel.org, Roland Kammerer , Jens Axboe , NeilBrown , Kent Overstreet , Shaohua Li , Alasdair Kergon , "open list:DEVICE-MAPPER (LVM)" , Ingo Molnar , Peter Zijlstra , Takashi Iwai , Jiri Kosina , Zheng Liu , Keith Busch , "Martin K. Petersen" , "Kirill A. Shutemov" , Linux Kernel Mailing List , "open list:BCACHE (BLOCK LAYER CACHE)" , "open list:SOFTWARE RAID (Multiple Disks) SUPPORT" Subject: Re: block: fix blk_queue_split() resource exhaustion Message-ID: <20160624151547.GA13898@redhat.com> References: <1466583730-28595-1-git-send-email-lars.ellenberg@linbit.com> <20160624142711.GF3239@soda.linbit> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160624142711.GF3239@soda.linbit> User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 24 Jun 2016 15:15:49 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 24 2016 at 10:27am -0400, Lars Ellenberg wrote: > On Fri, Jun 24, 2016 at 07:36:57PM +0800, Ming Lei wrote: > > > > > > This is not a theoretical problem. > > > At least int DRBD, and an unfortunately high IO concurrency wrt. the > > > "max-buffers" setting, without this patch we have a reproducible deadlock. > > > > Is there any log about the deadlock? And is there any lockdep warning > > if it is enabled? > > In DRBD, to avoid potentially very long internal queues as we wait for > our replication peer device and local backend, we limit the number of > in-flight bios we accept, and block in our ->make_request_fn() if that > number exceeds a configured watermark ("max-buffers"). > > Works fine, as long as we could assume that once our make_request_fn() > returns, any bios we "recursively" submitted against the local backend > would be dispatched. Which used to be the case. It'd be useful to know whether this patch fixes your issue: https://patchwork.kernel.org/patch/7398411/ Ming Lei didn't like it due to concerns about I contexts changing (whereby breaking merging that occurs via plugging). But if it _does_ fix your issue then the case for the change is increased; and we just need to focus on addressing Ming's concerns (Mikulas has some ideas). Conversely, and in parallel, Mikulas can look to see if your approach fixes the observed dm-snapshot deadlock that he set out to fix.