From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B4D8C3F6B0 for ; Wed, 17 Aug 2022 12:31:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236708AbiHQMbN (ORCPT ); Wed, 17 Aug 2022 08:31:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233642AbiHQMbM (ORCPT ); Wed, 17 Aug 2022 08:31:12 -0400 Received: from mail.itouring.de (mail.itouring.de [IPv6:2a01:4f8:a0:4463::2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2F4B83F0C; Wed, 17 Aug 2022 05:31:11 -0700 (PDT) Received: from tux.applied-asynchrony.com (p5ddd78be.dip0.t-ipconnect.de [93.221.120.190]) by mail.itouring.de (Postfix) with ESMTPSA id 14C27103762; Wed, 17 Aug 2022 14:31:10 +0200 (CEST) Received: from [192.168.100.221] (hho.applied-asynchrony.com [192.168.100.221]) by tux.applied-asynchrony.com (Postfix) with ESMTP id C18A5F01600; Wed, 17 Aug 2022 14:31:09 +0200 (CEST) Subject: Re: stalling IO regression since linux 5.12, through 5.18 To: Chris Murphy , Nikolay Borisov , Jens Axboe , Jan Kara , Paolo Valente Cc: Linux-RAID , linux-block , linux-kernel , Josef Bacik References: <2220d403-e443-4e60-b7c3-d149e402c13e@www.fastmail.com> <61e5ccda-a527-4fea-9850-91095ffa91c4@www.fastmail.com> <4995baed-c561-421d-ba3e-3a75d6a738a3@www.fastmail.com> <2b8a38fa-f15f-45e8-8caa-61c5f8cd52de@www.fastmail.com> <7c830487-95a6-b008-920b-8bc4a318f10a@applied-asynchrony.com> <8b361f8e-cc4f-466c-90f0-031a43436af2@www.fastmail.com> From: =?UTF-8?Q?Holger_Hoffst=c3=a4tte?= Organization: Applied Asynchrony, Inc. Message-ID: <6eece869-5cab-57b6-6f8f-98eaf65a742f@applied-asynchrony.com> Date: Wed, 17 Aug 2022 14:31:09 +0200 MIME-Version: 1.0 In-Reply-To: <8b361f8e-cc4f-466c-90f0-031a43436af2@www.fastmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 2022-08-17 13:57, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 5:52 AM, Holger Hoffstätte wrote: >> On 2022-08-16 17:34, Chris Murphy wrote: >>> >>> On Tue, Aug 16, 2022, at 11:25 AM, Nikolay Borisov wrote: >>>> How about changing the scheduler either mq-deadline or noop, just >>>> to see if this is also reproducible with a different scheduler. I >>>> guess noop would imply the blk cgroup controller is going to be >>>> disabled >>> >>> I already reported on that: always happens with bfq within an hour or >>> less. Doesn't happen with mq-deadline for ~25+ hours. Does happen >>> with bfq with the above patches removed. Does happen with >>> cgroup.disabled=io set. >>> >>> Sounds to me like it's something bfq depends on and is somehow >>> becoming perturbed in a way that mq-deadline does not, and has >>> changed between 5.11 and 5.12. I have no idea what's under bfq that >>> matches this description. >> >> Chris, just a shot in the dark but can you try the patch from >> >> https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/ >> >> on top of something more recent than 5.12? Ideally 5.19 where it applies >> cleanly. > > The problem doesn't reliably reproduce on 5.19. A patch for 5.12..5.18 would be much more testable. If you look at the changes to sbitmap at: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/lib/sbitmap.c you'll find that they are relatively recent, so Yukai's patch will probably also apply to 5.18 - I don't know. Also look at the most recent commit which mentions "Checking free bits when setting the target bits. Otherwise, it may reuse the busying bits." Reusing the busy bits sounds "not great" either and (AFAIU) may also be a cause for lost wakeups, but I'm sure Jan and Ming know all that better than me. Especially Jan's suggestions re. disabling BFQ cgroup support is probably the easiest thing to try first. What you're observing may not have a single root cause, and even if it does, it might not be where we suspect. -h