From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 224CCC07E96 for ; Tue, 6 Jul 2021 17:22:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E781161C52 for ; Tue, 6 Jul 2021 17:22:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230414AbhGFRYr (ORCPT ); Tue, 6 Jul 2021 13:24:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230382AbhGFRYp (ORCPT ); Tue, 6 Jul 2021 13:24:45 -0400 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98ED7C061574 for ; Tue, 6 Jul 2021 10:22:06 -0700 (PDT) Received: by mail-io1-xd2d.google.com with SMTP id k16so25956349ios.10 for ; Tue, 06 Jul 2021 10:22:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9fM/ojEAwAZm7vkF16fhBubF991mWTVGuozzRW+2yzk=; b=nnW3CLU+iANXDoCRx5ZjTNWxlVGCHLOdMvlwTJkt5aXg3rskw5+FODo5ytAPe4wV58 LOgGQWUgeV9/ZOSb5EC78NY7ywTtb2gzjEHmmLYKH6ETmKPSoG+NoNOJUGKp6f6gwBIw RjXuOlMVxrU6XvSRyGP1uqgXaY1fzJbDNspkNjkA0UcrZqMYPZUo4xIRF8xMP9BvPlFv Y4URsnO9xO23oE+gzrjmTT8AC4RBw1wcbHrECeIwlgI6zzKM9luycTTuiHvDTR7a+srU eApA3CrILr+ViN8raPtPIqb/oJNG0qEa2qJMtAWTJ/sHdsDpuCKFUvl94TZVlNvK7SVB oBDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9fM/ojEAwAZm7vkF16fhBubF991mWTVGuozzRW+2yzk=; b=ATVclvFQKgE5NV2TE6mGmxWQ3XgAP6NljKl3TWU2vDHSnBviWxrN0NlkYC6Z5npB1X mBAyWmXsyoIuQyN/X8jj84iae+fPn+dNS8FXD/Q3w4fssFNoyFeQQjc+pyjaNbsEBTHh SLkkJjuW1/2XIBPF43O8C59Uic3Mf4IJIK2WRHnfHcIxu8BktBOr9NwMdC/1p5U4VXIF X/ihCKAr05V4PKO0/sAX9mmVHrFZaxZlyU3NkWUT6Hx9tXtriCTcbbOwEUo/TXFtxHd3 m94+3NABx5dBp30bntngGoNvPMW92qQuJGPKh/iIvTL3tNhZNvJhr6S2WCP7QmkzBJsT 1ZIQ== X-Gm-Message-State: AOAM532ezslbFCQ+35AI20qgwtEZivh9LU09ikx5WBYFkSpgwppNRV30 KmxW9KJ+Ga/FKQzczO2Pb7iQllv4F/+w6Gn+Wg2Gqv75Zci4dg== X-Google-Smtp-Source: ABdhPJw2n5Bz9h7N9XZpNorMRN11mCq636dUjESdFQS3nF7wVQCruMSz1XriBFJ/w1IOLKod0ADnGbZB47as88WyebA= X-Received: by 2002:a02:9109:: with SMTP id a9mr18010423jag.93.1625592126160; Tue, 06 Jul 2021 10:22:06 -0700 (PDT) MIME-Version: 1.0 References: <47f0a04ce6664116a11cfdb5a458e252@nl.team.blue> <8eb12c996e404870803e9a7c77e508d6@nl.team.blue> <666938090a8746a7ad8ae40ebf116e1c@nl.team.blue> <21c4b9e08c4d48d6b477fc61d1fccba3@nl.team.blue> <391efdae70644b71844fe6fa3dceea13@nl.team.blue> <2d37c87eb42d4bc2a99184f6bffce8a2@nl.team.blue> In-Reply-To: From: Ilya Dryomov Date: Tue, 6 Jul 2021 19:21:41 +0200 Message-ID: Subject: Re: All RBD IO stuck after flapping OSD's To: Robin Geuze Cc: Ceph Development Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org On Tue, Jun 29, 2021 at 12:07 PM Ilya Dryomov wrote: > > On Tue, Jun 29, 2021 at 10:39 AM Robin Geuze wrote: > > > > Hey Ilya, > > > > Do you have any idea on the cause of this bug yet? I tried to dig around a bit myself in the source, but the logic around this locking is very complex, so I couldn't figure out where the problem is. > > I do. The proper fix would indeed be large and not backportable but > I have a workaround in mind that should be simple enough to backport > all the way to 5.4. The trick is making sure that the workaround is > fine from the exclusive lock protocol POV. > > I'll try to flesh it out by the end of this week and report back > early next week. Hi Robin, I CCed you on the patches. They should apply to 5.4 cleanly. You mentioned you have a build farm set up, please take them for a spin. Thanks, Ilya