From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE33BC2D0D0 for ; Tue, 24 Dec 2019 02:24:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B56532073A for ; Tue, 24 Dec 2019 02:24:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=colorremedies-com.20150623.gappssmtp.com header.i=@colorremedies-com.20150623.gappssmtp.com header.b="cHcQDL3X" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727188AbfLXCY0 (ORCPT ); Mon, 23 Dec 2019 21:24:26 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:38502 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727009AbfLXCYZ (ORCPT ); Mon, 23 Dec 2019 21:24:25 -0500 Received: by mail-wm1-f65.google.com with SMTP id u2so1328410wmc.3 for ; Mon, 23 Dec 2019 18:24:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=colorremedies-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=ni8hIT/6QtQaBN0u/poXlAqh+507aFvCN6pPrZnXhE0=; b=cHcQDL3Xokf4xFkfIRBY0PYbIU41KsZTKHzQhtWH0HzSzhBTAULdr617ndGf1Vdsfd 3q+OMXWsbHtPLC0x7ZyPzr0akkteVD5jdbLvFjb/Q/Kr20ccpXOCYfTA28eDaBd3iaOr EH9pTX4xXfWuw8EE9T42XFpjhDqzhvB5RtcyRpF2YG/HdSbIGZ00ku01frJsqCMnxE/J PUbmCXrdgashkiziAMUzzXU2VJO5HUwSXXsT+6rTA+lrPgk4WZB9Lh7BTJgHoZZDPYjt kprnpQ5dyY7cFPQfl2bvvbcdY3RNX9+6MbNF6DfyiNFEQkzbvS3GUW0mTW0FB5U/Dw9r 06og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=ni8hIT/6QtQaBN0u/poXlAqh+507aFvCN6pPrZnXhE0=; b=c56OCjSg3bGH9TLJO8lr0TmhcXlO8h/Iy7vO89AOXY7ekEJHn5CZfEX7J+n5FQtnSn HjwqLZFOIcSKykkdFMiOI+A/a2VjDg7Qv/5XZlJKOH42wXUkUrd9TnbReyf9Ozjo2weR 0mqb3/1byDlJDcco988zuMF/xlyNGw681fmIJblfAk14U6OgTUjYdY5TXdHxrmNSlxB/ GzQh0B+kL/Lk3Qk6olUc8GBRgTUkQpnHSHUYLpb+8VKIxop4mcqVpmezcKIefJjohXTP afIUS1KYDB6JKd7Uj9zn0cSFij32HikFj/TUrRCUgHd8Q3BRN3VjTi+msWpVD3b2/BnB 7MBg== X-Gm-Message-State: APjAAAWCQ1sBc6muHUT/MjLvqc6H2H1tXmTMuaNH8o+6poyluncwacch rPndCCgQ2Kdi/pGF2/kne5AS6uGcPnK/AhjECGtggw== X-Google-Smtp-Source: APXvYqylXcme6uYxSfXaks9qK9bEz0O/1X2Zyi7e+SFj/kvvjcnUYxbSwi0Fbh4zui4pPdpXgUHF/CHiO9eOqmo5kBo= X-Received: by 2002:a1c:4d03:: with SMTP id o3mr1620194wmh.164.1577154262694; Mon, 23 Dec 2019 18:24:22 -0800 (PST) MIME-Version: 1.0 References: <4eca86cf-65c3-5aba-d0fd-466d779614e6@toxicpanda.com> <20191211155553.GP3929@twin.jikos.cz> <20191211155931.GQ3929@twin.jikos.cz> In-Reply-To: <20191211155931.GQ3929@twin.jikos.cz> From: Chris Murphy Date: Mon, 23 Dec 2019 19:24:06 -0700 Message-ID: Subject: Re: 5.5.0-0.rc1 hang, could be zstd compression related To: David Sterba , Josef Bacik , Chris Murphy , Btrfs BTRFS , Linux FS Devel , David Sterba Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Applied that single line on top of 5.5.0-rc3 fs/btrfs/compression.c:449:17: error: implicit declaration of function =E2=80=98bio_set_bev=E2=80=99; did you mean =E2=80=98bio_set_dev=E2=80=99? [-Werror=3Dimplicit-function-declaration] If I use bio_set_dev ... CC [M] fs/btrfs/compression.o fs/btrfs/compression.o: warning: objtool: end_compressed_bio_read.cold()+0x11: unreachable instruction LD [M] fs/btrfs/btrfs.o GEN .version ... Despite that, it seems to work, and no crash with the reproducer. On Wed, Dec 11, 2019 at 8:59 AM David Sterba wrote: > > On Wed, Dec 11, 2019 at 04:55:53PM +0100, David Sterba wrote: > > On Wed, Dec 11, 2019 at 09:58:45AM -0500, Josef Bacik wrote: > > > On 12/10/19 11:00 PM, Chris Murphy wrote: > > > > Could continue to chat in one application, the desktop environment = was > > > > responsive, but no shells worked and I couldn't get to a tty and I > > > > couldn't ssh into remotely. Looks like the journal has everything u= p > > > > until I pressed and held down the power button. > > > > > > > > > > > > /dev/nvme0n1p7 on / type btrfs > > > > (rw,noatime,seclabel,compress=3Dzstd:1,ssd,space_cache=3Dv2,subvoli= d=3D274,subvol=3D/root) > > > > > > > > dmesg pretty > > > > https://pastebin.com/pvG3ERnd > > > > > > > > dmesg (likely MUA stomped) > > > > [10224.184137] flap.local kernel: perf: interrupt took too long (25= 22 > > > >> 2500), lowering kernel.perf_event_max_sample_rate to 79000 > > > > [14712.698184] flap.local kernel: perf: interrupt took too long (31= 53 > > > >> 3152), lowering kernel.perf_event_max_sample_rate to 63000 > > > > [17903.211976] flap.local kernel: Lockdown: systemd-logind: > > > > hibernation is restricted; see man kernel_lockdown.7 > > > > [22877.667177] flap.local kernel: BUG: kernel NULL pointer > > > > dereference, address: 00000000000006c8 > > > > [22877.667182] flap.local kernel: #PF: supervisor read access in ke= rnel mode > > > > [22877.667184] flap.local kernel: #PF: error_code(0x0000) - not-pre= sent page > > > > [22877.667187] flap.local kernel: PGD 0 P4D 0 > > > > [22877.667191] flap.local kernel: Oops: 0000 [#1] SMP PTI > > > > [22877.667194] flap.local kernel: CPU: 2 PID: 14747 Comm: kworker/u= 8:7 > > > > Not tainted 5.5.0-0.rc1.git0.1.fc32.x86_64+debug #1 > > > > [22877.667196] flap.local kernel: Hardware name: HP HP Spectre > > > > Notebook/81A0, BIOS F.43 04/16/2019 > > > > [22877.667226] flap.local kernel: Workqueue: btrfs-delalloc > > > > btrfs_work_helper [btrfs] > > > > [22877.667233] flap.local kernel: RIP: > > > > 0010:bio_associate_blkg_from_css+0x1c/0x3b0 > > > > > > This looks like the extent_map bdev cleanup thing that was supposed t= o be fixed, > > > did you send the patch without the fix for it Dave? Thanks, > > > > The fix for NULL bdev was added in 429aebc0a9a063667dba21 (and tested > > with cgroups v2) and it's in a different function than the one that > > appears on the stacktrace. > > > > This seems to be another instance where the bdev is needed right after > > the bio is created but way earlier than it's actually known for real, > > yet still needed for the blkcg thing. > > > > 443 bio =3D btrfs_bio_alloc(first_byte); > > 444 bio->bi_opf =3D REQ_OP_WRITE | write_flags; > > 445 bio->bi_private =3D cb; > > 446 bio->bi_end_io =3D end_compressed_bio_write; > > 447 > > 448 if (blkcg_css) { > > 449 bio->bi_opf |=3D REQ_CGROUP_PUNT; > > 450 bio_associate_blkg_from_css(bio, blkcg_css); > > 451 } > > > > Strange that it takes so long to reproduce, meaning the 'if' branch is > > not taken often. > > Compile tested only: > > --- a/fs/btrfs/compression.c > +++ b/fs/btrfs/compression.c > @@ -446,6 +446,7 @@ blk_status_t btrfs_submit_compressed_write(struct ino= de *inode, u64 start, > bio->bi_end_io =3D end_compressed_bio_write; > > if (blkcg_css) { > + bio_set_bev(bio, fs_info->fs_devices->latest_bdev); > bio->bi_opf |=3D REQ_CGROUP_PUNT; > bio_associate_blkg_from_css(bio, blkcg_css); > } > --=20 Chris Murphy