From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEC83C76186 for ; Mon, 29 Jul 2019 19:20:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BCA3F2073F for ; Mon, 29 Jul 2019 19:20:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=colorremedies-com.20150623.gappssmtp.com header.i=@colorremedies-com.20150623.gappssmtp.com header.b="DUKEYGaw" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729363AbfG2TUp (ORCPT ); Mon, 29 Jul 2019 15:20:45 -0400 Received: from mail-wr1-f66.google.com ([209.85.221.66]:42449 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729141AbfG2TUp (ORCPT ); Mon, 29 Jul 2019 15:20:45 -0400 Received: by mail-wr1-f66.google.com with SMTP id x1so13161737wrr.9 for ; Mon, 29 Jul 2019 12:20:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=colorremedies-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=vX49INcmmLrvTviB9rPwp8Mp4TIMamo+j6PYJ+aDTr0=; b=DUKEYGaw8DjNQzzvL3FGl59Kc4DilazQMj6pKSriSeZcuhybmhc4qd34ydjqAI5oNm UzOHwHSNM1hVYJQW+gEfhJH14GeNSCRJEPYIlx85svQqvZKHr4MjvvxThBxbNlpvVjKi szuzHPGh6jvyY9/nf8GbtubBTGm1ZoAlukxNqKMGmb821Qt/LIJ3BRup0f1ExwstVy0z TM3ld4OY1rlwWoAWPTi9PpcurKz2OGkOymPvcZiTD2aV7XZK2wi6lMqEf2fMKZbPjIZC v7+WxOsxdyXwi+1ZyIxX+wUOjhM9sPS6SILKswveJEA1e+JATyuot5FCLDXbHIz06pn3 SB2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=vX49INcmmLrvTviB9rPwp8Mp4TIMamo+j6PYJ+aDTr0=; b=hGf2hK3hA1aHfTU5Sng6Tr5ZphiT/z69GPJz+E/q2Zm8kHyeU5tAwCKmx4zO9QZwaD e4fT5BAL04nZeaDZx7IXY8QFLYVNmvQtVZkS3b+G+Rh2IJoN++AxIWQIjI2dkf2hUsMj 0FQuGB6zK72vicITZ3csEd/OK2oRqzVk1tMmAjXoSlN7GkSF8ARXiB7N28WT8+5R4w9x EtLIsZwm2tpqPinG5v/i4BwIsieZiitypwn7qnb1YxRnaByHrmJiJozQZ4x2XGUvmNSp yMpceItnDjyNjpC5Tz8RUdelVIyoMfYOwZ2+oWZg5p+7bdfV/nyOsUCrgEA7k9MD2Fg8 85Tw== X-Gm-Message-State: APjAAAURhAZ7AcL8/caTyRUM5/OOdCwoPxtkofft1j1GbSK5O3z//dEx VDgPZZchVMyqw+jCLlNoEn9kzPxHjL0Vf1s8ZxY= X-Google-Smtp-Source: APXvYqwO5QLZb3KvPKQKIXBO5Nhru0xKtt+FhCtO0LtjZpiLGbTjCT9Wu59tasQ3jMJhzXCc5hFiEg1veiP8rzryJ2o= X-Received: by 2002:adf:f851:: with SMTP id d17mr122836754wrq.77.1564428043534; Mon, 29 Jul 2019 12:20:43 -0700 (PDT) MIME-Version: 1.0 References: <0ce15d14-9f30-ac83-0964-8e695eca8cbd@gmx.com> <325a96b2-e6a4-91e3-3b07-1d20a5a031af@petaramesh.org> <49785aa8-fb71-8e0e-bd1d-1e3cda4c7036@gmx.com> <39d43f92-413c-2184-b8da-2c6073b5223f@petaramesh.org> <71bc824e-1462-50ef-19b1-848c5eb0439d@gmx.com> <4aa57293-3f60-8ced-db14-ed38dff7644b@petaramesh.org> <43dc92e7-cd13-81db-bbe5-68affcdd317b@gmx.com> <5fd82812-22cf-253e-312a-1cfc33f4dfff@petaramesh.org> In-Reply-To: From: Chris Murphy Date: Mon, 29 Jul 2019 13:20:32 -0600 Message-ID: Subject: Re: Massive filesystem corruption since kernel 5.2 (ARCH) To: =?UTF-8?Q?Sw=C3=A2mi_Petaramesh?= Cc: Qu Wenruo , Btrfs BTRFS Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Mon, Jul 29, 2019 at 9:05 AM Sw=C3=A2mi Petaramesh wrote: > > On 7/29/19 4:55 PM, Sw=C3=A2mi Petaramesh wrote: > > Well All the errors I detailed today happen on the SAME FS, and this f= s > > is a BTRFS that was created on a new HD with a recent kernel (surely >= =3D > > 4.19) only a few months ago. > > > > And the errors I have one this one, As far as I can tell, look exactly > > like what happened on the same machines SSD as soons as I installer a > > 5.2 kernel... > > Plus I just decided to =E2=80=9Cbtrfs check=E2=80=9D the SSD FS from my m= achine (not yet > showing errors), which I completely reformatted using 5.2 3 days ago > (after having fully tested the SSD error-free itself)... > > And btrfs check tells me that this FS is now completely corrupt as well > :-((( > > The list of files in error has been scrolling for five minutes now :-((( Without both dmesg and btrfs check output it's not very useful. I've got a case where a file system scrubs fine and btrfs check complains, but turns out it's because of nocow files that are compressed via defrag path. The files are fine, there is no corruption, it's just noise. But the only way to know it is to always include the full dmesg and check output - I personally think snippets and trimmed logs are annoying. In this case we don't actually have anything to go on, therefore the problem could be anything, therefore we need all the information available. It seems unlikely drive related, as so many drives are involved. Same for logic board or RAM. I think these days with ext4 and XFS using checksumming for metadata, that if it were device-mapper or blk-mq related, they too would see errors. And yet of course many people are using kernel 5.2 with Btrfs and aren't having problems. So, it's just inherently tedious work to narrow down what's causing the problem. I think it's totally reasonable to go back to 5.1 for a while and make certain the problems aren't happening there. If they are, then I start to wonder about noisy power since you have so many different drives and setups affected. Some of the strangest problems I have ever seen in computing were directly attributed to noise on the power line. --=20 Chris Murphy