From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:48441 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751816AbdGGXSD (ORCPT ); Fri, 7 Jul 2017 19:18:03 -0400 From: Nick Terrell To: Adam Borowski CC: Kernel Team , Chris Mason , Yann Collet , David Sterba , "linux-btrfs@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH v2 3/4] btrfs: Add zstd support Date: Fri, 7 Jul 2017 23:17:49 +0000 Message-ID: References: <20170629194108.1674498-1-terrelln@fb.com> <20170629194108.1674498-4-terrelln@fb.com> <20170706163225.xbluc2gi2nlaafzo@angband.pl> In-Reply-To: <20170706163225.xbluc2gi2nlaafzo@angband.pl> Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 7/6/17, 9:32 AM, "Adam Borowski" wrote: > On Thu, Jun 29, 2017 at 12:41:07PM -0700, Nick Terrell wrote: >> Add zstd compression and decompression support to BtrFS. zstd at its >> fastest level compresses almost as well as zlib, while offering much >> faster compression and decompression, approaching lzo speeds. > > Got a reproducible crash on amd64: > > [98235.266511] BUG: unable to handle kernel paging request at ffffc90001251000 > [98235.267485] IP: ZSTD_storeSeq.constprop.24+0x67/0xe0 > [98235.269395] PGD 227034067 > [98235.269397] P4D 227034067 > [98235.271587] PUD 227035067 > [98235.273657] PMD 223323067 > [98235.275744] PTE 0 > > [98235.281545] Oops: 0002 [#1] SMP > [98235.283353] Modules linked in: loop veth tun fuse arc4 rtl8xxxu mac80211 cfg80211 cp210x pl2303 rfkill usbserial nouveau video mxm_wmi ttm > [98235.285203] CPU: 0 PID: 10850 Comm: kworker/u12:9 Not tainted 4.12.0+ #1 > [98235.287070] Hardware name: System manufacturer System Product Name/M4A77T, BIOS 2401 05/18/2011 > [98235.288964] Workqueue: btrfs-delalloc btrfs_delalloc_helper > [98235.290934] task: ffff880224984140 task.stack: ffffc90007e5c000 > [98235.292731] RIP: 0010:ZSTD_storeSeq.constprop.24+0x67/0xe0 > [98235.294579] RSP: 0018:ffffc90007e5fa68 EFLAGS: 00010282 > [98235.296395] RAX: ffffc90001251001 RBX: 0000000000000094 RCX: ffffc9000118f930 > [98235.298380] RDX: 0000000000000006 RSI: ffffc900011b06b0 RDI: ffffc9000118d1e0 > [98235.300321] RBP: 000000000000009f R08: 1fffffffffffbe58 R09: 0000000000000000 > [98235.302282] R10: ffffc9000118f970 R11: 0000000000000005 R12: ffffc9000118f878 > [98235.304221] R13: 000000000000005b R14: ffffc9000118f915 R15: ffffc900011cfe88 > [98235.306147] FS: 0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000 > [98235.308162] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [98235.310129] CR2: ffffc90001251000 CR3: 000000021018d000 CR4: 00000000000006f0 > [98235.312095] Call Trace: > [98235.314008] ? ZSTD_compressBlock_fast+0x94b/0xb30 > [98235.315975] ? ZSTD_compressContinue_internal+0x1a0/0x580 > [98235.317938] ? ZSTD_compressStream_generic+0x248/0x2f0 > [98235.319877] ? ZSTD_compressStream+0x41/0x60 > [98235.321821] ? zstd_compress_pages+0x236/0x5d0 > [98235.323724] ? btrfs_compress_pages+0x5e/0x80 > [98235.325684] ? compress_file_range.constprop.79+0x1eb/0x750 > [98235.327668] ? async_cow_start+0x2e/0x50 > [98235.329594] ? btrfs_worker_helper+0x1b9/0x1d0 > [98235.331486] ? process_one_work+0x158/0x2f0 > [98235.333361] ? worker_thread+0x45/0x3a0 > [98235.335253] ? process_one_work+0x2f0/0x2f0 > [98235.337189] ? kthread+0x10e/0x130 > [98235.339020] ? kthread_park+0x60/0x60 > [98235.340819] ? ret_from_fork+0x22/0x30 > [98235.342637] Code: 8b 4e d0 4c 89 48 d0 4c 8b 4e d8 4c 89 48 d8 4c 8b 4e e0 4c 89 48 e0 4c 8b 4e e8 4c 89 48 e8 4c 8b 4e f0 4c 89 48 f0 4c 8b 4e f8 <4c> 89 48 f8 48 39 f1 75 a2 4e 8d 04 c0 48 8b 31 48 83 c0 08 48 > [98235.346773] RIP: ZSTD_storeSeq.constprop.24+0x67/0xe0 RSP: ffffc90007e5fa68 > [98235.348809] CR2: ffffc90001251000 > [98235.363216] ---[ end trace 5fb3ad0f2aec0605 ]--- > [98235.363218] BUG: unable to handle kernel paging request at ffffc9000393a000 > [98235.363239] IP: ZSTD_storeSeq.constprop.24+0x67/0xe0 > [98235.363241] PGD 227034067 > [98235.363242] P4D 227034067 > [98235.363243] PUD 227035067 > [98235.363244] PMD 21edec067 > [98235.363245] PTE 0 > (More of the above follows.) > > My reproducer copies an uncompressed tarball onto a fresh filesystem: > .---- > #!/bin/sh > set -e > > losetup -D; umount /mnt/vol1 ||: > dd if=/dev/zero of=/tmp/disk bs=2048 seek=1048575 count=1 > mkfs.btrfs -msingle /tmp/disk > losetup -f /tmp/disk > sleep 1 # yay udev races > mount -onoatime,compress=$1 /dev/loop0 /mnt/vol1 > time sh -c 'cp -p ~kilobyte/tmp/kernel.tar /mnt/vol1 && umount /mnt/vol1' > losetup -D > `---- > (run it with arg of "zstd") > > Kernel is 4.12.0 + btrfs-for-4.13 + v4 of Qu's chunk check + some unrelated > stuff + zstd; in case it matters I've pushed my tree to > https://github.com/kilobyte/linux/tree/zstd-crash > > The payload is a tarball of the above, but, for debugging compression you > need the exact byte stream. https://angband.pl/tmp/kernel.tar.xz -- > without xz, I compressed it for transport. Thanks for the bug report Adam! I'm looking into the failure, and haven't been able to reproduce it yet. I've built my kernel from your tree, and I ran your script with the kernel.tar tarball 100 times, but haven't gotten a failure yet. I have a few questions to guide my debugging. - How many cores are you running with? I’ve run the script with 1, 2, and 4 cores. - Which version of gcc are you using to compile the kernel? I’m using gcc-6.2.0-5ubuntu12. - Are the failures always in exactly the same place, and does it fail 100% of the time or just regularly? {.n++%ݶw{.n+{k~^nrzh&zzޗ++zfh~iz_j:+v)ߣm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752615AbdGGXSG (ORCPT ); Fri, 7 Jul 2017 19:18:06 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:48441 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751816AbdGGXSD (ORCPT ); Fri, 7 Jul 2017 19:18:03 -0400 From: Nick Terrell To: Adam Borowski CC: Kernel Team , Chris Mason , Yann Collet , David Sterba , "linux-btrfs@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH v2 3/4] btrfs: Add zstd support Thread-Topic: [PATCH v2 3/4] btrfs: Add zstd support Thread-Index: AQHS8Q/LbUGGKKAiq0CVg+3Wq3AYu6JHCMaAgAGOPwA= Date: Fri, 7 Jul 2017 23:17:49 +0000 Message-ID: References: <20170629194108.1674498-1-terrelln@fb.com> <20170629194108.1674498-4-terrelln@fb.com> <20170706163225.xbluc2gi2nlaafzo@angband.pl> In-Reply-To: <20170706163225.xbluc2gi2nlaafzo@angband.pl> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: angband.pl; dkim=none (message not signed) header.d=none;angband.pl; dmarc=none action=none header.from=fb.com; x-originating-ip: [2620:10d:c090:200::7:fed] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DM5PR15MB1657;20:nPrlQR3GpET2txNC8Tm0Ze3h9b0Dh3qI43g42sfmm0L5EnWFbsppHBsv8n+nIFd1NJqXFj+Ql0tqihipGaJSpw0J/7JuTy/SO0U6CSc4Y/UZ1eu+p1zJz7GR3dc8EzaLWT1fgrSwqT59rtIavpVcyIsKnYOuvT8oHRJvCfmFl88= x-ms-office365-filtering-correlation-id: 2eb0ea33-e67e-415e-2ddc-08d4c58e62d2 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254075)(300000503095)(300135400095)(2017052603031)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:DM5PR15MB1657; x-ms-traffictypediagnostic: DM5PR15MB1657: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(133145235818549)(166708455590820)(236129657087228)(48057245064654); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(2017060910070)(10201501046)(93006095)(93001095)(100000703101)(100105400095)(3002001)(6041248)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123560025)(20161123562025)(20161123555025)(20161123564025)(6072148)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:DM5PR15MB1657;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:DM5PR15MB1657; x-forefront-prvs: 0361212EA8 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39850400002)(39410400002)(39840400002)(39400400002)(39450400003)(24454002)(51234002)(377454003)(6916009)(54906002)(8936002)(38730400002)(110136004)(478600001)(6486002)(6246003)(14454004)(305945005)(82746002)(45080400002)(575784001)(5660300001)(4326008)(53936002)(83716003)(966005)(25786009)(50986999)(3660700001)(33656002)(2906002)(76176999)(3280700002)(54356999)(99286003)(77096006)(8676002)(86362001)(81166006)(36756003)(2950100002)(6512007)(229853002)(53546010)(7736002)(6116002)(6436002)(6306002)(2900100001)(6506006)(102836003)(189998001);DIR:OUT;SFP:1102;SCL:1;SRVR:DM5PR15MB1657;H:DM5PR15MB1753.namprd15.prod.outlook.com;FPR:;SPF:None;MLV:sfv;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <8426D25196FF5D42ABD9396FF4FD939B@namprd15.prod.outlook.com> MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Jul 2017 23:17:49.8860 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR15MB1657 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-07_12:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id v67NIDrt029237 On 7/6/17, 9:32 AM, "Adam Borowski" wrote: > On Thu, Jun 29, 2017 at 12:41:07PM -0700, Nick Terrell wrote: >> Add zstd compression and decompression support to BtrFS. zstd at its >> fastest level compresses almost as well as zlib, while offering much >> faster compression and decompression, approaching lzo speeds. > > Got a reproducible crash on amd64: > > [98235.266511] BUG: unable to handle kernel paging request at ffffc90001251000 > [98235.267485] IP: ZSTD_storeSeq.constprop.24+0x67/0xe0 > [98235.269395] PGD 227034067 > [98235.269397] P4D 227034067 > [98235.271587] PUD 227035067 > [98235.273657] PMD 223323067 > [98235.275744] PTE 0 > > [98235.281545] Oops: 0002 [#1] SMP > [98235.283353] Modules linked in: loop veth tun fuse arc4 rtl8xxxu mac80211 cfg80211 cp210x pl2303 rfkill usbserial nouveau video mxm_wmi ttm > [98235.285203] CPU: 0 PID: 10850 Comm: kworker/u12:9 Not tainted 4.12.0+ #1 > [98235.287070] Hardware name: System manufacturer System Product Name/M4A77T, BIOS 2401 05/18/2011 > [98235.288964] Workqueue: btrfs-delalloc btrfs_delalloc_helper > [98235.290934] task: ffff880224984140 task.stack: ffffc90007e5c000 > [98235.292731] RIP: 0010:ZSTD_storeSeq.constprop.24+0x67/0xe0 > [98235.294579] RSP: 0018:ffffc90007e5fa68 EFLAGS: 00010282 > [98235.296395] RAX: ffffc90001251001 RBX: 0000000000000094 RCX: ffffc9000118f930 > [98235.298380] RDX: 0000000000000006 RSI: ffffc900011b06b0 RDI: ffffc9000118d1e0 > [98235.300321] RBP: 000000000000009f R08: 1fffffffffffbe58 R09: 0000000000000000 > [98235.302282] R10: ffffc9000118f970 R11: 0000000000000005 R12: ffffc9000118f878 > [98235.304221] R13: 000000000000005b R14: ffffc9000118f915 R15: ffffc900011cfe88 > [98235.306147] FS: 0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000 > [98235.308162] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [98235.310129] CR2: ffffc90001251000 CR3: 000000021018d000 CR4: 00000000000006f0 > [98235.312095] Call Trace: > [98235.314008] ? ZSTD_compressBlock_fast+0x94b/0xb30 > [98235.315975] ? ZSTD_compressContinue_internal+0x1a0/0x580 > [98235.317938] ? ZSTD_compressStream_generic+0x248/0x2f0 > [98235.319877] ? ZSTD_compressStream+0x41/0x60 > [98235.321821] ? zstd_compress_pages+0x236/0x5d0 > [98235.323724] ? btrfs_compress_pages+0x5e/0x80 > [98235.325684] ? compress_file_range.constprop.79+0x1eb/0x750 > [98235.327668] ? async_cow_start+0x2e/0x50 > [98235.329594] ? btrfs_worker_helper+0x1b9/0x1d0 > [98235.331486] ? process_one_work+0x158/0x2f0 > [98235.333361] ? worker_thread+0x45/0x3a0 > [98235.335253] ? process_one_work+0x2f0/0x2f0 > [98235.337189] ? kthread+0x10e/0x130 > [98235.339020] ? kthread_park+0x60/0x60 > [98235.340819] ? ret_from_fork+0x22/0x30 > [98235.342637] Code: 8b 4e d0 4c 89 48 d0 4c 8b 4e d8 4c 89 48 d8 4c 8b 4e e0 4c 89 48 e0 4c 8b 4e e8 4c 89 48 e8 4c 8b 4e f0 4c 89 48 f0 4c 8b 4e f8 <4c> 89 48 f8 48 39 f1 75 a2 4e 8d 04 c0 48 8b 31 48 83 c0 08 48 > [98235.346773] RIP: ZSTD_storeSeq.constprop.24+0x67/0xe0 RSP: ffffc90007e5fa68 > [98235.348809] CR2: ffffc90001251000 > [98235.363216] ---[ end trace 5fb3ad0f2aec0605 ]--- > [98235.363218] BUG: unable to handle kernel paging request at ffffc9000393a000 > [98235.363239] IP: ZSTD_storeSeq.constprop.24+0x67/0xe0 > [98235.363241] PGD 227034067 > [98235.363242] P4D 227034067 > [98235.363243] PUD 227035067 > [98235.363244] PMD 21edec067 > [98235.363245] PTE 0 > (More of the above follows.) > > My reproducer copies an uncompressed tarball onto a fresh filesystem: > .---- > #!/bin/sh > set -e > > losetup -D; umount /mnt/vol1 ||: > dd if=/dev/zero of=/tmp/disk bs=2048 seek=1048575 count=1 > mkfs.btrfs -msingle /tmp/disk > losetup -f /tmp/disk > sleep 1 # yay udev races > mount -onoatime,compress=$1 /dev/loop0 /mnt/vol1 > time sh -c 'cp -p ~kilobyte/tmp/kernel.tar /mnt/vol1 && umount /mnt/vol1' > losetup -D > `---- > (run it with arg of "zstd") > > Kernel is 4.12.0 + btrfs-for-4.13 + v4 of Qu's chunk check + some unrelated > stuff + zstd; in case it matters I've pushed my tree to > https://github.com/kilobyte/linux/tree/zstd-crash > > The payload is a tarball of the above, but, for debugging compression you > need the exact byte stream. https://angband.pl/tmp/kernel.tar.xz -- > without xz, I compressed it for transport. Thanks for the bug report Adam! I'm looking into the failure, and haven't been able to reproduce it yet. I've built my kernel from your tree, and I ran your script with the kernel.tar tarball 100 times, but haven't gotten a failure yet. I have a few questions to guide my debugging. - How many cores are you running with? I’ve run the script with 1, 2, and 4 cores. - Which version of gcc are you using to compile the kernel? I’m using gcc-6.2.0-5ubuntu12. - Are the failures always in exactly the same place, and does it fail 100% of the time or just regularly?