From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp2120.oracle.com ([156.151.31.85]:42894 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725779AbeISF3H (ORCPT ); Wed, 19 Sep 2018 01:29:07 -0400 Subject: Re: btrfs panic problem To: Duncan <1i5t5.duncan@cox.net>, linux-btrfs@vger.kernel.org References: <2cce0d8b-0958-9fb9-bb88-09fbfbf94c9e@oracle.com> From: "sunny.s.zhang" Message-ID: <8f6641aa-fc2e-a7b2-4dee-d69706ed8801@oracle.com> Date: Wed, 19 Sep 2018 07:53:35 +0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: Hi Duncan, Thank you for your advice. I understand what you mean.  But i have reviewed the latest btrfs code, and i think the issue is exist still. At 71 line, if the function of btrfs_get_delayed_node run over this line, then switch to other process, which run over the 1282 and release the delayed node at the end. And then, switch back to the  btrfs_get_delayed_node. find that the node is not null, and use it as normal. that mean we used a freed memory. at some time, this memory will be freed again. latest code as below. 1278 void btrfs_remove_delayed_node(struct btrfs_inode *inode) 1279 { 1280         struct btrfs_delayed_node *delayed_node; 1281 1282         delayed_node = READ_ONCE(inode->delayed_node); 1283         if (!delayed_node) 1284                 return; 1285 1286         inode->delayed_node = NULL; 1287         btrfs_release_delayed_node(delayed_node); 1288 }   64 static struct btrfs_delayed_node *btrfs_get_delayed_node(   65                 struct btrfs_inode *btrfs_inode)   66 {   67         struct btrfs_root *root = btrfs_inode->root;   68         u64 ino = btrfs_ino(btrfs_inode);   69         struct btrfs_delayed_node *node;   70   71         node = READ_ONCE(btrfs_inode->delayed_node);   72         if (node) {   73                 refcount_inc(&node->refs);   74                 return node;   75         }   76   77         spin_lock(&root->inode_lock);   78         node = radix_tree_lookup(&root->delayed_nodes_tree, ino); 在 2018年09月18日 13:05, Duncan 写道: > sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted: > >> My OS(4.1.12) panic in kmem_cache_alloc, which is called by >> btrfs_get_or_create_delayed_node. >> >> I found that the freelist of the slub is wrong. > [Not a dev, just a btrfs list regular and user, myself. But here's a > general btrfs list recommendations reply...] > > You appear to mean kernel 4.1.12 -- confirmed by the version reported in > the posted dump: 4.1.12-112.14.13.el6uek.x86_64 > > OK, so from the perspective of this forward-development-focused list, > kernel 4.1 is pretty ancient history, but you do have a number of options. > > First let's consider the general situation. Most people choose an > enterprise distro for supported stability, and that's certainly a valid > thing to want. However, btrfs, while now reaching early maturity for the > basics (single device in single or dup mode, and multi-device in single/ > raid0/1/10 modes, note that raid56 mode is newer and less mature), > remains under quite heavy development, and keeping reasonably current is > recommended for that reason. > > So you you chose an enterprise distro presumably to lock in supported > stability for several years, but you chose a filesystem, btrfs, that's > still under heavy development, with reasonably current kernels and > userspace recommended as tending to have the known bugs fixed. There's a > bit of a conflict there, and the /general/ recommendation would thus be > to consider whether one or the other of those choices are inappropriate > for your use-case, because it's really quite likely that if you really > want the stability of an enterprise distro and kernel, that btrfs isn't > as stable a filesystem as you're likely to want to match with it. > Alternatively, if you want something newer to match the still under heavy > development btrfs, you very likely want a distro that's not focused on > years-old stability just for the sake of it. One or the other is likely > to be a poor match for your needs, and choosing something else that's a > better match is likely to be a much better experience for you. > > But perhaps you do have reason to want to run the newer and not quite to > traditional enterprise-distro level stability btrfs, on an otherwise > older and very stable enterprise distro. That's fine, provided you know > what you're getting yourself into, and are prepared to deal with it. > > In that case, for best support from the list, we'd recommend running one > of the latest two kernels in either the current or mainline LTS tracks. > > For current track, With 4.18 being the latest kernel, that'd be 4.18 or > 4.17, as available on kernel.org (tho 4.17 is already EOL, no further > releases, at 4.17.19). > > For mainline-LTS track, 4.14 and 4.9 are the latest two LTS series > kernels, tho IIRC 4.19 is scheduled to be this year's LTS (or was it 4.18 > and it's just not out of normal stable range yet so not yet marked LTS?), > so it'll be coming up soon and 4.9 will then be dropping to third LTS > series and thus out of our best recommended range. 4.4 was the previous > LTS and while still in LTS support, is outside the two newest LTS series > that this list recommends. > > And of course 4.1 is older than 4.4, so as I said, in btrfs development > terms, it's quite ancient indeed... quite out of practical support range > here, tho of course we'll still try, but in many cases the first question > when any problem's reported is going to be whether it's reproducible on > something closer to current. > > But... you ARE on an enterprise kernel, likely on an enterprise distro, > and very possibly actually paying /them/ for support. So you're not > without options if you prefer to stay with your supported enterprise > kernel. If you're paying them for support, you might as well use it, and > of course of the very many fixes since 4.1, they know what they've > backported and what they haven't, so they're far better placed to provide > that support in any case. > > Or, given what you posted, you appear to be reasonably able to do at > least limited kernel-dev-level analysis yourself. Given that, you're > already reasonably well placed to simply decide to stick with what you > have and take the support you can get, diving into things yourself if > necessary. > > > So those are your kernel options. What about userspace btrfs-progs? > > Generally speaking, while the filesystem's running, it's the kernel code > doing most of the work. If you have old userspace, it simply means you > can't take advantage of some of the newer features as the old userspace > doesn't know how to call for them. > > But the situation changes as soon as you have problems and can't mount, > because it's userspace code that runs to try to fix that sort of problem, > or failing that, it's userspace code that btrfs restore runs to try to > grab what files can be grabbed off of the unmountable filesystem. > > So for routine operation, it's no big deal if userspace is a bit old, at > least as long as it's new enough to have all the newer command formats, > etc, that you need, and for comparing against others when posted. But > once things go bad on you, you really want the newest btrfs-progs in > ordered to give you the best chance at either fixing things, or worst- > case, at least retrieving the files off the dead filesystem. So using > the older distro btrfs-progs for routine running should be fine, but > unless your backups are complete and frequent enough that if something > goes wrong it's easiest to simply blow the bad version away with a fresh > mkfs and start over, you'll probably want at least a reasonably current > btrfs-progs on your rescue media at least. Since the userspace version > numbers are synced to the kernel cycle, a good rule of thumb is keep your > btrfs-progs version to at least that of the oldest recommended LTS kernel > version, as well, so you'd want at least btrfs-progs 4.9 on your rescue > media, for now, and 4.14, coming up, since when the new kernel goes LTS > that'll displace 4.9 and 4.14 will then be the second-back LTS. >