All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhao Lei <zhaolei@cn.fujitsu.com>
To: <fdmanana@gmail.com>
Cc: <linux-btrfs@vger.kernel.org>
Subject: RE: [PATCH] btrfs: Fix no space bug caused by removing bg
Date: Tue, 22 Sep 2015 19:24:53 +0800	[thread overview]
Message-ID: <010401d0f529$4db27670$e9176350$@cn.fujitsu.com> (raw)
In-Reply-To: <CAL3q7H5=ZnEav6CHAGW+oUwNJAuoYcxcBmHQG9Ssa_gaMDTh7w@mail.gmail.com>

Hi, Filipe David Manana

> -----Original Message-----
> From: linux-btrfs-owner@vger.kernel.org
> [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Filipe David Manana
> Sent: Tuesday, September 22, 2015 6:22 PM
> To: Zhao Lei <zhaolei@cn.fujitsu.com>
> Cc: linux-btrfs@vger.kernel.org
> Subject: Re: [PATCH] btrfs: Fix no space bug caused by removing bg
> 
> On Tue, Sep 22, 2015 at 11:06 AM, Zhao Lei <zhaolei@cn.fujitsu.com> wrote:
> > Hi, Filipe David Manana
> >
> > Thanks for review this patch.
> >
> >> -----Original Message-----
> >> From: Filipe David Manana [mailto:fdmanana@gmail.com]
> >> Sent: Monday, September 21, 2015 9:27 PM
> >> To: Zhao Lei <zhaolei@cn.fujitsu.com>
> >> Cc: linux-btrfs@vger.kernel.org
> >> Subject: Re: [PATCH] btrfs: Fix no space bug caused by removing bg
> >>
> >> On Mon, Sep 21, 2015 at 1:59 PM, Zhao Lei <zhaolei@cn.fujitsu.com>
> wrote:
> >> > btrfs in v4.3-rc1 failed many xfstests items with '-o nospace_cache'
> >> > mount option.
> >> >
> >> > Failed cases are:
> >> >
> >> > btrfs/008,016,019,020,026,027,028,029,031,041,046,048,050,051,053,0
> >> > 54,
> >> >  077,083,084,087,092,094
> >>
> >> Hi Zhao,
> >>
> >> So far I tried a few of those against Chris' integration-4.3 branch
> >> (same btrfs code as 4.3-rc1):
> >>
> >> MOUNT_OPTIONS="-o nospace_cache" ./check btrfs/008 btrfs/016
> >> btrfs/019
> >> btrfs/020
> >> FSTYP         -- btrfs
> >> PLATFORM      -- Linux/x86_64 debian3 4.2.0-rc5-btrfs-next-12+
> >> MKFS_OPTIONS  -- /dev/sdc
> >> MOUNT_OPTIONS -- -o nospace_cache /dev/sdc
> >> /home/fdmanana/btrfs-tests/scratch_1
> >>
> >> btrfs/008 2s ... 1s
> >> btrfs/016 4s ... 3s
> >> btrfs/019 4s ... 2s
> >> btrfs/020 2s ... 1s
> >> Ran: btrfs/008 btrfs/016 btrfs/019 btrfs/020 Passed all 4 tests
> >>
> >> And none of the tests failed...
> >>
> > Sorry I hadn't paste detail of my test command.
> >
> > It is from a coincidence operation which is some different with
> > standard steps(as yours), I mount fs with -o no_space_cache manually
> > without set MOUNT_OPT, then xfstests entered into a special path, and
> triggered the bug:
> >   export TEST_DEV='/dev/sdb5'
> >   export TEST_DIR='/var/ltf/tester/mnt'
> >   mkdir -p '/var/ltf/tester/mnt'
> >
> >   export SCRATCH_DEV_POOL='/dev/sdb6 /dev/sdb7 /dev/sdb8 /dev/sdb9
> /dev/sdb10 /dev/sdb11'
> >   export SCRATCH_MNT='/var/ltf/tester/scratch_mnt'
> >   mkdir -p '/var/ltf/tester/scratch_mnt'
> >
> >   export DIFF_LENGTH=0
> >
> >   mkfs.btrfs -f "$TEST_DEV"
> >   mount -o nospace_cache "$TEST_DEV" "$TEST_DIR"
> >
> >   ./check generic/014
> >
> > Result:
> >   FSTYP         -- btrfs
> >   PLATFORM      -- Linux/x86_64 lenovo
> 4.3.0-rc2_HEAD_1f93e4a96c9109378204c147b3eec0d0e8100fde_
> >   MKFS_OPTIONS  -- /dev/sdb6
> >   MOUNT_OPTIONS -- /dev/sdb6 /var/ltf/tester/scratch_mnt
> >
> >   generic/014 0s ... - output mismatch (see
> /var/lib/xfstests/results//generic/014.out.bad)
> >       --- tests/generic/014.out   2015-09-22 17:46:13.855391451 +0800
> >       +++ /var/lib/xfstests/results//generic/014.out.bad  2015-09-22
> 17:57:06.446095748 +0800
> >       @@ -3,4 +3,5 @@
> >        ------
> >        test 1
> >        ------
> >       -OK
> >       +truncfile returned 1 : "write: No space left on device
> >       +Seed = 1442915826 (use "-s 1442915826" to re-execute this test)"
> >   Ran: generic/014
> >   Failures: generic/014
> >   Failed 1 of 1 tests
> >
> > And following script is from trace result of above test.
> > Maybe I can remove the xfstest description because it is not standard steps.
> >
> >> >
> >> > generic/004,010,014,023,024,074,075,080,086,087,089,091,092,100,112
> >> > ,12
> >> > 3,
> >> > 124,125,126,127,131,133,192,193,198,207,208,209,213,214,215,228,239
> >> > ,24
> >> > 0,
> >> >  246,247,248,255,263,285,306,313,316,323
> >> >
> >> > We can reproduce this bug with following simple command:
> >> >  TEST_DEV=/dev/vdh
> >> >  TEST_DIR=/mnt/tmp
> >> >
> >> >  umount "$TEST_DEV" >/dev/null
> >> >  mkfs.btrfs -f "$TEST_DEV"
> >> >  mount "$TEST_DEV" "$TEST_DIR"
> >> >
> >> >  umount "$TEST_DEV"
> >> >  mount "$TEST_DEV" "$TEST_DIR"
> >> >
> >> >  cp /bin/bash $TEST_DIR
> >> >
> >> > Result is:
> >> >  (omit previous commands)
> >> >  # cp /bin/bash $TEST_DIR
> >> >  cp: writing `/mnt/tmp/bash': No space left on device
> >> >
> >> > By bisect, we can see it is triggered by patch titled:
> >> >  commit e44163e17796
> >> >  ("btrfs: explictly delete unused block groups in close_ctree and
> >> > ro-remount")
> >> >
> >> > But the wrong code is not in above patch, btrfs delete all chunks
> >> > if no data in filesystem, and above patch just make it obviously.
> >> >
> >> > Detail reason:
> >> >  1: mkfs a blank filesystem, or delete everything in filesystem
> >> >  2: umount fs
> >> >     (current code will delete all data chunks)
> >> >  3: mount fs
> >> >     Because no any data chunks, data's space_cache have no chance
> >> >     to init, it means: space_info->total_bytes == 0, and
> >> >     space_info->full == 1.
> >>
> >> Right, and that's the problem. When the space_info is initialized it
> >> should never be flagged as full, otherwise any buffered write
> >> attempts fail immediately with enospc instead of trying to allocate a
> >> data block group (at extent-tree.c:btrfs_check_data_free_space()).
> >>
> >> That was fixed recently by:
> >>
> >> https://patchwork.kernel.org/patch/7133451/
> >>
> >> (with a respective test too,
> >> https://patchwork.kernel.org/patch/7133471/)
> >>
> >
> > It can fix problem in mount, but can not fix problem of "raid-level
> > change", please see below.
> >
> >> >  4: do some write
> >> >     Current code will ignore chunk allocate because space_info->full,
> >> >     and return -ENOSPC.
> >> >
> >> > Fix:
> >> >  Don't auto-delete last blockgroup for a raid type.
> >> >  If we delete all blockgroup for a raidtype, it not only cause
> >> > above bug,  but also may change filesystem to all-single in some case.
> >>
> >> I don't get this. Can you mention in which cases that happens and why
> >> (in the commit message)?
> >>
> >> It isn't clear when reading the patch why we need to keep at least
> >> one block of each type/profile, and seems to be a workaround for a different
> problem.
> >>
> > Simply speaking, if we run following command after apply your patch:
> >
> >   TEST_DEV=(/dev/vdg /dev/vdh)
> >   TEST_DIR=/mnt/tmp
> >
> >   umount "$TEST_DEV" >/dev/null
> >   mkfs.btrfs -f -d raid1 "${TEST_DEV[@]}"
> >   mount -o nospace_cache "$TEST_DEV" "$TEST_DIR"
> >
> >   umount "$TEST_DEV"
> >   mount -o nospace_cache "$TEST_DEV" "$TEST_DIR"
> >
> >   btrfs filesystem usage $TEST_DIR
> >
> > The result is:
> >   # btrfs filesystem usage $TEST_DIR
> >   (omit)
> >   Data,single: Size:8.00MiB, Used:0.00B
> >      /dev/vdg        8.00MiB
> >   ...
> >
> > We can see data chunk is changed from raid1 to single, because if we
> > delete all data chunks before mount, there are raid-type information
> > in filesystem, and btrfs will use raid-type of "0x0" for new data
> > chunk after your patch.
> >
> > So, leave at least one data chunk is a simple workaround for above two
> > bug.
> 
> Well it's a hack that besides not being very logical it's also incomplete and
> unreliable: preventing the cleaner kthread (automatic removal of unused block
> groups) from deleting the last block group of a kind doesn't prevent the block
> group from being deleted by a balance, just as the new regression xfstest
> does...
> 

What this patch fixed is regression caused by patch
of "auto-remove-bg-at-umount".

And the balance problem is another bug, which should be fixed by
another patch.

> In other words, the problem existed before we had automatic removal of
> unused block groups, but just less likely for someone to run into it.
> 
> So please lets have a real fix for this problem. Seems like we're losing raid
> profile type somewhere when allocating a new block group.
> 
I think way of "remain last bg" is not a hack way, just like we create
one blank data chunk in mkfs.

In detail, maybe "writing raidtype in disk" is more direct way,
but it will change our disk data format(as superblock),
and need change lots of code as chunk-allocate, balance, mkfs, check, or more.
plus of compatibility with old filesystem.

So the way in this patch maybe the most suit for above bug.

Thanks
Zhaolei

> thanks
> 
> 
> >
> > Thanks
> > Zhaolei
> >
> >> thanks
> >>
> >> >
> >> > Test:
> >> >  Test by above script, and confirmed the logic by debug output.
> >> >
> >> > Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> >> > ---
> >> >  fs/btrfs/extent-tree.c | 3 ++-
> >> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index
> >> > 5411f0a..35cf7eb 100644
> >> > --- a/fs/btrfs/extent-tree.c
> >> > +++ b/fs/btrfs/extent-tree.c
> >> > @@ -10012,7 +10012,8 @@ void btrfs_delete_unused_bgs(struct
> >> btrfs_fs_info *fs_info)
> >> >                                                bg_list);
> >> >                 space_info = block_group->space_info;
> >> >                 list_del_init(&block_group->bg_list);
> >> > -               if (ret || btrfs_mixed_space_info(space_info)) {
> >> > +               if (ret || btrfs_mixed_space_info(space_info) ||
> >> > +                   block_group->list.next ==
> >> > + block_group->list.prev) {
> >> >                         btrfs_put_block_group(block_group);
> >> >                         continue;
> >> >                 }
> >> > --
> >> > 1.8.5.1
> >> >
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> >> > in the body of a message to majordomo@vger.kernel.org More
> >> > majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> >>
> >> --
> >> Filipe David Manana,
> >>
> >> "Reasonable men adapt themselves to the world.
> >>  Unreasonable men adapt the world to themselves.
> >>  That's why all progress depends on unreasonable men."
> >
> 
> 
> 
> --
> Filipe David Manana,
> 
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2015-09-22 11:25 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-21 12:59 [PATCH] btrfs: Fix no space bug caused by removing bg Zhao Lei
2015-09-21 13:27 ` Filipe David Manana
2015-09-21 13:37   ` Filipe David Manana
2015-09-22 10:06   ` Zhao Lei
2015-09-22 10:22     ` Filipe David Manana
2015-09-22 11:24       ` Zhao Lei [this message]
2015-09-22 12:45         ` Filipe David Manana
2015-09-23  1:59           ` Zhao Lei
2015-09-22 10:22     ` Zhao Lei
2015-09-22 12:59 ` Jeff Mahoney
2015-09-22 13:28   ` Hugo Mills
2015-09-22 13:36   ` Holger Hoffstätte
2015-09-22 13:41     ` Hugo Mills
2015-09-22 14:23       ` David Sterba
2015-09-22 14:36         ` Hugo Mills
2015-09-22 14:54           ` Austin S Hemmelgarn
2015-09-22 15:39             ` Hugo Mills
2015-09-22 17:32               ` Austin S Hemmelgarn
2015-09-22 17:37                 ` Austin S Hemmelgarn
2015-09-23  4:49                 ` Duncan
2015-09-23 13:28               ` David Sterba
2015-09-23 13:57                 ` Austin S Hemmelgarn
2015-09-23 14:05                 ` Hugo Mills
2015-09-23 13:12           ` David Sterba
2015-09-23 13:19             ` Qu Wenruo
2015-09-23 13:32               ` Austin S Hemmelgarn
2015-09-23 14:00                 ` Qu Wenruo
2015-09-23 17:28                   ` David Sterba
2015-09-23 13:37               ` David Sterba
2015-09-23 13:45               ` Hugo Mills
2015-09-23 13:28             ` Hugo Mills
2015-09-22 16:23     ` Jeff Mahoney
2015-09-23  2:14   ` Zhao Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='010401d0f529$4db27670$e9176350$@cn.fujitsu.com' \
    --to=zhaolei@cn.fujitsu.com \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.