All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pete <pete@petezilla.co.uk>
To: Chris Murphy <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: ENOSPC on file system with nearly empty 12TB drive
Date: Sun, 16 Jan 2022 12:27:54 +0000	[thread overview]
Message-ID: <706262eb-ee79-5dc7-4aae-132e750b2625@petezilla.co.uk> (raw)
In-Reply-To: <CAJCQCtS8DDGXOjZdzvkaMSbVyuG-x1Z5o_fO_u8rOGtE2zKSfA@mail.gmail.com>

On 1/15/22 20:38, Chris Murphy wrote:
> On Fri, Jan 14, 2022 at 4:09 PM Pete <pete@petezilla.co.uk> wrote:
> 
>> Any suggestions?  Just be patient, and hope the balance finishes without
>> ENOSPC?  Go for the remove again.  I'd like to remove a 4TB drive if I
>> can without adding a 6th HD to the system.  Still don't understand why I
>> might need more than one practically empty drive for raid1?
> 
> When bg profile is raid1, any time the file system wants to add
> another block group, it must create a chunk on 2 devices at the same
> time for it to succeed or else you get ENOSPC. The question is why two
> chunks can't be created given all the unallocated space you have even
> without the empty drive.

So although in % terms the drives were very full the > 30GB of free 
space on each ought to have been more than sufficient?

A little care is needed as the ENOSPC did not occur at exactly the time 
I ran btrfs fi show and btrfs fi usage.  So perhaps my post is the the 
most rock solid evidence if there is an issue.  However, I had multiple 
instances of ENOSPC at the time, even after it had spent a few hours 
balancing.


> 
> Ordinarily you want to avoid doing metadata balance. Balancing data is
> ok, it amounts to defragmenting free space, and returning excess into
> unallocated space which can then be used to create any type of block
> group.
> 

OK, but various resources show metadata balance being used when trying 
to sort a ENOSPC issue, e.g.
https://www.suse.com/support/kb/doc/?id=000019789
But there are others that I don't seem to be able to find again. I 
should have noted the URLs when I was trying to sort the issue...


> 
> I don't think the large image file is the problem.
> 
> In my opinion, you've hit a bug. There's plenty of unallocated space
> on multiple drives. I think what's going on is an old bug that might
> not be fixed yet where metadata overcommit is allowed to overestimate
> due to the single large empty drive with a ton of space on it. But
> since the overcommit can't be fulfilled on at least two drives, you
> get ENOSPC even though it's not actually trying to create that many
> block groups at once.
> 

My naive but logical expectation was that balance would start 
aggressively moving data to the empty drive.  Most of the guidance I see 
for recovering from an ENOSPC indicates that adding a single device 
would be sufficient, I do note, however, that if you scroll down and 
read it all, some sites do point this out:

https://wiki.tnonline.net/w/Btrfs/ENOSPC

I also note that the various guides / howtos online are not necessarily 
within the control people who are on this mailing list - I'm not 
implying that the active maintainers are/should be responsible about 
everything written about btrfs online.

My strong recollection from reading this mailing list and when I have 
searched on line was that adding one device only was mentioned for 
managing an ENOSPC issue.  However, from a limited amount of searching 
I'm not sure that I can back the up with references.  Perhaps that is 
just my perception?

Adding the new drive plus a loop device allowed me to progressively 
rebalance using dusage and musage filters until I could start a full 
rebalance without the loop device.

Interestingly, though this array had been running, raid1, for a while 
with four drives, without rebalancing, metadata was only stored on two 
of the drives.


> So what I suggest doing is removing the mostly empty device, and only
> add devices in pairs when using raid1.

Should there be a caveat added to this "For very full btrfs raid1 file 
systems only add devices in at least pairs."?  Being able to add devices 
in a fairly ad hoc manner is a great strength for btrfs.

It seems to be that I am past the point of removing the larger device 
now as I have > 500 GB free on the smaller drives now, have removed the 
loop device and am no longer hitting ENOSPC.  I probably would have to 
start deleting some of my backups to remove the new larger device just 
to that the original 4 device array was not so cripplingly full.

Thank you for your help.



      reply	other threads:[~2022-01-16 12:28 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-14 20:24 ENOSPC on file system with nearly empty 12TB drive Peter Chant
2022-01-14 22:12 ` Chris Murphy
2022-01-14 23:09   ` Pete
2022-01-15 20:38     ` Chris Murphy
2022-01-16 12:27       ` Pete [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=706262eb-ee79-5dc7-4aae-132e750b2625@petezilla.co.uk \
    --to=pete@petezilla.co.uk \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.