All of lore.kernel.org
 help / color / mirror / Atom feed
* Btrfs Raid10 eating all Ram on Mount
@ 2023-03-15  7:26 Robert Krig
  2023-03-15 18:48 ` Goffredo Baroncelli
  2023-03-16  9:01 ` Qu Wenruo
  0 siblings, 2 replies; 7+ messages in thread
From: Robert Krig @ 2023-03-15  7:26 UTC (permalink / raw)
  To: linux-btrfs

Hi,


I've got a bit of a strange situation here.  I've got a server with 
4x16TB Drives in a RAID10 for data and a Raid1C4 for metadata 
configuration.
I'm currently retiring that server so I've been transferring and 
deleting snapshots from it.

For some reason, this server (Debian with kernel 6.2.1) suddenly starts 
eating all of my ram (64GB). Even if completely idle. I see that there 
is a btrfs-transaction process and a btrfs-cleaner process that are 
running and using quite a bit of cpu.

Basically, even after a fresh reboot. Once I mount the array, the memory 
usage will slowly start to creep up, until it reaches OOM and the system 
freezes.

I'm currently running a read-only check on the system and as far as I 
recall, I've never enabled Quotas on that system.

Does anyone have any idea what's causing this, or how I can fix it?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Btrfs Raid10 eating all Ram on Mount
  2023-03-15  7:26 Btrfs Raid10 eating all Ram on Mount Robert Krig
@ 2023-03-15 18:48 ` Goffredo Baroncelli
  2023-03-16  7:02   ` Robert Krig
  2023-03-16  9:01 ` Qu Wenruo
  1 sibling, 1 reply; 7+ messages in thread
From: Goffredo Baroncelli @ 2023-03-15 18:48 UTC (permalink / raw)
  To: Robert Krig, linux-btrfs

On 15/03/2023 08.26, Robert Krig wrote:
> Hi,
> 
> 
> I've got a bit of a strange situation here.  I've got a server with 4x16TB Drives in a RAID10 for data and a Raid1C4 for metadata configuration.
> I'm currently retiring that server so I've been transferring and deleting snapshots from it.

Deleting a snapshot requires a background process to release all the resource allocated on the filesystem.

> 
> For some reason, this server (Debian with kernel 6.2.1) suddenly starts eating all of my ram (64GB). Even if completely idle. I see that there is a btrfs-transaction process and a btrfs-cleaner process that are running and using quite a bit of cpu.
> 
> Basically, even after a fresh reboot. Once I mount the array, the memory usage will slowly start to creep up, until it reaches OOM and the system freezes.

Could you share some numbers about the filesystem, like the number of the snapshots deleted, the number of files of each snapshot and the kind of workload on the filesystem ? This to understand if 'btrfs-cleaner' is busy to 'unlink' the shared references between the files or not.

Unfortunately btrfs-cleaner even if interrupted by an unmount, restarts at the next mount.

Hoping that you had encountered a bug of the new 6.2.x series, may be a downgrading of the kernel could help. But before doing that, wait some other comments by other developers...
> 
> I'm currently running a read-only check on the system and as far as I recall, I've never enabled Quotas on that system.
> 
> Does anyone have any idea what's causing this, or how I can fix it?
> 

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Btrfs Raid10 eating all Ram on Mount
  2023-03-15 18:48 ` Goffredo Baroncelli
@ 2023-03-16  7:02   ` Robert Krig
  2023-03-16  7:57     ` Robert Krig
  0 siblings, 1 reply; 7+ messages in thread
From: Robert Krig @ 2023-03-16  7:02 UTC (permalink / raw)
  To: linux-btrfs

There were quite a few snapshots that I deleted on that system. But 
those snapshots were probably heavily de-duplicated since I was using 
the beesd tool to deduplicate the filesystem while in use.
At the moment, I just want to copy off some data from that filesystem, 
since that server is going to be cancelled.

Could I just mount the FS readonly, would that prevent the btrfs-cleaner 
from running and eating all my ram?




Am 15.03.23 um 19:48 schrieb Goffredo Baroncelli:
> On 15/03/2023 08.26, Robert Krig wrote:
>> Hi,
>>
>>
>> I've got a bit of a strange situation here.  I've got a server with 
>> 4x16TB Drives in a RAID10 for data and a Raid1C4 for metadata 
>> configuration.
>> I'm currently retiring that server so I've been transferring and 
>> deleting snapshots from it.
>
> Deleting a snapshot requires a background process to release all the 
> resource allocated on the filesystem.
>
>>
>> For some reason, this server (Debian with kernel 6.2.1) suddenly 
>> starts eating all of my ram (64GB). Even if completely idle. I see 
>> that there is a btrfs-transaction process and a btrfs-cleaner process 
>> that are running and using quite a bit of cpu.
>>
>> Basically, even after a fresh reboot. Once I mount the array, the 
>> memory usage will slowly start to creep up, until it reaches OOM and 
>> the system freezes.
>
> Could you share some numbers about the filesystem, like the number of 
> the snapshots deleted, the number of files of each snapshot and the 
> kind of workload on the filesystem ? This to understand if 
> 'btrfs-cleaner' is busy to 'unlink' the shared references between the 
> files or not.
>
> Unfortunately btrfs-cleaner even if interrupted by an unmount, 
> restarts at the next mount.
>
> Hoping that you had encountered a bug of the new 6.2.x series, may be 
> a downgrading of the kernel could help. But before doing that, wait 
> some other comments by other developers...
>>
>> I'm currently running a read-only check on the system and as far as I 
>> recall, I've never enabled Quotas on that system.
>>
>> Does anyone have any idea what's causing this, or how I can fix it?
>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Btrfs Raid10 eating all Ram on Mount
  2023-03-16  7:02   ` Robert Krig
@ 2023-03-16  7:57     ` Robert Krig
  2023-03-16 19:07       ` Goffredo Baroncelli
  0 siblings, 1 reply; 7+ messages in thread
From: Robert Krig @ 2023-03-16  7:57 UTC (permalink / raw)
  To: linux-btrfs

Update:

Ok, mounting as read-only seems to have done the trick (for now). At 
least it looks like I'm able to btrfs send snapshots to my new server 
without the RAM increasing.

How can I avoid this sort of thing in the future? Not using 
deduplication tools on snapshots? Only deleting one snapshot at a time 
and wait until I no longer see a btrfs-cleaner process?




Am 16.03.23 um 08:02 schrieb Robert Krig:
> There were quite a few snapshots that I deleted on that system. But 
> those snapshots were probably heavily de-duplicated since I was using 
> the beesd tool to deduplicate the filesystem while in use.
> At the moment, I just want to copy off some data from that filesystem, 
> since that server is going to be cancelled.
>
> Could I just mount the FS readonly, would that prevent the 
> btrfs-cleaner from running and eating all my ram?
>
>
>
>
> Am 15.03.23 um 19:48 schrieb Goffredo Baroncelli:
>> On 15/03/2023 08.26, Robert Krig wrote:
>>> Hi,
>>>
>>>
>>> I've got a bit of a strange situation here.  I've got a server with 
>>> 4x16TB Drives in a RAID10 for data and a Raid1C4 for metadata 
>>> configuration.
>>> I'm currently retiring that server so I've been transferring and 
>>> deleting snapshots from it.
>>
>> Deleting a snapshot requires a background process to release all the 
>> resource allocated on the filesystem.
>>
>>>
>>> For some reason, this server (Debian with kernel 6.2.1) suddenly 
>>> starts eating all of my ram (64GB). Even if completely idle. I see 
>>> that there is a btrfs-transaction process and a btrfs-cleaner 
>>> process that are running and using quite a bit of cpu.
>>>
>>> Basically, even after a fresh reboot. Once I mount the array, the 
>>> memory usage will slowly start to creep up, until it reaches OOM and 
>>> the system freezes.
>>
>> Could you share some numbers about the filesystem, like the number of 
>> the snapshots deleted, the number of files of each snapshot and the 
>> kind of workload on the filesystem ? This to understand if 
>> 'btrfs-cleaner' is busy to 'unlink' the shared references between the 
>> files or not.
>>
>> Unfortunately btrfs-cleaner even if interrupted by an unmount, 
>> restarts at the next mount.
>>
>> Hoping that you had encountered a bug of the new 6.2.x series, may be 
>> a downgrading of the kernel could help. But before doing that, wait 
>> some other comments by other developers...
>>>
>>> I'm currently running a read-only check on the system and as far as 
>>> I recall, I've never enabled Quotas on that system.
>>>
>>> Does anyone have any idea what's causing this, or how I can fix it?
>>>
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Btrfs Raid10 eating all Ram on Mount
  2023-03-15  7:26 Btrfs Raid10 eating all Ram on Mount Robert Krig
  2023-03-15 18:48 ` Goffredo Baroncelli
@ 2023-03-16  9:01 ` Qu Wenruo
  1 sibling, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2023-03-16  9:01 UTC (permalink / raw)
  To: Robert Krig, linux-btrfs



On 2023/3/15 15:26, Robert Krig wrote:
> Hi,
> 
> 
> I've got a bit of a strange situation here.  I've got a server with 
> 4x16TB Drives in a RAID10 for data and a Raid1C4 for metadata 
> configuration.
> I'm currently retiring that server so I've been transferring and 
> deleting snapshots from it.
> 
> For some reason, this server (Debian with kernel 6.2.1) suddenly starts 
> eating all of my ram (64GB). Even if completely idle. I see that there 
> is a btrfs-transaction process and a btrfs-cleaner process that are 
> running and using quite a bit of cpu.
> 
> Basically, even after a fresh reboot. Once I mount the array, the memory 
> usage will slowly start to creep up, until it reaches OOM and the system 
> freezes.
> 
> I'm currently running a read-only check on the system and as far as I 
> recall, I've never enabled Quotas on that system.

Just in case, since you can already RO mount the fs, did "btrfs qgroup 
show -prce <mnt>" shows anything?

The quota can be enabled by tools like snapper, to get an accurate 
number of exclusively owned bytes.

As the symptom itself looks like quota exactly.

Thanks,
Qu
> 
> Does anyone have any idea what's causing this, or how I can fix it?
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Btrfs Raid10 eating all Ram on Mount
  2023-03-16  7:57     ` Robert Krig
@ 2023-03-16 19:07       ` Goffredo Baroncelli
  2023-03-17  0:29         ` Paul Jones
  0 siblings, 1 reply; 7+ messages in thread
From: Goffredo Baroncelli @ 2023-03-16 19:07 UTC (permalink / raw)
  To: Robert Krig, linux-btrfs

On 16/03/2023 08.57, Robert Krig wrote:
> Update:
> 
> Ok, mounting as read-only seems to have done the trick (for now). At least it looks like I'm able to btrfs send snapshots to my new server without the RAM increasing.
> 
> How can I avoid this sort of thing in the future? Not using deduplication tools on snapshots? Only deleting one snapshot at a time and wait until I no longer see a btrfs-cleaner process?

I am not sure how a deduplication tool is compatible with multiple snapshot.
In theory both the snapshot and the deduplication create multiple reference to the same pieces of data.
However snapshot does the same for metadata (and this is good); instead dedup does the opposite: unshare the metadata to improve the sharing between the files of the same pieces of data (bad); this means that you have smal reduction of the data, at the cost of increasing the metadata.

It would be useful if you share the information about the metadata/data usage:

# btrfs fi us <mnt>

I suspect that you have a lot of metadata that creates the problem that you are experiencing.

Even tough it should not degenerate to an oom situation.

BR
> 
> 
> 
> 
> Am 16.03.23 um 08:02 schrieb Robert Krig:
>> There were quite a few snapshots that I deleted on that system. But those snapshots were probably heavily de-duplicated since I was using the beesd tool to deduplicate the filesystem while in use.
>> At the moment, I just want to copy off some data from that filesystem, since that server is going to be cancelled.
>>
>> Could I just mount the FS readonly, would that prevent the btrfs-cleaner from running and eating all my ram?
>>
>>
>>
>>
>> Am 15.03.23 um 19:48 schrieb Goffredo Baroncelli:
>>> On 15/03/2023 08.26, Robert Krig wrote:
>>>> Hi,
>>>>
>>>>
>>>> I've got a bit of a strange situation here.  I've got a server with 4x16TB Drives in a RAID10 for data and a Raid1C4 for metadata configuration.
>>>> I'm currently retiring that server so I've been transferring and deleting snapshots from it.
>>>
>>> Deleting a snapshot requires a background process to release all the resource allocated on the filesystem.
>>>
>>>>
>>>> For some reason, this server (Debian with kernel 6.2.1) suddenly starts eating all of my ram (64GB). Even if completely idle. I see that there is a btrfs-transaction process and a btrfs-cleaner process that are running and using quite a bit of cpu.
>>>>
>>>> Basically, even after a fresh reboot. Once I mount the array, the memory usage will slowly start to creep up, until it reaches OOM and the system freezes.
>>>
>>> Could you share some numbers about the filesystem, like the number of the snapshots deleted, the number of files of each snapshot and the kind of workload on the filesystem ? This to understand if 'btrfs-cleaner' is busy to 'unlink' the shared references between the files or not.
>>>
>>> Unfortunately btrfs-cleaner even if interrupted by an unmount, restarts at the next mount.
>>>
>>> Hoping that you had encountered a bug of the new 6.2.x series, may be a downgrading of the kernel could help. But before doing that, wait some other comments by other developers...
>>>>
>>>> I'm currently running a read-only check on the system and as far as I recall, I've never enabled Quotas on that system.
>>>>
>>>> Does anyone have any idea what's causing this, or how I can fix it?
>>>>
>>>

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Btrfs Raid10 eating all Ram on Mount
  2023-03-16 19:07       ` Goffredo Baroncelli
@ 2023-03-17  0:29         ` Paul Jones
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Jones @ 2023-03-17  0:29 UTC (permalink / raw)
  To: kreijack, Robert Krig, linux-btrfs

> -----Original Message-----
> From: Goffredo Baroncelli <kreijack@libero.it>
> Sent: Friday, 17 March 2023 6:08 AM
> To: Robert Krig <robert.krig@render-wahnsinn.de>; linux-
> btrfs@vger.kernel.org
> Subject: Re: Btrfs Raid10 eating all Ram on Mount
> 
> On 16/03/2023 08.57, Robert Krig wrote:
> > Update:
> >
> > Ok, mounting as read-only seems to have done the trick (for now). At least it
> looks like I'm able to btrfs send snapshots to my new server without the RAM
> increasing.
> >
> > How can I avoid this sort of thing in the future? Not using deduplication
> tools on snapshots? Only deleting one snapshot at a time and wait until I no
> longer see a btrfs-cleaner process?
> 
> I am not sure how a deduplication tool is compatible with multiple snapshot.
> In theory both the snapshot and the deduplication create multiple reference to
> the same pieces of data.
> However snapshot does the same for metadata (and this is good); instead
> dedup does the opposite: unshare the metadata to improve the sharing
> between the files of the same pieces of data (bad); this means that you have
> smal reduction of the data, at the cost of increasing the metadata.

There can be multiple snapshots of multiple systems that share the same/similar data, or there can be a large file(s) that has been moved.

But you are right in general, I used to use dedupe with snapshots but found that it didn't help in any significant way. Overall bytes used was a bit less (total disk usage), but not enough to justify the cost for my particular dataset. 


Paul.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-03-17  0:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-15  7:26 Btrfs Raid10 eating all Ram on Mount Robert Krig
2023-03-15 18:48 ` Goffredo Baroncelli
2023-03-16  7:02   ` Robert Krig
2023-03-16  7:57     ` Robert Krig
2023-03-16 19:07       ` Goffredo Baroncelli
2023-03-17  0:29         ` Paul Jones
2023-03-16  9:01 ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.