All of lore.kernel.org
 help / color / mirror / Atom feed
* cancel btrfs delete job
@ 2014-06-24  7:50 Franziska Näpelt
  2014-06-26  7:17 ` Satoru Takeuchi
  2014-06-26 11:06 ` Franziska Näpelt
  0 siblings, 2 replies; 15+ messages in thread
From: Franziska Näpelt @ 2014-06-24  7:50 UTC (permalink / raw)
  To: linux-btrfs

Hi,

i want to expand a btrfs filesystem. In the moment I have four 2 TB HDDs
for one btrfs pool.
I want to replace two of the HDDs with 3 TB devices. For this reason i
added as first one 3 TB device. After that, i do the btrfs delete job.
This job lasts since 4 days and it seems, that it hangs.

Is it possible to cancel a btrfs delete job?
And when so, how can i do that?

Best regards,
Franziska




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-24  7:50 cancel btrfs delete job Franziska Näpelt
@ 2014-06-26  7:17 ` Satoru Takeuchi
       [not found]   ` <1403777123.7657.5.camel@hsew-frn.HIPERSCAN>
  2014-06-26 11:06 ` Franziska Näpelt
  1 sibling, 1 reply; 15+ messages in thread
From: Satoru Takeuchi @ 2014-06-26  7:17 UTC (permalink / raw)
  To: franziska.naepelt, linux-btrfs

Hi Franziska,

(2014/06/24 16:50), Franziska Näpelt wrote:
> Hi,
>
> i want to expand a btrfs filesystem. In the moment I have four 2 TB HDDs
> for one btrfs pool.
> I want to replace two of the HDDs with 3 TB devices. For this reason i
> added as first one 3 TB device. After that, i do the btrfs delete job.
> This job lasts since 4 days and it seems, that it hangs.
>
> Is it possible to cancel a btrfs delete job?
> And when so, how can i do that?

Unfortunately you can't cancel btrfs delete.

Q1. Does your environment and operations are as follows?

Environment:
  - You have six disks.
    - four 2TB disks: /dev/sd[a-d]
    - two 3TB disks: /dev/sd[ef]
  - Currently your btrfs filesystem consists of /dev/sd[a-d].
  - You want to replace /dev/sd[cd] with /dev/[ef].

Your operation:
  1. Add one 3TB disk, /dev/sde.     # It works correctly.
  2. Delete one 2TB disk, /dev/sdc   # Hanged up happens here!

Q2. Does I/Os for delete still in progress, or there is no I/O?

Thanks,
Satoru

>
> Best regards,
> Franziska
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-24  7:50 cancel btrfs delete job Franziska Näpelt
  2014-06-26  7:17 ` Satoru Takeuchi
@ 2014-06-26 11:06 ` Franziska Näpelt
  2014-06-26 11:30   ` Duncan
  1 sibling, 1 reply; 15+ messages in thread
From: Franziska Näpelt @ 2014-06-26 11:06 UTC (permalink / raw)
  To: linux-btrfs

Hello Satoru,

here are your requested informations:

environment:

- four 2 TB disks: /dev/sd[c-f]
- two 3 TB disks: /dev/sdg (but until now, only one is connected)

filesystem consists of/dev/sd[c-f]

I wanted to replace /dev/sdc by /dev/sdg ( with commands add and after
that delete.
In the second step, I wanted to replace the next disk.

But it hanged during btrfs delete command (after successfull add).
The delete process was still in progress, but with iotop it seems to me,
that there is was no data transfer.


Today in the morning the hole computer hangs and there was no other
possibility than reset :(

Until now, he tries to boot with a lot of erros. But I can see, that
there are fileactions on the harddrive.

There are a lot of following messages:
btrfs free space inode generation (0) did not match free space cache
generation


best regards,
Franziska

-------- Weitergeleitete Nachricht --------
Von: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
An: franziska.naepelt@hiperscan.com, linux-btrfs@vger.kernel.org
<linux-btrfs@vger.kernel.org>
Betreff: Re: cancel btrfs delete job
Datum: Thu, 26 Jun 2014 16:17:30 +0900

Hi Franziska,

(2014/06/24 16:50), Franziska Näpelt wrote:
> Hi,
>
> i want to expand a btrfs filesystem. In the moment I have four 2 TB
HDDs
> for one btrfs pool.
> I want to replace two of the HDDs with 3 TB devices. For this reason i
> added as first one 3 TB device. After that, i do the btrfs delete job.
> This job lasts since 4 days and it seems, that it hangs.
>
> Is it possible to cancel a btrfs delete job?
> And when so, how can i do that?

Unfortunately you can't cancel btrfs delete.

Q1. Does your environment and operations are as follows?

Environment:
  - You have six disks.
    - four 2TB disks: /dev/sd[a-d]
    - two 3TB disks: /dev/sd[ef]
  - Currently your btrfs filesystem consists of /dev/sd[a-d].
  - You want to replace /dev/sd[cd] with /dev/[ef].

Your operation:
  1. Add one 3TB disk, /dev/sde.     # It works correctly.
  2. Delete one 2TB disk, /dev/sdc   # Hanged up happens here!

Q2. Does I/Os for delete still in progress, or there is no I/O?

Thanks,
Satoru

>
> Best regards,
> Franziska
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>






^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-26 11:06 ` Franziska Näpelt
@ 2014-06-26 11:30   ` Duncan
  0 siblings, 0 replies; 15+ messages in thread
From: Duncan @ 2014-06-26 11:30 UTC (permalink / raw)
  To: linux-btrfs

Franziska Näpelt posted on Thu, 26 Jun 2014 13:06:35 +0200 as excerpted:

> There are a lot of following messages:
> btrfs free space inode generation (0) did not match free space cache
> generation

Well, set of messages at least is somewhat expected after a hard 
shutdown, and shouldn't be a problem as the filesystem should rebuild the 
cache.  In fact, that space_cache rebuild might be why you were seeing 
such high I/O after the reboot and fresh mount.

If the space-cache rebuild is all the problems you see, you may be lucky, 
and the hard-terminated delete and reboot might not have resulted in any 
permanent damage.  OTOH, if it seems the space_cache rebuild is 
interfering with further activity for too long and you end up doing 
another hard reset anyway, there's the nospace_cache mount option to turn 
it off.

(Just a user and list regular with that single comment... I'll let you 
get back to the expert help now.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
       [not found]     ` <53AC013E.5000702@jp.fujitsu.com>
@ 2014-06-26 11:34       ` Franziska Näpelt
  2014-06-26 23:29         ` Satoru Takeuchi
  0 siblings, 1 reply; 15+ messages in thread
From: Franziska Näpelt @ 2014-06-26 11:34 UTC (permalink / raw)
  To: Satoru Takeuchi; +Cc: linux-btrfs

Hi Satoru,

I'm sorry, but the boot process is always runnig(i hope so), i can't
login until now. So therefore i have currently no logs.
I don't want to interrupt these process, because there are a lot of
fileactions on the harddrive (LED is blinking).

I'm not sure about the mkfs.btrfs option, because the system was set up
more than one year ago.

mount-options in fstab: 
LABEL=btrfs-pool /mnt/btrfs btrfs compress=lzo,degraded 0 1

kernel version is 3.15 on a Debian Whezzy with current btrfs-tools
installed

Can you estimate, how long the boot-process (repairing btrfs?) could
take? The boot-process works for five hours now.

best regards,
Franziska



Hi Franziska,

(2014/06/26 19:05), Franziska Näpelt wrote:
> Hello Satoru,
>
> here are your requested informations:
>
> environment:
>
> - four 2 TB disks: /dev/sd[c-f]
> - two 3 TB disks: /dev/sdg (but until now, only one is connected)
>
> filesystem consists of/dev/sd[c-f]
>
> I wanted to replace /dev/sdc by /dev/sdg ( with commands add and after
> that delete.
> In the second step, I wanted to replace the next disk.
>
> But it hanged during btrfs delete command (after successfull add).
> The delete process was still in progress, but with iotop it seems to me, that there is was no data transfer.

Hm, them something bad would happen on Btrfs.

> Today in the morning the hole computer hangs and there was no other possibility than reset :(

So, unfortunately any debug info like sysrq-w can't be get.

>
> Until now, he tries to boot with a lot of erros. But I can see, that there are fileactions on the harddrive.
>
> There are a lot of following messages:
> btrfs free space inode generation (0) did not match free space cache generation

And doesn't finish to mount process?

Your filesystem is in inconsistent state since you
reset during rebalancing filesystem which triggered by
device deletion.

The following link would help you. But I'm not sure whether
your data can be restored or not.

https://btrfs.wiki.kernel.org/index.php/Btrfsck

Could you tell me your mkfs.btrfs options, mount options,
and kernel version, if possible? I'd like to try to
reproduce your problem anyway.

Thanks,
Satoru

>
>
> best regards,
> Franziska
>
> -------- Weitergeleitete Nachricht --------
> Von: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
> An: franziska.naepelt@hiperscan.com, linux-btrfs@vger.kernel.org
> <linux-btrfs@vger.kernel.org>
> Betreff: Re: cancel btrfs delete job
> Datum: Thu, 26 Jun 2014 16:17:30 +0900
>
> Hi Franziska,
>
> (2014/06/24 16:50), Franziska Näpelt wrote:
>> Hi,
>>
>> i want to expand a btrfs filesystem. In the moment I have four 2 TB HDDs
>> for one btrfs pool.
>> I want to replace two of the HDDs with 3 TB devices. For this reason i
>> added as first one 3 TB device. After that, i do the btrfs delete job.
>> This job lasts since 4 days and it seems, that it hangs.
>>
>> Is it possible to cancel a btrfs delete job?
>> And when so, how can i do that?
>
> Unfortunately you can't cancel btrfs delete.
>
> Q1. Does your environment and operations are as follows?
>
> Environment:
>    - You have six disks.
>      - four 2TB disks: /dev/sd[a-d]
>      - two 3TB disks: /dev/sd[ef]
>    - Currently your btrfs filesystem consists of /dev/sd[a-d].
>    - You want to replace /dev/sd[cd] with /dev/[ef].
>
> Your operation:
>    1. Add one 3TB disk, /dev/sde.     # It works correctly.
>    2. Delete one 2TB disk, /dev/sdc   # Hanged up happens here!
>
> Q2. Does I/Os for delete still in progress, or there is no I/O?
>
> Thanks,
> Satoru
>
>>
>> Best regards,
>> Franziska
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
>
>




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-26 11:34       ` Franziska Näpelt
@ 2014-06-26 23:29         ` Satoru Takeuchi
  2014-06-27  8:55           ` Satoru Takeuchi
  0 siblings, 1 reply; 15+ messages in thread
From: Satoru Takeuchi @ 2014-06-26 23:29 UTC (permalink / raw)
  To: franziska.naepelt; +Cc: linux-btrfs

Hi Franziska,

(2014/06/26 20:34), Franziska Näpelt wrote:
> Hi Satoru,
>
> I'm sorry, but the boot process is always runnig(i hope so), i can't
> login until now. So therefore i have currently no logs.
> I don't want to interrupt these process, because there are a lot of
> fileactions on the harddrive (LED is blinking).
>
> I'm not sure about the mkfs.btrfs option, because the system was set up
> more than one year ago.
>
> mount-options in fstab:
> LABEL=btrfs-pool /mnt/btrfs btrfs compress=lzo,degraded 0 1
>
> kernel version is 3.15 on a Debian Whezzy with current btrfs-tools
> installed
>
> Can you estimate, how long the boot-process (repairing btrfs?) could
> take? The boot-process works for five hours now.

To do so, I'll try to follow your steps with the system which similar
to your environment as possible. Unfortunately I don't have plenty
of disks.

Thanks,
Satoru

>
> best regards,
> Franziska
>
>
>
> Hi Franziska,
>
> (2014/06/26 19:05), Franziska Näpelt wrote:
>> Hello Satoru,
>>
>> here are your requested informations:
>>
>> environment:
>>
>> - four 2 TB disks: /dev/sd[c-f]
>> - two 3 TB disks: /dev/sdg (but until now, only one is connected)
>>
>> filesystem consists of/dev/sd[c-f]
>>
>> I wanted to replace /dev/sdc by /dev/sdg ( with commands add and after
>> that delete.
>> In the second step, I wanted to replace the next disk.
>>
>> But it hanged during btrfs delete command (after successfull add).
>> The delete process was still in progress, but with iotop it seems to me, that there is was no data transfer.
>
> Hm, them something bad would happen on Btrfs.
>
>> Today in the morning the hole computer hangs and there was no other possibility than reset :(
>
> So, unfortunately any debug info like sysrq-w can't be get.
>
>>
>> Until now, he tries to boot with a lot of erros. But I can see, that there are fileactions on the harddrive.
>>
>> There are a lot of following messages:
>> btrfs free space inode generation (0) did not match free space cache generation
>
> And doesn't finish to mount process?
>
> Your filesystem is in inconsistent state since you
> reset during rebalancing filesystem which triggered by
> device deletion.
>
> The following link would help you. But I'm not sure whether
> your data can be restored or not.
>
> https://btrfs.wiki.kernel.org/index.php/Btrfsck
>
> Could you tell me your mkfs.btrfs options, mount options,
> and kernel version, if possible? I'd like to try to
> reproduce your problem anyway.
>
> Thanks,
> Satoru
>
>>
>>
>> best regards,
>> Franziska
>>
>> -------- Weitergeleitete Nachricht --------
>> Von: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
>> An: franziska.naepelt@hiperscan.com, linux-btrfs@vger.kernel.org
>> <linux-btrfs@vger.kernel.org>
>> Betreff: Re: cancel btrfs delete job
>> Datum: Thu, 26 Jun 2014 16:17:30 +0900
>>
>> Hi Franziska,
>>
>> (2014/06/24 16:50), Franziska Näpelt wrote:
>>> Hi,
>>>
>>> i want to expand a btrfs filesystem. In the moment I have four 2 TB HDDs
>>> for one btrfs pool.
>>> I want to replace two of the HDDs with 3 TB devices. For this reason i
>>> added as first one 3 TB device. After that, i do the btrfs delete job.
>>> This job lasts since 4 days and it seems, that it hangs.
>>>
>>> Is it possible to cancel a btrfs delete job?
>>> And when so, how can i do that?
>>
>> Unfortunately you can't cancel btrfs delete.
>>
>> Q1. Does your environment and operations are as follows?
>>
>> Environment:
>>     - You have six disks.
>>       - four 2TB disks: /dev/sd[a-d]
>>       - two 3TB disks: /dev/sd[ef]
>>     - Currently your btrfs filesystem consists of /dev/sd[a-d].
>>     - You want to replace /dev/sd[cd] with /dev/[ef].
>>
>> Your operation:
>>     1. Add one 3TB disk, /dev/sde.     # It works correctly.
>>     2. Delete one 2TB disk, /dev/sdc   # Hanged up happens here!
>>
>> Q2. Does I/Os for delete still in progress, or there is no I/O?
>>
>> Thanks,
>> Satoru
>>
>>>
>>> Best regards,
>>> Franziska
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>>
>>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-26 23:29         ` Satoru Takeuchi
@ 2014-06-27  8:55           ` Satoru Takeuchi
  0 siblings, 0 replies; 15+ messages in thread
From: Satoru Takeuchi @ 2014-06-27  8:55 UTC (permalink / raw)
  To: franziska.naepelt; +Cc: linux-btrfs

Hi Franziska,

> (2014/06/26 20:34), Franziska Näpelt wrote:
>> Hi Satoru,
>>
>> I'm sorry, but the boot process is always runnig(i hope so), i can't
>> login until now. So therefore i have currently no logs.
>> I don't want to interrupt these process, because there are a lot of
>> fileactions on the harddrive (LED is blinking).
>>
>> I'm not sure about the mkfs.btrfs option, because the system was set up
>> more than one year ago.
>>
>> mount-options in fstab:
>> LABEL=btrfs-pool /mnt/btrfs btrfs compress=lzo,degraded 0 1
>>
>> kernel version is 3.15 on a Debian Whezzy with current btrfs-tools
>> installed
>>
>> Can you estimate, how long the boot-process (repairing btrfs?) could
>> take? The boot-process works for five hours now.
>
> To do so, I'll try to follow your steps with the system which similar
> to your environment as possible. Unfortunately I don't have plenty
> of disks.

Although you've already succeeded to mount your btrfs now,
I share "how long does Franziska's operations take" anyway.

Please note that I measured not "balance after reset
during balance triggered by delete", but "balance
triggered by delete". Since most required time of both work
are balance, the result of former would be similar
to the latter.



Environment:
- x86_64 fedora20 KVM guest on x86_64 fedora20 host
- RAM: 4GiB
- kernel: 3.15.2
- Storage: 50GB virt-io disk
   - small devices: /dev/vd[d-g]
   - large devices: /dev/vd[hi]

   # All of these virtual devices are backed by
   # files on a real HDD in the host.

Operations:
  1. Make a Btrfs filesystem
  2. Make a junk file in the filesystem.
  3. Add a large device
  4. Remove a small device and measure how long it takes.

Script:
===============================================================================
#!/bin/bash

MOUNTPOINT=/home/sat/mnt

MEGABYTES=4096

mkfs.btrfs -f /dev/vdd /dev/vde /dev/vdf /dev/vdg
mount -o compress=lzo /dev/vdd /home/sat/mnt
dd if=/dev/urandom of=/home/sat/mnt/junk oflag=direct bs=1MiB count="$MEGABYTES"
btrfs dev add -f /dev/vdh /home/sat/mnt
time btrfs dev del /dev/vdg /home/sat/mnt
umount /home/sat/mnt
===============================================================================

Test factors:
   - Device size
      - small: 2GB, large: 3GB
      - small: 4GB, large: 6GB
   - The size of junk file # MEGABYTES parameter of script
     - 1/2 GB
     - 1 GB
     - 2 GB
     - 4 GB

Result (*1):

device size[GB]| junk file |         | time/junk
------+--------+ size [GB] | time[s] | file size
small | large  |           |         |    [s/GB]
======+========+===========+=========+===========
     2 |      3 |       1/2 |     5.3 |    10.6
       |        |         1 |     9.6 |     9.6
       |        |         2 |    19.0 |     9.5
------+--------+-----------+---------+-----------
     4 |      6 |       1/2 |     5.1 |    10.2
       |        |         1 |     9.4 |     9.4
       |        |         2 |    17.0 |     8.5
       |        |         4 |    39.3 |     9.8

*1) This data is the average of three tries.

So, it seems that how long delete (and balance)
takes is proportional to the used size (the size
of junk file here). In my case, delete work seems
to take about 10 [s/GB]. If the storage size
are 2TB for small devices and 3TB for large devices,
and junk file size is 2TB, this operation would take
5.4 hours.

Of course, it's a too simplified case and it wouldn't
apply to your case cleanly. However, this kind of
measurement would help to estimate the required time
to your next balance operation.

Thanks,
Satoru

>
>>
>> best regards,
>> Franziska
>>
>>
>>
>> Hi Franziska,
>>
>> (2014/06/26 19:05), Franziska Näpelt wrote:
>>> Hello Satoru,
>>>
>>> here are your requested informations:
>>>
>>> environment:
>>>
>>> - four 2 TB disks: /dev/sd[c-f]
>>> - two 3 TB disks: /dev/sdg (but until now, only one is connected)
>>>
>>> filesystem consists of/dev/sd[c-f]
>>>
>>> I wanted to replace /dev/sdc by /dev/sdg ( with commands add and after
>>> that delete.
>>> In the second step, I wanted to replace the next disk.
>>>
>>> But it hanged during btrfs delete command (after successfull add).
>>> The delete process was still in progress, but with iotop it seems to me, that there is was no data transfer.
>>
>> Hm, them something bad would happen on Btrfs.
>>
>>> Today in the morning the hole computer hangs and there was no other possibility than reset :(
>>
>> So, unfortunately any debug info like sysrq-w can't be get.
>>
>>>
>>> Until now, he tries to boot with a lot of erros. But I can see, that there are fileactions on the harddrive.
>>>
>>> There are a lot of following messages:
>>> btrfs free space inode generation (0) did not match free space cache generation
>>
>> And doesn't finish to mount process?
>>
>> Your filesystem is in inconsistent state since you
>> reset during rebalancing filesystem which triggered by
>> device deletion.
>>
>> The following link would help you. But I'm not sure whether
>> your data can be restored or not.
>>
>> https://btrfs.wiki.kernel.org/index.php/Btrfsck
>>
>> Could you tell me your mkfs.btrfs options, mount options,
>> and kernel version, if possible? I'd like to try to
>> reproduce your problem anyway.
>>
>> Thanks,
>> Satoru
>>
>>>
>>>
>>> best regards,
>>> Franziska
>>>
>>> -------- Weitergeleitete Nachricht --------
>>> Von: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
>>> An: franziska.naepelt@hiperscan.com, linux-btrfs@vger.kernel.org
>>> <linux-btrfs@vger.kernel.org>
>>> Betreff: Re: cancel btrfs delete job
>>> Datum: Thu, 26 Jun 2014 16:17:30 +0900
>>>
>>> Hi Franziska,
>>>
>>> (2014/06/24 16:50), Franziska Näpelt wrote:
>>>> Hi,
>>>>
>>>> i want to expand a btrfs filesystem. In the moment I have four 2 TB HDDs
>>>> for one btrfs pool.
>>>> I want to replace two of the HDDs with 3 TB devices. For this reason i
>>>> added as first one 3 TB device. After that, i do the btrfs delete job.
>>>> This job lasts since 4 days and it seems, that it hangs.
>>>>
>>>> Is it possible to cancel a btrfs delete job?
>>>> And when so, how can i do that?
>>>
>>> Unfortunately you can't cancel btrfs delete.
>>>
>>> Q1. Does your environment and operations are as follows?
>>>
>>> Environment:
>>>     - You have six disks.
>>>       - four 2TB disks: /dev/sd[a-d]
>>>       - two 3TB disks: /dev/sd[ef]
>>>     - Currently your btrfs filesystem consists of /dev/sd[a-d].
>>>     - You want to replace /dev/sd[cd] with /dev/[ef].
>>>
>>> Your operation:
>>>     1. Add one 3TB disk, /dev/sde.     # It works correctly.
>>>     2. Delete one 2TB disk, /dev/sdc   # Hanged up happens here!
>>>
>>> Q2. Does I/Os for delete still in progress, or there is no I/O?
>>>
>>> Thanks,
>>> Satoru
>>>
>>>>
>>>> Best regards,
>>>> Franziska
>>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-27  5:58 ` Satoru Takeuchi
@ 2014-06-27  8:09   ` Satoru Takeuchi
  0 siblings, 0 replies; 15+ messages in thread
From: Satoru Takeuchi @ 2014-06-27  8:09 UTC (permalink / raw)
  To: franziska.naepelt, linux-btrfs

Hi Franziska,

> (2014/06/27 14:00), Franziska Näpelt wrote:
>> Hi!
>>
>> After about 12 hours of booting, the system runs now
>
> Congratulations!
>
>> The fifth harddrive is still in the btrfs-pool.
>>
>> Here is the log from the crash, while the btrfs delete job runs:
>>
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957248] ------------[ cut
>> here ]------------
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957268] WARNING: CPU: 3 PID:
>> 31131 at fs/btrfs/super.c:259 __btrfs_abort_transaction+0x46/0xf8 [
>> btrfs]()
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957270] Modules linked in:
>> xts gf128mul tun parport_pc ppdev lp parport bnep rfcomm bluetooth rf
>> kill pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) cpufreq_powersave
>> cpufreq_userspace cpufreq_stats cpufreq_conservative vboxdrv(O) binfm
>> t_misc fuse nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache
>> sunrpc ext2 dm_crypt hwmon_vid loop firewire_sbp2 snd_hda_codec_hdmi snd
>> _hda_intel joydev radeon ttm drm_kms_helper iTCO_wdt iTCO_vendor_support
>> snd_hda_controller drm i2c_algo_bit snd_hda_codec snd_hwdep snd_pcm
>>   i7core_edac snd_timer edac_core snd soundcore psmouse acpi_cpufreq
>> coretemp processor kvm_intel kvm microcode lpc_ich mfd_core pcspkr asus_
>> atk0110 ehci_pci mxm_wmi i2c_i801 i2c_core wmi serio_raw thermal_sys
>> evdev button ext4 crc16 jbd2 mbcache btrfs xor raid6_pq dm_mod raid1 md
>> _mod sg sd_mod crct10dif_generic crc_t10dif crct10dif_common hid_generic
>> usbhid hid crc32c_intel firewire_ohci r8169 firewire_core mii crc_i
>> tu_t sata_sil ahci libahci sata_mv uhci_hcd ehci_hcd
>> Jun 25 20:34:59 hsad-srv-03 kernel: libata xhci_hcd scsi_mod usbcore
>> usb_common
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957309] CPU: 3 PID: 31131
>> Comm: find Tainted: G           O  3.15.0 #1
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957310] Hardware name:
>> System manufacturer System Product Name/SABERTOOTH X58, BIOS 1304
>> 08/02/2011
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957311]  0000000000000000
>> 0000000000000009 ffffffff8138b54a ffff880001593b58
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957313]  ffffffff81039583
>> 0000000000000000 ffffffffa01c8123 00000000000000b0
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957315]  00000000ffffffe4
>> ffff880625fb9000 ffff8801077e8e80 ffffffffa0247ac0
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957317] Call Trace:
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957321]
>> [<ffffffff8138b54a>] ? dump_stack+0x41/0x51
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957324]
>> [<ffffffff81039583>] ? warn_slowpath_common+0x78/0x90
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957331]
>> [<ffffffffa01c8123>] ? __btrfs_abort_transaction+0x46/0xf8 [btrfs]
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957333]
>> [<ffffffff81039633>] ? warn_slowpath_fmt+0x45/0x4a
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957340]
>> [<ffffffffa01c8123>] ? __btrfs_abort_transaction+0x46/0xf8 [btrfs]
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957348]
>> [<ffffffffa01d648a>] ? __btrfs_free_extent+0x80a/0x84d [btrfs]
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957351]
>> [<ffffffff8138db1c>] ? mutex_trylock+0x10/0x29
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957359]
>> [<ffffffffa01dabfe>] ? __btrfs_run_delayed_refs+0xae4/0xc2b [btrfs]
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957368]
>> [<ffffffffa01dc86c>] ? btrfs_run_delayed_refs+0x7b/0x17e [btrfs]
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957378]
>> [<ffffffffa01ea1d7>] ? __btrfs_end_transaction+0xe5/0x2c0 [btrfs]
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957389]
>> [<ffffffffa01ee9bb>] ? btrfs_dirty_inode+0x8c/0xa7 [btrfs]
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957391]
>> [<ffffffff8111f12d>] ? touch_atime+0xe3/0x11c
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957393]
>> [<ffffffff81119843>] ? iterate_dir+0x7c/0xa2
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957395]
>> [<ffffffff81119949>] ? SyS_getdents+0x74/0xca
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957397]
>> [<ffffffff811196ee>] ? filldir64+0xdd/0xdd
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957399]
>> [<ffffffff81394522>] ? system_call_fastpath+0x16/0x1b
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957400] ---[ end trace
>> 8392ac15dafb7de4 ]---
>> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957422] BTRFS info (device
>> sdh): forced readonly
>
> Your delete job seemed to fail at __btrfs_abort_transaction(),
> resulted in readonly remount at that time.

In addition, if you encounter this kind of situation, setting
skip_balance mount option would help you. It skips to continue
balance at mount time. Please see also the following thread.

It's about the case which Marc tried to balance btrfs and
hanup happened.

http://comments.gmane.org/gmane.comp.file-systems.btrfs/35791

Thanks,
Satoru

>
>>
>> After that event, there are furher entries in the messages log, but
>> nothing interesting, only some dhcp infos. Three minutes later, the log
>> stopped without any message.
>>
>> Does someone need further logs?
>
> I have no idea.
>
> Thanks,
> Satoru
>
>>
>> best regards,
>> Franziska
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-27  5:00 Franziska Näpelt
@ 2014-06-27  5:58 ` Satoru Takeuchi
  2014-06-27  8:09   ` Satoru Takeuchi
  0 siblings, 1 reply; 15+ messages in thread
From: Satoru Takeuchi @ 2014-06-27  5:58 UTC (permalink / raw)
  To: franziska.naepelt, linux-btrfs

Hi Franziska,

(2014/06/27 14:00), Franziska Näpelt wrote:
> Hi!
>
> After about 12 hours of booting, the system runs now

Congratulations!

> The fifth harddrive is still in the btrfs-pool.
>
> Here is the log from the crash, while the btrfs delete job runs:
>
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957248] ------------[ cut
> here ]------------
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957268] WARNING: CPU: 3 PID:
> 31131 at fs/btrfs/super.c:259 __btrfs_abort_transaction+0x46/0xf8 [
> btrfs]()
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957270] Modules linked in:
> xts gf128mul tun parport_pc ppdev lp parport bnep rfcomm bluetooth rf
> kill pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) cpufreq_powersave
> cpufreq_userspace cpufreq_stats cpufreq_conservative vboxdrv(O) binfm
> t_misc fuse nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache
> sunrpc ext2 dm_crypt hwmon_vid loop firewire_sbp2 snd_hda_codec_hdmi snd
> _hda_intel joydev radeon ttm drm_kms_helper iTCO_wdt iTCO_vendor_support
> snd_hda_controller drm i2c_algo_bit snd_hda_codec snd_hwdep snd_pcm
>   i7core_edac snd_timer edac_core snd soundcore psmouse acpi_cpufreq
> coretemp processor kvm_intel kvm microcode lpc_ich mfd_core pcspkr asus_
> atk0110 ehci_pci mxm_wmi i2c_i801 i2c_core wmi serio_raw thermal_sys
> evdev button ext4 crc16 jbd2 mbcache btrfs xor raid6_pq dm_mod raid1 md
> _mod sg sd_mod crct10dif_generic crc_t10dif crct10dif_common hid_generic
> usbhid hid crc32c_intel firewire_ohci r8169 firewire_core mii crc_i
> tu_t sata_sil ahci libahci sata_mv uhci_hcd ehci_hcd
> Jun 25 20:34:59 hsad-srv-03 kernel: libata xhci_hcd scsi_mod usbcore
> usb_common
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957309] CPU: 3 PID: 31131
> Comm: find Tainted: G           O  3.15.0 #1
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957310] Hardware name:
> System manufacturer System Product Name/SABERTOOTH X58, BIOS 1304
> 08/02/2011
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957311]  0000000000000000
> 0000000000000009 ffffffff8138b54a ffff880001593b58
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957313]  ffffffff81039583
> 0000000000000000 ffffffffa01c8123 00000000000000b0
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957315]  00000000ffffffe4
> ffff880625fb9000 ffff8801077e8e80 ffffffffa0247ac0
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957317] Call Trace:
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957321]
> [<ffffffff8138b54a>] ? dump_stack+0x41/0x51
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957324]
> [<ffffffff81039583>] ? warn_slowpath_common+0x78/0x90
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957331]
> [<ffffffffa01c8123>] ? __btrfs_abort_transaction+0x46/0xf8 [btrfs]
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957333]
> [<ffffffff81039633>] ? warn_slowpath_fmt+0x45/0x4a
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957340]
> [<ffffffffa01c8123>] ? __btrfs_abort_transaction+0x46/0xf8 [btrfs]
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957348]
> [<ffffffffa01d648a>] ? __btrfs_free_extent+0x80a/0x84d [btrfs]
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957351]
> [<ffffffff8138db1c>] ? mutex_trylock+0x10/0x29
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957359]
> [<ffffffffa01dabfe>] ? __btrfs_run_delayed_refs+0xae4/0xc2b [btrfs]
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957368]
> [<ffffffffa01dc86c>] ? btrfs_run_delayed_refs+0x7b/0x17e [btrfs]
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957378]
> [<ffffffffa01ea1d7>] ? __btrfs_end_transaction+0xe5/0x2c0 [btrfs]
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957389]
> [<ffffffffa01ee9bb>] ? btrfs_dirty_inode+0x8c/0xa7 [btrfs]
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957391]
> [<ffffffff8111f12d>] ? touch_atime+0xe3/0x11c
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957393]
> [<ffffffff81119843>] ? iterate_dir+0x7c/0xa2
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957395]
> [<ffffffff81119949>] ? SyS_getdents+0x74/0xca
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957397]
> [<ffffffff811196ee>] ? filldir64+0xdd/0xdd
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957399]
> [<ffffffff81394522>] ? system_call_fastpath+0x16/0x1b
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957400] ---[ end trace
> 8392ac15dafb7de4 ]---
> Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957422] BTRFS info (device
> sdh): forced readonly

Your delete job seemed to fail at __btrfs_abort_transaction(),
resulted in readonly remount at that time.

>
> After that event, there are furher entries in the messages log, but
> nothing interesting, only some dhcp infos. Three minutes later, the log
> stopped without any message.
>
> Does someone need further logs?

I have no idea.

Thanks,
Satoru

>
> best regards,
> Franziska
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-27  0:49   ` Russell Coker
@ 2014-06-27  5:26     ` Duncan
  0 siblings, 0 replies; 15+ messages in thread
From: Duncan @ 2014-06-27  5:26 UTC (permalink / raw)
  To: linux-btrfs

Russell Coker posted on Fri, 27 Jun 2014 10:49:40 +1000 as excerpted:

> On Thu, 26 Jun 2014 13:25:09 Duncan wrote:
>> properly.  And if you do automated snapshots, use a good thinning
>> script to thin them down so you're well under 500. (I've posted figures
>> where even starting with per-minute snapshots, thinning down to 10
>> minute, then to half hour within the day, then to say 4/day after two
>> days, 1/day after a week, one a week after four weeks, one every 13
>> weeks aka quarterly after say six months, and clearing them all and
>> relying on off-
>> machine backups after a year or 18 months, runs only 250-ish or so,
>> under 300.)
> 
> What scripts are people using for this?

I don't do much with snapshots here, and definitely don't do auto-
snapshotting, but snapper's a popular one, there's Marc Merlin's script,
and there's several other related links I've not looked closely at on the
wiki, as well.

The use-cases page has a section on time-machine snapshots, with several
links including snapper:

https://btrfs.wiki.kernel.org/index.php/UseCases#How_can_I_use_btrfs_for_backups.2Ftime-machine.3F

I don't see Marc Merlin's script linked there, but it's linked from the
FAQ, snapshot example section:

https://btrfs.wiki.kernel.org/index.php/FAQ#snapshot_example

(Someone with a wiki login could probably add Marc's script to the use-
cases list, and then reword the FAQ entry appropriately and point it to
the use-cases discussion instead of just linking Marc's script.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
@ 2014-06-27  5:00 Franziska Näpelt
  2014-06-27  5:58 ` Satoru Takeuchi
  0 siblings, 1 reply; 15+ messages in thread
From: Franziska Näpelt @ 2014-06-27  5:00 UTC (permalink / raw)
  To: linux-btrfs

Hi!

After about 12 hours of booting, the system runs now :)
The fifth harddrive is still in the btrfs-pool.

Here is the log from the crash, while the btrfs delete job runs:

Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957248] ------------[ cut
here ]------------
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957268] WARNING: CPU: 3 PID:
31131 at fs/btrfs/super.c:259 __btrfs_abort_transaction+0x46/0xf8 [
btrfs]()
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957270] Modules linked in:
xts gf128mul tun parport_pc ppdev lp parport bnep rfcomm bluetooth rf
kill pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) cpufreq_powersave
cpufreq_userspace cpufreq_stats cpufreq_conservative vboxdrv(O) binfm
t_misc fuse nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache
sunrpc ext2 dm_crypt hwmon_vid loop firewire_sbp2 snd_hda_codec_hdmi snd
_hda_intel joydev radeon ttm drm_kms_helper iTCO_wdt iTCO_vendor_support
snd_hda_controller drm i2c_algo_bit snd_hda_codec snd_hwdep snd_pcm
 i7core_edac snd_timer edac_core snd soundcore psmouse acpi_cpufreq
coretemp processor kvm_intel kvm microcode lpc_ich mfd_core pcspkr asus_
atk0110 ehci_pci mxm_wmi i2c_i801 i2c_core wmi serio_raw thermal_sys
evdev button ext4 crc16 jbd2 mbcache btrfs xor raid6_pq dm_mod raid1 md
_mod sg sd_mod crct10dif_generic crc_t10dif crct10dif_common hid_generic
usbhid hid crc32c_intel firewire_ohci r8169 firewire_core mii crc_i
tu_t sata_sil ahci libahci sata_mv uhci_hcd ehci_hcd
Jun 25 20:34:59 hsad-srv-03 kernel: libata xhci_hcd scsi_mod usbcore
usb_common
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957309] CPU: 3 PID: 31131
Comm: find Tainted: G           O  3.15.0 #1
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957310] Hardware name:
System manufacturer System Product Name/SABERTOOTH X58, BIOS 1304
08/02/2011
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957311]  0000000000000000
0000000000000009 ffffffff8138b54a ffff880001593b58
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957313]  ffffffff81039583
0000000000000000 ffffffffa01c8123 00000000000000b0
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957315]  00000000ffffffe4
ffff880625fb9000 ffff8801077e8e80 ffffffffa0247ac0
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957317] Call Trace:
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957321]
[<ffffffff8138b54a>] ? dump_stack+0x41/0x51
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957324]
[<ffffffff81039583>] ? warn_slowpath_common+0x78/0x90
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957331]
[<ffffffffa01c8123>] ? __btrfs_abort_transaction+0x46/0xf8 [btrfs]
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957333]
[<ffffffff81039633>] ? warn_slowpath_fmt+0x45/0x4a
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957340]
[<ffffffffa01c8123>] ? __btrfs_abort_transaction+0x46/0xf8 [btrfs]
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957348]
[<ffffffffa01d648a>] ? __btrfs_free_extent+0x80a/0x84d [btrfs]
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957351]
[<ffffffff8138db1c>] ? mutex_trylock+0x10/0x29
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957359]
[<ffffffffa01dabfe>] ? __btrfs_run_delayed_refs+0xae4/0xc2b [btrfs]
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957368]
[<ffffffffa01dc86c>] ? btrfs_run_delayed_refs+0x7b/0x17e [btrfs]
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957378]
[<ffffffffa01ea1d7>] ? __btrfs_end_transaction+0xe5/0x2c0 [btrfs]
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957389]
[<ffffffffa01ee9bb>] ? btrfs_dirty_inode+0x8c/0xa7 [btrfs]
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957391]
[<ffffffff8111f12d>] ? touch_atime+0xe3/0x11c
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957393]
[<ffffffff81119843>] ? iterate_dir+0x7c/0xa2
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957395]
[<ffffffff81119949>] ? SyS_getdents+0x74/0xca
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957397]
[<ffffffff811196ee>] ? filldir64+0xdd/0xdd
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957399]
[<ffffffff81394522>] ? system_call_fastpath+0x16/0x1b
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957400] ---[ end trace
8392ac15dafb7de4 ]---
Jun 25 20:34:59 hsad-srv-03 kernel: [614028.957422] BTRFS info (device
sdh): forced readonly

After that event, there are furher entries in the messages log, but
nothing interesting, only some dhcp infos. Three minutes later, the log
stopped without any message.

Does someone need further logs?

best regards,
Franziska



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-26 13:25 ` Duncan
  2014-06-26 13:47   ` Franziska Näpelt
@ 2014-06-27  0:49   ` Russell Coker
  2014-06-27  5:26     ` Duncan
  1 sibling, 1 reply; 15+ messages in thread
From: Russell Coker @ 2014-06-27  0:49 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

On Thu, 26 Jun 2014 13:25:09 Duncan wrote:
> properly.  And if you do automated snapshots, use a good thinning script 
> to thin them down so you're well under 500. (I've posted figures where 
> even starting with per-minute snapshots, thinning down to 10 minute, then 
> to half hour within the day, then to say 4/day after two days, 1/day 
> after a week, one a week after four weeks, one every 13 weeks aka 
> quarterly after say six months, and clearing them all and relying on off-
> machine backups after a year or 18 months, runs only 250-ish or so, under 
> 300.)

What scripts are people using for this?

I've attached the latest scripts I use for managing BTRFS, they assume that 
the filesystem is mounted on root.  They use /backup for snapshots of root and 
/backup-$DIR for snapshots of /$DIR.  I've also attached the scrub and 
rebalance script I use.

These aren't the greatest scripts, I'll probably write something better 
eventually if no-one else has done so.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/

[-- Attachment #2: btrfs-make-snapshot --]
[-- Type: application/x-shellscript, Size: 516 bytes --]

[-- Attachment #3: btrfs-remove-snapshots --]
[-- Type: application/x-shellscript, Size: 705 bytes --]

[-- Attachment #4: btrfs-scrub --]
[-- Type: application/x-shellscript, Size: 102 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-26 13:25 ` Duncan
@ 2014-06-26 13:47   ` Franziska Näpelt
  2014-06-27  0:49   ` Russell Coker
  1 sibling, 0 replies; 15+ messages in thread
From: Franziska Näpelt @ 2014-06-26 13:47 UTC (permalink / raw)
  To: linux-btrfs

Hi Duncan,
thank you very much for your detailed description. For this reason, i
will wait now and hope that everything will end well.

best regards,
Franziska


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
  2014-06-26 11:47 Franziska Näpelt
@ 2014-06-26 13:25 ` Duncan
  2014-06-26 13:47   ` Franziska Näpelt
  2014-06-27  0:49   ` Russell Coker
  0 siblings, 2 replies; 15+ messages in thread
From: Duncan @ 2014-06-26 13:25 UTC (permalink / raw)
  To: linux-btrfs

Franziska Näpelt posted on Thu, 26 Jun 2014 13:47:12 +0200 as excerpted:

> What do you mean with "if it seems the space_cache rebuild is
> interfering with further activity for too long"
> 
> The boot-process runs for five hours now. How long should i wait? What
> would you recommend?

Well, you have TiBs of capacity to work thru, and your drives will be 
doing a lot of seeking so they won't be running at anything like full 
rated speed.  Multiple TiB at say 10 MiB/sec progress... ~100 seconds/
gig, couple thousand gig... 55 hours?  That's the couple TB drive you 
mentioned, if it was near full.  I suspect it's doing something else too, 
hopefully finishing the delete, but if the I/O for that is fighting with 
the I/O for space_cache rebuild, given the size and the two I/O heavy 
tasks at once, it could take awhile.  Tho hopefully the space_cache 
rebuild should be done in a few hours and the rebuild will go faster 
after that.

Anyway, when you're talking 2 TB, even at a relatively brisk 100 MB/sec 
you're looking at five hours, so if it /is/ actually completing the 
delete, that's about the /minimum/ I'd expect.

As long as you see drive activity I'd not bother it, even if it's a day 
or two... or even more... I'd be evaluating whether to give up at the 
week-point, tho.

Note that we've had cases reported on-list where a resumed balance or the 
like can take a week, but at some point the I/O quit and it was 
apparently CPU.  At that point you gotta guess whether it's looping or 
the logic is just taking time but making (some) progress, and evaluate 
whether it's time to simply give up and restore from backup and eat the 
loss on anything not backed up.

One of the reasons snapshot-aware-defrag was disabled was because it 
simply didn't scale to thousands of snapshots well at all, and as long as 
it didn't run out of memory, it wasn't exactly locked up, but forward 
progress was close to zero and it would literally take over a week in 
some cases.  There's similar issues with the old quota code, tho there's 
patches reworking that but I'm not sure they're actually in or ready for 
mainline yet.  So I've been recommending not using quotas on btrfs -- if 
you NEED them, use a more mature filesystem where they actually work 
properly.  And if you do automated snapshots, use a good thinning script 
to thin them down so you're well under 500. (I've posted figures where 
even starting with per-minute snapshots, thinning down to 10 minute, then 
to half hour within the day, then to say 4/day after two days, 1/day 
after a week, one a week after four weeks, one every 13 weeks aka 
quarterly after say six months, and clearing them all and relying on off-
machine backups after a year or 18 months, runs only 250-ish or so, under 
300.)

Etc.  But of course even if you were doing all the wrong things it's a 
bit late to worry about it now, until you're back up and running.

But as I said, drive activity is a good sign.  I'd leave it alone as long 
as that's happening -- with that much data it could literally take days.  
If the drive activity stops, tho, that's when you gotta reevaluate 
whether it's worth waiting or not.

Meanwhile, if the drive activity /does/ stop, consider doing an alt-sysrq-
w, to get a trace of what's blocking.  Then wait say a half an hour and 
do another, and compare.  People report that sometimes you can see if 
it's making forward progress from that (if the blocked tasks seem to be 
in the same spot), or at least post it for the devs to look at -- that's 
actually one of their most requested things, tho I'm not sure how easy 
it'll be to capture without being able to get at the logs.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: cancel btrfs delete job
@ 2014-06-26 11:47 Franziska Näpelt
  2014-06-26 13:25 ` Duncan
  0 siblings, 1 reply; 15+ messages in thread
From: Franziska Näpelt @ 2014-06-26 11:47 UTC (permalink / raw)
  To: linux-btrfs

@Duncan

What do you mean with 
"if it seems the space_cache rebuild is 
interfering with further activity for too long"

The boot-process runs for five hours now. How long should i wait? What would you recommend?

best regards,
Franziska


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-06-27  8:57 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-24  7:50 cancel btrfs delete job Franziska Näpelt
2014-06-26  7:17 ` Satoru Takeuchi
     [not found]   ` <1403777123.7657.5.camel@hsew-frn.HIPERSCAN>
     [not found]     ` <53AC013E.5000702@jp.fujitsu.com>
2014-06-26 11:34       ` Franziska Näpelt
2014-06-26 23:29         ` Satoru Takeuchi
2014-06-27  8:55           ` Satoru Takeuchi
2014-06-26 11:06 ` Franziska Näpelt
2014-06-26 11:30   ` Duncan
2014-06-26 11:47 Franziska Näpelt
2014-06-26 13:25 ` Duncan
2014-06-26 13:47   ` Franziska Näpelt
2014-06-27  0:49   ` Russell Coker
2014-06-27  5:26     ` Duncan
2014-06-27  5:00 Franziska Näpelt
2014-06-27  5:58 ` Satoru Takeuchi
2014-06-27  8:09   ` Satoru Takeuchi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.