On 2019/9/9 下午11:29, Graham Cobb wrote:
> On 09/09/2019 13:18, Qu Wenruo wrote:
>>
>>
>> On 2019/9/9 下午7:25, zedlryqc@server53.web-hosting.com wrote:
>>> What I am complaining about is that at one point in time, after issuing
>>> the command:
>>>     btrfs balance start -dconvert=single -mconvert=single
>>> and before issuing the 'btrfs delete', the system could be in a too
>>> fragile state, with extents unnecesarily spread out over two drives,
>>> which is both a completely unnecessary operation, and it also seems to
>>> me that it could be dangerous in some situations involving potentially
>>> malfunctioning drives.
>>
>> In that case, you just need to replace that malfunctioning device other
>> than fall back to SINGLE.
> 
> Actually, this case is the (only) one of the three that I think would be
> very useful (backup is better handled by having a choice of userspace
> tools to choose from - I use btrbk - and does anyone really care about
> defrag any more?).
> 
> I did, recently, have a case where I had started to move my main data
> disk to a raid1 setup but my new disk started reporting errors. I didn't
> have a spare disk (and didn't have a spare SCSI slot for another disk
> anyway). So, I wanted to stop using the new disk and revert to my former
> (m=dup, d=single) setup as quickly as possible.
> 
> I spent time trying to find a way to do that balance without risking the
> single copy of some of the data being stored on the failing disk between
> starting the balance and completing the remove. That has two problems:
> obviously having the single copy on the failing disk is bad news but,
> also, it increases the time taken for the subsequent remove which has to
> copy that data back to the remaining disk (where there used to be a
> perfectly good copy which was subsequently thrown away during the balance).
> 
> In the end, I took the risk and the time of the two steps. In my case, I
> had good backups, and actually most of my data was still in a single
> profile on the old disk (because the errors starting happening before I
> had done the balance to change the profile of all the old data from
> single to raid1).
> 
> But a balance -dconvert=single-but-force-it-to-go-on-disk-1 would have
> been useful. (Actually a "btrfs device mark-for-removal" command would
> be better - allow a failing device to be retained for a while, and used
> to provide data, but ignore it when looking to store data).

Indeed, it makes sense.

It would be some user-defined chunk allocation behavior, in that case,
we need to double think about the interface first.

BTW, have you tried to mark the malfunctioning disk RO and mount it?

Thanks,
Qu
> 
> Graham
>