On 2020/5/5 下午8:41, Jeff Mahoney wrote:
> On 5/5/20 8:39 AM, Qu Wenruo wrote:
>>
>>
>> On 2020/5/5 下午8:36, Jeff Mahoney wrote:
>>> On 5/5/20 3:55 AM, Johannes Thumshirn wrote:
>>>> On 04/05/2020 23:59, Richard Weinberger wrote:
>>>>> Eric already raised doubts, let me ask more directly.
>>>>> Does the checksum tree really cover all moving parts of BTRFS?
>>>>>
>>>>> I'm a little surprised how small your patch is.
>>>>> Getting all this done for UBIFS was not easy and given that UBIFS is truly
>>>>> copy-on-write it was still less work than it would be for other filesystems.
>>>>>
>>>>> If I understand the checksum tree correctly, the main purpose is protecting
>>>>> you from flipping bits.
>>>>> An attacker will perform much more sophisticated attacks.
>>>>
>>>> [ Adding Jeff with whom I did the design work ]
>>>>
>>>> The checksum tree only covers the file-system payload. But combined with 
>>>> the checksum field, which is the start of every on-disk structure, we 
>>>> have all parts of the filesystem checksummed.
>>>
>>> That the checksums were originally intended for bitflip protection isn't
>>> really relevant.  Using a different algorithm doesn't change the
>>> fundamentals and the disk format was designed to use larger checksums
>>> than crc32c.  The checksum tree covers file data.  The contextual
>>> information is in the metadata describing the disk blocks and all the
>>> metadata blocks have internal checksums that would also be
>>> authenticated.  The only weak spot is that there has been a historical
>>> race where a user submits a write using direct i/o and modifies the data
>>> while in flight.  This will cause CRC failures already and that would
>>> still happen with this.
>>>
>>> All that said, the biggest weak spot I see in the design was mentioned
>>> on LWN: We require the key to mount the file system at all and there's
>>> no way to have a read-only but still verifiable file system.  That's
>>> worth examining further.
>>
>> That can be done easily, with something like ignore_auth mount option to
>> completely skip hmac csum check (of course, need full RO mount, no log
>> replay, no way to remount rw), completely rely on bytenr/gen/first_key
>> and tree-checker to verify the fs.
> 
> But then you lose even bitflip protection.

That's why we have tree-checker for metadata.

Most detected bitflips look like from readtime tree-checker, as most of
them are bit flip in memory.
It looks like bitflip in block device is less common, as most physical
block devices have internal checksum. Bitflip there tends to cause EIO
other than bad data.

For data part, I have to admit that we lose the check completely, but
read-only mount is still better than unable to mount at all.

However such ignore_auth may need extra attention on the device assembly
part, as it can be another attacking vector (e.g. create extra device
with higher generation to override the genuine device), so it will not
be that easy as I thought.

Thanks,
Qu

> 
> -Jeff
>