On 2019/8/20 上午10:24, Chao Yu wrote: > On 2019/8/20 8:55, Qu Wenruo wrote: >> [...] >>>>> I have made a simple fuzzer to inject messy in inode metadata, >>>>> dir data, compressed indexes and super block, >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/commit/?h=experimental-fuzzer >>>>> >>>>> I am testing with some given dirs and the following script. >>>>> Does it look reasonable? >>>>> >>>>> # !/bin/bash >>>>> >>>>> mkdir -p mntdir >>>>> >>>>> for ((i=0; i<1000; ++i)); do >>>>> mkfs/mkfs.erofs -F$i testdir_fsl.fuzz.img testdir_fsl > /dev/null 2>&1 >>>> >>>> mkfs fuzzes the image? Er.... >>> >>> Thanks for your reply. >>> >>> First, This is just the first step of erofs fuzzer I wrote yesterday night... >>> >>>> >>>> Over in XFS land we have an xfs debugging tool (xfs_db) that knows how >>>> to dump (and write!) most every field of every metadata type. This >>>> makes it fairly easy to write systematic level 0 fuzzing tests that >>>> check how well the filesystem reacts to garbage data (zeroing, >>>> randomizing, oneing, adding and subtracting small integers) in a field. >>>> (It also knows how to trash entire blocks.) >> >> The same tool exists for btrfs, although lacks the write ability, but >> that dump is more comprehensive and a great tool to learn the on-disk >> format. >> >> >> And for the fuzzing defending part, just a few kernel releases ago, >> there is none for btrfs, and now we have a full static verification >> layer to cover (almost) all on-disk data at read and write time. >> (Along with enhanced runtime check) >> >> We have covered from vague values inside tree blocks and invalid/missing >> cross-ref find at runtime. >> >> Currently the two layered check works pretty fine (well, sometimes too >> good to detect older, improper behaved kernel). >> - Tree blocks with vague data just get rejected by verification layer >> So that all members should fit on-disk format, from alignment to >> generation to inode mode. >> >> The error will trigger a good enough (TM) error message for developer >> to read, and if we have other copies, we retry other copies just as >> we hit a bad copy. >> >> - At runtime, we have much less to check >> Only cross-ref related things can be wrong now. since everything >> inside a single tree block has already be checked. >> >> In fact, from my respect of view, such read time check should be there >> from the very beginning. >> It acts kinda of a on-disk format spec. (In fact, by implementing the >> verification layer itself, it already exposes a lot of btrfs design >> trade-offs) >> >> Even for a fs as complex (buggy) as btrfs, we only take 1K lines to >> implement the verification layer. >> So I'd like to see every new mainlined fs to have such ability. > > Out of curiosity, it looks like every mainstream filesystem has its own > fuzz/injection tool in their tool-set, if it's really such a generic > requirement, why shouldn't there be a common tool to handle that, let specified > filesystem fill the tool's callback to seek a node/block and supported fields > can be fuzzed in inode. It could be possible for XFS/EXT* to share the same infrastructure without much hassle. (If not considering external journal) But for btrfs, it's like a regular fs on a super large dm-linear, which further builds its chunks on different dm-raid1/dm-linear/dm-raid56. So not sure if it's possible for btrfs, as it contains its logical address layer bytenr (the most common one) along with per-chunk physical mapping bytenr (in another tree). It may depends on the granularity. But definitely a good idea to do so in a generic way. Currently we depend on super kind student developers/reporters on such fuzzed images, and developers sometimes get inspired by real world corruption (or his/her mood) to add some valid but hard-to-hit corner case check. Thanks, Qu > It can help to avoid redundant work whenever Linux > welcomes a new filesystem.... > > Thanks, > >> >>> >>> Actually, compared with XFS, EROFS has rather simple on-disk format. >>> What we inject one time is quite deterministic. >>> >>> The first step just purposely writes some random fuzzed data to >>> the base inode metadata, compressed indexes, or dir data field >>> (one round one field) to make it validity and coverability. >>> >>>> >>>> You might want to write such a debugging tool for erofs so that you can >>>> take apart crashed images to get a better idea of what went wrong, and >>>> to write easy fuzzing tests. >>> >>> Yes, we will do such a debugging tool of course. Actually Li Guifu is now >>> developping a erofs-fuse to support old linux versions or other OSes for >>> archiveing only use, we will base on that code to develop a better fuzzer >>> tool as well. >> >> Personally speaking, debugging tool is way more important than a running >> kernel module/fuse. >> It's human trying to write the code, most of time is spent educating >> code readers, thus debugging tool is way more important than dead cold code. >> >> Thanks, >> Qu >>> >>> Thanks, >>> Gao Xiang >>> >>>> >>>> --D >>>> >>>>> umount mntdir >>>>> mount -t erofs -o loop testdir_fsl.fuzz.img mntdir >>>>> for j in `find mntdir -type f`; do >>>>> md5sum $j > /dev/null >>>>> done >>>>> done >>>>> >>>>> Thanks, >>>>> Gao Xiang >>>>> >>>>>> >>>>>> Thanks, >>>>>> Gao Xiang >>>>>> >>