On 2018年02月13日 20:26, Holger Hoffstätte wrote: > On 02/13/18 12:40, Qu Wenruo wrote: >>>> The problem is not about how much space it takes, but how many extents >>>> are here in the filesystem. > > I have no idea why btrfs' mount even needs to touch all block groups to > get going (which seems to be the root of the problem), but here's a > not so crazy idea for more "mechanical sympathy". Feel free to mock > me if this is terribly wrong or not possible. ;) > > Mounting of even large filesystems (with many extents) seems to be fine > on SSDS, but not so fine on rotational storage. We've heard that from > several people with large (multi-TB) filesystems, and obviously it's > even more terrible on 5400RPM drives because their seeks are sooo sloow. > > If the problem is that the bgs are touched/iterated in "tree order", > would it then not be possible to sort the block groups in physical order > before trying to load whatever mount needs to load? This is in fact a good idea. Make block group into its own tree. But it will takes a lot of work to do, since we a modifying the on-disk format. In that case, a leaf with default leaf size (16K) can store 678 block group items. And that many block groups can contain data between 169G (256M metadata size) to 1.6T (10G for max data chunk size). And even for tens of tegas, a level-2 tree should handle it without problem, and searching them should be quite fast. The only problem is, I'm not sure if there will be enough developer interesting with this idea, and this idea may have extra problems hidden. Thanks, Qu > That way the entire > process would involve less seeking (no backward seeks for one) and the > drive could very likely get more done during a rotation before stepping > further. > > cheers, > Holger >