* btrfs mount takes too long time @ 2021-08-28 11:58 Jingyun He 2021-08-28 23:47 ` Anand Jain 0 siblings, 1 reply; 16+ messages in thread From: Jingyun He @ 2021-08-28 11:58 UTC (permalink / raw) To: linux-btrfs Hello, all I'm new to btrfs, I have a HM-SMR 14TB disk, I have formatted it to btrfs to store the files. When the device is almost full, it needs about 5 mins to mount the device. Is it normal? is there any mount option that I can use to reduce the mount time? Thank you. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-08-28 11:58 btrfs mount takes too long time Jingyun He @ 2021-08-28 23:47 ` Anand Jain [not found] ` <CAHQ7scVkHp8Lcfxx2QZXv2ghkW-nYpiFGntyZa0Toz2hU3S-tQ@mail.gmail.com> 0 siblings, 1 reply; 16+ messages in thread From: Anand Jain @ 2021-08-28 23:47 UTC (permalink / raw) To: Jingyun He, linux-btrfs On 28/08/2021 19:58, Jingyun He wrote: > Hello, all > I'm new to btrfs, I have a HM-SMR 14TB disk, I have formatted it to > btrfs to store the files. > > When the device is almost full, it needs about 5 mins to mount the device. > > Is it normal? is there any mount option that I can use to reduce the mount time? > We need to figure out the function taking a longer time (maybe it is read-block-groups). I have similar reports on the non zoned device as well (with a few TB full of data). But there is no good data yet to analyse. Could you please collect the trace data from the ftracegraph from here [1] (It needs trace-cmd). [1] https://github.com/asj/btrfstrace.git Run it as in the example below: umount /btrfs; ./ftracegraph open_ctree 3 "*:mod:btrfs" "mount /dev/vg/scratch0 /btrfs" cat /tmp/ftracegraph.out Thanks, Anand > Thank you. > ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAHQ7scVkHp8Lcfxx2QZXv2ghkW-nYpiFGntyZa0Toz2hU3S-tQ@mail.gmail.com>]
* Re: btrfs mount takes too long time [not found] ` <CAHQ7scVkHp8Lcfxx2QZXv2ghkW-nYpiFGntyZa0Toz2hU3S-tQ@mail.gmail.com> @ 2021-08-30 1:27 ` Anand Jain [not found] ` <CAHQ7scUiNLVD_4---ArBet-0DqzfmmH5Y9JgQY0grYrUv8yhiQ@mail.gmail.com> 0 siblings, 1 reply; 16+ messages in thread From: Anand Jain @ 2021-08-30 1:27 UTC (permalink / raw) To: Jingyun He; +Cc: linux-btrfs Our open_ctree() took around ~223secs (~3.7mins) 1) $ 223375750 us | } /* open_ctree [btrfs] */ Unfortunately, the default trace buffer per CPU (4K) wasn't sufficient and, the trace-buffer rolled over. So we still don't know how long we spent in btrfs_read_block_groups(). Sorry for my mistake we should go 1 step at a time and, we have to do this until we narrow it down to a specific function. Could you please run with the depth = 2 (instead of 3) and use the time command prefix. Also, pull a new ftracegraph as I have updated it to display a proper time output. $ ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/vg/scratch0 /btrfs" Thanks, Anand On 29/08/2021 17:42, Jingyun He wrote: > Hi, Anand > > I have attached the file. > Could you kindly check this? > > Thank you. > > On Sun, Aug 29, 2021 at 7:47 AM Anand Jain <anand.jain@oracle.com> wrote: >> >> On 28/08/2021 19:58, Jingyun He wrote: >>> Hello, all >>> I'm new to btrfs, I have a HM-SMR 14TB disk, I have formatted it to >>> btrfs to store the files. >>> >>> When the device is almost full, it needs about 5 mins to mount the device. >>> >>> Is it normal? is there any mount option that I can use to reduce the mount time? >>> >> >> >> We need to figure out the function taking a longer time (maybe it is >> read-block-groups). I have similar reports on the non zoned device >> as well (with a few TB full of data). But there is no good data yet >> to analyse. >> >> Could you please collect the trace data from the ftracegraph >> from here [1] (It needs trace-cmd). >> >> [1] https://github.com/asj/btrfstrace.git >> >> Run it as in the example below: >> >> umount /btrfs; >> >> ./ftracegraph open_ctree 3 "*:mod:btrfs" "mount /dev/vg/scratch0 /btrfs" >> >> cat /tmp/ftracegraph.out >> >> >> Thanks, Anand >> >> >> >>> Thank you. >>> >> ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAHQ7scUiNLVD_4---ArBet-0DqzfmmH5Y9JgQY0grYrUv8yhiQ@mail.gmail.com>]
* Re: btrfs mount takes too long time [not found] ` <CAHQ7scUiNLVD_4---ArBet-0DqzfmmH5Y9JgQY0grYrUv8yhiQ@mail.gmail.com> @ 2021-08-30 13:04 ` Anand Jain 2021-08-30 13:04 ` Anand Jain 1 sibling, 0 replies; 16+ messages in thread From: Anand Jain @ 2021-08-30 13:04 UTC (permalink / raw) To: Jingyun He; +Cc: linux-btrfs open_ctree() took 228254398 us. And 98% of it that is 225418272 us was taken by btrfs_read_block_groups(). ------------------- 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); 0) 0.865 us | btrfs_discard_resume [btrfs](); 0) $ 228254398 us | } /* open_ctree [btrfs] */ ------------------- Now we need to run the same thing on btrfs_read_block_groups(), could you please run.. [1] (no need of the time). [1] $ umount /btrfs; $./ftracegraph btrfs_read_block_groups 3 "*:mod:btrfs" "mount /dev/vg/scratch0 /btrfs" Thanks, Anand On 30/08/2021 14:44, Jingyun He wrote: > Hi, Anand, > I have attached the new result. > Kindly check. > > Thank you. > > On Mon, Aug 30, 2021 at 9:27 AM Anand Jain <anand.jain@oracle.com> wrote: >> >> >> Our open_ctree() took around ~223secs (~3.7mins) >> >> 1) $ 223375750 us | } /* open_ctree [btrfs] */ >> >> Unfortunately, the default trace buffer per CPU (4K) wasn't sufficient >> and, the trace-buffer rolled over. >> So we still don't know how long we spent in btrfs_read_block_groups(). >> Sorry for my mistake we should go 1 step at a time and, we have to do >> this until we narrow it down to a specific function. >> >> Could you please run with the depth = 2 (instead of 3) and use the time >> command prefix. Also, pull a new ftracegraph as I have updated it to >> display a proper time output. >> >> $ ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/vg/scratch0 >> /btrfs" >> >> Thanks, Anand >> >> On 29/08/2021 17:42, Jingyun He wrote: >>> Hi, Anand >>> >>> I have attached the file. >>> Could you kindly check this? >>> >>> Thank you. >>> >>> On Sun, Aug 29, 2021 at 7:47 AM Anand Jain <anand.jain@oracle.com> wrote: >>>> >>>> On 28/08/2021 19:58, Jingyun He wrote: >>>>> Hello, all >>>>> I'm new to btrfs, I have a HM-SMR 14TB disk, I have formatted it to >>>>> btrfs to store the files. >>>>> >>>>> When the device is almost full, it needs about 5 mins to mount the device. >>>>> >>>>> Is it normal? is there any mount option that I can use to reduce the mount time? >>>>> >>>> >>>> >>>> We need to figure out the function taking a longer time (maybe it is >>>> read-block-groups). I have similar reports on the non zoned device >>>> as well (with a few TB full of data). But there is no good data yet >>>> to analyse. >>>> >>>> Could you please collect the trace data from the ftracegraph >>>> from here [1] (It needs trace-cmd). >>>> >>>> [1] https://github.com/asj/btrfstrace.git >>>> >>>> Run it as in the example below: >>>> >>>> umount /btrfs; >>>> >>>> ./ftracegraph open_ctree 3 "*:mod:btrfs" "mount /dev/vg/scratch0 /btrfs" >>>> >>>> cat /tmp/ftracegraph.out >>>> >>>> >>>> Thanks, Anand >>>> >>>> >>>> >>>>> Thank you. >>>>> >>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time [not found] ` <CAHQ7scUiNLVD_4---ArBet-0DqzfmmH5Y9JgQY0grYrUv8yhiQ@mail.gmail.com> 2021-08-30 13:04 ` Anand Jain @ 2021-08-30 13:04 ` Anand Jain [not found] ` <CAHQ7scVOYuzz58KcO_fo2rph44CCC46ef=LFVZF8XzoKYq27ig@mail.gmail.com> 2021-09-01 16:11 ` Dāvis Mosāns 1 sibling, 2 replies; 16+ messages in thread From: Anand Jain @ 2021-08-30 13:04 UTC (permalink / raw) To: Jingyun He; +Cc: linux-btrfs open_ctree() took 228254398 us. And 98% of it that is 225418272 us was taken by btrfs_read_block_groups(). ------------------- 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); 0) 0.865 us | btrfs_discard_resume [btrfs](); 0) $ 228254398 us | } /* open_ctree [btrfs] */ ------------------- Now we need to run the same thing on btrfs_read_block_groups(), could you please run.. [1] (no need of the time). [1] $ umount /btrfs; $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount /dev/vg/scratch0 /btrfs" Thanks, Anand On 30/08/2021 14:44, Jingyun He wrote: > Hi, Anand, > I have attached the new result. > Kindly check. > > Thank you. > > On Mon, Aug 30, 2021 at 9:27 AM Anand Jain <anand.jain@oracle.com> wrote: >> >> >> Our open_ctree() took around ~223secs (~3.7mins) >> >> 1) $ 223375750 us | } /* open_ctree [btrfs] */ >> >> Unfortunately, the default trace buffer per CPU (4K) wasn't sufficient >> and, the trace-buffer rolled over. >> So we still don't know how long we spent in btrfs_read_block_groups(). >> Sorry for my mistake we should go 1 step at a time and, we have to do >> this until we narrow it down to a specific function. >> >> Could you please run with the depth = 2 (instead of 3) and use the time >> command prefix. Also, pull a new ftracegraph as I have updated it to >> display a proper time output. >> >> $ ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/vg/scratch0 >> /btrfs" >> >> Thanks, Anand >> >> On 29/08/2021 17:42, Jingyun He wrote: >>> Hi, Anand >>> >>> I have attached the file. >>> Could you kindly check this? >>> >>> Thank you. >>> >>> On Sun, Aug 29, 2021 at 7:47 AM Anand Jain <anand.jain@oracle.com> wrote: >>>> >>>> On 28/08/2021 19:58, Jingyun He wrote: >>>>> Hello, all >>>>> I'm new to btrfs, I have a HM-SMR 14TB disk, I have formatted it to >>>>> btrfs to store the files. >>>>> >>>>> When the device is almost full, it needs about 5 mins to mount the device. >>>>> >>>>> Is it normal? is there any mount option that I can use to reduce the mount time? >>>>> >>>> >>>> >>>> We need to figure out the function taking a longer time (maybe it is >>>> read-block-groups). I have similar reports on the non zoned device >>>> as well (with a few TB full of data). But there is no good data yet >>>> to analyse. >>>> >>>> Could you please collect the trace data from the ftracegraph >>>> from here [1] (It needs trace-cmd). >>>> >>>> [1] https://github.com/asj/btrfstrace.git >>>> >>>> Run it as in the example below: >>>> >>>> umount /btrfs; >>>> >>>> ./ftracegraph open_ctree 3 "*:mod:btrfs" "mount /dev/vg/scratch0 /btrfs" >>>> >>>> cat /tmp/ftracegraph.out >>>> >>>> >>>> Thanks, Anand >>>> >>>> >>>> >>>>> Thank you. >>>>> >>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAHQ7scVOYuzz58KcO_fo2rph44CCC46ef=LFVZF8XzoKYq27ig@mail.gmail.com>]
* Re: btrfs mount takes too long time [not found] ` <CAHQ7scVOYuzz58KcO_fo2rph44CCC46ef=LFVZF8XzoKYq27ig@mail.gmail.com> @ 2021-08-31 11:09 ` Anand Jain 2021-08-31 13:10 ` Johannes Thumshirn [not found] ` <CAHQ7scXp9gTANB9gWHqQgXUsnF-YvTOKrK1waUScmqvHU_0ifw@mail.gmail.com> 0 siblings, 2 replies; 16+ messages in thread From: Anand Jain @ 2021-08-31 11:09 UTC (permalink / raw) To: Jingyun He; +Cc: linux-btrfs btrfs_load_block_group_zone_info () and btrfs_search_slot() were the slowest and, they are called more than 2127 times each (trace buffer has rolled over there isn't all the data). Sorted in the ascending order of time spent.. and bottom few... <snip> 19472.64 us | btrfs_search_slot [btrfs](); 23294.26 us | btrfs_load_block_group_zone_info [btrfs](); 23315.21 us | btrfs_load_block_group_zone_info [btrfs](); 224534963 us | } / btrfs_read_block_groups [btrfs] / Most likely because the zoned devices are slow, and we have to read all the block groups during mount, so there is not much that can be done about it. But let's see what part in btrfs_load_block_group_zone_info() is taking most of the time. Could you please help get this output? $./ftracegraph btrfs_load_block_group_zone_info 2 "*:mod:btrfs" "mount > /dev/vg/scratch0 /btrfs" Thx. On 31/08/2021 01:17, Jingyun He wrote: > Hi, > I have attached the result of > $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount > /dev/vg/scratch0 /btrfs" > > Please kindly check. > Thank you very much. > > On Mon, Aug 30, 2021 at 9:05 PM Anand Jain <anand.jain@oracle.com> wrote: >> >> open_ctree() took 228254398 us. And 98% of it that is 225418272 us >> was taken by btrfs_read_block_groups(). >> >> ------------------- >> 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ >> 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); >> 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); >> 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); >> 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); >> 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); >> 0) 0.865 us | btrfs_discard_resume [btrfs](); >> 0) $ 228254398 us | } /* open_ctree [btrfs] */ >> ------------------- >> >> Now we need to run the same thing on btrfs_read_block_groups(), >> could you please run.. [1] (no need of the time). >> >> [1] >> $ umount /btrfs; >> $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount >> /dev/vg/scratch0 /btrfs" >> >> Thanks, Anand >> >> >> On 30/08/2021 14:44, Jingyun He wrote: >>> Hi, Anand, >>> I have attached the new result. >>> Kindly check. >>> >>> Thank you. >>> >>> On Mon, Aug 30, 2021 at 9:27 AM Anand Jain <anand.jain@oracle.com> wrote: >>>> >>>> >>>> Our open_ctree() took around ~223secs (~3.7mins) >>>> >>>> 1) $ 223375750 us | } /* open_ctree [btrfs] */ >>>> >>>> Unfortunately, the default trace buffer per CPU (4K) wasn't sufficient >>>> and, the trace-buffer rolled over. >>>> So we still don't know how long we spent in btrfs_read_block_groups(). >>>> Sorry for my mistake we should go 1 step at a time and, we have to do >>>> this until we narrow it down to a specific function. >>>> >>>> Could you please run with the depth = 2 (instead of 3) and use the time >>>> command prefix. Also, pull a new ftracegraph as I have updated it to >>>> display a proper time output. >>>> >>>> $ ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/vg/scratch0 >>>> /btrfs" >>>> >>>> Thanks, Anand >>>> >>>> On 29/08/2021 17:42, Jingyun He wrote: >>>>> Hi, Anand >>>>> >>>>> I have attached the file. >>>>> Could you kindly check this? >>>>> >>>>> Thank you. >>>>> >>>>> On Sun, Aug 29, 2021 at 7:47 AM Anand Jain <anand.jain@oracle.com> wrote: >>>>>> >>>>>> On 28/08/2021 19:58, Jingyun He wrote: >>>>>>> Hello, all >>>>>>> I'm new to btrfs, I have a HM-SMR 14TB disk, I have formatted it to >>>>>>> btrfs to store the files. >>>>>>> >>>>>>> When the device is almost full, it needs about 5 mins to mount the device. >>>>>>> >>>>>>> Is it normal? is there any mount option that I can use to reduce the mount time? >>>>>>> >>>>>> >>>>>> >>>>>> We need to figure out the function taking a longer time (maybe it is >>>>>> read-block-groups). I have similar reports on the non zoned device >>>>>> as well (with a few TB full of data). But there is no good data yet >>>>>> to analyse. >>>>>> >>>>>> Could you please collect the trace data from the ftracegraph >>>>>> from here [1] (It needs trace-cmd). >>>>>> >>>>>> [1] https://github.com/asj/btrfstrace.git >>>>>> >>>>>> Run it as in the example below: >>>>>> >>>>>> umount /btrfs; >>>>>> >>>>>> ./ftracegraph open_ctree 3 "*:mod:btrfs" "mount /dev/vg/scratch0 /btrfs" >>>>>> >>>>>> cat /tmp/ftracegraph.out >>>>>> >>>>>> >>>>>> Thanks, Anand >>>>>> >>>>>> >>>>>> >>>>>>> Thank you. >>>>>>> >>>>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-08-31 11:09 ` Anand Jain @ 2021-08-31 13:10 ` Johannes Thumshirn 2021-08-31 16:47 ` David Sterba [not found] ` <CAHQ7scXp9gTANB9gWHqQgXUsnF-YvTOKrK1waUScmqvHU_0ifw@mail.gmail.com> 1 sibling, 1 reply; 16+ messages in thread From: Johannes Thumshirn @ 2021-08-31 13:10 UTC (permalink / raw) To: Anand Jain, Jingyun He; +Cc: linux-btrfs, Naohiro Aota [ +Cc Naohiro ] On 31/08/2021 14:11, Anand Jain wrote: > But let's see what part in btrfs_load_block_group_zone_info() is taking > most of the time. Could you please help get this output? I'd suspect it's the btrfs_get_dev_zone() call that we need to do for each block group in order to determine the current write pointer to set block_group->alloc_offset. We could speed up the mount process by caching the device's REPORT ZONES response, as we're doing a REPORT ZONES once to get all zones and then again per block group load. On a 14TB SMR drive this results in (14 * 1024 * 1024) / 256 + 1 = 57345 REPORT ZONES calls. OTOH struct blk_zone is 64 bytes per zone resulting in 64 * 57344 = 3584kB data to be cached. So the solution should probably somewhere in between, aka caching X REPORT ZONES responses before we need to do a new one. Byte, Johannes ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-08-31 13:10 ` Johannes Thumshirn @ 2021-08-31 16:47 ` David Sterba 2021-09-07 5:29 ` Naohiro Aota 0 siblings, 1 reply; 16+ messages in thread From: David Sterba @ 2021-08-31 16:47 UTC (permalink / raw) To: Johannes Thumshirn; +Cc: Anand Jain, Jingyun He, linux-btrfs, Naohiro Aota On Tue, Aug 31, 2021 at 01:10:38PM +0000, Johannes Thumshirn wrote: > [ +Cc Naohiro ] > > On 31/08/2021 14:11, Anand Jain wrote: > > But let's see what part in btrfs_load_block_group_zone_info() is taking > > most of the time. Could you please help get this output? > > I'd suspect it's the btrfs_get_dev_zone() call that we need to do for > each block group in order to determine the current write pointer to > set block_group->alloc_offset. > > We could speed up the mount process by caching the device's REPORT ZONES > response, as we're doing a REPORT ZONES once to get all zones and then > again per block group load. On a 14TB SMR drive this results in > (14 * 1024 * 1024) / 256 + 1 = 57345 REPORT ZONES calls. OTOH > struct blk_zone is 64 bytes per zone resulting in 64 * 57344 = 3584kB > data to be cached. > > So the solution should probably somewhere in between, aka caching X > REPORT ZONES responses before we need to do a new one. That sounds like a good option. The number of new members in block group struct has grown due to zoned support, this means higher memory requirements, scaling up with device size. We'll have to do something about that eventually. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-08-31 16:47 ` David Sterba @ 2021-09-07 5:29 ` Naohiro Aota 0 siblings, 0 replies; 16+ messages in thread From: Naohiro Aota @ 2021-09-07 5:29 UTC (permalink / raw) To: dsterba, Johannes Thumshirn, Anand Jain, Jingyun He, linux-btrfs, Naohiro Aota On Tue, Aug 31, 2021 at 06:47:44PM +0200, David Sterba wrote: > On Tue, Aug 31, 2021 at 01:10:38PM +0000, Johannes Thumshirn wrote: > > [ +Cc Naohiro ] > > > > On 31/08/2021 14:11, Anand Jain wrote: > > > But let's see what part in btrfs_load_block_group_zone_info() is taking > > > most of the time. Could you please help get this output? > > > > I'd suspect it's the btrfs_get_dev_zone() call that we need to do for > > each block group in order to determine the current write pointer to > > set block_group->alloc_offset. > > > > We could speed up the mount process by caching the device's REPORT ZONES > > response, as we're doing a REPORT ZONES once to get all zones and then > > again per block group load. On a 14TB SMR drive this results in > > (14 * 1024 * 1024) / 256 + 1 = 57345 REPORT ZONES calls. OTOH > > struct blk_zone is 64 bytes per zone resulting in 64 * 57344 = 3584kB > > data to be cached. > > > > So the solution should probably somewhere in between, aka caching X > > REPORT ZONES responses before we need to do a new one. > > That sounds like a good option. The number of new members in block group > struct has grown due to zoned support, this means higher memory > requirements, scaling up with device size. We'll have to do something > about that eventually. As a reminder, I'm working on to implement the REPORT ZONES caching solution on mount time. But, I'm struggling to fill my btrfs to reproduce the issue due to some other bug (it hang while I'm filling it). So, it will take some more time implementing it. Thanks, ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAHQ7scXp9gTANB9gWHqQgXUsnF-YvTOKrK1waUScmqvHU_0ifw@mail.gmail.com>]
* Re: btrfs mount takes too long time [not found] ` <CAHQ7scXp9gTANB9gWHqQgXUsnF-YvTOKrK1waUScmqvHU_0ifw@mail.gmail.com> @ 2021-09-07 9:52 ` Anand Jain 0 siblings, 0 replies; 16+ messages in thread From: Anand Jain @ 2021-09-07 9:52 UTC (permalink / raw) To: Jingyun He; +Cc: linux-btrfs On 04/09/2021 20:45, Jingyun He wrote: > Hello, Anand > Sorry about the delay, > I have attached the result of > $./ftracegraph btrfs_load_block_group_zone_info 2 "*:mod:btrfs" "mount > /dev/vg/scratch0 /btrfs" The last trace has 8514 calls to btrfs_get_dev_zones() and, each takes around ~1.2ms to ~3ms. Naohiro is planning to cache the report zones. It would help. Thanks, Anand > Please kindly check. > > Thank you > > On Tue, Aug 31, 2021 at 7:09 PM Anand Jain <anand.jain@oracle.com> wrote: >> >> >> btrfs_load_block_group_zone_info () and btrfs_search_slot() were the >> slowest and, they are called more than 2127 times each (trace buffer >> has rolled over there isn't all the data). >> >> Sorted in the ascending order of time spent.. and bottom few... >> <snip> >> 19472.64 us | btrfs_search_slot [btrfs](); >> 23294.26 us | btrfs_load_block_group_zone_info [btrfs](); >> 23315.21 us | btrfs_load_block_group_zone_info [btrfs](); >> 224534963 us | } / btrfs_read_block_groups [btrfs] / >> >> Most likely because the zoned devices are slow, and we have to read >> all the block groups during mount, so there is not much that can be >> done about it. >> >> But let's see what part in btrfs_load_block_group_zone_info() is taking >> most of the time. Could you please help get this output? >> >> $./ftracegraph btrfs_load_block_group_zone_info 2 "*:mod:btrfs" "mount >> > /dev/vg/scratch0 /btrfs" >> >> Thx. >> >> >> On 31/08/2021 01:17, Jingyun He wrote: >>> Hi, >>> I have attached the result of >>> $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount >>> /dev/vg/scratch0 /btrfs" >>> >>> Please kindly check. >>> Thank you very much. >>> >>> On Mon, Aug 30, 2021 at 9:05 PM Anand Jain <anand.jain@oracle.com> wrote: >>>> >>>> open_ctree() took 228254398 us. And 98% of it that is 225418272 us >>>> was taken by btrfs_read_block_groups(). >>>> >>>> ------------------- >>>> 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ >>>> 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); >>>> 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); >>>> 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); >>>> 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); >>>> 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); >>>> 0) 0.865 us | btrfs_discard_resume [btrfs](); >>>> 0) $ 228254398 us | } /* open_ctree [btrfs] */ >>>> ------------------- >>>> >>>> Now we need to run the same thing on btrfs_read_block_groups(), >>>> could you please run.. [1] (no need of the time). >>>> >>>> [1] >>>> $ umount /btrfs; >>>> $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount >>>> /dev/vg/scratch0 /btrfs" >>>> >>>> Thanks, Anand >>>> >>>> >>>> On 30/08/2021 14:44, Jingyun He wrote: >>>>> Hi, Anand, >>>>> I have attached the new result. >>>>> Kindly check. >>>>> >>>>> Thank you. >>>>> >>>>> On Mon, Aug 30, 2021 at 9:27 AM Anand Jain <anand.jain@oracle.com> wrote: >>>>>> >>>>>> >>>>>> Our open_ctree() took around ~223secs (~3.7mins) >>>>>> >>>>>> 1) $ 223375750 us | } /* open_ctree [btrfs] */ >>>>>> >>>>>> Unfortunately, the default trace buffer per CPU (4K) wasn't sufficient >>>>>> and, the trace-buffer rolled over. >>>>>> So we still don't know how long we spent in btrfs_read_block_groups(). >>>>>> Sorry for my mistake we should go 1 step at a time and, we have to do >>>>>> this until we narrow it down to a specific function. >>>>>> >>>>>> Could you please run with the depth = 2 (instead of 3) and use the time >>>>>> command prefix. Also, pull a new ftracegraph as I have updated it to >>>>>> display a proper time output. >>>>>> >>>>>> $ ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/vg/scratch0 >>>>>> /btrfs" >>>>>> >>>>>> Thanks, Anand >>>>>> >>>>>> On 29/08/2021 17:42, Jingyun He wrote: >>>>>>> Hi, Anand >>>>>>> >>>>>>> I have attached the file. >>>>>>> Could you kindly check this? >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> On Sun, Aug 29, 2021 at 7:47 AM Anand Jain <anand.jain@oracle.com> wrote: >>>>>>>> >>>>>>>> On 28/08/2021 19:58, Jingyun He wrote: >>>>>>>>> Hello, all >>>>>>>>> I'm new to btrfs, I have a HM-SMR 14TB disk, I have formatted it to >>>>>>>>> btrfs to store the files. >>>>>>>>> >>>>>>>>> When the device is almost full, it needs about 5 mins to mount the device. >>>>>>>>> >>>>>>>>> Is it normal? is there any mount option that I can use to reduce the mount time? >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We need to figure out the function taking a longer time (maybe it is >>>>>>>> read-block-groups). I have similar reports on the non zoned device >>>>>>>> as well (with a few TB full of data). But there is no good data yet >>>>>>>> to analyse. >>>>>>>> >>>>>>>> Could you please collect the trace data from the ftracegraph >>>>>>>> from here [1] (It needs trace-cmd). >>>>>>>> >>>>>>>> [1] https://github.com/asj/btrfstrace.git >>>>>>>> >>>>>>>> Run it as in the example below: >>>>>>>> >>>>>>>> umount /btrfs; >>>>>>>> >>>>>>>> ./ftracegraph open_ctree 3 "*:mod:btrfs" "mount /dev/vg/scratch0 /btrfs" >>>>>>>> >>>>>>>> cat /tmp/ftracegraph.out >>>>>>>> >>>>>>>> >>>>>>>> Thanks, Anand >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-08-30 13:04 ` Anand Jain [not found] ` <CAHQ7scVOYuzz58KcO_fo2rph44CCC46ef=LFVZF8XzoKYq27ig@mail.gmail.com> @ 2021-09-01 16:11 ` Dāvis Mosāns 2021-09-01 21:30 ` Anand Jain 2021-09-01 23:42 ` Qu Wenruo 1 sibling, 2 replies; 16+ messages in thread From: Dāvis Mosāns @ 2021-09-01 16:11 UTC (permalink / raw) To: Anand Jain; +Cc: Jingyun He, Btrfs BTRFS, Johannes Thumshirn, David Sterba pirmd., 2021. g. 30. aug., plkst. 16:08 — lietotājs Anand Jain (<anand.jain@oracle.com>) rakstīja: > > open_ctree() took 228254398 us. And 98% of it that is 225418272 us > was taken by btrfs_read_block_groups(). > > ------------------- > 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ > 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); > 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); > 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); > 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); > 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); > 0) 0.865 us | btrfs_discard_resume [btrfs](); > 0) $ 228254398 us | } /* open_ctree [btrfs] */ > ------------------- > > Now we need to run the same thing on btrfs_read_block_groups(), > could you please run.. [1] (no need of the time). > > [1] > $ umount /btrfs; > $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount > /dev/vg/scratch0 /btrfs" > > Thanks, Anand > > Hi, I also have a btrfs filesystem that takes a while to mount. So I'm interested if this could be improved. $ ./ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/md127 -o space_cache=v2,compress=zstd,acl,subvol=Data /mnt/Data/" kernel.ftrace_enabled = 1 real 1m33,638s user 0m0,000s sys 0m1,130s Here's the trace output https://dāvis.lv/files/ftracegraph.out.gz The filesystem is on top of RAID6 mdadm array which is from 9x 3TB HDDs. Best regards, Dāvis ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-09-01 16:11 ` Dāvis Mosāns @ 2021-09-01 21:30 ` Anand Jain 2021-09-02 16:56 ` Dāvis Mosāns 2021-09-02 17:34 ` Joshua 2021-09-01 23:42 ` Qu Wenruo 1 sibling, 2 replies; 16+ messages in thread From: Anand Jain @ 2021-09-01 21:30 UTC (permalink / raw) To: Dāvis Mosāns Cc: Jingyun He, Btrfs BTRFS, Johannes Thumshirn, David Sterba On 02/09/2021 00:11, Dāvis Mosāns wrote: > pirmd., 2021. g. 30. aug., plkst. 16:08 — lietotājs Anand Jain > (<anand.jain@oracle.com>) rakstīja: >> >> open_ctree() took 228254398 us. And 98% of it that is 225418272 us >> was taken by btrfs_read_block_groups(). >> >> ------------------- >> 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ >> 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); >> 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); >> 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); >> 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); >> 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); >> 0) 0.865 us | btrfs_discard_resume [btrfs](); >> 0) $ 228254398 us | } /* open_ctree [btrfs] */ >> ------------------- >> >> Now we need to run the same thing on btrfs_read_block_groups(), >> could you please run.. [1] (no need of the time). >> >> [1] >> $ umount /btrfs; >> $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount >> /dev/vg/scratch0 /btrfs" >> >> Thanks, Anand >> >> > > Hi, > > I also have a btrfs filesystem that takes a while to mount. > So I'm interested if this could be improved. > > $ ./ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/md127 -o > space_cache=v2,compress=zstd,acl,subvol=Data /mnt/Data/" It is better if we don't use the time prefix for the mount command here. The ftrace, traces time syscall as well, which is unessential. And we lose a lot of trace-buffer to it. > kernel.ftrace_enabled = 1 > > real 1m33,638s > user 0m0,000s > sys 0m1,130s > > Here's the trace output https://dāvis.lv/files/ftracegraph.out.gz > > The filesystem is on top of RAID6 mdadm array which is from 9x 3TB HDDs. So here is a case of a non-zoned device. Again it is btrfs_read_block_groups() which is taking ~98% of the time. 3) $ 91607669 us | } /* btrfs_read_block_groups [btrfs] */ 3) # 9399.566 us | btrfs_check_rw_degradable [btrfs](); 3) 0.922 us | btrfs_apply_pending_changes [btrfs](); 3) ! 186.540 us | btrfs_read_qgroup_config [btrfs](); 3) * 26109.92 us | btrfs_get_root_ref [btrfs](); 3) + 23.965 us | btrfs_start_pre_rw_mount [btrfs](); 3) 1.192 us | btrfs_discard_resume [btrfs](); 3) $ 93501136 us | } /* open_ctree [btrfs] */ Could we pls get this? $ ./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount ..." Hopefully, there won't be a trace-buffer rollover here, as we saw in the other case so that we could account for all the time spent. Also, let's understand how many block groups are there. $ btrfs in dump-tree <dev> | grep BLOCK_GROUP_ITEM | wc -l Thx, Anand > Best regards, > Dāvis > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-09-01 21:30 ` Anand Jain @ 2021-09-02 16:56 ` Dāvis Mosāns 2021-09-03 9:56 ` Anand Jain 2021-09-02 17:34 ` Joshua 1 sibling, 1 reply; 16+ messages in thread From: Dāvis Mosāns @ 2021-09-02 16:56 UTC (permalink / raw) To: Anand Jain; +Cc: Jingyun He, Btrfs BTRFS, Johannes Thumshirn, David Sterba ceturtd., 2021. g. 2. sept., plkst. 00:31 — lietotājs Anand Jain (<anand.jain@oracle.com>) rakstīja: > > On 02/09/2021 00:11, Dāvis Mosāns wrote: > > pirmd., 2021. g. 30. aug., plkst. 16:08 — lietotājs Anand Jain > > (<anand.jain@oracle.com>) rakstīja: > >> > >> open_ctree() took 228254398 us. And 98% of it that is 225418272 us > >> was taken by btrfs_read_block_groups(). > >> > >> ------------------- > >> 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ > >> 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); > >> 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); > >> 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); > >> 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); > >> 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); > >> 0) 0.865 us | btrfs_discard_resume [btrfs](); > >> 0) $ 228254398 us | } /* open_ctree [btrfs] */ > >> ------------------- > >> > >> Now we need to run the same thing on btrfs_read_block_groups(), > >> could you please run.. [1] (no need of the time). > >> > >> [1] > >> $ umount /btrfs; > >> $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount > >> /dev/vg/scratch0 /btrfs" > >> > >> Thanks, Anand > >> > >> > > > > Hi, > > > > I also have a btrfs filesystem that takes a while to mount. > > So I'm interested if this could be improved. > > > > $ ./ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/md127 -o > > space_cache=v2,compress=zstd,acl,subvol=Data /mnt/Data/" > > It is better if we don't use the time prefix for the mount command > here. The ftrace, traces time syscall as well, which is unessential. > And we lose a lot of trace-buffer to it. > > > kernel.ftrace_enabled = 1 > > > > real 1m33,638s > > user 0m0,000s > > sys 0m1,130s > > > > Here's the trace output https://dāvis.lv/files/ftracegraph.out.gz > > > > The filesystem is on top of RAID6 mdadm array which is from 9x 3TB HDDs. > > So here is a case of a non-zoned device. > > Again it is btrfs_read_block_groups() which is taking ~98% of the time. > > 3) $ 91607669 us | } /* btrfs_read_block_groups [btrfs] */ > 3) # 9399.566 us | btrfs_check_rw_degradable [btrfs](); > 3) 0.922 us | btrfs_apply_pending_changes [btrfs](); > 3) ! 186.540 us | btrfs_read_qgroup_config [btrfs](); > 3) * 26109.92 us | btrfs_get_root_ref [btrfs](); > 3) + 23.965 us | btrfs_start_pre_rw_mount [btrfs](); > 3) 1.192 us | btrfs_discard_resume [btrfs](); > 3) $ 93501136 us | } /* open_ctree [btrfs] */ > > Could we pls get this? > > $ ./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount ..." > > Hopefully, there won't be a trace-buffer rollover here, as we saw in > the other case so that we could account for all the time spent. Sure, here https://dāvis.lv/files/ftracegraph_v2.out.gz > Also, let's understand how many block groups are there. > > $ btrfs in dump-tree <dev> | grep BLOCK_GROUP_ITEM | wc -l It's 22660 Also by the way `-t EXTENT_TREE` should be faster Best regards, Dāvis ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-09-02 16:56 ` Dāvis Mosāns @ 2021-09-03 9:56 ` Anand Jain 0 siblings, 0 replies; 16+ messages in thread From: Anand Jain @ 2021-09-03 9:56 UTC (permalink / raw) To: Dāvis Mosāns Cc: Jingyun He, Btrfs BTRFS, Johannes Thumshirn, David Sterba On 03/09/2021 00:56, Dāvis Mosāns wrote: > ceturtd., 2021. g. 2. sept., plkst. 00:31 — lietotājs Anand Jain > (<anand.jain@oracle.com>) rakstīja: >> >> On 02/09/2021 00:11, Dāvis Mosāns wrote: >>> pirmd., 2021. g. 30. aug., plkst. 16:08 — lietotājs Anand Jain >>> (<anand.jain@oracle.com>) rakstīja: >>>> >>>> open_ctree() took 228254398 us. And 98% of it that is 225418272 us >>>> was taken by btrfs_read_block_groups(). >>>> >>>> ------------------- >>>> 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ >>>> 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); >>>> 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); >>>> 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); >>>> 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); >>>> 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); >>>> 0) 0.865 us | btrfs_discard_resume [btrfs](); >>>> 0) $ 228254398 us | } /* open_ctree [btrfs] */ >>>> ------------------- >>>> >>>> Now we need to run the same thing on btrfs_read_block_groups(), >>>> could you please run.. [1] (no need of the time). >>>> >>>> [1] >>>> $ umount /btrfs; >>>> $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount >>>> /dev/vg/scratch0 /btrfs" >>>> >>>> Thanks, Anand >>>> >>>> >>> >>> Hi, >>> >>> I also have a btrfs filesystem that takes a while to mount. >>> So I'm interested if this could be improved. >>> >>> $ ./ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/md127 -o >>> space_cache=v2,compress=zstd,acl,subvol=Data /mnt/Data/" >> >> It is better if we don't use the time prefix for the mount command >> here. The ftrace, traces time syscall as well, which is unessential. >> And we lose a lot of trace-buffer to it. >> >>> kernel.ftrace_enabled = 1 >>> >>> real 1m33,638s >>> user 0m0,000s >>> sys 0m1,130s >>> >>> Here's the trace output https://dāvis.lv/files/ftracegraph.out.gz >>> >>> The filesystem is on top of RAID6 mdadm array which is from 9x 3TB HDDs. >> >> So here is a case of a non-zoned device. >> >> Again it is btrfs_read_block_groups() which is taking ~98% of the time. >> >> 3) $ 91607669 us | } /* btrfs_read_block_groups [btrfs] */ >> 3) # 9399.566 us | btrfs_check_rw_degradable [btrfs](); >> 3) 0.922 us | btrfs_apply_pending_changes [btrfs](); >> 3) ! 186.540 us | btrfs_read_qgroup_config [btrfs](); >> 3) * 26109.92 us | btrfs_get_root_ref [btrfs](); >> 3) + 23.965 us | btrfs_start_pre_rw_mount [btrfs](); >> 3) 1.192 us | btrfs_discard_resume [btrfs](); >> 3) $ 93501136 us | } /* open_ctree [btrfs] */ >> >> Could we pls get this? >> >> $ ./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount ..." >> >> Hopefully, there won't be a trace-buffer rollover here, as we saw in >> the other case so that we could account for all the time spent. > > Sure, here https://dāvis.lv/files/ftracegraph_v2.out.gz Ok. For conventional devices, it's only btrfs_search_slot() contributing to the mount time significantly. > >> Also, let's understand how many block groups are there. >> >> $ btrfs in dump-tree <dev> | grep BLOCK_GROUP_ITEM | wc -l > > It's 22660 > Also by the way `-t EXTENT_TREE` should be faster > > Best regards, > Dāvis > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-09-01 21:30 ` Anand Jain 2021-09-02 16:56 ` Dāvis Mosāns @ 2021-09-02 17:34 ` Joshua 1 sibling, 0 replies; 16+ messages in thread From: Joshua @ 2021-09-02 17:34 UTC (permalink / raw) To: Dāvis Mosāns, Anand Jain, Btrfs BTRFS Cc: Jingyun He, Johannes Thumshirn, David Sterba September 2, 2021 9:56 AM, "Dāvis Mosāns" <davispuh@gmail.com> wrote: > ceturtd., 2021. g. 2. sept., plkst. 00:31 — lietotājs Anand Jain > (<anand.jain@oracle.com>) rakstīja: > >> On 02/09/2021 00:11, Dāvis Mosāns wrote: >> pirmd., 2021. g. 30. aug., plkst. 16:08 — lietotājs Anand Jain >> (<anand.jain@oracle.com>) rakstīja: >> >> open_ctree() took 228254398 us. And 98% of it that is 225418272 us >> was taken by btrfs_read_block_groups(). >> >> ------------------- >> 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ >> 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); >> 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); >> 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); >> 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); >> 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); >> 0) 0.865 us | btrfs_discard_resume [btrfs](); >> 0) $ 228254398 us | } /* open_ctree [btrfs] */ >> ------------------- >> >> Now we need to run the same thing on btrfs_read_block_groups(), >> could you please run.. [1] (no need of the time). >> >> [1] >> $ umount /btrfs; >> $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount >> /dev/vg/scratch0 /btrfs" >> >> Thanks, Anand >> >> Hi, >> >> I also have a btrfs filesystem that takes a while to mount. >> So I'm interested if this could be improved. >> >> $ ./ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/md127 -o >> space_cache=v2,compress=zstd,acl,subvol=Data /mnt/Data/" >> >> It is better if we don't use the time prefix for the mount command >> here. The ftrace, traces time syscall as well, which is unessential. >> And we lose a lot of trace-buffer to it. >> >> kernel.ftrace_enabled = 1 >> >> real 1m33,638s >> user 0m0,000s >> sys 0m1,130s >> >> Here's the trace output https://dāvis.lv/files/ftracegraph.out.gz >> >> The filesystem is on top of RAID6 mdadm array which is from 9x 3TB HDDs. >> >> So here is a case of a non-zoned device. >> >> Again it is btrfs_read_block_groups() which is taking ~98% of the time. >> >> 3) $ 91607669 us | } /* btrfs_read_block_groups [btrfs] */ >> 3) # 9399.566 us | btrfs_check_rw_degradable [btrfs](); >> 3) 0.922 us | btrfs_apply_pending_changes [btrfs](); >> 3) ! 186.540 us | btrfs_read_qgroup_config [btrfs](); >> 3) * 26109.92 us | btrfs_get_root_ref [btrfs](); >> 3) + 23.965 us | btrfs_start_pre_rw_mount [btrfs](); >> 3) 1.192 us | btrfs_discard_resume [btrfs](); >> 3) $ 93501136 us | } /* open_ctree [btrfs] */ >> >> Could we pls get this? >> >> $ ./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount ..." >> >> Hopefully, there won't be a trace-buffer rollover here, as we saw in >> the other case so that we could account for all the time spent. > > Sure, here https://dāvis.lv/files/ftracegraph_v2.out.gz > >> Also, let's understand how many block groups are there. >> >> $ btrfs in dump-tree <dev> | grep BLOCK_GROUP_ITEM | wc -l > > It's 22660 > Also by the way `-t EXTENT_TREE` should be faster Just to add more data, I have a 10-device raid1 array with ~86TB used. My results: root@SERVER:~# btrfs in dump-tree -t EXTENT_TREE /dev/sdf | grep BLOCK_GROUP_ITEM | wc -l 44986 I estimate it takes ~83 seconds to mount. (by looking at dmesg) It sometimes times out on boot and drops to recovery mode, as systemd will only wait 90 seconds, and for some reason 'x-systemd.mount-timeout=500' doesn't work Note that I defragment the subvolume and extent tree for each subvolume weekly, as it seems to reduce my mount times, and make it *usually* boot without dropping to recovery mode. --Joshua ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: btrfs mount takes too long time 2021-09-01 16:11 ` Dāvis Mosāns 2021-09-01 21:30 ` Anand Jain @ 2021-09-01 23:42 ` Qu Wenruo 1 sibling, 0 replies; 16+ messages in thread From: Qu Wenruo @ 2021-09-01 23:42 UTC (permalink / raw) To: Dāvis Mosāns, Anand Jain Cc: Jingyun He, Btrfs BTRFS, Johannes Thumshirn, David Sterba On 2021/9/2 上午12:11, Dāvis Mosāns wrote: > pirmd., 2021. g. 30. aug., plkst. 16:08 — lietotājs Anand Jain > (<anand.jain@oracle.com>) rakstīja: >> >> open_ctree() took 228254398 us. And 98% of it that is 225418272 us >> was taken by btrfs_read_block_groups(). >> >> ------------------- >> 1) $ 225418272 us | } /* btrfs_read_block_groups [btrfs] */ As long as it's not zoned device, this is a known problem. Block group items by its original design is scattered across the huge extent tree. Grabbing them all needs quite a lot of random IO, for large fs it's very time consuming. I have purposed a skinny bg tree design to put all block group items into one tree, so that would greatly speed up the whole mount process. And with user-space tool to do the convert. But recently Josef is developing a completely new extent tree design, which will include a similar design for it. For know I'm not sure what's the way to go, either we can wait for extent-tree-v2, or should we push for the middle ground skinny bg tree? David, Any idea on this? Thanks, Qu >> 1) * 16934.96 us | btrfs_check_rw_degradable [btrfs](); >> 0) 0.967 us | btrfs_apply_pending_changes [btrfs](); >> 0) 0.239 us | btrfs_read_qgroup_config [btrfs](); >> 0) * 21017.34 us | btrfs_get_root_ref [btrfs](); >> 0) + 15.717 us | btrfs_start_pre_rw_mount [btrfs](); >> 0) 0.865 us | btrfs_discard_resume [btrfs](); >> 0) $ 228254398 us | } /* open_ctree [btrfs] */ >> ------------------- >> >> Now we need to run the same thing on btrfs_read_block_groups(), >> could you please run.. [1] (no need of the time). >> >> [1] >> $ umount /btrfs; >> $./ftracegraph btrfs_read_block_groups 2 "*:mod:btrfs" "mount >> /dev/vg/scratch0 /btrfs" >> >> Thanks, Anand >> >> > > Hi, > > I also have a btrfs filesystem that takes a while to mount. > So I'm interested if this could be improved. > > $ ./ftracegraph open_ctree 2 "*:mod:btrfs" "time mount /dev/md127 -o > space_cache=v2,compress=zstd,acl,subvol=Data /mnt/Data/" > kernel.ftrace_enabled = 1 > > real 1m33,638s > user 0m0,000s > sys 0m1,130s > > Here's the trace output https://dāvis.lv/files/ftracegraph.out.gz > > The filesystem is on top of RAID6 mdadm array which is from 9x 3TB HDDs. > > Best regards, > Dāvis > ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2021-09-07 9:53 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-28 11:58 btrfs mount takes too long time Jingyun He 2021-08-28 23:47 ` Anand Jain [not found] ` <CAHQ7scVkHp8Lcfxx2QZXv2ghkW-nYpiFGntyZa0Toz2hU3S-tQ@mail.gmail.com> 2021-08-30 1:27 ` Anand Jain [not found] ` <CAHQ7scUiNLVD_4---ArBet-0DqzfmmH5Y9JgQY0grYrUv8yhiQ@mail.gmail.com> 2021-08-30 13:04 ` Anand Jain 2021-08-30 13:04 ` Anand Jain [not found] ` <CAHQ7scVOYuzz58KcO_fo2rph44CCC46ef=LFVZF8XzoKYq27ig@mail.gmail.com> 2021-08-31 11:09 ` Anand Jain 2021-08-31 13:10 ` Johannes Thumshirn 2021-08-31 16:47 ` David Sterba 2021-09-07 5:29 ` Naohiro Aota [not found] ` <CAHQ7scXp9gTANB9gWHqQgXUsnF-YvTOKrK1waUScmqvHU_0ifw@mail.gmail.com> 2021-09-07 9:52 ` Anand Jain 2021-09-01 16:11 ` Dāvis Mosāns 2021-09-01 21:30 ` Anand Jain 2021-09-02 16:56 ` Dāvis Mosāns 2021-09-03 9:56 ` Anand Jain 2021-09-02 17:34 ` Joshua 2021-09-01 23:42 ` Qu Wenruo
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.