* REPORT: FIO random read performance degradation without "norandommap" option @ 2015-01-09 22:42 Kudryavtsev, Andrey O 2015-01-09 22:49 ` Elliott, Robert (Server Storage) 0 siblings, 1 reply; 7+ messages in thread From: Kudryavtsev, Andrey O @ 2015-01-09 22:42 UTC (permalink / raw) To: fio [-- Attachment #1: Type: text/plain, Size: 1033 bytes --] Colleagues, I executed 2-hour runs of 4KRR to understand performance changes across the time on the specific very fast NVMe SSD with 1.6TB capacity. I noticed the side effect of “norandommap” parameter performing full span test on the block device. Here is the example of the result with random map (I.e. without "norandommap" option) within 120 minutes windows. [cid:E6872B64-35D1-4447-A0CF-32E6411D9BDB] (IOPS in blue) As soon as I enabled “norandommap” option the curve has changed into the straight line as expected. Some technical details: I’m running Centos 7 with 3.18 kernel, SSD of course in the precondition state. FIO 2.2.2 (I unfortunately got higher CPU utilization with 2.2.4 which I’ll report separately). Config file: [global] name=4k random read filename=/dev/nvme0n1 ioengine=libaio direct=1 bs=4k rw=randread iodepth=16 numjobs=8 buffered=0 size=100% randrepeat=0 norandommap refill_buffers group_reporting [job1] -- Andrey Kudryavtsev, SSD Solution Architect Intel Corp. [-- Attachment #2: Screen Shot 2015-01-09 at 2.33.53 PM.png --] [-- Type: image/png, Size: 15762 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: REPORT: FIO random read performance degradation without "norandommap" option 2015-01-09 22:42 REPORT: FIO random read performance degradation without "norandommap" option Kudryavtsev, Andrey O @ 2015-01-09 22:49 ` Elliott, Robert (Server Storage) 2015-01-09 23:01 ` Kudryavtsev, Andrey O 2015-01-09 23:07 ` Kulkarni, Vasu 0 siblings, 2 replies; 7+ messages in thread From: Elliott, Robert (Server Storage) @ 2015-01-09 22:49 UTC (permalink / raw) To: Kudryavtsev, Andrey O, fio > -----Original Message----- > From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On > Behalf Of Kudryavtsev, Andrey O > Sent: Friday, 09 January, 2015 4:42 PM > To: fio@vger.kernel.org > Subject: REPORT: FIO random read performance degradation without > "norandommap" option > > Colleagues, > I executed 2-hour runs of 4KRR to understand performance changes across > the time on the specific very fast NVMe SSD with 1.6TB capacity. > I noticed the side effect of "norandommap" parameter performing full span > test on the block device. > Here is the example of the result with random map (I.e. without > "norandommap" option) within 120 minutes windows. > [cid:E6872B64-35D1-4447-A0CF-32E6411D9BDB] > (IOPS in blue) > > As soon as I enabled "norandommap" option the curve has changed into the > straight line as expected. It takes resources to maintain the random map table. I always run with norandommap unless using verify, which has to remember which ones have been accessed. Here's a description I gave someone a while back: With a huge device (e.g., 5.8 TB from RAID-0 made from 16 SSDs), if you do not use "norandommap", fio allocates a bitmap for all the disk blocks to keep track of where it has read or written. It uses this to avoid accessing the same blocks until all the blocks have been accessed, and to know which blocks it needs to verify if verify=<something> is enabled. For 5.8 TB, that is 1562714136 = 1.5 GB. Not many of those huge allocations work, so it * hangs the system for a while * generates estimates like [eta 1158050440d:06h:50m:22s] * and eventually reports smalloc: failed adding pool fio: failed allocating random map. If running a large number of jobs, try the 'norandommap' option or set 'softrandommap'. Or give a larger --alloc-size to fio. fio continues to run after that; I think it verifies only the devices for which the allocation worked and ignores the rest. --- Rob Elliott HP Server Storage ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: REPORT: FIO random read performance degradation without "norandommap" option 2015-01-09 22:49 ` Elliott, Robert (Server Storage) @ 2015-01-09 23:01 ` Kudryavtsev, Andrey O 2015-01-10 9:20 ` Andrey Kuzmin 2015-01-09 23:07 ` Kulkarni, Vasu 1 sibling, 1 reply; 7+ messages in thread From: Kudryavtsev, Andrey O @ 2015-01-09 23:01 UTC (permalink / raw) To: Elliott, Robert (Server Storage), fio Robert, thanks for the reply. That�s exactly confirms my findings. Any idea if LBA map is generated for every job in case of multiple numjobs? The fact of the performance recovery after degradation was completely confusing too. I�m glad the solution was so easy. Jens, are there any chance to add the note into the documentation? -- Andrey Kudryavtsev, SSD Solution Architect Intel Corp. On 1/9/15, 2:49 PM, "Elliott, Robert (Server Storage)" <Elliott@hp.com> wrote: > >> -----Original Message----- >> From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On >> Behalf Of Kudryavtsev, Andrey O >> Sent: Friday, 09 January, 2015 4:42 PM >> To: fio@vger.kernel.org >> Subject: REPORT: FIO random read performance degradation without >> "norandommap" option >> >> Colleagues, >> I executed 2-hour runs of 4KRR to understand performance changes across >> the time on the specific very fast NVMe SSD with 1.6TB capacity. >> I noticed the side effect of �norandommap� parameter performing full >>span >> test on the block device. >> Here is the example of the result with random map (I.e. without >> "norandommap" option) within 120 minutes windows. >> [cid:E6872B64-35D1-4447-A0CF-32E6411D9BDB] >> (IOPS in blue) >> >> As soon as I enabled �norandommap� option the curve has changed into the >> straight line as expected. > >It takes resources to maintain the random map table. I always run with >norandommap unless using verify, which has to remember which ones have >been accessed. > >Here's a description I gave someone a while back: >With a huge device (e.g., 5.8 TB from RAID-0 made from 16 SSDs), >if you do not use "norandommap", fio allocates a bitmap for all >the disk blocks to keep track of where it has read or written. >It uses this to avoid accessing the same blocks until all >the blocks have been accessed, and to know which blocks it >needs to verify if verify=<something> is enabled. > >For 5.8 TB, that is 1562714136 = 1.5 GB. Not many of those >huge allocations work, so it >* hangs the system for a while >* generates estimates like > [eta 1158050440d:06h:50m:22s] >* and eventually reports > smalloc: failed adding pool > fio: failed allocating random map. If running a large number of >jobs, try the 'norandommap' option or set 'softrandommap'. Or give a >larger --alloc-size to fio. > >fio continues to run after that; I think it verifies only >the devices for which the allocation worked and ignores >the rest. > > >--- >Rob Elliott HP Server Storage > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: REPORT: FIO random read performance degradation without "norandommap" option 2015-01-09 23:01 ` Kudryavtsev, Andrey O @ 2015-01-10 9:20 ` Andrey Kuzmin 2015-01-12 21:06 ` Kudryavtsev, Andrey O 0 siblings, 1 reply; 7+ messages in thread From: Andrey Kuzmin @ 2015-01-10 9:20 UTC (permalink / raw) To: Kudryavtsev, Andrey O; +Cc: Elliott, Robert (Server Storage), fio On Sat, Jan 10, 2015 at 2:01 AM, Kudryavtsev, Andrey O <andrey.o.kudryavtsev@intel.com> wrote: > Robert, thanks for the reply. > Thatąs exactly confirms my findings. Any idea if LBA map is generated for > every job in case of multiple numjobs? > > The fact of the performance recovery after degradation was completely > confusing too. Iąm glad the solution was so easy. My guess is the problem is not not with the option itself, but with the way you're trying to use it. randommap tracks written blocks to avoid rewrite, in particular when one wants to verify data integrity after the run. In read/write workloads, the option might be utilized to avoid reading unallocated space on a thin-provisioned volume. In read-only workloads, there is no point in using random map. Regarding random map RAM footprint, it's quite modest (around 1.01 bits per minimum block size of the workload, 4K in your case), so one should only see the memory pressure with capacities in the tenths of TB range. Regards, Andrey > > Jens, are there any chance to add the note into the documentation? > > -- > Andrey Kudryavtsev, > > SSD Solution Architect > Intel Corp. > > > > On 1/9/15, 2:49 PM, "Elliott, Robert (Server Storage)" <Elliott@hp.com> > wrote: > >> >>> -----Original Message----- >>> From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On >>> Behalf Of Kudryavtsev, Andrey O >>> Sent: Friday, 09 January, 2015 4:42 PM >>> To: fio@vger.kernel.org >>> Subject: REPORT: FIO random read performance degradation without >>> "norandommap" option >>> >>> Colleagues, >>> I executed 2-hour runs of 4KRR to understand performance changes across >>> the time on the specific very fast NVMe SSD with 1.6TB capacity. >>> I noticed the side effect of łnorandommap˛ parameter performing full >>>span >>> test on the block device. >>> Here is the example of the result with random map (I.e. without >>> "norandommap" option) within 120 minutes windows. >>> [cid:E6872B64-35D1-4447-A0CF-32E6411D9BDB] >>> (IOPS in blue) >>> >>> As soon as I enabled łnorandommap˛ option the curve has changed into the >>> straight line as expected. >> >>It takes resources to maintain the random map table. I always run with >>norandommap unless using verify, which has to remember which ones have >>been accessed. >> >>Here's a description I gave someone a while back: >>With a huge device (e.g., 5.8 TB from RAID-0 made from 16 SSDs), >>if you do not use "norandommap", fio allocates a bitmap for all >>the disk blocks to keep track of where it has read or written. >>It uses this to avoid accessing the same blocks until all >>the blocks have been accessed, and to know which blocks it >>needs to verify if verify=<something> is enabled. >> >>For 5.8 TB, that is 1562714136 = 1.5 GB. Not many of those >>huge allocations work, so it >>* hangs the system for a while >>* generates estimates like >> [eta 1158050440d:06h:50m:22s] >>* and eventually reports >> smalloc: failed adding pool >> fio: failed allocating random map. If running a large number of >>jobs, try the 'norandommap' option or set 'softrandommap'. Or give a >>larger --alloc-size to fio. >> >>fio continues to run after that; I think it verifies only >>the devices for which the allocation worked and ignores >>the rest. >> >> >>--- >>Rob Elliott HP Server Storage >> >> > > -- > To unsubscribe from this list: send the line "unsubscribe fio" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: REPORT: FIO random read performance degradation without "norandommap" option 2015-01-10 9:20 ` Andrey Kuzmin @ 2015-01-12 21:06 ` Kudryavtsev, Andrey O 2015-01-12 22:56 ` Elliott, Robert (Server Storage) 0 siblings, 1 reply; 7+ messages in thread From: Kudryavtsev, Andrey O @ 2015-01-12 21:06 UTC (permalink / raw) To: Andrey Kuzmin; +Cc: Elliott, Robert (Server Storage), fio Hi Andrey, Thank you, it explains the purpose of this feature. Unfortunately it was not clear to me from the manual, I faulty supposed it increases the randomness of the random generator distribution. Thanks again! I’d propose to have it auto set to "norandommap" if “randread” workload is specified. Does it make sense? -- Andrey Kudryavtsev, SSD Solution Architect Intel Corp. On 1/10/15, 1:20 AM, "Andrey Kuzmin" <andrey.v.kuzmin@gmail.com> wrote: >On Sat, Jan 10, 2015 at 2:01 AM, Kudryavtsev, Andrey O ><andrey.o.kudryavtsev@intel.com> wrote: >> Robert, thanks for the reply. >> Thatąs exactly confirms my findings. Any idea if LBA map is generated >>for >> every job in case of multiple numjobs? >> >> The fact of the performance recovery after degradation was completely >> confusing too. Iąm glad the solution was so easy. > >My guess is the problem is not not with the option itself, but with >the way you're trying to use it. randommap tracks written blocks to >avoid rewrite, in particular when one wants to verify data integrity >after the run. In read/write workloads, the option might be utilized >to avoid reading unallocated space on a thin-provisioned volume. In >read-only workloads, there is no point in using random map. > >Regarding random map RAM footprint, it's quite modest (around 1.01 >bits per minimum block size of the workload, 4K in your case), so one >should only see the memory pressure with capacities in the tenths of >TB range. > >Regards, >Andrey > >> >> Jens, are there any chance to add the note into the documentation? >> >> -- >> Andrey Kudryavtsev, >> >> SSD Solution Architect >> Intel Corp. >> >> >> >> On 1/9/15, 2:49 PM, "Elliott, Robert (Server Storage)" <Elliott@hp.com> >> wrote: >> >>> >>>> -----Original Message----- >>>> From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On >>>> Behalf Of Kudryavtsev, Andrey O >>>> Sent: Friday, 09 January, 2015 4:42 PM >>>> To: fio@vger.kernel.org >>>> Subject: REPORT: FIO random read performance degradation without >>>> "norandommap" option >>>> >>>> Colleagues, >>>> I executed 2-hour runs of 4KRR to understand performance changes >>>>across >>>> the time on the specific very fast NVMe SSD with 1.6TB capacity. >>>> I noticed the side effect of łnorandommap˛ parameter performing full >>>>span >>>> test on the block device. >>>> Here is the example of the result with random map (I.e. without >>>> "norandommap" option) within 120 minutes windows. >>>> [cid:E6872B64-35D1-4447-A0CF-32E6411D9BDB] >>>> (IOPS in blue) >>>> >>>> As soon as I enabled łnorandommap˛ option the curve has changed into >>>>the >>>> straight line as expected. >>> >>>It takes resources to maintain the random map table. I always run with >>>norandommap unless using verify, which has to remember which ones have >>>been accessed. >>> >>>Here's a description I gave someone a while back: >>>With a huge device (e.g., 5.8 TB from RAID-0 made from 16 SSDs), >>>if you do not use "norandommap", fio allocates a bitmap for all >>>the disk blocks to keep track of where it has read or written. >>>It uses this to avoid accessing the same blocks until all >>>the blocks have been accessed, and to know which blocks it >>>needs to verify if verify=<something> is enabled. >>> >>>For 5.8 TB, that is 1562714136 = 1.5 GB. Not many of those >>>huge allocations work, so it >>>* hangs the system for a while >>>* generates estimates like >>> [eta 1158050440d:06h:50m:22s] >>>* and eventually reports >>> smalloc: failed adding pool >>> fio: failed allocating random map. If running a large number of >>>jobs, try the 'norandommap' option or set 'softrandommap'. Or give a >>>larger --alloc-size to fio. >>> >>>fio continues to run after that; I think it verifies only >>>the devices for which the allocation worked and ignores >>>the rest. >>> >>> >>>--- >>>Rob Elliott HP Server Storage >>> >>> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe fio" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >-- >To unsubscribe from this list: send the line "unsubscribe fio" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: REPORT: FIO random read performance degradation without "norandommap" option 2015-01-12 21:06 ` Kudryavtsev, Andrey O @ 2015-01-12 22:56 ` Elliott, Robert (Server Storage) 0 siblings, 0 replies; 7+ messages in thread From: Elliott, Robert (Server Storage) @ 2015-01-12 22:56 UTC (permalink / raw) To: Kudryavtsev, Andrey O, Andrey Kuzmin; +Cc: fio > -----Original Message----- > From: Kudryavtsev, Andrey O [mailto:andrey.o.kudryavtsev@intel.com] > Sent: Monday, 12 January, 2015 3:06 PM > To: Andrey Kuzmin > Cc: Elliott, Robert (Server Storage); fio@vger.kernel.org > Subject: Re: REPORT: FIO random read performance degradation without > "norandommap" option > > Hi Andrey, > Thank you, it explains the purpose of this feature. Unfortunately it was > not clear to me from the manual, I faulty supposed it increases the > randomness of the random generator distribution. Thanks again! > > I’d propose to have it auto set to "norandommap" if “randread” workload is > specified. Does it make sense? > I don't think I'd change it. For randread, randommap ensures that each LBA has been accessed once before it repeats reading any of the LBAs. That could be useful in some cases, like scanning the media (ensuring you've read each LBA). --- Rob Elliott HP Server Storage ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: REPORT: FIO random read performance degradation without "norandommap" option 2015-01-09 22:49 ` Elliott, Robert (Server Storage) 2015-01-09 23:01 ` Kudryavtsev, Andrey O @ 2015-01-09 23:07 ` Kulkarni, Vasu 1 sibling, 0 replies; 7+ messages in thread From: Kulkarni, Vasu @ 2015-01-09 23:07 UTC (permalink / raw) To: Elliott, Robert (Server Storage); +Cc: Kudryavtsev, Andrey O, fio Excellent!! atleast this explains me how to get out of crazy eta's , I think the hang may be possible due to low RAM with combination of high iodepth/blocksize/numjobs. On Fri, Jan 9, 2015 at 2:49 PM, Elliott, Robert (Server Storage) <Elliott@hp.com> wrote: > >> -----Original Message----- >> From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On >> Behalf Of Kudryavtsev, Andrey O >> Sent: Friday, 09 January, 2015 4:42 PM >> To: fio@vger.kernel.org >> Subject: REPORT: FIO random read performance degradation without >> "norandommap" option >> >> Colleagues, >> I executed 2-hour runs of 4KRR to understand performance changes across >> the time on the specific very fast NVMe SSD with 1.6TB capacity. >> I noticed the side effect of "norandommap" parameter performing full span >> test on the block device. >> Here is the example of the result with random map (I.e. without >> "norandommap" option) within 120 minutes windows. >> [cid:E6872B64-35D1-4447-A0CF-32E6411D9BDB] >> (IOPS in blue) >> >> As soon as I enabled "norandommap" option the curve has changed into the >> straight line as expected. > > It takes resources to maintain the random map table. I always run with > norandommap unless using verify, which has to remember which ones have > been accessed. > > Here's a description I gave someone a while back: > With a huge device (e.g., 5.8 TB from RAID-0 made from 16 SSDs), > if you do not use "norandommap", fio allocates a bitmap for all > the disk blocks to keep track of where it has read or written. > It uses this to avoid accessing the same blocks until all > the blocks have been accessed, and to know which blocks it > needs to verify if verify=<something> is enabled. > > For 5.8 TB, that is 1562714136 = 1.5 GB. Not many of those > huge allocations work, so it > * hangs the system for a while > * generates estimates like > [eta 1158050440d:06h:50m:22s] > * and eventually reports > smalloc: failed adding pool > fio: failed allocating random map. If running a large number of jobs, try the 'norandommap' option or set 'softrandommap'. Or give a larger --alloc-size to fio. > > fio continues to run after that; I think it verifies only > the devices for which the allocation worked and ignores > the rest. > > > --- > Rob Elliott HP Server Storage > > > -- > To unsubscribe from this list: send the line "unsubscribe fio" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-01-12 22:57 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-09 22:42 REPORT: FIO random read performance degradation without "norandommap" option Kudryavtsev, Andrey O 2015-01-09 22:49 ` Elliott, Robert (Server Storage) 2015-01-09 23:01 ` Kudryavtsev, Andrey O 2015-01-10 9:20 ` Andrey Kuzmin 2015-01-12 21:06 ` Kudryavtsev, Andrey O 2015-01-12 22:56 ` Elliott, Robert (Server Storage) 2015-01-09 23:07 ` Kulkarni, Vasu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.