All of lore.kernel.org
 help / color / mirror / Atom feed
* Fio 2.1.5 release upcoming
@ 2014-02-06 19:21 Jens Axboe
  2014-02-07  3:44 ` Mutex destruction, invalid memory accesses, leaks Sitsofe Wheeler
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Jens Axboe @ 2014-02-06 19:21 UTC (permalink / raw)
  To: fio

Hi,

I've been late on this release, originally wanted it out around
Christmas. But various issues here and there prevented that. However, I
now think we are getting pretty close. It'd be great if folks running on
non-linux systems could compile and ensure that everything is in working
order, because otherwise I'm going to assume that it is...

Similarly, if you know of bugs (particularly regressions from previous
releases), speak up now so we can get them fixed before 2.1.5 is cut.

I'll wait until Monday to tag the release.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Mutex destruction, invalid memory accesses, leaks
  2014-02-06 19:21 Fio 2.1.5 release upcoming Jens Axboe
@ 2014-02-07  3:44 ` Sitsofe Wheeler
  2014-02-07 16:11   ` Jens Axboe
  2014-02-08 19:52 ` Fio 2.1.5 release upcoming Matthew Eaton
  2014-02-11 11:22 ` Paul Alcorn
  2 siblings, 1 reply; 36+ messages in thread
From: Sitsofe Wheeler @ 2014-02-07  3:44 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Thu, Feb 06, 2014 at 12:21:35PM -0700, Jens Axboe wrote:
> 
> Similarly, if you know of bugs (particularly regressions from previous
> releases), speak up now so we can get them fixed before 2.1.5 is cut.

I think there is a problem with how mutexes are being destroyed and it's
manifesting as a reproducible segfault in libwinpthread-1.dll on
Windows. From
http://thread.gmane.org/gmane.comp.storage.fio/97/focus=136 :

> I've finally had time to reproduce this on a Windows 7 box. I use a
> different command line:
> 
> ./fio.exe --debug=all --filename=fiojob --thread --size=512 --rw=read --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --name=fiojobname
> 
> The problem appears to be that the mutex is being destroyed while it
> is still being held by a different thread. Adding return; to the first
> line of fio_mutex_remove in mutex.c papers over the problem...

This issue hasn't seen much interest since it was raised a few weeks ago
and I haven't had time to come up with a proper fix but it looks similar
to the issue described in https://lwn.net/Articles/575460/ (A surprise
with mutexes and reference counts).

Additionally Dr Memory is also flagging up an invalid memory access on
the Windows version of fio (one is in a macro which makes a for loop but
I only have a non-macro fix for it at the moment) and some memory leaks
around string_to_cpu and init_io_u.

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-07  3:44 ` Mutex destruction, invalid memory accesses, leaks Sitsofe Wheeler
@ 2014-02-07 16:11   ` Jens Axboe
  2014-02-09 19:50     ` Sitsofe Wheeler
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-07 16:11 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

On 2014-02-06 20:44, Sitsofe Wheeler wrote:
> On Thu, Feb 06, 2014 at 12:21:35PM -0700, Jens Axboe wrote:
>>
>> Similarly, if you know of bugs (particularly regressions from previous
>> releases), speak up now so we can get them fixed before 2.1.5 is cut.
>
> I think there is a problem with how mutexes are being destroyed and it's
> manifesting as a reproducible segfault in libwinpthread-1.dll on
> Windows. From
> http://thread.gmane.org/gmane.comp.storage.fio/97/focus=136 :
>
>> I've finally had time to reproduce this on a Windows 7 box. I use a
>> different command line:
>>
>> ./fio.exe --debug=all --filename=fiojob --thread --size=512 --rw=read --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --name=fiojobname
>>
>> The problem appears to be that the mutex is being destroyed while it
>> is still being held by a different thread. Adding return; to the first
>> line of fio_mutex_remove in mutex.c papers over the problem...
>
> This issue hasn't seen much interest since it was raised a few weeks ago
> and I haven't had time to come up with a proper fix but it looks similar
> to the issue described in https://lwn.net/Articles/575460/ (A surprise
> with mutexes and reference counts).

Does this still happen in current -git? The bug is a weird one - it 
looks like it's crashing in bringing up the thread, but the 
synchronization around that should ensure that it never gets to touch 
td->mutex. If the mutexes are broken somehow and the thread doesn't 
properly wait for the main thread to bring it up, then I can see it 
happening. Hence my question whether it's still happening after Bruce 
fixed the pthread linkage in current -git.

> Additionally Dr Memory is also flagging up an invalid memory access on
> the Windows version of fio (one is in a macro which makes a for loop but
> I only have a non-macro fix for it at the moment) and some memory leaks
> around string_to_cpu and init_io_u.

I'm going to need more info on the invalid mem access. Not surprised 
there are a few leaks around the init functions. Would be nice to get 
fixed up, but not a ship-stopper.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-06 19:21 Fio 2.1.5 release upcoming Jens Axboe
  2014-02-07  3:44 ` Mutex destruction, invalid memory accesses, leaks Sitsofe Wheeler
@ 2014-02-08 19:52 ` Matthew Eaton
  2014-02-09 20:57   ` Jens Axboe
  2014-02-11 11:22 ` Paul Alcorn
  2 siblings, 1 reply; 36+ messages in thread
From: Matthew Eaton @ 2014-02-08 19:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Thu, Feb 6, 2014 at 11:21 AM, Jens Axboe <axboe@kernel.dk> wrote:
> Hi,
>
> I've been late on this release, originally wanted it out around
> Christmas. But various issues here and there prevented that. However, I
> now think we are getting pretty close. It'd be great if folks running on
> non-linux systems could compile and ensure that everything is in working
> order, because otherwise I'm going to assume that it is...
>
> Similarly, if you know of bugs (particularly regressions from previous
> releases), speak up now so we can get them fixed before 2.1.5 is cut.
>
> I'll wait until Monday to tag the release.
>
> --
> Jens Axboe
>

Hi Jens,

I was messing around with the openfiles flag in one of my job files
last week but was unable to get it to work.  Fio would keep defaulting
to opening all files set by nrfiles, but I'm not sure if I was doing
something wrong.  This was with fio 2.1.4 on linux.  Can you check on
this?

Matt


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-07 16:11   ` Jens Axboe
@ 2014-02-09 19:50     ` Sitsofe Wheeler
  2014-02-09 20:49       ` Jens Axboe
  2014-02-10 19:25       ` Bruce Cran
  0 siblings, 2 replies; 36+ messages in thread
From: Sitsofe Wheeler @ 2014-02-09 19:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Fri, Feb 07, 2014 at 09:11:20AM -0700, Jens Axboe wrote:
> On 2014-02-06 20:44, Sitsofe Wheeler wrote:
> >On Thu, Feb 06, 2014 at 12:21:35PM -0700, Jens Axboe wrote:
> >>
> >>./fio.exe --debug=all --filename=fiojob --thread --size=512 --rw=read --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --name=fiojobname
> >>
> >>The problem appears to be that the mutex is being destroyed while it
> >>is still being held by a different thread. Adding return; to the first
> >>line of fio_mutex_remove in mutex.c papers over the problem...
> >
> Does this still happen in current -git? The bug is a weird one - it
> looks like it's crashing in bringing up the thread, but the
> synchronization around that should ensure that it never gets to
> touch td->mutex. If the mutexes are broken somehow and the thread
> doesn't properly wait for the main thread to bring it up, then I can
> see it happening. Hence my question whether it's still happening
> after Bruce fixed the pthread linkage in current -git.

Yes it's still happening with -git from a moment ago. What is stopping a
sleeping thread from holding a mutex that is destroyed and then waking
up on it after the memory has been unmapped?

> >Additionally Dr Memory is also flagging up an invalid memory access on
> >the Windows version of fio (one is in a macro which makes a for loop but
> >I only have a non-macro fix for it at the moment) and some memory leaks
> >around string_to_cpu and init_io_u.
> 
> I'm going to need more info on the invalid mem access. Not surprised
> there are a few leaks around the init functions. Would be nice to
> get fixed up, but not a ship-stopper.

Here's the Dr Memory output:
Error #1: UNADDRESSABLE ACCESS: reading 2 byte(s)
# 0 __get_mult_bytes.constprop.5               [fio/parse.c:168]
# 1 str_to_decimal                             [fio/parse.c:237]
# 2 __handle_option                            [fio/parse.c:285]
# 3 handle_option                              [fio/parse.c:861]
# 4 fill_default_options                       [fio/parse.c:1174]
# 5 main                                       [fio/fio.c:40]
Note: refers to 0 byte(s) beyond last valid byte in prior malloc

Error #2: LEAK 11 bytes 
# 0 replace_malloc                     [d:\drmemory_package\common\alloc_replace.c:2292]
# 1 msvcrt.dll!_strdup   
# 2 __handle_option                    [fio/parse.c:615]
# 3 handle_option                      [fio/parse.c:861]
# 4 fill_default_options               [fio/parse.c:1174]
# 5 main                               [fio/fio.c:40]

Error #3: LEAK 26 bytes 
# 0 replace_malloc               [d:\drmemory_package\common\alloc_replace.c:2292]
# 1 msvcrt.dll!_strdup   
# 2 fio_test_cconv               [fio/cconv.c:10]
# 3 main                         [fio/fio.c:40]

Error #4: LEAK 11 bytes 
# 0 replace_malloc               [d:\drmemory_package\common\alloc_replace.c:2292]
# 1 msvcrt.dll!_strdup   
# 2 fio_test_cconv               [fio/cconv.c:10]
# 3 main                         [fio/fio.c:40]

Error #5: POSSIBLE LEAK 35 bytes 
# 0 replace_malloc                                     [d:\drmemory_package\common\alloc_replace.c:2292]
# 1 emutls_alloc                                       [/usr/src/debug/mingw64-i686-gcc-4.8.2-2/libgcc/emutls.c:110]
# 2 __fio_gettime                                      [fio/gettime.c:165]
# 3 _fu0___set_invalid_parameter_handler               [/usr/src/debug/mingw64-i686-runtime-3.1.0-1/crt/crtexe.c:332]
# 4 KERNEL32.dll!BaseThreadInitThunk

Error #6: LEAK 136 bytes 
# 0 replace_calloc                       [d:\drmemory_package\common\alloc_replace.c:2310]
# 1 __emutls_get_address                 [/usr/src/debug/mingw64-i686-gcc-4.8.2-2/libgcc/emutls.c:159]
# 2 __fio_gettime                        [fio/gettime.c:165]
# 3 pthread_create_wrapper               [/usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/thread.c:1381]
# 4 msvcrt.dll!_endthreadex
# 5 msvcrt.dll!_endthreadex
# 6 KERNEL32.dll!BaseThreadInitThunk

Error #7: POSSIBLE LEAK 35 bytes 
# 0 replace_malloc                       [d:\drmemory_package\common\alloc_replace.c:2292]
# 1 emutls_alloc                         [/usr/src/debug/mingw64-i686-gcc-4.8.2-2/libgcc/emutls.c:110]
# 2 __fio_gettime                        [fio/gettime.c:165]
# 3 pthread_create_wrapper               [/usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/thread.c:1381]
# 4 msvcrt.dll!_endthreadex
# 5 msvcrt.dll!_endthreadex
# 6 KERNEL32.dll!BaseThreadInitThunk

===========================================================================
FINAL SUMMARY:

DUPLICATE ERROR COUNTS:
	Error #   1:     32

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-09 19:50     ` Sitsofe Wheeler
@ 2014-02-09 20:49       ` Jens Axboe
  2014-02-10  9:55         ` Sitsofe Wheeler
  2014-02-10 19:25       ` Bruce Cran
  1 sibling, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-09 20:49 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: fio

On 2014-02-09 12:50, Sitsofe Wheeler wrote:
> On Fri, Feb 07, 2014 at 09:11:20AM -0700, Jens Axboe wrote:
>> On 2014-02-06 20:44, Sitsofe Wheeler wrote:
>>> On Thu, Feb 06, 2014 at 12:21:35PM -0700, Jens Axboe wrote:
>>>>
>>>> ./fio.exe --debug=all --filename=fiojob --thread --size=512 --rw=read --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --name=fiojobname
>>>>
>>>> The problem appears to be that the mutex is being destroyed while it
>>>> is still being held by a different thread. Adding return; to the first
>>>> line of fio_mutex_remove in mutex.c papers over the problem...
>>>
>> Does this still happen in current -git? The bug is a weird one - it
>> looks like it's crashing in bringing up the thread, but the
>> synchronization around that should ensure that it never gets to
>> touch td->mutex. If the mutexes are broken somehow and the thread
>> doesn't properly wait for the main thread to bring it up, then I can
>> see it happening. Hence my question whether it's still happening
>> after Bruce fixed the pthread linkage in current -git.
>
> Yes it's still happening with -git from a moment ago. What is stopping a
> sleeping thread from holding a mutex that is destroyed and then waking
> up on it after the memory has been unmapped?

If you look at the particular use case, it looks like this:

[io thread]		[main thread]
mutex_down(mutex);
			mutex_up(mutex);
mutex_kill(mutex);

and mutex isn't used after that kill. The trace you sent looks like the 
io thread doing down successfully (which it should not), then proceeding 
to killing the mutex. The main thread then runs into problems attempting 
to up a mute that has been freed. Hence why I think this is an issue in 
the windows pthread mutexes, that should not happen.

>>> Additionally Dr Memory is also flagging up an invalid memory access on
>>> the Windows version of fio (one is in a macro which makes a for loop but
>>> I only have a non-macro fix for it at the moment) and some memory leaks
>>> around string_to_cpu and init_io_u.
>>
>> I'm going to need more info on the invalid mem access. Not surprised
>> there are a few leaks around the init functions. Would be nice to
>> get fixed up, but not a ship-stopper.
>
> Here's the Dr Memory output:
> Error #1: UNADDRESSABLE ACCESS: reading 2 byte(s)
> # 0 __get_mult_bytes.constprop.5               [fio/parse.c:168]
> # 1 str_to_decimal                             [fio/parse.c:237]
> # 2 __handle_option                            [fio/parse.c:285]
> # 3 handle_option                              [fio/parse.c:861]
> # 4 fill_default_options                       [fio/parse.c:1174]
> # 5 main                                       [fio/fio.c:40]
> Note: refers to 0 byte(s) beyond last valid byte in prior malloc
>
> Error #2: LEAK 11 bytes
> # 0 replace_malloc                     [d:\drmemory_package\common\alloc_replace.c:2292]
> # 1 msvcrt.dll!_strdup
> # 2 __handle_option                    [fio/parse.c:615]
> # 3 handle_option                      [fio/parse.c:861]
> # 4 fill_default_options               [fio/parse.c:1174]
> # 5 main                               [fio/fio.c:40]
>
> Error #3: LEAK 26 bytes
> # 0 replace_malloc               [d:\drmemory_package\common\alloc_replace.c:2292]
> # 1 msvcrt.dll!_strdup
> # 2 fio_test_cconv               [fio/cconv.c:10]
> # 3 main                         [fio/fio.c:40]
>
> Error #4: LEAK 11 bytes
> # 0 replace_malloc               [d:\drmemory_package\common\alloc_replace.c:2292]
> # 1 msvcrt.dll!_strdup
> # 2 fio_test_cconv               [fio/cconv.c:10]
> # 3 main                         [fio/fio.c:40]
>
> Error #5: POSSIBLE LEAK 35 bytes
> # 0 replace_malloc                                     [d:\drmemory_package\common\alloc_replace.c:2292]
> # 1 emutls_alloc                                       [/usr/src/debug/mingw64-i686-gcc-4.8.2-2/libgcc/emutls.c:110]
> # 2 __fio_gettime                                      [fio/gettime.c:165]
> # 3 _fu0___set_invalid_parameter_handler               [/usr/src/debug/mingw64-i686-runtime-3.1.0-1/crt/crtexe.c:332]
> # 4 KERNEL32.dll!BaseThreadInitThunk
>
> Error #6: LEAK 136 bytes
> # 0 replace_calloc                       [d:\drmemory_package\common\alloc_replace.c:2310]
> # 1 __emutls_get_address                 [/usr/src/debug/mingw64-i686-gcc-4.8.2-2/libgcc/emutls.c:159]
> # 2 __fio_gettime                        [fio/gettime.c:165]
> # 3 pthread_create_wrapper               [/usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/thread.c:1381]
> # 4 msvcrt.dll!_endthreadex
> # 5 msvcrt.dll!_endthreadex
> # 6 KERNEL32.dll!BaseThreadInitThunk
>
> Error #7: POSSIBLE LEAK 35 bytes
> # 0 replace_malloc                       [d:\drmemory_package\common\alloc_replace.c:2292]
> # 1 emutls_alloc                         [/usr/src/debug/mingw64-i686-gcc-4.8.2-2/libgcc/emutls.c:110]
> # 2 __fio_gettime                        [fio/gettime.c:165]
> # 3 pthread_create_wrapper               [/usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/thread.c:1381]
> # 4 msvcrt.dll!_endthreadex
> # 5 msvcrt.dll!_endthreadex
> # 6 KERNEL32.dll!BaseThreadInitThunk

I'll take a look at these. How did you invoke fio for the above report?

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-08 19:52 ` Fio 2.1.5 release upcoming Matthew Eaton
@ 2014-02-09 20:57   ` Jens Axboe
  2014-02-10  0:26     ` Matthew Eaton
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-09 20:57 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio

On Sat, Feb 08 2014, Matthew Eaton wrote:
> On Thu, Feb 6, 2014 at 11:21 AM, Jens Axboe <axboe@kernel.dk> wrote:
> > Hi,
> >
> > I've been late on this release, originally wanted it out around
> > Christmas. But various issues here and there prevented that. However, I
> > now think we are getting pretty close. It'd be great if folks running on
> > non-linux systems could compile and ensure that everything is in working
> > order, because otherwise I'm going to assume that it is...
> >
> > Similarly, if you know of bugs (particularly regressions from previous
> > releases), speak up now so we can get them fixed before 2.1.5 is cut.
> >
> > I'll wait until Monday to tag the release.
> >
> > --
> > Jens Axboe
> >
> 
> Hi Jens,
> 
> I was messing around with the openfiles flag in one of my job files
> last week but was unable to get it to work.  Fio would keep defaulting
> to opening all files set by nrfiles, but I'm not sure if I was doing
> something wrong.  This was with fio 2.1.4 on linux.  Can you check on
> this?

It'd be easier if you include your job file / command line!

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-09 20:57   ` Jens Axboe
@ 2014-02-10  0:26     ` Matthew Eaton
  2014-02-10 22:14       ` Jens Axboe
  0 siblings, 1 reply; 36+ messages in thread
From: Matthew Eaton @ 2014-02-10  0:26 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Sun, Feb 9, 2014 at 12:57 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On Sat, Feb 08 2014, Matthew Eaton wrote:
>> On Thu, Feb 6, 2014 at 11:21 AM, Jens Axboe <axboe@kernel.dk> wrote:
>> > Hi,
>> >
>> > I've been late on this release, originally wanted it out around
>> > Christmas. But various issues here and there prevented that. However, I
>> > now think we are getting pretty close. It'd be great if folks running on
>> > non-linux systems could compile and ensure that everything is in working
>> > order, because otherwise I'm going to assume that it is...
>> >
>> > Similarly, if you know of bugs (particularly regressions from previous
>> > releases), speak up now so we can get them fixed before 2.1.5 is cut.
>> >
>> > I'll wait until Monday to tag the release.
>> >
>> > --
>> > Jens Axboe
>> >
>>
>> Hi Jens,
>>
>> I was messing around with the openfiles flag in one of my job files
>> last week but was unable to get it to work.  Fio would keep defaulting
>> to opening all files set by nrfiles, but I'm not sure if I was doing
>> something wrong.  This was with fio 2.1.4 on linux.  Can you check on
>> this?
>
> It'd be easier if you include your job file / command line!
>
> --
> Jens Axboe
>

Hi Jens,

Here's an example.  In fio output I get f=50 instead of f=10 which I
believe is the number of simultaneous opens?  Also strange to me is
that write io is 2000 MB instead of 1024 MB.

[job]
bs=1m
rw=write
size=1g
nrfiles=50
openfiles=10
directory=temp

job: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
fio-2.1.4
Starting 1 process
job: Laying out IO file(s) (50 file(s) / 1023MB)
Jobs: 1 (f=50): [W] [-.-% done] [0KB/539.0MB/0KB /s] [0/539/0 iops]
[eta 00m:00s]
job: (groupid=0, jobs=1): err= 0: pid=15131: Sun Feb  9 16:18:55 2014
  write: io=2000.0MB, bw=697785KB/s, iops=681, runt=  2935msec
    clat (usec): min=167, max=145348, avg=1399.89, stdev=6030.76
     lat (usec): min=175, max=145359, avg=1413.43, stdev=6031.10
    clat percentiles (usec):
     |  1.00th=[  175],  5.00th=[  207], 10.00th=[  253], 20.00th=[  262],
     | 30.00th=[  274], 40.00th=[  290], 50.00th=[  302], 60.00th=[  358],
     | 70.00th=[  516], 80.00th=[  572], 90.00th=[ 5984], 95.00th=[ 6624],
     | 99.00th=[ 7328], 99.50th=[10816], 99.90th=[115200], 99.95th=[140288],
     | 99.99th=[144384]
    bw (KB  /s): min=81160, max=1503620, per=100.00%, avg=778889.75,
stdev=604179.56
    lat (usec) : 250=7.50%, 500=61.95%, 750=18.25%
    lat (msec) : 10=11.70%, 20=0.20%, 50=0.15%, 100=0.10%, 250=0.15%
  cpu          : usr=2.97%, sys=25.70%, ctx=504, majf=0, minf=33
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=2000/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=2000.0MB, aggrb=697785KB/s, minb=697785KB/s,
maxb=697785KB/s, mint=2935msec, maxt=2935msec

Disk stats (read/write):
  sdb: ios=0/2740, merge=0/2, ticks=0/360280, in_queue=376924, util=90.05%


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-09 20:49       ` Jens Axboe
@ 2014-02-10  9:55         ` Sitsofe Wheeler
  0 siblings, 0 replies; 36+ messages in thread
From: Sitsofe Wheeler @ 2014-02-10  9:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Sun, Feb 09, 2014 at 01:49:29PM -0700, Jens Axboe wrote:
> On 2014-02-09 12:50, Sitsofe Wheeler wrote:
> >
> >Yes it's still happening with -git from a moment ago. What is stopping a
> >sleeping thread from holding a mutex that is destroyed and then waking
> >up on it after the memory has been unmapped?
> 
> If you look at the particular use case, it looks like this:
> 
> [io thread]		[main thread]
> mutex_down(mutex);
> 			mutex_up(mutex);
> mutex_kill(mutex);
> 
> and mutex isn't used after that kill. The trace you sent looks like
> the io thread doing down successfully (which it should not), then
> proceeding to killing the mutex. The main thread then runs into
> problems attempting to up a mute that has been freed. Hence why I
> think this is an issue in the windows pthread mutexes, that should
> not happen.

OK, I'll need to think on this a bit more but I have to admit I've never
seen this issue on Linux (but then again I don't tend to use thread mode
that much on Linux).

> >Here's the Dr Memory output:
> >Error #1: UNADDRESSABLE ACCESS: reading 2 byte(s)
> ># 6 KERNEL32.dll!BaseThreadInitThunk
> 
> I'll take a look at these. How did you invoke fio for the above report?

Firstly you MUST build the 32 bit version of fio - Dr Memory currently
does not work on 64 bit binaries. After downloading and installing the
latest Dr Memory exe from
https://code.google.com/p/drmemory/downloads/list :
CFLAGS="-fno-inline -fno-omit-frame-pointer" ./configure --build-32bit-win --extra-cflags="-fno-inline -fno-omit-frame-pointer"
make -j 4
'/cygdrive/c/Users/Sitsofe/Documents/DrMemory-Windows-1.6.1-2/DrMemory-Windows-1.6.1-2/bin/drmemory.exe' -brief -- ./fio.exe --debug=all --thread --filename=fiojob --size=512 --rw=read --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --name=fiojobname

A word of warning, cygwin's error_start environment variable and
dumper.exe do not support 64 bit binaries. For some reason this seems to
include 32 bit binaries built using a w64-mingw.

On a side note I'm sorry to report that AddressSanitizer is not
available in mingw's gcc 4.8.2 nor is it available in cygwin's current
clang (3.1-3). If you try to use the prebuilt Windows clang builds
floating around you will find they expect the MS Visual Studio's linker
and even with fiddling those versions of clang refuse to compile fio. In
fact on Windows fio can't be compiled with the 32 bit pc-mingw32
compiler - you have to use w64-mingw or you will find lots of functions
(like rand_r) can't be found. I suspect this also means that fio can't
currently be built with Visual Studio...

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-09 19:50     ` Sitsofe Wheeler
  2014-02-09 20:49       ` Jens Axboe
@ 2014-02-10 19:25       ` Bruce Cran
  2014-02-10 20:22         ` Sitsofe Wheeler
  1 sibling, 1 reply; 36+ messages in thread
From: Bruce Cran @ 2014-02-10 19:25 UTC (permalink / raw)
  To: Sitsofe Wheeler, Jens Axboe; +Cc: fio

On 2/9/2014 12:50 PM, Sitsofe Wheeler wrote:
>
> Yes it's still happening with -git from a moment ago. What is stopping a
> sleeping thread from holding a mutex that is destroyed and then waking
> up on it after the memory has been unmapped?

In case it *is* a bug in winpthreads-1.dll, are you using version 
3.1.0-1? It was released on the 15th January and there appeared to be 
some mutex-related fixes when I looked at the svn log a few weeks ago.

-- 
Bruce


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-10 19:25       ` Bruce Cran
@ 2014-02-10 20:22         ` Sitsofe Wheeler
  2014-02-10 20:48           ` Jens Axboe
  2014-02-10 20:56           ` Jens Axboe
  0 siblings, 2 replies; 36+ messages in thread
From: Sitsofe Wheeler @ 2014-02-10 20:22 UTC (permalink / raw)
  To: Bruce Cran; +Cc: Jens Axboe, fio

On Mon, Feb 10, 2014 at 12:25:32PM -0700, Bruce Cran wrote:
> On 2/9/2014 12:50 PM, Sitsofe Wheeler wrote:
> >
> >Yes it's still happening with -git from a moment ago. What is stopping a
> >sleeping thread from holding a mutex that is destroyed and then waking
> >up on it after the memory has been unmapped?
> 
> In case it *is* a bug in winpthreads-1.dll, are you using version
> 3.1.0-1? It was released on the 15th January and there appeared to
> be some mutex-related fixes when I looked at the svn log a few weeks
> ago.

Yes I am using version 3.1.0-1 on Windows 7 installed about a week
ago...

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-10 20:22         ` Sitsofe Wheeler
@ 2014-02-10 20:48           ` Jens Axboe
  2014-02-10 20:56           ` Jens Axboe
  1 sibling, 0 replies; 36+ messages in thread
From: Jens Axboe @ 2014-02-10 20:48 UTC (permalink / raw)
  To: Sitsofe Wheeler, Bruce Cran; +Cc: fio



On 02/10/2014 01:22 PM, Sitsofe Wheeler wrote:
> On Mon, Feb 10, 2014 at 12:25:32PM -0700, Bruce Cran wrote:
>> On 2/9/2014 12:50 PM, Sitsofe Wheeler wrote:
>>>
>>> Yes it's still happening with -git from a moment ago. What is stopping a
>>> sleeping thread from holding a mutex that is destroyed and then waking
>>> up on it after the memory has been unmapped?
>>
>> In case it *is* a bug in winpthreads-1.dll, are you using version
>> 3.1.0-1? It was released on the 15th January and there appeared to
>> be some mutex-related fixes when I looked at the svn log a few weeks
>> ago.
>
> Yes I am using version 3.1.0-1 on Windows 7 installed about a week
> ago...

I think I see what is going on here... It's not that the mutexes are 
broken on Windows, it's just that it probably has different scheduling 
behaviour. If the pthread_cond_signal() ends up preempting to the thread 
being woken up, then the memory allocated to the mutex could be gone 
before we get a chance to inc and unlock it.

Could you test the below and see if it fixes it?

diff --git a/backend.c b/backend.c
index 501c59a..841e54b 100644
--- a/backend.c
+++ b/backend.c
@@ -1886,7 +1886,7 @@ static void run_threads(void)
 			m_rate += ddir_rw_sum(td->o.ratemin);
 			t_rate += ddir_rw_sum(td->o.rate);
 			todo--;
-			fio_mutex_up(td->mutex);
+			fio_mutex_up_for_removal(td->mutex);
 		}
 
 		reap_threads(&nr_running, &t_rate, &m_rate);
diff --git a/mutex.c b/mutex.c
index e1fbb60..5eb0af8 100644
--- a/mutex.c
+++ b/mutex.c
@@ -141,6 +141,21 @@ void fio_mutex_down(struct fio_mutex *mutex)
 	pthread_mutex_unlock(&mutex->lock);
 }
 
+/*
+ * Special case fio_mutex_up() that doesn't touch the mutex after
+ * waking up. As such it's only safe if we know the waker is immediately
+ * going to kill the mutex, as it's left in a half-undefined state.
+ */
+void fio_mutex_up_for_removal(struct fio_mutex *mutex)
+{
+	assert(mutex->magic == FIO_MUTEX_MAGIC);
+
+	pthread_mutex_lock(&mutex->lock);
+	read_barrier();
+	if (!mutex->value && mutex->waiters)
+		pthread_cond_signal(&mutex->cond);
+}
+
 void fio_mutex_up(struct fio_mutex *mutex)
 {
 	assert(mutex->magic == FIO_MUTEX_MAGIC);
diff --git a/mutex.h b/mutex.h
index 4f3486d..1b03ba1 100644
--- a/mutex.h
+++ b/mutex.h
@@ -27,6 +27,7 @@ enum {
 extern struct fio_mutex *fio_mutex_init(int);
 extern void fio_mutex_remove(struct fio_mutex *);
 extern void fio_mutex_up(struct fio_mutex *);
+extern void fio_mutex_up_for_removal(struct fio_mutex *);
 extern void fio_mutex_down(struct fio_mutex *);
 extern int fio_mutex_down_timeout(struct fio_mutex *, unsigned int);
 

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-10 20:22         ` Sitsofe Wheeler
  2014-02-10 20:48           ` Jens Axboe
@ 2014-02-10 20:56           ` Jens Axboe
  2014-02-11  0:12             ` Elliott, Robert (Server Storage)
  1 sibling, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-10 20:56 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: Bruce Cran, fio

On Mon, Feb 10 2014, Sitsofe Wheeler wrote:
> On Mon, Feb 10, 2014 at 12:25:32PM -0700, Bruce Cran wrote:
> > On 2/9/2014 12:50 PM, Sitsofe Wheeler wrote:
> > >
> > >Yes it's still happening with -git from a moment ago. What is stopping a
> > >sleeping thread from holding a mutex that is destroyed and then waking
> > >up on it after the memory has been unmapped?
> > 
> > In case it *is* a bug in winpthreads-1.dll, are you using version
> > 3.1.0-1? It was released on the 15th January and there appeared to
> > be some mutex-related fixes when I looked at the svn log a few weeks
> > ago.
> 
> Yes I am using version 3.1.0-1 on Windows 7 installed about a week
> ago...

Actually, the previous wont work, and I don't see how to make it work.
Please try the below instead. Or just re-pull, I'll check it in now.


diff --git a/backend.c b/backend.c
index 501c59a..a607134 100644
--- a/backend.c
+++ b/backend.c
@@ -1236,13 +1236,6 @@ static void *thread_main(void *data)
 	dprint(FD_MUTEX, "done waiting on td->mutex\n");
 
 	/*
-	 * the ->mutex mutex is now no longer used, close it to avoid
-	 * eating a file descriptor
-	 */
-	fio_mutex_remove(td->mutex);
-	td->mutex = NULL;
-
-	/*
 	 * A new gid requires privilege, so we need to do this before setting
 	 * the uid.
 	 */
@@ -1521,6 +1514,9 @@ err:
 	fio_mutex_remove(td->rusage_sem);
 	td->rusage_sem = NULL;
 
+	fio_mutex_remove(td->mutex);
+	td->mutex = NULL;
+
 	td_set_runstate(td, TD_EXITED);
 	return (void *) (uintptr_t) td->error;
 }

-- 
Jens Axboe



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-10  0:26     ` Matthew Eaton
@ 2014-02-10 22:14       ` Jens Axboe
  2014-02-10 23:11         ` Matthew Eaton
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-10 22:14 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio

On Sun, Feb 09 2014, Matthew Eaton wrote:
> On Sun, Feb 9, 2014 at 12:57 PM, Jens Axboe <axboe@kernel.dk> wrote:
> > On Sat, Feb 08 2014, Matthew Eaton wrote:
> >> On Thu, Feb 6, 2014 at 11:21 AM, Jens Axboe <axboe@kernel.dk> wrote:
> >> > Hi,
> >> >
> >> > I've been late on this release, originally wanted it out around
> >> > Christmas. But various issues here and there prevented that. However, I
> >> > now think we are getting pretty close. It'd be great if folks running on
> >> > non-linux systems could compile and ensure that everything is in working
> >> > order, because otherwise I'm going to assume that it is...
> >> >
> >> > Similarly, if you know of bugs (particularly regressions from previous
> >> > releases), speak up now so we can get them fixed before 2.1.5 is cut.
> >> >
> >> > I'll wait until Monday to tag the release.
> >> >
> >> > --
> >> > Jens Axboe
> >> >
> >>
> >> Hi Jens,
> >>
> >> I was messing around with the openfiles flag in one of my job files
> >> last week but was unable to get it to work.  Fio would keep defaulting
> >> to opening all files set by nrfiles, but I'm not sure if I was doing
> >> something wrong.  This was with fio 2.1.4 on linux.  Can you check on
> >> this?
> >
> > It'd be easier if you include your job file / command line!
> >
> > --
> > Jens Axboe
> >
> 
> Hi Jens,
> 
> Here's an example.  In fio output I get f=50 instead of f=10 which I
> believe is the number of simultaneous opens?  Also strange to me is
> that write io is 2000 MB instead of 1024 MB.
> 
> [job]
> bs=1m
> rw=write
> size=1g
> nrfiles=50
> openfiles=10
> directory=temp
> 
> job: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
> fio-2.1.4
> Starting 1 process
> job: Laying out IO file(s) (50 file(s) / 1023MB)
> Jobs: 1 (f=50): [W] [-.-% done] [0KB/539.0MB/0KB /s] [0/539/0 iops]
> [eta 00m:00s]
> job: (groupid=0, jobs=1): err= 0: pid=15131: Sun Feb  9 16:18:55 2014
>   write: io=2000.0MB, bw=697785KB/s, iops=681, runt=  2935msec
>     clat (usec): min=167, max=145348, avg=1399.89, stdev=6030.76
>      lat (usec): min=175, max=145359, avg=1413.43, stdev=6031.10
>     clat percentiles (usec):
>      |  1.00th=[  175],  5.00th=[  207], 10.00th=[  253], 20.00th=[  262],
>      | 30.00th=[  274], 40.00th=[  290], 50.00th=[  302], 60.00th=[  358],
>      | 70.00th=[  516], 80.00th=[  572], 90.00th=[ 5984], 95.00th=[ 6624],
>      | 99.00th=[ 7328], 99.50th=[10816], 99.90th=[115200], 99.95th=[140288],
>      | 99.99th=[144384]
>     bw (KB  /s): min=81160, max=1503620, per=100.00%, avg=778889.75,
> stdev=604179.56
>     lat (usec) : 250=7.50%, 500=61.95%, 750=18.25%
>     lat (msec) : 10=11.70%, 20=0.20%, 50=0.15%, 100=0.10%, 250=0.15%
>   cpu          : usr=2.97%, sys=25.70%, ctx=504, majf=0, minf=33
>   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>      issued    : total=r=0/w=2000/d=0, short=r=0/w=0/d=0
> 
> Run status group 0 (all jobs):
>   WRITE: io=2000.0MB, aggrb=697785KB/s, minb=697785KB/s,
> maxb=697785KB/s, mint=2935msec, maxt=2935msec
> 
> Disk stats (read/write):
>   sdb: ios=0/2740, merge=0/2, ticks=0/360280, in_queue=376924, util=90.05%

Can you try this patch?


diff --git a/backend.c b/backend.c
index a607134..32bc265 100644
--- a/backend.c
+++ b/backend.c
@@ -52,6 +52,7 @@
 #include "server.h"
 #include "lib/getrusage.h"
 #include "idletime.h"
+#include "err.h"
 
 static pthread_t disk_util_thread;
 static struct fio_mutex *disk_thread_mutex;
@@ -478,6 +479,12 @@ static void do_verify(struct thread_data *td, uint64_t verify_bytes)
 				break;
 
 			while ((io_u = get_io_u(td)) != NULL) {
+				if (IS_ERR(io_u)) {
+					io_u = NULL;
+					ret = FIO_Q_BUSY;
+					goto reap;
+				}
+
 				/*
 				 * We are only interested in the places where
 				 * we wrote or trimmed IOs. Turn those into
@@ -574,6 +581,7 @@ sync_done:
 		 * completed io_u's first. Note that we can get BUSY even
 		 * without IO queued, if the system is resource starved.
 		 */
+reap:
 		full = queue_full(td) || (ret == FIO_Q_BUSY && td->cur_depth);
 		if (full || !td->o.iodepth_batch_complete) {
 			min_events = min(td->o.iodepth_batch_complete,
@@ -692,7 +700,14 @@ static uint64_t do_io(struct thread_data *td)
 			break;
 
 		io_u = get_io_u(td);
-		if (!io_u) {
+		if (IS_ERR_OR_NULL(io_u)) {
+			int err = PTR_ERR(io_u);
+
+			io_u = NULL;
+			if (err == -EBUSY) {
+				ret = FIO_Q_BUSY;
+				goto reap;
+			}
 			if (td->o.latency_target)
 				goto reap;
 			break;
@@ -1124,6 +1139,9 @@ static int keep_running(struct thread_data *td)
 		if (diff < td_max_bs(td))
 			return 0;
 
+		if (fio_files_done(td))
+			return 0;
+
 		return 1;
 	}
 
diff --git a/err.h b/err.h
new file mode 100644
index 0000000..5c024ee
--- /dev/null
+++ b/err.h
@@ -0,0 +1,44 @@
+#ifndef FIO_ERR_H
+#define FIO_ERR_H
+
+/*
+ * Kernel pointers have redundant information, so we can use a
+ * scheme where we can return either an error code or a dentry
+ * pointer with the same return value.
+ *
+ * This should be a per-architecture thing, to allow different
+ * error and pointer decisions.
+ */
+#define MAX_ERRNO	4095
+
+#define IS_ERR_VALUE(x) ((x) >= (unsigned long)-MAX_ERRNO)
+
+static inline void *ERR_PTR(long error)
+{
+	return (void *) error;
+}
+
+static inline long PTR_ERR(const void *ptr)
+{
+	return (long) ptr;
+}
+
+static inline long IS_ERR(const void *ptr)
+{
+	return IS_ERR_VALUE((unsigned long)ptr);
+}
+
+static inline long IS_ERR_OR_NULL(const void *ptr)
+{
+	return !ptr || IS_ERR_VALUE((unsigned long)ptr);
+}
+
+static inline int PTR_ERR_OR_ZERO(const void *ptr)
+{
+	if (IS_ERR(ptr))
+		return PTR_ERR(ptr);
+	else
+		return 0;
+}
+
+#endif
diff --git a/file.h b/file.h
index d7e05f4..19413fc 100644
--- a/file.h
+++ b/file.h
@@ -176,5 +176,6 @@ extern void dup_files(struct thread_data *, struct thread_data *);
 extern int get_fileno(struct thread_data *, const char *);
 extern void free_release_files(struct thread_data *);
 void fio_file_reset(struct thread_data *, struct fio_file *);
+int fio_files_done(struct thread_data *);
 
 #endif
diff --git a/filesetup.c b/filesetup.c
index d1702e2..975579a 100644
--- a/filesetup.c
+++ b/filesetup.c
@@ -639,7 +639,7 @@ static int get_file_sizes(struct thread_data *td)
 		}
 
 		if (f->real_file_size == -1ULL && td->o.size)
-			f->real_file_size = td->o.size / td->o.nr_files;
+			f->real_file_size = (td->o.size + td_min_bs(td) - 1) / td->o.nr_files;
 	}
 
 	return err;
@@ -801,7 +801,7 @@ int setup_files(struct thread_data *td)
 			 * total size divided by number of files. if that is
 			 * zero, set it to the real file size.
 			 */
-			f->io_size = o->size / o->nr_files;
+			f->io_size = (o->size + td_min_bs(td) - 1) / o->nr_files;
 			if (!f->io_size)
 				f->io_size = f->real_file_size - f->file_offset;
 		} else if (f->real_file_size < o->file_size_low ||
@@ -1386,3 +1386,15 @@ void fio_file_reset(struct thread_data *td, struct fio_file *f)
 	if (td->o.random_generator == FIO_RAND_GEN_LFSR)
 		lfsr_reset(&f->lfsr, td->rand_seeds[FIO_RAND_BLOCK_OFF]);
 }
+
+int fio_files_done(struct thread_data *td)
+{
+	struct fio_file *f;
+	unsigned int i;
+
+	for_each_file(td, f, i)
+		if (!fio_file_done(f))
+			return 0;
+
+	return 1;
+}
diff --git a/io_u.c b/io_u.c
index 64ff73c..acc1a7b 100644
--- a/io_u.c
+++ b/io_u.c
@@ -11,6 +11,7 @@
 #include "trim.h"
 #include "lib/rand.h"
 #include "lib/axmap.h"
+#include "err.h"
 
 struct io_completion_data {
 	int nr;				/* input */
@@ -985,6 +986,9 @@ static struct fio_file *get_next_file_rand(struct thread_data *td,
 		if (!fio_file_open(f)) {
 			int err;
 
+			if (td->nr_open_files >= td->o.open_files)
+				return ERR_PTR(-EBUSY);
+
 			err = td_io_open_file(td, f);
 			if (err)
 				continue;
@@ -1027,6 +1031,9 @@ static struct fio_file *get_next_file_rr(struct thread_data *td, int goodf,
 		if (!fio_file_open(f)) {
 			int err;
 
+			if (td->nr_open_files >= td->o.open_files)
+				return ERR_PTR(-EBUSY);
+
 			err = td_io_open_file(td, f);
 			if (err) {
 				dprint(FD_FILE, "error %d on open of %s\n",
@@ -1080,6 +1087,9 @@ static struct fio_file *__get_next_file(struct thread_data *td)
 	else
 		f = get_next_file_rand(td, FIO_FILE_open, FIO_FILE_closing);
 
+	if (IS_ERR(f))
+		return f;
+
 	td->file_service_file = f;
 	td->file_service_left = td->file_service_nr - 1;
 out:
@@ -1099,14 +1109,14 @@ static struct fio_file *get_next_file(struct thread_data *td)
 	return __get_next_file(td);
 }
 
-static int set_io_u_file(struct thread_data *td, struct io_u *io_u)
+static long set_io_u_file(struct thread_data *td, struct io_u *io_u)
 {
 	struct fio_file *f;
 
 	do {
 		f = get_next_file(td);
-		if (!f)
-			return 1;
+		if (IS_ERR_OR_NULL(f))
+			return PTR_ERR(f);
 
 		io_u->file = f;
 		get_file(f);
@@ -1400,6 +1410,7 @@ struct io_u *get_io_u(struct thread_data *td)
 	struct fio_file *f;
 	struct io_u *io_u;
 	int do_scramble = 0;
+	long ret = 0;
 
 	io_u = __get_io_u(td);
 	if (!io_u) {
@@ -1425,11 +1436,17 @@ struct io_u *get_io_u(struct thread_data *td)
 		if (read_iolog_get(td, io_u))
 			goto err_put;
 	} else if (set_io_u_file(td, io_u)) {
+		ret = -EBUSY;
 		dprint(FD_IO, "io_u %p, setting file failed\n", io_u);
 		goto err_put;
 	}
 
 	f = io_u->file;
+	if (!f) {
+		dprint(FD_IO, "io_u %p, setting file failed\n", io_u);
+		goto err_put;
+	}
+
 	assert(fio_file_open(f));
 
 	if (ddir_rw(io_u->ddir)) {
@@ -1478,7 +1495,7 @@ out:
 err_put:
 	dprint(FD_IO, "get_io_u failed\n");
 	put_io_u(td, io_u);
-	return NULL;
+	return ERR_PTR(ret);
 }
 
 void io_u_log_error(struct thread_data *td, struct io_u *io_u)

-- 
Jens Axboe



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-10 22:14       ` Jens Axboe
@ 2014-02-10 23:11         ` Matthew Eaton
  2014-02-10 23:15           ` Jens Axboe
  0 siblings, 1 reply; 36+ messages in thread
From: Matthew Eaton @ 2014-02-10 23:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Mon, Feb 10, 2014 at 2:14 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On Sun, Feb 09 2014, Matthew Eaton wrote:
>> On Sun, Feb 9, 2014 at 12:57 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> > On Sat, Feb 08 2014, Matthew Eaton wrote:
>> >> On Thu, Feb 6, 2014 at 11:21 AM, Jens Axboe <axboe@kernel.dk> wrote:
>> >> > Hi,
>> >> >
>> >> > I've been late on this release, originally wanted it out around
>> >> > Christmas. But various issues here and there prevented that. However, I
>> >> > now think we are getting pretty close. It'd be great if folks running on
>> >> > non-linux systems could compile and ensure that everything is in working
>> >> > order, because otherwise I'm going to assume that it is...
>> >> >
>> >> > Similarly, if you know of bugs (particularly regressions from previous
>> >> > releases), speak up now so we can get them fixed before 2.1.5 is cut.
>> >> >
>> >> > I'll wait until Monday to tag the release.
>> >> >
>> >> > --
>> >> > Jens Axboe
>> >> >
>> >>
>> >> Hi Jens,
>> >>
>> >> I was messing around with the openfiles flag in one of my job files
>> >> last week but was unable to get it to work.  Fio would keep defaulting
>> >> to opening all files set by nrfiles, but I'm not sure if I was doing
>> >> something wrong.  This was with fio 2.1.4 on linux.  Can you check on
>> >> this?
>> >
>> > It'd be easier if you include your job file / command line!
>> >
>> > --
>> > Jens Axboe
>> >
>>
>> Hi Jens,
>>
>> Here's an example.  In fio output I get f=50 instead of f=10 which I
>> believe is the number of simultaneous opens?  Also strange to me is
>> that write io is 2000 MB instead of 1024 MB.
>>
>> [job]
>> bs=1m
>> rw=write
>> size=1g
>> nrfiles=50
>> openfiles=10
>> directory=temp
>>
>> job: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
>> fio-2.1.4
>> Starting 1 process
>> job: Laying out IO file(s) (50 file(s) / 1023MB)
>> Jobs: 1 (f=50): [W] [-.-% done] [0KB/539.0MB/0KB /s] [0/539/0 iops]
>> [eta 00m:00s]
>> job: (groupid=0, jobs=1): err= 0: pid=15131: Sun Feb  9 16:18:55 2014
>>   write: io=2000.0MB, bw=697785KB/s, iops=681, runt=  2935msec
>>     clat (usec): min=167, max=145348, avg=1399.89, stdev=6030.76
>>      lat (usec): min=175, max=145359, avg=1413.43, stdev=6031.10
>>     clat percentiles (usec):
>>      |  1.00th=[  175],  5.00th=[  207], 10.00th=[  253], 20.00th=[  262],
>>      | 30.00th=[  274], 40.00th=[  290], 50.00th=[  302], 60.00th=[  358],
>>      | 70.00th=[  516], 80.00th=[  572], 90.00th=[ 5984], 95.00th=[ 6624],
>>      | 99.00th=[ 7328], 99.50th=[10816], 99.90th=[115200], 99.95th=[140288],
>>      | 99.99th=[144384]
>>     bw (KB  /s): min=81160, max=1503620, per=100.00%, avg=778889.75,
>> stdev=604179.56
>>     lat (usec) : 250=7.50%, 500=61.95%, 750=18.25%
>>     lat (msec) : 10=11.70%, 20=0.20%, 50=0.15%, 100=0.10%, 250=0.15%
>>   cpu          : usr=2.97%, sys=25.70%, ctx=504, majf=0, minf=33
>>   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>>      issued    : total=r=0/w=2000/d=0, short=r=0/w=0/d=0
>>
>> Run status group 0 (all jobs):
>>   WRITE: io=2000.0MB, aggrb=697785KB/s, minb=697785KB/s,
>> maxb=697785KB/s, mint=2935msec, maxt=2935msec
>>
>> Disk stats (read/write):
>>   sdb: ios=0/2740, merge=0/2, ticks=0/360280, in_queue=376924, util=90.05%
>
> Can you try this patch?
>
>
> diff --git a/backend.c b/backend.c
> index a607134..32bc265 100644
> --- a/backend.c
> +++ b/backend.c
> @@ -52,6 +52,7 @@
>  #include "server.h"
>  #include "lib/getrusage.h"
>  #include "idletime.h"
> +#include "err.h"
>
>  static pthread_t disk_util_thread;
>  static struct fio_mutex *disk_thread_mutex;
> @@ -478,6 +479,12 @@ static void do_verify(struct thread_data *td, uint64_t verify_bytes)
>                                 break;
>
>                         while ((io_u = get_io_u(td)) != NULL) {
> +                               if (IS_ERR(io_u)) {
> +                                       io_u = NULL;
> +                                       ret = FIO_Q_BUSY;
> +                                       goto reap;
> +                               }
> +
>                                 /*
>                                  * We are only interested in the places where
>                                  * we wrote or trimmed IOs. Turn those into
> @@ -574,6 +581,7 @@ sync_done:
>                  * completed io_u's first. Note that we can get BUSY even
>                  * without IO queued, if the system is resource starved.
>                  */
> +reap:
>                 full = queue_full(td) || (ret == FIO_Q_BUSY && td->cur_depth);
>                 if (full || !td->o.iodepth_batch_complete) {
>                         min_events = min(td->o.iodepth_batch_complete,
> @@ -692,7 +700,14 @@ static uint64_t do_io(struct thread_data *td)
>                         break;
>
>                 io_u = get_io_u(td);
> -               if (!io_u) {
> +               if (IS_ERR_OR_NULL(io_u)) {
> +                       int err = PTR_ERR(io_u);
> +
> +                       io_u = NULL;
> +                       if (err == -EBUSY) {
> +                               ret = FIO_Q_BUSY;
> +                               goto reap;
> +                       }
>                         if (td->o.latency_target)
>                                 goto reap;
>                         break;
> @@ -1124,6 +1139,9 @@ static int keep_running(struct thread_data *td)
>                 if (diff < td_max_bs(td))
>                         return 0;
>
> +               if (fio_files_done(td))
> +                       return 0;
> +
>                 return 1;
>         }
>
> diff --git a/err.h b/err.h
> new file mode 100644
> index 0000000..5c024ee
> --- /dev/null
> +++ b/err.h
> @@ -0,0 +1,44 @@
> +#ifndef FIO_ERR_H
> +#define FIO_ERR_H
> +
> +/*
> + * Kernel pointers have redundant information, so we can use a
> + * scheme where we can return either an error code or a dentry
> + * pointer with the same return value.
> + *
> + * This should be a per-architecture thing, to allow different
> + * error and pointer decisions.
> + */
> +#define MAX_ERRNO      4095
> +
> +#define IS_ERR_VALUE(x) ((x) >= (unsigned long)-MAX_ERRNO)
> +
> +static inline void *ERR_PTR(long error)
> +{
> +       return (void *) error;
> +}
> +
> +static inline long PTR_ERR(const void *ptr)
> +{
> +       return (long) ptr;
> +}
> +
> +static inline long IS_ERR(const void *ptr)
> +{
> +       return IS_ERR_VALUE((unsigned long)ptr);
> +}
> +
> +static inline long IS_ERR_OR_NULL(const void *ptr)
> +{
> +       return !ptr || IS_ERR_VALUE((unsigned long)ptr);
> +}
> +
> +static inline int PTR_ERR_OR_ZERO(const void *ptr)
> +{
> +       if (IS_ERR(ptr))
> +               return PTR_ERR(ptr);
> +       else
> +               return 0;
> +}
> +
> +#endif
> diff --git a/file.h b/file.h
> index d7e05f4..19413fc 100644
> --- a/file.h
> +++ b/file.h
> @@ -176,5 +176,6 @@ extern void dup_files(struct thread_data *, struct thread_data *);
>  extern int get_fileno(struct thread_data *, const char *);
>  extern void free_release_files(struct thread_data *);
>  void fio_file_reset(struct thread_data *, struct fio_file *);
> +int fio_files_done(struct thread_data *);
>
>  #endif
> diff --git a/filesetup.c b/filesetup.c
> index d1702e2..975579a 100644
> --- a/filesetup.c
> +++ b/filesetup.c
> @@ -639,7 +639,7 @@ static int get_file_sizes(struct thread_data *td)
>                 }
>
>                 if (f->real_file_size == -1ULL && td->o.size)
> -                       f->real_file_size = td->o.size / td->o.nr_files;
> +                       f->real_file_size = (td->o.size + td_min_bs(td) - 1) / td->o.nr_files;
>         }
>
>         return err;
> @@ -801,7 +801,7 @@ int setup_files(struct thread_data *td)
>                          * total size divided by number of files. if that is
>                          * zero, set it to the real file size.
>                          */
> -                       f->io_size = o->size / o->nr_files;
> +                       f->io_size = (o->size + td_min_bs(td) - 1) / o->nr_files;
>                         if (!f->io_size)
>                                 f->io_size = f->real_file_size - f->file_offset;
>                 } else if (f->real_file_size < o->file_size_low ||
> @@ -1386,3 +1386,15 @@ void fio_file_reset(struct thread_data *td, struct fio_file *f)
>         if (td->o.random_generator == FIO_RAND_GEN_LFSR)
>                 lfsr_reset(&f->lfsr, td->rand_seeds[FIO_RAND_BLOCK_OFF]);
>  }
> +
> +int fio_files_done(struct thread_data *td)
> +{
> +       struct fio_file *f;
> +       unsigned int i;
> +
> +       for_each_file(td, f, i)
> +               if (!fio_file_done(f))
> +                       return 0;
> +
> +       return 1;
> +}
> diff --git a/io_u.c b/io_u.c
> index 64ff73c..acc1a7b 100644
> --- a/io_u.c
> +++ b/io_u.c
> @@ -11,6 +11,7 @@
>  #include "trim.h"
>  #include "lib/rand.h"
>  #include "lib/axmap.h"
> +#include "err.h"
>
>  struct io_completion_data {
>         int nr;                         /* input */
> @@ -985,6 +986,9 @@ static struct fio_file *get_next_file_rand(struct thread_data *td,
>                 if (!fio_file_open(f)) {
>                         int err;
>
> +                       if (td->nr_open_files >= td->o.open_files)
> +                               return ERR_PTR(-EBUSY);
> +
>                         err = td_io_open_file(td, f);
>                         if (err)
>                                 continue;
> @@ -1027,6 +1031,9 @@ static struct fio_file *get_next_file_rr(struct thread_data *td, int goodf,
>                 if (!fio_file_open(f)) {
>                         int err;
>
> +                       if (td->nr_open_files >= td->o.open_files)
> +                               return ERR_PTR(-EBUSY);
> +
>                         err = td_io_open_file(td, f);
>                         if (err) {
>                                 dprint(FD_FILE, "error %d on open of %s\n",
> @@ -1080,6 +1087,9 @@ static struct fio_file *__get_next_file(struct thread_data *td)
>         else
>                 f = get_next_file_rand(td, FIO_FILE_open, FIO_FILE_closing);
>
> +       if (IS_ERR(f))
> +               return f;
> +
>         td->file_service_file = f;
>         td->file_service_left = td->file_service_nr - 1;
>  out:
> @@ -1099,14 +1109,14 @@ static struct fio_file *get_next_file(struct thread_data *td)
>         return __get_next_file(td);
>  }
>
> -static int set_io_u_file(struct thread_data *td, struct io_u *io_u)
> +static long set_io_u_file(struct thread_data *td, struct io_u *io_u)
>  {
>         struct fio_file *f;
>
>         do {
>                 f = get_next_file(td);
> -               if (!f)
> -                       return 1;
> +               if (IS_ERR_OR_NULL(f))
> +                       return PTR_ERR(f);
>
>                 io_u->file = f;
>                 get_file(f);
> @@ -1400,6 +1410,7 @@ struct io_u *get_io_u(struct thread_data *td)
>         struct fio_file *f;
>         struct io_u *io_u;
>         int do_scramble = 0;
> +       long ret = 0;
>
>         io_u = __get_io_u(td);
>         if (!io_u) {
> @@ -1425,11 +1436,17 @@ struct io_u *get_io_u(struct thread_data *td)
>                 if (read_iolog_get(td, io_u))
>                         goto err_put;
>         } else if (set_io_u_file(td, io_u)) {
> +               ret = -EBUSY;
>                 dprint(FD_IO, "io_u %p, setting file failed\n", io_u);
>                 goto err_put;
>         }
>
>         f = io_u->file;
> +       if (!f) {
> +               dprint(FD_IO, "io_u %p, setting file failed\n", io_u);
> +               goto err_put;
> +       }
> +
>         assert(fio_file_open(f));
>
>         if (ddir_rw(io_u->ddir)) {
> @@ -1478,7 +1495,7 @@ out:
>  err_put:
>         dprint(FD_IO, "get_io_u failed\n");
>         put_io_u(td, io_u);
> -       return NULL;
> +       return ERR_PTR(ret);
>  }
>
>  void io_u_log_error(struct thread_data *td, struct io_u *io_u)
>
> --
> Jens Axboe
>

To be honest I'm not sure how to apply a patch.  Thus far I have only
used release versions of fio.  Do I need to get fio from git, apply
the patch, and then compile?


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-10 23:11         ` Matthew Eaton
@ 2014-02-10 23:15           ` Jens Axboe
  2014-02-11  0:00             ` Matthew Eaton
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-10 23:15 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio

> To be honest I'm not sure how to apply a patch.  Thus far I have only
> used release versions of fio.  Do I need to get fio from git, apply
> the patch, and then compile?

The easiest would be:

$ git clone git://git.kernel.dk/fio

Then save the patch from mail in a file, eg /tmp/patch. Then do:

$ cd fio
$ patch -p1 --dry-run < /tmp/patch

If the patch command spews any errors, the most likely explanation is 
that your mailer mangled it somehow. You can try and add -l and see if 
that makes patch happier, it'll ignore white space then.

Assuming that worked, just do:

$ ./configure
$ make

and re-run with ./fio and your job file.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-10 23:15           ` Jens Axboe
@ 2014-02-11  0:00             ` Matthew Eaton
  2014-02-11 15:09               ` Jens Axboe
  2014-02-11 15:27               ` Jens Axboe
  0 siblings, 2 replies; 36+ messages in thread
From: Matthew Eaton @ 2014-02-11  0:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

On Mon, Feb 10, 2014 at 3:15 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> To be honest I'm not sure how to apply a patch.  Thus far I have only
>> used release versions of fio.  Do I need to get fio from git, apply
>> the patch, and then compile?
>
>
> The easiest would be:
>
> $ git clone git://git.kernel.dk/fio
>
> Then save the patch from mail in a file, eg /tmp/patch. Then do:
>
> $ cd fio
> $ patch -p1 --dry-run < /tmp/patch
>
> If the patch command spews any errors, the most likely explanation is that
> your mailer mangled it somehow. You can try and add -l and see if that makes
> patch happier, it'll ignore white space then.
>
> Assuming that worked, just do:
>
> $ ./configure
> $ make
>
> and re-run with ./fio and your job file.
>
> --
> Jens Axboe

Jens, thanks a lot for your help.  Here is the output from fio from
git + your patch.  Looks correct except that write io should be 1024
MB instead of 1000 MB?

job: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
fio-2.1.4-48-gea66
Starting 1 process
job: Laying out IO file(s) (50 file(s) / 1024MB)
Jobs: 1 (f=10)
job: (groupid=0, jobs=1): err= 0: pid=7286: Mon Feb 10 15:56:16 2014
  write: io=1000.0MB, bw=546425KB/s, iops=533, runt=  1874msec
    clat (usec): min=331, max=18665, avg=1845.86, stdev=3825.28
     lat (usec): min=342, max=18676, avg=1861.42, stdev=3825.89
    clat percentiles (usec):
     |  1.00th=[  338],  5.00th=[  346], 10.00th=[  354], 20.00th=[  378],
     | 30.00th=[  430], 40.00th=[  438], 50.00th=[  466], 60.00th=[  628],
     | 70.00th=[  652], 80.00th=[  668], 90.00th=[ 9152], 95.00th=[13120],
     | 99.00th=[14784], 99.50th=[15424], 99.90th=[18304], 99.95th=[18560],
     | 99.99th=[18560]
    bw (KB  /s): min=374784, max=999424, per=100.00%, avg=585045.33,
stdev=358875.60
    lat (usec) : 500=52.90%, 750=35.60%
    lat (msec) : 10=2.70%, 20=8.80%
  cpu          : usr=4.38%, sys=24.07%, ctx=250, majf=0, minf=33
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=1000/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: io=1000.0MB, aggrb=546424KB/s, minb=546424KB/s,
maxb=546424KB/s, mint=1874msec, maxt=1874msec


^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: Mutex destruction, invalid memory accesses, leaks
  2014-02-10 20:56           ` Jens Axboe
@ 2014-02-11  0:12             ` Elliott, Robert (Server Storage)
  2014-02-11  7:07               ` Sitsofe Wheeler
  0 siblings, 1 reply; 36+ messages in thread
From: Elliott, Robert (Server Storage) @ 2014-02-11  0:12 UTC (permalink / raw)
  To: Jens Axboe, Sitsofe Wheeler; +Cc: Bruce Cran, fio

The latest pulled version with this change works on my Windows Server 2008 R2 system that was crashing with 2-1.4.

---
Rob Elliott    HP Server Storage




> -----Original Message-----
> From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On
> Behalf Of Jens Axboe
> Sent: Monday, 10 February, 2014 2:57 PM
> To: Sitsofe Wheeler
> Cc: Bruce Cran; fio@vger.kernel.org
> Subject: Re: Mutex destruction, invalid memory accesses, leaks
> 
> On Mon, Feb 10 2014, Sitsofe Wheeler wrote:
> > On Mon, Feb 10, 2014 at 12:25:32PM -0700, Bruce Cran wrote:
> > > On 2/9/2014 12:50 PM, Sitsofe Wheeler wrote:
> > > >
> > > >Yes it's still happening with -git from a moment ago. What is stopping a
> > > >sleeping thread from holding a mutex that is destroyed and then waking
> > > >up on it after the memory has been unmapped?
> > >
> > > In case it *is* a bug in winpthreads-1.dll, are you using version
> > > 3.1.0-1? It was released on the 15th January and there appeared to
> > > be some mutex-related fixes when I looked at the svn log a few weeks
> > > ago.
> >
> > Yes I am using version 3.1.0-1 on Windows 7 installed about a week
> > ago...
> 
> Actually, the previous wont work, and I don't see how to make it work.
> Please try the below instead. Or just re-pull, I'll check it in now.
> 
> 
> diff --git a/backend.c b/backend.c
> index 501c59a..a607134 100644
> --- a/backend.c
> +++ b/backend.c
> @@ -1236,13 +1236,6 @@ static void *thread_main(void *data)
>  	dprint(FD_MUTEX, "done waiting on td->mutex\n");
> 
>  	/*
> -	 * the ->mutex mutex is now no longer used, close it to avoid
> -	 * eating a file descriptor
> -	 */
> -	fio_mutex_remove(td->mutex);
> -	td->mutex = NULL;
> -
> -	/*
>  	 * A new gid requires privilege, so we need to do this before setting
>  	 * the uid.
>  	 */
> @@ -1521,6 +1514,9 @@ err:
>  	fio_mutex_remove(td->rusage_sem);
>  	td->rusage_sem = NULL;
> 
> +	fio_mutex_remove(td->mutex);
> +	td->mutex = NULL;
> +
>  	td_set_runstate(td, TD_EXITED);
>  	return (void *) (uintptr_t) td->error;
>  }
> 
> --
> Jens Axboe
> 
> --
> To unsubscribe from this list: send the line "unsubscribe fio" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-11  0:12             ` Elliott, Robert (Server Storage)
@ 2014-02-11  7:07               ` Sitsofe Wheeler
  2014-02-11 15:30                 ` Elliott, Robert (Server Storage)
  0 siblings, 1 reply; 36+ messages in thread
From: Sitsofe Wheeler @ 2014-02-11  7:07 UTC (permalink / raw)
  To: Elliott, Robert (Server Storage); +Cc: Jens Axboe, Bruce Cran, fio

On Tue, Feb 11, 2014 at 12:12:54AM +0000, Elliott, Robert (Server Storage) wrote:
> 
> > -----Original Message-----
> > From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On
> > Behalf Of Jens Axboe
> > 
> > Actually, the previous wont work, and I don't see how to make it work.
> > Please try the below instead. Or just re-pull, I'll check it in now.
> > 
> The latest pulled version with this change works on my Windows Server 2008 R2 system that was crashing with 2-1.4.

Still a problem here:

$ git rev-parse HEAD
ea66e04fe1a803f6a9ddf31cb999641d4396d67c
$ ./fio.exe --version
fio-2.1.4-48-gea66
$ gdb --args ./fio.exe --debug=all --filename=fiojob --thread --size=512 --rw=re
ad --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --name=fiojobname
GNU gdb (GDB) 7.6.50.20130728-cvs (cygwin-special)
<snip>
Reading symbols from /home/Sitsofe Wheeler/fio/fio.exe...done.
(gdb) ru
Starting program: /home/Sitsofe Wheeler/fio/fio.exe --debug=all --filename=fioj
b --thread --size=512 --rw=read --bs=512 --ioengine=sync --verify_pattern=0xdead
beef --name=fiojobname
[New Thread 1756.0xbe0]
[New Thread 1756.0x8e8]
fio: set all debug options
io       1756  load ioengine windowsaio
parse    1756  handle_option=filename, ptr=fiojob
parse    1756  __handle_option=filename, type=5, ptr=fiojob
file     1756  add file fiojob
file     1756  resize file array to 1 files
file     1756  file 01F40008 "fiojob" added at 0
parse    1756  handle_option=thread, ptr=(null)
parse    1756  __handle_option=thread, type=10, ptr=(null)
parse    1756    ret=0, out=1
parse    1756  handle_option=size, ptr=512
parse    1756  __handle_option=size, type=3, ptr=512
parse    1756    ret=0, out=512
parse    1756  handle_option=rw, ptr=read
parse    1756  __handle_option=rw, type=1, ptr=read
parse    1756  handle_option=bs, ptr=512
parse    1756  __handle_option=bs, type=7, ptr=512
parse    1756    ret=0, out=512
parse    1756  handle_option=ioengine, ptr=sync
parse    1756  __handle_option=ioengine, type=5, ptr=sync
io       1756  free ioengine windowsaio
io       1756  load ioengine sync
parse    1756  handle_option=verify_pattern, ptr=0xdeadbeef
parse    1756  __handle_option=verify_pattern, type=1, ptr=0xdeadbeef
file     1756  dup files: 1
io       1756  load ioengine sync
parse    1756  handle_option=name, ptr=fiojobname
parse    1756  __handle_option=name, type=5, ptr=fiojobname
fiojobname: (g=0): rw=read, bs=512-512/512-512/512-512, ioengine=sync, iodepth=1
parse    1756  free options
fio-2.1.4-48-gea66
time     1756  cycles[0]=2593
time     1756  cycles[1]=2593
time     1756  cycles[2]=2593
time     1756  cycles[3]=2593
time     1756  cycles[4]=2594
time     1756  cycles[5]=2592
time     1756  cycles[6]=2593
time     1756  cycles[7]=2593
time     1756  cycles[8]=2593
time     1756  cycles[9]=2593
time     1756  cycles[10]=2593
time     1756  cycles[11]=2593
time     1756  cycles[12]=2593
time     1756  cycles[13]=2593
time     1756  cycles[14]=2593
time     1756  cycles[15]=2593
time     1756  cycles[16]=2593
time     1756  cycles[17]=2593
time     1756  cycles[18]=2593
time     1756  cycles[19]=2593
time     1756  cycles[20]=2593
time     1756  cycles[21]=2593
time     1756  cycles[22]=2593
time     1756  cycles[23]=2593
time     1756  cycles[24]=2593
time     1756  cycles[25]=2593
time     1756  cycles[26]=2593
time     1756  cycles[27]=2593
time     1756  cycles[28]=2593
time     1756  cycles[29]=2593
time     1756  cycles[30]=2593
time     1756  cycles[31]=2593
time     1756  cycles[32]=2593
time     1756  cycles[33]=2593
time     1756  cycles[34]=2593
time     1756  cycles[35]=2593
time     1756  cycles[36]=2593
time     1756  cycles[37]=2593
time     1756  cycles[38]=2593
time     1756  cycles[39]=2593
time     1756  cycles[40]=2593
time     1756  cycles[41]=2593
time     1756  cycles[42]=2593
time     1756  cycles[43]=2593
time     1756  cycles[44]=2593
time     1756  cycles[45]=2593
time     1756  cycles[46]=2593
time     1756  cycles[47]=2593
time     1756  cycles[48]=2593
time     1756  cycles[49]=2593
time     1756  avg: 2594
time     1756  mean=2593.572000, S=0.042769
time     1756  inv_cycles_per_usec=6467
mutex    1756  wait on startup_mutex
mutex    1756  done waiting on startup_mutex
Starting 1 thread
[New Thread 1756.0x1a0]

Program received signal SIGSEGV, Segmentation fault.
0x0043e1de in pthread_mutex_unlock (m=0x4f80000)
    at /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
(gdb) thread apply all bt

Thread 3 (Thread 1756.0x1a0):
#0  0x775dfd91 in ntdll!ZwDelayExecution ()
   from /cygdrive/c/Windows/system32/ntdll.dll
#1  0x775dfd91 in ntdll!ZwDelayExecution ()
   from /cygdrive/c/Windows/system32/ntdll.dll
#2  0x76933bc8 in SleepEx () from /cygdrive/c/Windows/syswow64/KERNELBASE.dll
#3  0x00000000 in ?? ()

Thread 2 (Thread 1756.0x8e8):
#0  0x775dfd91 in ntdll!ZwDelayExecution ()
   from /cygdrive/c/Windows/system32/ntdll.dll
#1  0x775dfd91 in ntdll!ZwDelayExecution ()
   from /cygdrive/c/Windows/system32/ntdll.dll
#2  0x76933bc8 in SleepEx () from /cygdrive/c/Windows/syswow64/KERNELBASE.dll
#3  0x00000000 in ?? ()

Thread 1 (Thread 1756.0xbe0):
#0  0x0043e1de in pthread_mutex_unlock (m=0x4f80000)
    at /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
#1  0x0041bc48 in fio_mutex_up (mutex=0x4f80000) at mutex.c:153
#2  0x00433c84 in run_threads () at backend.c:1885
#3  0x00434005 in fio_backend () at backend.c:1998
#4  0x00449a14 in main (argc=10, argv=0x9a28b0, envp=0x9a19a0) at fio.c:50

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: Fio 2.1.5 release upcoming
  2014-02-06 19:21 Fio 2.1.5 release upcoming Jens Axboe
  2014-02-07  3:44 ` Mutex destruction, invalid memory accesses, leaks Sitsofe Wheeler
  2014-02-08 19:52 ` Fio 2.1.5 release upcoming Matthew Eaton
@ 2014-02-11 11:22 ` Paul Alcorn
  2014-02-11 15:39   ` 'Jens Axboe'
  2 siblings, 1 reply; 36+ messages in thread
From: Paul Alcorn @ 2014-02-11 11:22 UTC (permalink / raw)
  To: 'Jens Axboe', fio

One problem I have noticed in windows is that the sequential workloads
(read/write) do not scale correctly with different thread counts. Sequential
measurements do not work correctly at all. 

-----Original Message-----
From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On Behalf
Of Jens Axboe
Sent: Thursday, February 6, 2014 1:22 PM
To: fio@vger.kernel.org
Subject: Fio 2.1.5 release upcoming

Hi,

I've been late on this release, originally wanted it out around Christmas.
But various issues here and there prevented that. However, I now think we
are getting pretty close. It'd be great if folks running on non-linux
systems could compile and ensure that everything is in working order,
because otherwise I'm going to assume that it is...

Similarly, if you know of bugs (particularly regressions from previous
releases), speak up now so we can get them fixed before 2.1.5 is cut.

I'll wait until Monday to tag the release.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in the body
of a message to majordomo@vger.kernel.org More majordomo info at
http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11  0:00             ` Matthew Eaton
@ 2014-02-11 15:09               ` Jens Axboe
  2014-02-11 15:27               ` Jens Axboe
  1 sibling, 0 replies; 36+ messages in thread
From: Jens Axboe @ 2014-02-11 15:09 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio



On 02/10/2014 05:00 PM, Matthew Eaton wrote:
> On Mon, Feb 10, 2014 at 3:15 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>> To be honest I'm not sure how to apply a patch.  Thus far I have only
>>> used release versions of fio.  Do I need to get fio from git, apply
>>> the patch, and then compile?
>>
>>
>> The easiest would be:
>>
>> $ git clone git://git.kernel.dk/fio
>>
>> Then save the patch from mail in a file, eg /tmp/patch. Then do:
>>
>> $ cd fio
>> $ patch -p1 --dry-run < /tmp/patch
>>
>> If the patch command spews any errors, the most likely explanation is that
>> your mailer mangled it somehow. You can try and add -l and see if that makes
>> patch happier, it'll ignore white space then.
>>
>> Assuming that worked, just do:
>>
>> $ ./configure
>> $ make
>>
>> and re-run with ./fio and your job file.
>>
>> --
>> Jens Axboe
>
> Jens, thanks a lot for your help.  Here is the output from fio from
> git + your patch.  Looks correct except that write io should be 1024
> MB instead of 1000 MB?

Yeah, but that one I can more easily explain. Unless told otherwise, fio 
just divides the file sizes into equal sizes, and aligns to the min bs. 
So that means you got 50*20MB files, eg 1000MB in total. I can make this 
a bit more clever, if it split the leftover 24MB over 24 of the files.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11  0:00             ` Matthew Eaton
  2014-02-11 15:09               ` Jens Axboe
@ 2014-02-11 15:27               ` Jens Axboe
  2014-02-11 19:18                 ` Matthew Eaton
  1 sibling, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-11 15:27 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio

On Mon, Feb 10 2014, Matthew Eaton wrote:
> On Mon, Feb 10, 2014 at 3:15 PM, Jens Axboe <axboe@kernel.dk> wrote:
> >> To be honest I'm not sure how to apply a patch.  Thus far I have only
> >> used release versions of fio.  Do I need to get fio from git, apply
> >> the patch, and then compile?
> >
> >
> > The easiest would be:
> >
> > $ git clone git://git.kernel.dk/fio
> >
> > Then save the patch from mail in a file, eg /tmp/patch. Then do:
> >
> > $ cd fio
> > $ patch -p1 --dry-run < /tmp/patch
> >
> > If the patch command spews any errors, the most likely explanation is that
> > your mailer mangled it somehow. You can try and add -l and see if that makes
> > patch happier, it'll ignore white space then.
> >
> > Assuming that worked, just do:
> >
> > $ ./configure
> > $ make
> >
> > and re-run with ./fio and your job file.
> >
> > --
> > Jens Axboe
> 
> Jens, thanks a lot for your help.  Here is the output from fio from
> git + your patch.  Looks correct except that write io should be 1024
> MB instead of 1000 MB?

The below should be a bit better.


diff --git a/backend.c b/backend.c
index a607134..32bc265 100644
--- a/backend.c
+++ b/backend.c
@@ -52,6 +52,7 @@
 #include "server.h"
 #include "lib/getrusage.h"
 #include "idletime.h"
+#include "err.h"
 
 static pthread_t disk_util_thread;
 static struct fio_mutex *disk_thread_mutex;
@@ -478,6 +479,12 @@ static void do_verify(struct thread_data *td, uint64_t verify_bytes)
 				break;
 
 			while ((io_u = get_io_u(td)) != NULL) {
+				if (IS_ERR(io_u)) {
+					io_u = NULL;
+					ret = FIO_Q_BUSY;
+					goto reap;
+				}
+
 				/*
 				 * We are only interested in the places where
 				 * we wrote or trimmed IOs. Turn those into
@@ -574,6 +581,7 @@ sync_done:
 		 * completed io_u's first. Note that we can get BUSY even
 		 * without IO queued, if the system is resource starved.
 		 */
+reap:
 		full = queue_full(td) || (ret == FIO_Q_BUSY && td->cur_depth);
 		if (full || !td->o.iodepth_batch_complete) {
 			min_events = min(td->o.iodepth_batch_complete,
@@ -692,7 +700,14 @@ static uint64_t do_io(struct thread_data *td)
 			break;
 
 		io_u = get_io_u(td);
-		if (!io_u) {
+		if (IS_ERR_OR_NULL(io_u)) {
+			int err = PTR_ERR(io_u);
+
+			io_u = NULL;
+			if (err == -EBUSY) {
+				ret = FIO_Q_BUSY;
+				goto reap;
+			}
 			if (td->o.latency_target)
 				goto reap;
 			break;
@@ -1124,6 +1139,9 @@ static int keep_running(struct thread_data *td)
 		if (diff < td_max_bs(td))
 			return 0;
 
+		if (fio_files_done(td))
+			return 0;
+
 		return 1;
 	}
 
diff --git a/file.h b/file.h
index d7e05f4..19413fc 100644
--- a/file.h
+++ b/file.h
@@ -176,5 +176,6 @@ extern void dup_files(struct thread_data *, struct thread_data *);
 extern int get_fileno(struct thread_data *, const char *);
 extern void free_release_files(struct thread_data *);
 void fio_file_reset(struct thread_data *, struct fio_file *);
+int fio_files_done(struct thread_data *);
 
 #endif
diff --git a/filesetup.c b/filesetup.c
index d1702e2..e37307b 100644
--- a/filesetup.c
+++ b/filesetup.c
@@ -639,7 +639,7 @@ static int get_file_sizes(struct thread_data *td)
 		}
 
 		if (f->real_file_size == -1ULL && td->o.size)
-			f->real_file_size = td->o.size / td->o.nr_files;
+			f->real_file_size = (td->o.size + td_min_bs(td) - 1) / td->o.nr_files;
 	}
 
 	return err;
@@ -734,9 +734,11 @@ int setup_files(struct thread_data *td)
 	unsigned long long total_size, extend_size;
 	struct thread_options *o = &td->o;
 	struct fio_file *f;
-	unsigned int i;
+	unsigned int i, nr_fs_extra = 0;
 	int err = 0, need_extend;
 	int old_state;
+	const unsigned int bs = td_min_bs(td);
+	uint64_t fs = 0;
 
 	dprint(FD_FILE, "setup files\n");
 
@@ -786,6 +788,20 @@ int setup_files(struct thread_data *td)
 	}
 
 	/*
+	 * Calculate per-file size and potential extra size for the
+	 * first files, if needed.
+	 */
+	if (!o->file_size_low) {
+		uint64_t all_fs;
+
+		fs = o->size / o->nr_files;
+		all_fs = fs * o->nr_files;
+
+		if (all_fs < o->size)
+			nr_fs_extra = (o->size - all_fs) / bs;
+	}
+
+	/*
 	 * now file sizes are known, so we can set ->io_size. if size= is
 	 * not given, ->io_size is just equal to ->real_file_size. if size
 	 * is given, ->io_size is size / nr_files.
@@ -798,10 +814,17 @@ int setup_files(struct thread_data *td)
 		if (!o->file_size_low) {
 			/*
 			 * no file size range given, file size is equal to
-			 * total size divided by number of files. if that is
-			 * zero, set it to the real file size.
+			 * total size divided by number of files. If that is
+			 * zero, set it to the real file size. If the size
+			 * doesn't divide nicely with the min blocksize,
+			 * make the first files bigger.
 			 */
-			f->io_size = o->size / o->nr_files;
+			f->io_size = fs;
+			if (nr_fs_extra) {
+				nr_fs_extra--;
+				f->io_size += bs;
+			}
+
 			if (!f->io_size)
 				f->io_size = f->real_file_size - f->file_offset;
 		} else if (f->real_file_size < o->file_size_low ||
@@ -1386,3 +1409,15 @@ void fio_file_reset(struct thread_data *td, struct fio_file *f)
 	if (td->o.random_generator == FIO_RAND_GEN_LFSR)
 		lfsr_reset(&f->lfsr, td->rand_seeds[FIO_RAND_BLOCK_OFF]);
 }
+
+int fio_files_done(struct thread_data *td)
+{
+	struct fio_file *f;
+	unsigned int i;
+
+	for_each_file(td, f, i)
+		if (!fio_file_done(f))
+			return 0;
+
+	return 1;
+}
diff --git a/io_u.c b/io_u.c
index 64ff73c..acc1a7b 100644
--- a/io_u.c
+++ b/io_u.c
@@ -11,6 +11,7 @@
 #include "trim.h"
 #include "lib/rand.h"
 #include "lib/axmap.h"
+#include "err.h"
 
 struct io_completion_data {
 	int nr;				/* input */
@@ -985,6 +986,9 @@ static struct fio_file *get_next_file_rand(struct thread_data *td,
 		if (!fio_file_open(f)) {
 			int err;
 
+			if (td->nr_open_files >= td->o.open_files)
+				return ERR_PTR(-EBUSY);
+
 			err = td_io_open_file(td, f);
 			if (err)
 				continue;
@@ -1027,6 +1031,9 @@ static struct fio_file *get_next_file_rr(struct thread_data *td, int goodf,
 		if (!fio_file_open(f)) {
 			int err;
 
+			if (td->nr_open_files >= td->o.open_files)
+				return ERR_PTR(-EBUSY);
+
 			err = td_io_open_file(td, f);
 			if (err) {
 				dprint(FD_FILE, "error %d on open of %s\n",
@@ -1080,6 +1087,9 @@ static struct fio_file *__get_next_file(struct thread_data *td)
 	else
 		f = get_next_file_rand(td, FIO_FILE_open, FIO_FILE_closing);
 
+	if (IS_ERR(f))
+		return f;
+
 	td->file_service_file = f;
 	td->file_service_left = td->file_service_nr - 1;
 out:
@@ -1099,14 +1109,14 @@ static struct fio_file *get_next_file(struct thread_data *td)
 	return __get_next_file(td);
 }
 
-static int set_io_u_file(struct thread_data *td, struct io_u *io_u)
+static long set_io_u_file(struct thread_data *td, struct io_u *io_u)
 {
 	struct fio_file *f;
 
 	do {
 		f = get_next_file(td);
-		if (!f)
-			return 1;
+		if (IS_ERR_OR_NULL(f))
+			return PTR_ERR(f);
 
 		io_u->file = f;
 		get_file(f);
@@ -1400,6 +1410,7 @@ struct io_u *get_io_u(struct thread_data *td)
 	struct fio_file *f;
 	struct io_u *io_u;
 	int do_scramble = 0;
+	long ret = 0;
 
 	io_u = __get_io_u(td);
 	if (!io_u) {
@@ -1425,11 +1436,17 @@ struct io_u *get_io_u(struct thread_data *td)
 		if (read_iolog_get(td, io_u))
 			goto err_put;
 	} else if (set_io_u_file(td, io_u)) {
+		ret = -EBUSY;
 		dprint(FD_IO, "io_u %p, setting file failed\n", io_u);
 		goto err_put;
 	}
 
 	f = io_u->file;
+	if (!f) {
+		dprint(FD_IO, "io_u %p, setting file failed\n", io_u);
+		goto err_put;
+	}
+
 	assert(fio_file_open(f));
 
 	if (ddir_rw(io_u->ddir)) {
@@ -1478,7 +1495,7 @@ out:
 err_put:
 	dprint(FD_IO, "get_io_u failed\n");
 	put_io_u(td, io_u);
-	return NULL;
+	return ERR_PTR(ret);
 }
 
 void io_u_log_error(struct thread_data *td, struct io_u *io_u)

-- 
Jens Axboe



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* RE: Mutex destruction, invalid memory accesses, leaks
  2014-02-11  7:07               ` Sitsofe Wheeler
@ 2014-02-11 15:30                 ` Elliott, Robert (Server Storage)
  2014-02-11 15:38                   ` Jens Axboe
  0 siblings, 1 reply; 36+ messages in thread
From: Elliott, Robert (Server Storage) @ 2014-02-11 15:30 UTC (permalink / raw)
  To: Sitsofe Wheeler; +Cc: Jens Axboe, Bruce Cran, fio

That specific command line does also crash on my Windows 2008 R2 system.  It does not crash if I drop --ioengine=sync.

---
Rob Elliott    HP Server Storage




> -----Original Message-----
> From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On
> Behalf Of Sitsofe Wheeler
> Sent: Tuesday, 11 February, 2014 1:07 AM
> To: Elliott, Robert (Server Storage)
> Cc: Jens Axboe; Bruce Cran; fio@vger.kernel.org
> Subject: Re: Mutex destruction, invalid memory accesses, leaks
> 
> On Tue, Feb 11, 2014 at 12:12:54AM +0000, Elliott, Robert (Server Storage)
> wrote:
> >
> > > -----Original Message-----
> > > From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org]
> On
> > > Behalf Of Jens Axboe
> > >
> > > Actually, the previous wont work, and I don't see how to make it work.
> > > Please try the below instead. Or just re-pull, I'll check it in now.
> > >
> > The latest pulled version with this change works on my Windows Server
> 2008 R2 system that was crashing with 2-1.4.
> 
> Still a problem here:
> 
> $ git rev-parse HEAD
> ea66e04fe1a803f6a9ddf31cb999641d4396d67c
> $ ./fio.exe --version
> fio-2.1.4-48-gea66
> $ gdb --args ./fio.exe --debug=all --filename=fiojob --thread --size=512 --
> rw=re
> ad --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --
> name=fiojobname
> GNU gdb (GDB) 7.6.50.20130728-cvs (cygwin-special)
> <snip>
> Reading symbols from /home/Sitsofe Wheeler/fio/fio.exe...done.
> (gdb) ru
> Starting program: /home/Sitsofe Wheeler/fio/fio.exe --debug=all --
> filename=fioj
> b --thread --size=512 --rw=read --bs=512 --ioengine=sync --
> verify_pattern=0xdead
> beef --name=fiojobname
> [New Thread 1756.0xbe0]
> [New Thread 1756.0x8e8]
> fio: set all debug options
> io       1756  load ioengine windowsaio
> parse    1756  handle_option=filename, ptr=fiojob
> parse    1756  __handle_option=filename, type=5, ptr=fiojob
> file     1756  add file fiojob
> file     1756  resize file array to 1 files
> file     1756  file 01F40008 "fiojob" added at 0
> parse    1756  handle_option=thread, ptr=(null)
> parse    1756  __handle_option=thread, type=10, ptr=(null)
> parse    1756    ret=0, out=1
> parse    1756  handle_option=size, ptr=512
> parse    1756  __handle_option=size, type=3, ptr=512
> parse    1756    ret=0, out=512
> parse    1756  handle_option=rw, ptr=read
> parse    1756  __handle_option=rw, type=1, ptr=read
> parse    1756  handle_option=bs, ptr=512
> parse    1756  __handle_option=bs, type=7, ptr=512
> parse    1756    ret=0, out=512
> parse    1756  handle_option=ioengine, ptr=sync
> parse    1756  __handle_option=ioengine, type=5, ptr=sync
> io       1756  free ioengine windowsaio
> io       1756  load ioengine sync
> parse    1756  handle_option=verify_pattern, ptr=0xdeadbeef
> parse    1756  __handle_option=verify_pattern, type=1, ptr=0xdeadbeef
> file     1756  dup files: 1
> io       1756  load ioengine sync
> parse    1756  handle_option=name, ptr=fiojobname
> parse    1756  __handle_option=name, type=5, ptr=fiojobname
> fiojobname: (g=0): rw=read, bs=512-512/512-512/512-512, ioengine=sync,
> iodepth=1
> parse    1756  free options
> fio-2.1.4-48-gea66
> time     1756  cycles[0]=2593
> time     1756  cycles[1]=2593
> time     1756  cycles[2]=2593
> time     1756  cycles[3]=2593
> time     1756  cycles[4]=2594
> time     1756  cycles[5]=2592
> time     1756  cycles[6]=2593
> time     1756  cycles[7]=2593
> time     1756  cycles[8]=2593
> time     1756  cycles[9]=2593
> time     1756  cycles[10]=2593
> time     1756  cycles[11]=2593
> time     1756  cycles[12]=2593
> time     1756  cycles[13]=2593
> time     1756  cycles[14]=2593
> time     1756  cycles[15]=2593
> time     1756  cycles[16]=2593
> time     1756  cycles[17]=2593
> time     1756  cycles[18]=2593
> time     1756  cycles[19]=2593
> time     1756  cycles[20]=2593
> time     1756  cycles[21]=2593
> time     1756  cycles[22]=2593
> time     1756  cycles[23]=2593
> time     1756  cycles[24]=2593
> time     1756  cycles[25]=2593
> time     1756  cycles[26]=2593
> time     1756  cycles[27]=2593
> time     1756  cycles[28]=2593
> time     1756  cycles[29]=2593
> time     1756  cycles[30]=2593
> time     1756  cycles[31]=2593
> time     1756  cycles[32]=2593
> time     1756  cycles[33]=2593
> time     1756  cycles[34]=2593
> time     1756  cycles[35]=2593
> time     1756  cycles[36]=2593
> time     1756  cycles[37]=2593
> time     1756  cycles[38]=2593
> time     1756  cycles[39]=2593
> time     1756  cycles[40]=2593
> time     1756  cycles[41]=2593
> time     1756  cycles[42]=2593
> time     1756  cycles[43]=2593
> time     1756  cycles[44]=2593
> time     1756  cycles[45]=2593
> time     1756  cycles[46]=2593
> time     1756  cycles[47]=2593
> time     1756  cycles[48]=2593
> time     1756  cycles[49]=2593
> time     1756  avg: 2594
> time     1756  mean=2593.572000, S=0.042769
> time     1756  inv_cycles_per_usec=6467
> mutex    1756  wait on startup_mutex
> mutex    1756  done waiting on startup_mutex
> Starting 1 thread
> [New Thread 1756.0x1a0]
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x0043e1de in pthread_mutex_unlock (m=0x4f80000)
>     at /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
> (gdb) thread apply all bt
> 
> Thread 3 (Thread 1756.0x1a0):
> #0  0x775dfd91 in ntdll!ZwDelayExecution ()
>    from /cygdrive/c/Windows/system32/ntdll.dll
> #1  0x775dfd91 in ntdll!ZwDelayExecution ()
>    from /cygdrive/c/Windows/system32/ntdll.dll
> #2  0x76933bc8 in SleepEx () from
> /cygdrive/c/Windows/syswow64/KERNELBASE.dll
> #3  0x00000000 in ?? ()
> 
> Thread 2 (Thread 1756.0x8e8):
> #0  0x775dfd91 in ntdll!ZwDelayExecution ()
>    from /cygdrive/c/Windows/system32/ntdll.dll
> #1  0x775dfd91 in ntdll!ZwDelayExecution ()
>    from /cygdrive/c/Windows/system32/ntdll.dll
> #2  0x76933bc8 in SleepEx () from
> /cygdrive/c/Windows/syswow64/KERNELBASE.dll
> #3  0x00000000 in ?? ()
> 
> Thread 1 (Thread 1756.0xbe0):
> #0  0x0043e1de in pthread_mutex_unlock (m=0x4f80000)
>     at /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
> #1  0x0041bc48 in fio_mutex_up (mutex=0x4f80000) at mutex.c:153
> #2  0x00433c84 in run_threads () at backend.c:1885
> #3  0x00434005 in fio_backend () at backend.c:1998
> #4  0x00449a14 in main (argc=10, argv=0x9a28b0, envp=0x9a19a0) at fio.c:50
> 
> --
> Sitsofe | http://sucs.org/~sits/
> --
> To unsubscribe from this list: send the line "unsubscribe fio" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-11 15:30                 ` Elliott, Robert (Server Storage)
@ 2014-02-11 15:38                   ` Jens Axboe
  2014-02-11 22:51                     ` Sitsofe Wheeler
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-11 15:38 UTC (permalink / raw)
  To: Elliott, Robert (Server Storage); +Cc: Sitsofe Wheeler, Bruce Cran, fio

Interesting. The mutex issue should be fixed, I'm puzzled why it isn't.
And especially if the sync ioengine has something to do with it. Can
either of you dump the source around:

  at
  /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392

perhaps that will clear things up a bit more?



On Tue, Feb 11 2014, Elliott, Robert (Server Storage) wrote:
> That specific command line does also crash on my Windows 2008 R2 system.  It does not crash if I drop --ioengine=sync.
> 
> ---
> Rob Elliott    HP Server Storage
> 
> 
> 
> 
> > -----Original Message-----
> > From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On
> > Behalf Of Sitsofe Wheeler
> > Sent: Tuesday, 11 February, 2014 1:07 AM
> > To: Elliott, Robert (Server Storage)
> > Cc: Jens Axboe; Bruce Cran; fio@vger.kernel.org
> > Subject: Re: Mutex destruction, invalid memory accesses, leaks
> > 
> > On Tue, Feb 11, 2014 at 12:12:54AM +0000, Elliott, Robert (Server Storage)
> > wrote:
> > >
> > > > -----Original Message-----
> > > > From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org]
> > On
> > > > Behalf Of Jens Axboe
> > > >
> > > > Actually, the previous wont work, and I don't see how to make it work.
> > > > Please try the below instead. Or just re-pull, I'll check it in now.
> > > >
> > > The latest pulled version with this change works on my Windows Server
> > 2008 R2 system that was crashing with 2-1.4.
> > 
> > Still a problem here:
> > 
> > $ git rev-parse HEAD
> > ea66e04fe1a803f6a9ddf31cb999641d4396d67c
> > $ ./fio.exe --version
> > fio-2.1.4-48-gea66
> > $ gdb --args ./fio.exe --debug=all --filename=fiojob --thread --size=512 --
> > rw=re
> > ad --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --
> > name=fiojobname
> > GNU gdb (GDB) 7.6.50.20130728-cvs (cygwin-special)
> > <snip>
> > Reading symbols from /home/Sitsofe Wheeler/fio/fio.exe...done.
> > (gdb) ru
> > Starting program: /home/Sitsofe Wheeler/fio/fio.exe --debug=all --
> > filename=fioj
> > b --thread --size=512 --rw=read --bs=512 --ioengine=sync --
> > verify_pattern=0xdead
> > beef --name=fiojobname
> > [New Thread 1756.0xbe0]
> > [New Thread 1756.0x8e8]
> > fio: set all debug options
> > io       1756  load ioengine windowsaio
> > parse    1756  handle_option=filename, ptr=fiojob
> > parse    1756  __handle_option=filename, type=5, ptr=fiojob
> > file     1756  add file fiojob
> > file     1756  resize file array to 1 files
> > file     1756  file 01F40008 "fiojob" added at 0
> > parse    1756  handle_option=thread, ptr=(null)
> > parse    1756  __handle_option=thread, type=10, ptr=(null)
> > parse    1756    ret=0, out=1
> > parse    1756  handle_option=size, ptr=512
> > parse    1756  __handle_option=size, type=3, ptr=512
> > parse    1756    ret=0, out=512
> > parse    1756  handle_option=rw, ptr=read
> > parse    1756  __handle_option=rw, type=1, ptr=read
> > parse    1756  handle_option=bs, ptr=512
> > parse    1756  __handle_option=bs, type=7, ptr=512
> > parse    1756    ret=0, out=512
> > parse    1756  handle_option=ioengine, ptr=sync
> > parse    1756  __handle_option=ioengine, type=5, ptr=sync
> > io       1756  free ioengine windowsaio
> > io       1756  load ioengine sync
> > parse    1756  handle_option=verify_pattern, ptr=0xdeadbeef
> > parse    1756  __handle_option=verify_pattern, type=1, ptr=0xdeadbeef
> > file     1756  dup files: 1
> > io       1756  load ioengine sync
> > parse    1756  handle_option=name, ptr=fiojobname
> > parse    1756  __handle_option=name, type=5, ptr=fiojobname
> > fiojobname: (g=0): rw=read, bs=512-512/512-512/512-512, ioengine=sync,
> > iodepth=1
> > parse    1756  free options
> > fio-2.1.4-48-gea66
> > time     1756  cycles[0]=2593
> > time     1756  cycles[1]=2593
> > time     1756  cycles[2]=2593
> > time     1756  cycles[3]=2593
> > time     1756  cycles[4]=2594
> > time     1756  cycles[5]=2592
> > time     1756  cycles[6]=2593
> > time     1756  cycles[7]=2593
> > time     1756  cycles[8]=2593
> > time     1756  cycles[9]=2593
> > time     1756  cycles[10]=2593
> > time     1756  cycles[11]=2593
> > time     1756  cycles[12]=2593
> > time     1756  cycles[13]=2593
> > time     1756  cycles[14]=2593
> > time     1756  cycles[15]=2593
> > time     1756  cycles[16]=2593
> > time     1756  cycles[17]=2593
> > time     1756  cycles[18]=2593
> > time     1756  cycles[19]=2593
> > time     1756  cycles[20]=2593
> > time     1756  cycles[21]=2593
> > time     1756  cycles[22]=2593
> > time     1756  cycles[23]=2593
> > time     1756  cycles[24]=2593
> > time     1756  cycles[25]=2593
> > time     1756  cycles[26]=2593
> > time     1756  cycles[27]=2593
> > time     1756  cycles[28]=2593
> > time     1756  cycles[29]=2593
> > time     1756  cycles[30]=2593
> > time     1756  cycles[31]=2593
> > time     1756  cycles[32]=2593
> > time     1756  cycles[33]=2593
> > time     1756  cycles[34]=2593
> > time     1756  cycles[35]=2593
> > time     1756  cycles[36]=2593
> > time     1756  cycles[37]=2593
> > time     1756  cycles[38]=2593
> > time     1756  cycles[39]=2593
> > time     1756  cycles[40]=2593
> > time     1756  cycles[41]=2593
> > time     1756  cycles[42]=2593
> > time     1756  cycles[43]=2593
> > time     1756  cycles[44]=2593
> > time     1756  cycles[45]=2593
> > time     1756  cycles[46]=2593
> > time     1756  cycles[47]=2593
> > time     1756  cycles[48]=2593
> > time     1756  cycles[49]=2593
> > time     1756  avg: 2594
> > time     1756  mean=2593.572000, S=0.042769
> > time     1756  inv_cycles_per_usec=6467
> > mutex    1756  wait on startup_mutex
> > mutex    1756  done waiting on startup_mutex
> > Starting 1 thread
> > [New Thread 1756.0x1a0]
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x0043e1de in pthread_mutex_unlock (m=0x4f80000)
> >     at /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
> > (gdb) thread apply all bt
> > 
> > Thread 3 (Thread 1756.0x1a0):
> > #0  0x775dfd91 in ntdll!ZwDelayExecution ()
> >    from /cygdrive/c/Windows/system32/ntdll.dll
> > #1  0x775dfd91 in ntdll!ZwDelayExecution ()
> >    from /cygdrive/c/Windows/system32/ntdll.dll
> > #2  0x76933bc8 in SleepEx () from
> > /cygdrive/c/Windows/syswow64/KERNELBASE.dll
> > #3  0x00000000 in ?? ()
> > 
> > Thread 2 (Thread 1756.0x8e8):
> > #0  0x775dfd91 in ntdll!ZwDelayExecution ()
> >    from /cygdrive/c/Windows/system32/ntdll.dll
> > #1  0x775dfd91 in ntdll!ZwDelayExecution ()
> >    from /cygdrive/c/Windows/system32/ntdll.dll
> > #2  0x76933bc8 in SleepEx () from
> > /cygdrive/c/Windows/syswow64/KERNELBASE.dll
> > #3  0x00000000 in ?? ()
> > 
> > Thread 1 (Thread 1756.0xbe0):
> > #0  0x0043e1de in pthread_mutex_unlock (m=0x4f80000)
> >     at /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
> > #1  0x0041bc48 in fio_mutex_up (mutex=0x4f80000) at mutex.c:153
> > #2  0x00433c84 in run_threads () at backend.c:1885
> > #3  0x00434005 in fio_backend () at backend.c:1998
> > #4  0x00449a14 in main (argc=10, argv=0x9a28b0, envp=0x9a19a0) at fio.c:50
> > 
> > --
> > Sitsofe | http://sucs.org/~sits/
> > --
> > To unsubscribe from this list: send the line "unsubscribe fio" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11 11:22 ` Paul Alcorn
@ 2014-02-11 15:39   ` 'Jens Axboe'
  0 siblings, 0 replies; 36+ messages in thread
From: 'Jens Axboe' @ 2014-02-11 15:39 UTC (permalink / raw)
  To: Paul Alcorn; +Cc: fio

On Tue, Feb 11 2014, Paul Alcorn wrote:
> One problem I have noticed in windows is that the sequential workloads
> (read/write) do not scale correctly with different thread counts. Sequential
> measurements do not work correctly at all. 

You have to be a little more precise than that, I'm not sure what you
mean. If you tell me "what you ran, what you expected, and what you
saw", then that might help us a bit.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11 15:27               ` Jens Axboe
@ 2014-02-11 19:18                 ` Matthew Eaton
  2014-02-11 19:29                   ` Jens Axboe
  0 siblings, 1 reply; 36+ messages in thread
From: Matthew Eaton @ 2014-02-11 19:18 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

> The below should be a bit better.

I made a fresh git clone and tried to apply patch2 but got some errors.

matt@matt-work:~/fiogit/fio$ patch -p1 -l --dry-run < /home/matt/patch2
checking file backend.c
Reversed (or previously applied) patch detected!  Assume -R? [n] y
checking file file.h
Reversed (or previously applied) patch detected!  Assume -R? [n] y
checking file filesetup.c
Hunk #2 FAILED at 734.
Hunk #3 succeeded at 802 (offset 16 lines).
Hunk #4 FAILED at 812.
Hunk #5 succeeded at 1435 with fuzz 2 (offset 35 lines).
2 out of 5 hunks FAILED
checking file io_u.c
Reversed (or previously applied) patch detected!  Assume -R? [n] y
matt@matt-work:~/fiogit/fio$


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11 19:18                 ` Matthew Eaton
@ 2014-02-11 19:29                   ` Jens Axboe
  2014-02-11 20:52                     ` Matthew Eaton
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-11 19:29 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio



On 02/11/2014 12:18 PM, Matthew Eaton wrote:
>> The below should be a bit better.
>
> I made a fresh git clone and tried to apply patch2 but got some errors.
>
> matt@matt-work:~/fiogit/fio$ patch -p1 -l --dry-run < /home/matt/patch2
> checking file backend.c
> Reversed (or previously applied) patch detected!  Assume -R? [n] y
> checking file file.h
> Reversed (or previously applied) patch detected!  Assume -R? [n] y
> checking file filesetup.c
> Hunk #2 FAILED at 734.
> Hunk #3 succeeded at 802 (offset 16 lines).
> Hunk #4 FAILED at 812.
> Hunk #5 succeeded at 1435 with fuzz 2 (offset 35 lines).
> 2 out of 5 hunks FAILED
> checking file io_u.c
> Reversed (or previously applied) patch detected!  Assume -R? [n] y
> matt@matt-work:~/fiogit/fio$

Just run current -git, I applied the patch.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11 19:29                   ` Jens Axboe
@ 2014-02-11 20:52                     ` Matthew Eaton
  2014-02-11 21:21                       ` Jens Axboe
  0 siblings, 1 reply; 36+ messages in thread
From: Matthew Eaton @ 2014-02-11 20:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

> Just run current -git, I applied the patch.

Ok done, but write io is still 1000 MB.  I checked the size of each
test file and they are all the same size, 21474836 bytes.  Not sure if
the patch was supposed to make a difference here but just fyi.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11 20:52                     ` Matthew Eaton
@ 2014-02-11 21:21                       ` Jens Axboe
  2014-02-11 21:38                         ` Matthew Eaton
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-11 21:21 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio



On 02/11/2014 01:52 PM, Matthew Eaton wrote:
>> Just run current -git, I applied the patch.
>
> Ok done, but write io is still 1000 MB.  I checked the size of each
> test file and they are all the same size, 21474836 bytes.  Not sure if
> the patch was supposed to make a difference here but just fyi.

Try and delete the output files and re-run it.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11 21:21                       ` Jens Axboe
@ 2014-02-11 21:38                         ` Matthew Eaton
  2014-02-11 21:42                           ` Jens Axboe
  0 siblings, 1 reply; 36+ messages in thread
From: Matthew Eaton @ 2014-02-11 21:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

> Try and delete the output files and re-run it.

Ok, but got the same result.  Here is the output.

matt@matt-work:~/fiogit/fio$ rm -r ~/temp/*
matt@matt-work:~/fiogit/fio$ ls -l ~/temp
total 0
matt@matt-work:~/fiogit/fio$ ./fio ~/fio.job
job: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
fio-2.1.4-51-geb0c7
Starting 1 process
job: Laying out IO file(s) (50 file(s) / 1023MB)
Jobs: 1 (f=10)
job: (groupid=0, jobs=1): err= 0: pid=18777: Tue Feb 11 13:37:19 2014
  write: io=1000.0MB, bw=562947KB/s, iops=549, runt=  1819msec
    clat (usec): min=331, max=18299, avg=1790.14, stdev=3952.06
     lat (usec): min=342, max=18310, avg=1804.84, stdev=3952.43
    clat percentiles (usec):
     |  1.00th=[  338],  5.00th=[  346], 10.00th=[  354], 20.00th=[  374],
     | 30.00th=[  398], 40.00th=[  434], 50.00th=[  442], 60.00th=[  482],
     | 70.00th=[  644], 80.00th=[  660], 90.00th=[ 9408], 95.00th=[13632],
     | 99.00th=[16512], 99.50th=[17792], 99.90th=[18048], 99.95th=[18304],
     | 99.99th=[18304]
    bw (KB  /s): min=387072, max=1038307, per=100.00%, avg=606881.00,
stdev=373648.33
    lat (usec) : 500=61.20%, 750=28.70%
    lat (msec) : 10=0.20%, 20=9.90%
  cpu          : usr=2.20%, sys=25.91%, ctx=261, majf=0, minf=33
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=1000/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: io=1000.0MB, aggrb=562946KB/s, minb=562946KB/s,
maxb=562946KB/s, mint=1819msec, maxt=1819msec

Disk stats (read/write):
  sda: ios=0/1005, merge=0/0, ticks=0/191632, in_queue=207380, util=83.30%


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11 21:38                         ` Matthew Eaton
@ 2014-02-11 21:42                           ` Jens Axboe
  2014-02-12  0:01                             ` Matthew Eaton
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-11 21:42 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio

On 2014-02-11 14:38, Matthew Eaton wrote:
>> Try and delete the output files and re-run it.
>
> Ok, but got the same result.  Here is the output.
>
> matt@matt-work:~/fiogit/fio$ rm -r ~/temp/*
> matt@matt-work:~/fiogit/fio$ ls -l ~/temp
> total 0
> matt@matt-work:~/fiogit/fio$ ./fio ~/fio.job
> job: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=1
> fio-2.1.4-51-geb0c7
> Starting 1 process
> job: Laying out IO file(s) (50 file(s) / 1023MB)
> Jobs: 1 (f=10)
> job: (groupid=0, jobs=1): err= 0: pid=18777: Tue Feb 11 13:37:19 2014
>    write: io=1000.0MB, bw=562947KB/s, iops=549, runt=  1819msec
>      clat (usec): min=331, max=18299, avg=1790.14, stdev=3952.06
>       lat (usec): min=342, max=18310, avg=1804.84, stdev=3952.43
>      clat percentiles (usec):
>       |  1.00th=[  338],  5.00th=[  346], 10.00th=[  354], 20.00th=[  374],
>       | 30.00th=[  398], 40.00th=[  434], 50.00th=[  442], 60.00th=[  482],
>       | 70.00th=[  644], 80.00th=[  660], 90.00th=[ 9408], 95.00th=[13632],
>       | 99.00th=[16512], 99.50th=[17792], 99.90th=[18048], 99.95th=[18304],
>       | 99.99th=[18304]
>      bw (KB  /s): min=387072, max=1038307, per=100.00%, avg=606881.00,
> stdev=373648.33
>      lat (usec) : 500=61.20%, 750=28.70%
>      lat (msec) : 10=0.20%, 20=9.90%
>    cpu          : usr=2.20%, sys=25.91%, ctx=261, majf=0, minf=33
>    IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>       complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>       issued    : total=r=0/w=1000/d=0, short=r=0/w=0/d=0
>       latency   : target=0, window=0, percentile=100.00%, depth=1
>
> Run status group 0 (all jobs):
>    WRITE: io=1000.0MB, aggrb=562946KB/s, minb=562946KB/s,
> maxb=562946KB/s, mint=1819msec, maxt=1819msec
>
> Disk stats (read/write):
>    sda: ios=0/1005, merge=0/0, ticks=0/191632, in_queue=207380, util=83.30%

Weird, this is what I get with your config file:

[...]
Run status group 0 (all jobs):
   WRITE: io=1023.9MB, aggrb=130026KB/s, minb=130026KB/s, 
maxb=130026KB/s, mint=8063msec, maxt=8063msec

and each file in temp/ is 21474836, which should give us 1023.99MB of IO.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-11 15:38                   ` Jens Axboe
@ 2014-02-11 22:51                     ` Sitsofe Wheeler
  2014-02-12  6:32                       ` Sitsofe Wheeler
  0 siblings, 1 reply; 36+ messages in thread
From: Sitsofe Wheeler @ 2014-02-11 22:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Elliott, Robert (Server Storage), Bruce Cran, fio

I can't tell if this is just a gdb quirk because I haven't hand built
winpthreads but:

Program received signal SIGSEGV, Segmentation fault.
0x0043e1de in pthread_mutex_unlock (m=0x790000) at
/usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
(gdb) list
Line number 392 out of range;
/usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c has 228
lines.

As Elliott mentioned the windowaio engine doesn't exhibit this problem
(but perhaps it causes different thread scheduling?)...

On Tue, Feb 11, 2014 at 08:38:46AM -0700, Jens Axboe wrote:
> Interesting. The mutex issue should be fixed, I'm puzzled why it isn't.
> And especially if the sync ioengine has something to do with it. Can
> either of you dump the source around:
> 
>   at
>   /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
> 
> perhaps that will clear things up a bit more?
> 
> On Tue, Feb 11 2014, Elliott, Robert (Server Storage) wrote:
> > That specific command line does also crash on my Windows 2008 R2 system.  It does not crash if I drop --ioengine=sync.
> > 
> > > -----Original Message-----
> > > From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On
> > > Behalf Of Sitsofe Wheeler
> > > 
> > > $ gdb --args ./fio.exe --debug=all --filename=fiojob --thread --size=512 --
> > > rw=re
> > > ad --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --
> > > name=fiojobname

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-11 21:42                           ` Jens Axboe
@ 2014-02-12  0:01                             ` Matthew Eaton
  2014-02-12  1:46                               ` Jens Axboe
  0 siblings, 1 reply; 36+ messages in thread
From: Matthew Eaton @ 2014-02-12  0:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

> Weird, this is what I get with your config file:
>
> [...]
>
> Run status group 0 (all jobs):
>   WRITE: io=1023.9MB, aggrb=130026KB/s, minb=130026KB/s, maxb=130026KB/s,
> mint=8063msec, maxt=8063msec
>
> and each file in temp/ is 21474836, which should give us 1023.99MB of IO.
>

Well, that is weird.  I tried on two systems now and both report 1000 MB.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-12  0:01                             ` Matthew Eaton
@ 2014-02-12  1:46                               ` Jens Axboe
  2014-02-12  2:30                                 ` Matthew Eaton
  0 siblings, 1 reply; 36+ messages in thread
From: Jens Axboe @ 2014-02-12  1:46 UTC (permalink / raw)
  To: Matthew Eaton; +Cc: fio



On 02/11/2014 05:01 PM, Matthew Eaton wrote:
>> Weird, this is what I get with your config file:
>>
>> [...]
>>
>> Run status group 0 (all jobs):
>>    WRITE: io=1023.9MB, aggrb=130026KB/s, minb=130026KB/s, maxb=130026KB/s,
>> mint=8063msec, maxt=8063msec
>>
>> and each file in temp/ is 21474836, which should give us 1023.99MB of IO.
>>
>
> Well, that is weird.  I tried on two systems now and both report 1000 MB.

That is weird! I'll look into it. For now, 2.1.5 will go out with that 
behaviour. It's an improvement over 2.1.4 in any case, this last issue 
is minor compared to the openfiles= one.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Fio 2.1.5 release upcoming
  2014-02-12  1:46                               ` Jens Axboe
@ 2014-02-12  2:30                                 ` Matthew Eaton
  0 siblings, 0 replies; 36+ messages in thread
From: Matthew Eaton @ 2014-02-12  2:30 UTC (permalink / raw)
  To: Jens Axboe; +Cc: fio

> That is weird! I'll look into it. For now, 2.1.5 will go out with that
> behaviour. It's an improvement over 2.1.4 in any case, this last issue is
> minor compared to the openfiles= one.

Agreed, thanks again for your help!


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Mutex destruction, invalid memory accesses, leaks
  2014-02-11 22:51                     ` Sitsofe Wheeler
@ 2014-02-12  6:32                       ` Sitsofe Wheeler
  0 siblings, 0 replies; 36+ messages in thread
From: Sitsofe Wheeler @ 2014-02-12  6:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Elliott, Robert (Server Storage), Bruce Cran, fio

OK linking against a hand built winpthreads (with -O1 in CFLAGS and
LDFLAGS):

$ gdb --args ./fio.exe --debug=all --filename=fiojob --thread --size=512 --rw=read --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --name=fiojobname                     GNU gdb (GDB) 7.6.50.20130728-cvs (cygwin-special)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-cygwin".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
..
Reading symbols from /home/Sitsofe Wheeler/fio/fio.exe...done.
(gdb) ru
Starting program: /home/Sitsofe Wheeler/fio/fio.exe --debug=all --filename=fiojob --thread --size=512 --rw=read --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --name=fiojobname
[New Thread 1224.0xeb8]
[New Thread 1224.0xf8c]
fio: set all debug options
io       1224  load ioengine windowsaio
parse    1224  handle_option=filename, ptr=fiojob
parse    1224  __handle_option=filename, type=5, ptr=fiojob
file     1224  add file fiojob
file     1224  resize file array to 1 files
file     1224  file 02130008 "fiojob" added at 0
parse    1224  handle_option=thread, ptr=(null)
parse    1224  __handle_option=thread, type=10, ptr=(null)
parse    1224    ret=0, out=1
parse    1224  handle_option=size, ptr=512
parse    1224  __handle_option=size, type=3, ptr=512
parse    1224    ret=0, out=512
parse    1224  handle_option=rw, ptr=read
parse    1224  __handle_option=rw, type=1, ptr=read
parse    1224  handle_option=bs, ptr=512
parse    1224  __handle_option=bs, type=7, ptr=512
parse    1224    ret=0, out=512
parse    1224  handle_option=ioengine, ptr=sync
parse    1224  __handle_option=ioengine, type=5, ptr=sync
io       1224  free ioengine windowsaio
io       1224  load ioengine sync
parse    1224  handle_option=verify_pattern, ptr=0xdeadbeef
parse    1224  __handle_option=verify_pattern, type=1, ptr=0xdeadbeef
file     1224  dup files: 1
io       1224  load ioengine sync
parse    1224  handle_option=name, ptr=fiojobname
parse    1224  __handle_option=name, type=5, ptr=fiojobname
fiojobname: (g=0): rw=read, bs=512-512/512-512/512-512, ioengine=sync, iodepth=1
parse    1224  free options
fio-2.1.4-48-gea66
time     1224  cycles[0]=2593
time     1224  cycles[1]=2593
time     1224  cycles[2]=2593
time     1224  cycles[3]=2592
time     1224  cycles[4]=2593
time     1224  cycles[5]=2758
time     1224  cycles[6]=2594
time     1224  cycles[7]=2594
time     1224  cycles[8]=2593
time     1224  cycles[9]=2598
time     1224  cycles[10]=2592
time     1224  cycles[11]=2593
time     1224  cycles[12]=2593
time     1224  cycles[13]=2593
time     1224  cycles[14]=2593
time     1224  cycles[15]=3518
time     1224  cycles[16]=2593
time     1224  cycles[17]=2593
time     1224  cycles[18]=2593
time     1224  cycles[19]=2593
time     1224  cycles[20]=2593
time     1224  cycles[21]=2593
time     1224  cycles[22]=2593
time     1224  cycles[23]=2593
time     1224  cycles[24]=2593
time     1224  cycles[25]=2593
time     1224  cycles[26]=2593
time     1224  cycles[27]=2593
time     1224  cycles[28]=2593
time     1224  cycles[29]=2593
time     1224  cycles[30]=2592
time     1224  cycles[31]=2593
time     1224  cycles[32]=2593
time     1224  cycles[33]=2593
time     1224  cycles[34]=2593
time     1224  cycles[35]=2593
time     1224  cycles[36]=2593
time     1224  cycles[37]=2593
time     1224  cycles[38]=2593
time     1224  cycles[39]=2593
time     1224  cycles[40]=2593
time     1224  cycles[41]=2593
time     1224  cycles[42]=2593
time     1224  cycles[43]=2593
time     1224  cycles[44]=2593
time     1224  cycles[45]=2593
time     1224  cycles[46]=2593
time     1224  cycles[47]=2593
time     1224  cycles[48]=2593
time     1224  cycles[49]=2593
time     1224  avg: 2593
time     1224  mean=2615.262000, S=26.484294
time     1224  inv_cycles_per_usec=6470
mutex    1224  wait on startup_mutex
mutex    1224  done waiting on startup_mutex
Starting 1 thread
[New Thread 1224.0xc40]

Program received signal SIGSEGV, Segmentation fault.
mutex_unref (m=m@entry=0x830000, r=r@entry=0) at src/mutex.c:42
42        mutex_t *m_ = (mutex_t *)*m;
(gdb) bt
#0  mutex_unref (m=m@entry=0x830000, r=r@entry=0) at src/mutex.c:42
#1  0x00438e3f in pthread_mutex_unlock (m=m@entry=0x830000) at
src/mutex.c:392
#2  0x004188a4 in fio_mutex_up (mutex=0x830000) at mutex.c:153
#3  0x0042f5b4 in run_threads () at backend.c:1885
#4  0x0042f790 in fio_backend () at backend.c:1998
#5  0x00438afe in main (argc=10, argv=0x3d28a0, envp=0x3d1998) at
fio.c:50
(gdb) print *m
Cannot access memory at address 0x830000
(gdb) list
37      static pthread_spinlock_t mutex_global_static = PTHREAD_SPINLOCK_INITIALIZER;
38
39      static WINPTHREADS_ATTRIBUTE((noinline)) int
40      mutex_unref (pthread_mutex_t *m, int r)
41      {
42        mutex_t *m_ = (mutex_t *)*m;
43        pthread_spin_lock (&mutex_global);
44      #ifdef WINPTHREAD_DBG
45        assert((m_->valid == LIFE_MUTEX) && (m_->busy > 0));
46      #endif
(gdb) up
#1  0x00438e3f in pthread_mutex_unlock (m=m@entry=0x830000) at src/mutex.c:392
392       return mutex_unref(m,0);
(gdb) list pthread_mutex_unlock
334       r = pthread_mutex_lock_intern(m, (ct > t ? 0 : (t - ct)));
335       return  r;
336     }
337
338     int pthread_mutex_unlock(pthread_mutex_t *m)
339     {
340       mutex_t *_m;
341       int r = mutex_ref_unlock(m);
342
343       if(r) {
(gdb)
344     #if 0
345         printf("thread %d, la pool, no user unset in mutex %p\n", GetCurrentThreadId(), m);
346     #endif
347         return r;
348       }
349
350       _m = (mutex_t *)*m;
351
352       if (_m->type == PTHREAD_MUTEX_NORMAL)
353       {
(gdb)
354         if (!COND_LOCKED(_m))
355           {
356     #if 0
357               printf("thread %d, mutex %p never locked, actually :p\n", GetCurrentThreadId(), m);
358     #endif
359               return mutex_unref(m, EPERM);
360           }
361       }
362       else if (!COND_LOCKED(_m) || !COND_OWNER(_m)) {
363     #if 0
(gdb)
364         printf("thread %d, mutex %p never locked or not owner, actually :p\n", GetCurrentThreadId(), m);
365     #endif
366         return mutex_unref(m,EPERM);
367       }
368
369       if (_m->type == PTHREAD_MUTEX_RECURSIVE)
370       {
371         if(InterlockedDecrement(&_m->count)) {
372     #if 0
373               printf("thread %d, mutex %p decreasing recursive\n", GetCurrentThreadId(), m);
(gdb)
374     #endif
375               return mutex_unref(m,0);
376             }
377       }
378     #if 0
379       printf("thread %d,unsetting owner of mutex %p\n", GetCurrentThreadId(), m);
380     #endif
381       UNSET_OWNER(_m);
382
383       if (_m->h != NULL && !ReleaseSemaphore(_m->h, 1, NULL)) {
(gdb)
384             SET_OWNER(_m);
385     #if 0
386             printf("Error, not released! thread %d, setting owner of mutex m\n", GetCurrentThreadId(), m);
387     #endif
388         /* restore our own bookkeeping */
389         return mutex_unref(m,EPERM);
390       }
391
392       return mutex_unref(m,0);
393     }
(gdb)
394
395     static WINPTHREADS_ATTRIBUTE((noinline)) int
396     _mutex_trylock(pthread_mutex_t *m)
397     {
398       int r = 0;
399       mutex_t *_m = (mutex_t *)*m;
400
401       if (_m->type != PTHREAD_MUTEX_NORMAL)
402       {
403         if (COND_LOCKED(_m))
(gdb) info locals
_m = 0x3d45d0
r = <optimized out>
(gdb) print m
$1 = (pthread_mutex_t *) 0x830000

On Tue, Feb 11, 2014 at 10:51:49PM +0000, Sitsofe Wheeler wrote:
> I can't tell if this is just a gdb quirk because I haven't hand built
> winpthreads but:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x0043e1de in pthread_mutex_unlock (m=0x790000) at
> /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
> (gdb) list
> Line number 392 out of range;
> /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c has 228
> lines.
> 
> As Elliott mentioned the windowaio engine doesn't exhibit this problem
> (but perhaps it causes different thread scheduling?)...
> 
> On Tue, Feb 11, 2014 at 08:38:46AM -0700, Jens Axboe wrote:
> > Interesting. The mutex issue should be fixed, I'm puzzled why it isn't.
> > And especially if the sync ioengine has something to do with it. Can
> > either of you dump the source around:
> > 
> >   at
> >   /usr/src/debug/mingw64-i686-winpthreads-3.1.0-1/src/mutex.c:392
> > 
> > perhaps that will clear things up a bit more?
> > 
> > On Tue, Feb 11 2014, Elliott, Robert (Server Storage) wrote:
> > > That specific command line does also crash on my Windows 2008 R2 system.  It does not crash if I drop --ioengine=sync.
> > > 
> > > > -----Original Message-----
> > > > From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On
> > > > Behalf Of Sitsofe Wheeler
> > > > 
> > > > $ gdb --args ./fio.exe --debug=all --filename=fiojob --thread --size=512 --
> > > > rw=re
> > > > ad --bs=512 --ioengine=sync --verify_pattern=0xdeadbeef --
> > > > name=fiojobname

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2014-02-12  6:32 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-06 19:21 Fio 2.1.5 release upcoming Jens Axboe
2014-02-07  3:44 ` Mutex destruction, invalid memory accesses, leaks Sitsofe Wheeler
2014-02-07 16:11   ` Jens Axboe
2014-02-09 19:50     ` Sitsofe Wheeler
2014-02-09 20:49       ` Jens Axboe
2014-02-10  9:55         ` Sitsofe Wheeler
2014-02-10 19:25       ` Bruce Cran
2014-02-10 20:22         ` Sitsofe Wheeler
2014-02-10 20:48           ` Jens Axboe
2014-02-10 20:56           ` Jens Axboe
2014-02-11  0:12             ` Elliott, Robert (Server Storage)
2014-02-11  7:07               ` Sitsofe Wheeler
2014-02-11 15:30                 ` Elliott, Robert (Server Storage)
2014-02-11 15:38                   ` Jens Axboe
2014-02-11 22:51                     ` Sitsofe Wheeler
2014-02-12  6:32                       ` Sitsofe Wheeler
2014-02-08 19:52 ` Fio 2.1.5 release upcoming Matthew Eaton
2014-02-09 20:57   ` Jens Axboe
2014-02-10  0:26     ` Matthew Eaton
2014-02-10 22:14       ` Jens Axboe
2014-02-10 23:11         ` Matthew Eaton
2014-02-10 23:15           ` Jens Axboe
2014-02-11  0:00             ` Matthew Eaton
2014-02-11 15:09               ` Jens Axboe
2014-02-11 15:27               ` Jens Axboe
2014-02-11 19:18                 ` Matthew Eaton
2014-02-11 19:29                   ` Jens Axboe
2014-02-11 20:52                     ` Matthew Eaton
2014-02-11 21:21                       ` Jens Axboe
2014-02-11 21:38                         ` Matthew Eaton
2014-02-11 21:42                           ` Jens Axboe
2014-02-12  0:01                             ` Matthew Eaton
2014-02-12  1:46                               ` Jens Axboe
2014-02-12  2:30                                 ` Matthew Eaton
2014-02-11 11:22 ` Paul Alcorn
2014-02-11 15:39   ` 'Jens Axboe'

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.