linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: BUG: aio/direct-io data corruption in 4.7
@ 2018-11-05 15:16 Gregory Shapiro
  2018-11-06  7:28 ` Jack Wang
  0 siblings, 1 reply; 6+ messages in thread
From: Gregory Shapiro @ 2018-11-05 15:16 UTC (permalink / raw)
  To: hch, jnicklin
  Cc: linux-kernel, linux-fsdevel, gregory.shapiro, Gregory Shapiro

Hello, my name is Gregory Shapiro and I am a newbie on this list.
I recently encountered data corruption as I got a kernel to
acknowledge write ("io_getevents" system call with a correct number of
bytes) but undergoing write to disk failed.
After investigating the problem I found it is identical to issue found
in direct-io.c mentioned the bellow thread.
https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/
Is there a reason proposed patch didn't apply to the kernel?
When can I expect it to be applied?
Thanks,
 Gregory

^ permalink raw reply	[flat|nested] 6+ messages in thread
* BUG: aio/direct-io data corruption in 4.7
@ 2016-09-12 18:38 Jonathan Nicklin
  2016-09-21 14:15 ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Nicklin @ 2016-09-12 18:38 UTC (permalink / raw)
  To: linux-kernel

In 4.7.2, the kernel is acknowledging block writes that have not
completed to disk. To reproduce: create an MD array, run FIO (direct +
libaio), and pull all drives. FIO will continue to run without
receiving I/O errors. I have also reproduced the bug using physical
drives. In this case, only a limited number of I/Os are incorrectly
acknowledged; FIO eventually receives an I/O error after the device
reference is removed.

The root cause of the problem is that dio_complete() does not
correctly propagate I/O errors in the is_async case. Specifically,
generic_write_sync() appears to be overwriting the return status
destined for ki_complete().

This bug appears to have been introduced by the following commit:

Description: "fs: simplify the generic_write_sync prototype"
Committed: Apr 7, 2016
Hash: e259221763a40403d5bb232209998e8c45804ab8
Affects: 4.7-rc1 - master

I have confirmed a fix for the AIO/Direct-IO failure condition but
have not reviewed the rest of the changes associated with that commit.
If you would like a small patch for direct-io.c, let me know.

Regards,
-Jonathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-11-09 15:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-05 15:16 BUG: aio/direct-io data corruption in 4.7 Gregory Shapiro
2018-11-06  7:28 ` Jack Wang
2018-11-06 11:31   ` Gregory Shapiro
2018-11-09 15:44     ` Jack Wang
  -- strict thread matches above, loose matches on Subject: below --
2016-09-12 18:38 Jonathan Nicklin
2016-09-21 14:15 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).