All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 208827] [fio io_uring] io_uring write data crc32c verify failed
Date: Tue, 11 Aug 2020 01:15:34 +0000	[thread overview]
Message-ID: <bug-208827-201763-kjEiuEIBSj@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-208827-201763@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=208827

--- Comment #7 from Jens Axboe (axboe@kernel.dk) ---
On 8/10/20 3:08 AM, Dave Chinner wrote:
> On Mon, Aug 10, 2020 at 05:08:07PM +1000, Dave Chinner wrote:
>> [cc Jens]
>>
>> [Jens, data corruption w/ io_uring and simple fio reproducer. see
>> the bz link below.]
>>
>> On Mon, Aug 10, 2020 at 01:56:05PM +1000, Dave Chinner wrote:
>>> On Mon, Aug 10, 2020 at 10:09:32AM +1000, Dave Chinner wrote:
>>>> On Fri, Aug 07, 2020 at 03:12:03AM +0000,
>>>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>>>> --- Comment #1 from Dave Chinner (david@fromorbit.com) ---
>>>>> On Thu, Aug 06, 2020 at 04:57:58AM +0000,
>>>>> bugzilla-daemon@bugzilla.kernel.org
>>>>> wrote:
>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=208827
>>>>>>
>>>>>>             Bug ID: 208827
>>>>>>            Summary: [fio io_uring] io_uring write data crc32c verify
>>>>>>                     failed
>>>>>>            Product: File System
>>>>>>            Version: 2.5
>>>>>>     Kernel Version: xfs-linux xfs-5.9-merge-7 + v5.8-rc4
>>>>
>>>> FWIW, I can reproduce this with a vanilla 5.8 release kernel,
>>>> so this isn't related to contents of the XFS dev tree at all...
>>>>
>>>> In fact, this bug isn't a recent regression. AFAICT, it was
>>>> introduced between in 5.4 and 5.5 - 5.4 did not reproduce, 5.5 did
>>>> reproduce. More info once I've finished bisecting it....
>>>
>>> f67676d160c6ee2ed82917fadfed6d29cab8237c is the first bad commit
>>> commit f67676d160c6ee2ed82917fadfed6d29cab8237c
>>> Author: Jens Axboe <axboe@kernel.dk>
>>> Date:   Mon Dec 2 11:03:47 2019 -0700
>>>
>>>     io_uring: ensure async punted read/write requests copy iovec
>>
>> ....
>>
>> Ok, I went back to vanilla 5.8 to continue debugging and adding
>> tracepoints, and it's proving strangely difficult to reproduce now.
> 
> Which turns out to be caused by a tracepoint I inserted to try to
> narrow down if this was an invalidation race. I put this in
> invalidate_complete_page:
> 
> 
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -257,8 +257,11 @@ int invalidate_inode_page(struct page *page)
>         struct address_space *mapping = page_mapping(page);
>         if (!mapping)
>                 return 0;
> -       if (PageDirty(page) || PageWriteback(page))
> +       if (PageDirty(page) || PageWriteback(page)) {
> +               trace_printk("ino 0x%lx page %p, offset 0x%lx\n",
> +                       mapping->host->i_ino, page, page->index * PAGE_SIZE);
>                 return 0;
> +       }
>         if (page_mapped(page))
>                 return 0;
>         return invalidate_complete_page(mapping, page);
> 
> 
> And that alone, without even enabling tracepoints, made the
> corruption go completely away. So I suspect a page state race
> condition and look at POSIX_FADV_DONTNEED, which fio is issuing
> before running it's verification reads. First thing that does:
> 
>       if (!inode_write_congested(mapping->host))
>               __filemap_fdatawrite_range(mapping, offset, endbyte,
>                                          WB_SYNC_NONE);
> 
> It starts async writeback of the dirty pages. There's 256MB of dirty
> pages on these inodes, and iomap tracing indicates the entire 256MB
> immediately runs through the trace_iomap_writepage() tracepoint.
> i.e. every page goes Dirty -> Writeback and is submitted for async
> IO.
> 
> Then the POSIX_FADV_DONTNEED code goes and runs
> invalidate_mapping_pages(), which ends up try-locking each page and
> then running invalidate_inode_page() on the page, which is where the
> trace debug I put in on pages under writeback gets hit. So if
> changing the invalidation code for pages under writeback makes the
> problem go away, then stopping invalidate_mapping_pages() from
> racing with page writeback should make the problem go away, too.
> 
> This does indeed make the corruption go away:
> 
> --- a/mm/fadvise.c
> +++ b/mm/fadvise.c
> @@ -109,9 +109,8 @@ int generic_fadvise(struct file *file, loff_t offset,
> loff_t len, int advice)
>         case POSIX_FADV_NOREUSE:
>                 break;
>         case POSIX_FADV_DONTNEED:
>                 if (!inode_write_congested(mapping->host))
> -                       __filemap_fdatawrite_range(mapping, offset, endbyte,
> -                                                  WB_SYNC_NONE);
> +                       filemap_write_and_wait_range(mapping, offset,
> endbyte);
>  
>                 /*
>                  * First and last FULL page! Partial pages are deliberately
> 
> by making the invalidation wait for the pages to go fully to the
> clean state before starting.
> 
> This, however, only fixes the specific symptom being tripped over
> here.  To further test this, I removed this writeback from
> POSIX_FADV_DONTNEED completely so I could trigger writeback via
> controlled background writeback. And, as I expected, whenever
> background writeback ran to write back these dirty files, the
> verification failures triggered again. It is quite reliable.
> 
> So it looks like there is some kind of writeback completion vs page
> invalidation race condition occurring, but more work is needed to
> isolate it further. I don't know what part the async read plays in
> the corruption yet, because I don't know how we are getting pages in
> the cache where page->index != the file offset stamped in the data.
> That smells of leaking PageUptodate flags...

The async read really isn't doing anything that you could not do with
separate threads. Unfortunately it's not that easy to have multiple
threads working on the same region with fio, or we could've reproduced
it with a job file written to use that instead.

I'll dig a bit here...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

  parent reply	other threads:[~2020-08-11  1:15 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-06  4:57 [Bug 208827] New: [fio io_uring] io_uring write data crc32c verify failed bugzilla-daemon
2020-08-07  2:42 ` Dave Chinner
2020-08-07  3:12 ` [Bug 208827] " bugzilla-daemon
2020-08-10  0:09   ` Dave Chinner
2020-08-10  3:56     ` Dave Chinner
2020-08-10  7:08       ` Dave Chinner
2020-08-10  9:08         ` Dave Chinner
2020-08-11  1:15           ` Jens Axboe
2020-08-11  1:50             ` Jens Axboe
2020-08-11  2:01               ` Jens Axboe
2020-08-11  3:01                 ` Jens Axboe
2020-08-11 20:56                 ` Jeff Moyer
2020-08-11 22:09                   ` Dave Chinner
2020-08-12 15:13                     ` Jens Axboe
2020-08-12 15:24                       ` Jeff Moyer
2020-08-12 15:26                         ` Jens Axboe
2020-08-11  2:00           ` Dave Chinner
2020-08-11  2:19             ` Jens Axboe
2020-08-11  5:53               ` Dave Chinner
2020-08-11  7:05               ` Dave Chinner
2020-08-11 13:10                 ` Jens Axboe
2020-08-11 21:59                   ` Dave Chinner
2020-08-11 23:00                     ` Dave Chinner
2020-08-12 15:19                       ` Jens Axboe
2020-08-11  1:07         ` Jens Axboe
2020-08-10  0:09 ` bugzilla-daemon
2020-08-10  3:56 ` bugzilla-daemon
2020-08-10  7:08 ` bugzilla-daemon
2020-08-10  9:09 ` bugzilla-daemon
2020-08-11  1:07 ` bugzilla-daemon
2020-08-11  1:15 ` bugzilla-daemon [this message]
2020-08-11  1:50 ` bugzilla-daemon
2020-08-11  2:00 ` bugzilla-daemon
2020-08-11  2:01 ` bugzilla-daemon
2020-08-11  2:20 ` bugzilla-daemon
2020-08-11  3:01 ` bugzilla-daemon
2020-08-11  5:53 ` bugzilla-daemon
2020-08-11  7:05 ` bugzilla-daemon
2020-08-11 13:10 ` bugzilla-daemon
2020-08-11 16:16 ` bugzilla-daemon
2020-08-11 20:56 ` bugzilla-daemon
2020-08-11 21:59 ` bugzilla-daemon
2020-08-11 22:09 ` bugzilla-daemon
2020-08-11 23:00 ` bugzilla-daemon
2020-08-12  3:15 ` bugzilla-daemon
2020-08-12 15:14 ` bugzilla-daemon
2020-08-12 15:19 ` bugzilla-daemon
2020-08-12 15:24 ` bugzilla-daemon
2020-08-12 15:26 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-208827-201763-kjEiuEIBSj@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.