linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bio read race with different ranges inside the same page?
@ 2021-02-26  7:02 Qu Wenruo
  2021-03-01  7:55 ` Qu Wenruo
  0 siblings, 1 reply; 2+ messages in thread
From: Qu Wenruo @ 2021-02-26  7:02 UTC (permalink / raw)
  To: Linux FS Devel, linux-btrfs

Hi,

Is it possible that multiple ranges of the same page are submitted to
one or more bios, and such ranges race with each other and cause data
corruption.

Recently I'm trying to add subpage read/write support for btrfs, and
notice one strange false data corruption.

E.g, there is a 64K page to be read from disk:

0	16K	32K	48K	64K
|///////|       |///////|       |

Where |///| means data which needs to be read from disk.
And |   | means hole, we just zeroing the range.

Currently the code will:

- Submit bio for [0, 16K)
- Zero [16K, 32K)
- Submit bio for [32K, 48K)
- Zero [48K, 64k)

Between bio submission and zero, there is no need to wait for submitted
bio to finish, as I assume the submitted bio won't touch any range of
the page, except the one specified.

But randomly (not reliable), btrfs csum verification at the endio time
reports errors for the data read from disk mismatch from csum.

However the following things show it's read path has something wrong:
- On-disk data matches with csum

- If fully serialized the read path, the error just disappera
   If I changed the read path to be fully serialized, e.g:
   - Submit bio for [0, 16K)
   - Wait bio for [0, 16K) to finish
   - Zero [16K, 32K)
   - Submit bio for [32K, 48K)
   - Wait bio for [32K, 48K) to finish
   - Zero [48K, 64k)
   Then the problem just completely disappears.

So this looks like that, the read path hole zeroing and bio submission
is racing with each other?

Shouldn't bios only touch the range specified and not touching anything
else?

Or is there something I missed like off-by-one bug?

Thanks,
Qu

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Bio read race with different ranges inside the same page?
  2021-02-26  7:02 Bio read race with different ranges inside the same page? Qu Wenruo
@ 2021-03-01  7:55 ` Qu Wenruo
  0 siblings, 0 replies; 2+ messages in thread
From: Qu Wenruo @ 2021-03-01  7:55 UTC (permalink / raw)
  To: Linux FS Devel, linux-btrfs



On 2021/2/26 下午3:02, Qu Wenruo wrote:
> Hi,
>
> Is it possible that multiple ranges of the same page are submitted to
> one or more bios, and such ranges race with each other and cause data
> corruption.
>
> Recently I'm trying to add subpage read/write support for btrfs, and
> notice one strange false data corruption.
>
> E.g, there is a 64K page to be read from disk:
>
> 0    16K    32K    48K    64K
> |///////|       |///////|       |
>
> Where |///| means data which needs to be read from disk.
> And |   | means hole, we just zeroing the range.
>
> Currently the code will:
>
> - Submit bio for [0, 16K)
> - Zero [16K, 32K)
> - Submit bio for [32K, 48K)
> - Zero [48K, 64k)
>
> Between bio submission and zero, there is no need to wait for submitted
> bio to finish, as I assume the submitted bio won't touch any range of
> the page, except the one specified.
>
> But randomly (not reliable), btrfs csum verification at the endio time
> reports errors for the data read from disk mismatch from csum.
>
> However the following things show it's read path has something wrong:
> - On-disk data matches with csum
>
> - If fully serialized the read path, the error just disappera
>    If I changed the read path to be fully serialized, e.g:
>    - Submit bio for [0, 16K)
>    - Wait bio for [0, 16K) to finish
>    - Zero [16K, 32K)
>    - Submit bio for [32K, 48K)
>    - Wait bio for [32K, 48K) to finish
>    - Zero [48K, 64k)
>    Then the problem just completely disappears.

Never mind, the bio read part is doing what we expect, they won't really
touch any thing beyond the range specified.

It's the endio of btrfs end_bio_extent_readpage() doing zeroing which is
always to page end causing the problem.

Thanks,
Qu

>
> So this looks like that, the read path hole zeroing and bio submission
> is racing with each other?
>
> Shouldn't bios only touch the range specified and not touching anything
> else?
>
> Or is there something I missed like off-by-one bug?
>
> Thanks,
> Qu

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-03-01  7:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26  7:02 Bio read race with different ranges inside the same page? Qu Wenruo
2021-03-01  7:55 ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).