Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Stephen Rust <srust@blockbridge.com>
Cc: Rob Townley <rob.townley@gmail.com>,
	Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-scsi@vger.kernel.org, martin.petersen@oracle.com,
	target-devel@vger.kernel.org, Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Sagi Grimberg <sagi@grimberg.me>,
	Max Gurtovoy <maxg@mellanox.com>
Subject: Re: Data corruption in kernel 5.1+ with iSER attached ramdisk
Date: Thu, 5 Dec 2019 11:05:40 +0800
Message-ID: <20191205030540.GA20684@ming.t460p> (raw)
In-Reply-To: <CAAFE1bcwcdVuzAG5+x1UNcTaa22bf0tOaT=QOWrTup98sFXxuQ@mail.gmail.com>

On Wed, Dec 04, 2019 at 09:28:43PM -0500, Stephen Rust wrote:
> Hi Ming,
> 
> Thanks for all your help and insight. I really appreciate it.
> 
> > > Presumably non-brd devices, ie: real scsi devices work for these test
> > > cases because they accept un-aligned buffers?
> >
> > Right, not every driver supports such un-aligned buffer.
> 
> Can you please clarify: does the block layer require that it is called
> with 512-byte aligned buffers? If that is the case, would it make
> sense for the block interface (bio_add_page() or other) to reject
> buffers that are not aligned?

The things is a bit complicated, see the following xfs commits:

f8f9ee479439 xfs: add kmem_alloc_io()
d916275aa4dd xfs: get allocation alignment from the buftarg

Which applies request queue's dma alignment limit which may be
smaller than 512. Before this report, xfs should be the only known
user of passing un-aligned buffer.

So we can't add the check in bio_add_page(), in which request queue
may not be available, also bio_add_page() is really hot path, and
people hates to add unnecessary code in this function.

IMO, it is better for all FS or users of bio_add_page() to pass
512 aligned buffer.

> 
> It seems that passing these buffers on to underlying drivers that
> don't support un-aligned buffers can result in silent data corruption.
> Perhaps it would be better to fail the I/O up front. This would also
> help future proof the block interface when changes/new target drivers
> are added.

It is a brd device, strictly speaking, it doesn't matter to fail the
I/O or whatever, given either way should cause data loss.

> 
> I'm also curious how these same unaligned buffers from iSER made it to
> brd and were written successfully in the pre "multi-page bvec" world.
> (Just trying to understand, if you have any thoughts, as this same
> test case worked fine in 4.14+ until 5.1)

I am pretty sure that brd never supports un-aligned buffer, and I have
no idea why 'multi-page bvec' helper can cause this issue. However, I
am happy to investigate further if you can run previous trace on pre
'multi-page bvec' kernel.

> 
> > I am not familiar with RDMA, but from the trace we have done so far,
> > it is highly related with iser driver.
> 
> Do you think it is fair to say that the iSER/block integration is
> causing corruption by using un-aligned buffers?

As you saw, XFS changed the un-aligned buffer into aligned one for
avoiding the issue, so I think it is pretty fair to say that.

Thanks, 
Ming


  reply index

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAAFE1bd9wuuobpe4VK7Ty175j7mWT+kRmHCNhVD+6R8MWEAqmw@mail.gmail.com>
2019-11-28  1:57 ` Ming Lei
     [not found]   ` <CA+VdTb_-CGaPjKUQteKVFSGqDz-5o-tuRRkJYqt8B9iOQypiwQ@mail.gmail.com>
2019-11-28  2:58     ` Ming Lei
     [not found]       ` <CAAFE1bfsXsKGyw7SU_z4NanT+wmtuJT=XejBYbHHMCDQwm73sw@mail.gmail.com>
2019-11-28  4:25         ` Stephen Rust
2019-11-28  5:51           ` Rob Townley
2019-11-28  9:12         ` Ming Lei
2019-12-02 18:42           ` Stephen Rust
2019-12-03  0:58             ` Ming Lei
2019-12-03  3:04               ` Stephen Rust
2019-12-03  3:14                 ` Ming Lei
2019-12-03  3:26                   ` Stephen Rust
2019-12-03  3:50                     ` Stephen Rust
2019-12-03 12:45                       ` Ming Lei
2019-12-03 19:56                         ` Stephen Rust
2019-12-04  1:05                           ` Ming Lei
2019-12-04 17:23                             ` Stephen Rust
2019-12-04 23:02                               ` Ming Lei
2019-12-05  0:16                                 ` Bart Van Assche
2019-12-05 14:44                                   ` Stephen Rust
2019-12-05  2:28                                 ` Stephen Rust
2019-12-05  3:05                                   ` Ming Lei [this message]
2019-12-05  9:17                                 ` Sagi Grimberg
2019-12-05 14:36                                   ` Stephen Rust
2019-12-04  2:39                           ` Ming Lei
2019-12-03  4:15                     ` Ming Lei

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191205030540.GA20684@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dledford@redhat.com \
    --cc=hch@lst.de \
    --cc=jgg@ziepe.ca \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=maxg@mellanox.com \
    --cc=rob.townley@gmail.com \
    --cc=sagi@grimberg.me \
    --cc=srust@blockbridge.com \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git