All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Ming Lei <tom.leiming@gmail.com>,
	 linux-block <linux-block@vger.kernel.org>,
	 linux-mm <linux-mm@kvack.org>,
	 Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	 "open list\:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
	 Dave Chinner <dchinner@redhat.com>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	 Christoph Hellwig <hch@lst.de>,  Jens Axboe <axboe@kernel.dk>
Subject: Re: block: DMA alignment of IO buffer allocated from slab
Date: Wed, 19 Sep 2018 13:15:00 +0200	[thread overview]
Message-ID: <8736u53fij.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <20180919100256.GD23172@ming.t460p> (Ming Lei's message of "Wed, 19 Sep 2018 18:02:57 +0800")

Ming Lei <ming.lei@redhat.com> writes:

> Hi Vitaly,
>
> On Wed, Sep 19, 2018 at 11:41:07AM +0200, Vitaly Kuznetsov wrote:
>> Ming Lei <tom.leiming@gmail.com> writes:
>> 
>> > Hi Guys,
>> >
>> > Some storage controllers have DMA alignment limit, which is often set via
>> > blk_queue_dma_alignment(), such as 512-byte alignment for IO buffer.
>> 
>> While mostly drivers use 512-byte alignment it is not a rule of thumb,
>> 'git grep' tell me we have:
>> ide-cd.c with 32-byte alignment
>> ps3disk.c and rsxx/dev.c with variable alignment.
>> 
>> What if our block configuration consists of several devices (in raid
>> array, for example) with different requirements, e.g. one requiring
>> 512-byte alignment and the other requiring 256?
>
> 512-byte alignment is also 256-byte aligned, and the sector size is 512 byte.
>

Yes, but it doesn't work the other way around, e.g. what if some device
has e.g. PAGE_SIZE alignment requirement (this would likely imply that
it's sector size is also not 512 I guess)?

>
> From the Red Hat BZ, looks I understand this issue is only triggered when
> KASAN is enabled, or you have figured out how to reproduce it without
> KASAN involved?

Yes, any SLUB debug triggers it (e.g. build your kernel with
SLUB_DEBUG_ON or slub_debug= options (Red zoning, User tracking, ... -
everything will trigger it)

>
>> 
>> >
>> > 3) If slab can't guarantee to return 512-aligned buffer, how to fix
>> > this data corruption issue?
>> 
>> I'm no expert in block layer but in case of complex block device
>> configurations when bio submitter can't know all the requirements I see
>> no other choice than bouncing.
>
> I guess that might be the last straw, given the current way without
> bouncing works for decades, and seems no one complains before.

Not many drivers have alignment requirements and not many filesystems
do requests of this kind. Another option would be to give an API to
figure out alignment requirements for the whole block stack (returning
which alignment would work for _all_ devices in the stack, not just for
one of them) and mandating that all users have to use this while
allocating buffers.

-- 
  Vitaly

WARNING: multiple messages have this Message-ID (diff)
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Ming Lei <tom.leiming@gmail.com>,
	linux-block <linux-block@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	"open list:XFS FILESYSTEM" <linux-xfs@vger.kernel.org>,
	Dave Chinner <dchinner@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>
Subject: Re: block: DMA alignment of IO buffer allocated from slab
Date: Wed, 19 Sep 2018 13:15:00 +0200	[thread overview]
Message-ID: <8736u53fij.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <20180919100256.GD23172@ming.t460p> (Ming Lei's message of "Wed, 19 Sep 2018 18:02:57 +0800")

Ming Lei <ming.lei@redhat.com> writes:

> Hi Vitaly,
>
> On Wed, Sep 19, 2018 at 11:41:07AM +0200, Vitaly Kuznetsov wrote:
>> Ming Lei <tom.leiming@gmail.com> writes:
>> 
>> > Hi Guys,
>> >
>> > Some storage controllers have DMA alignment limit, which is often set via
>> > blk_queue_dma_alignment(), such as 512-byte alignment for IO buffer.
>> 
>> While mostly drivers use 512-byte alignment it is not a rule of thumb,
>> 'git grep' tell me we have:
>> ide-cd.c with 32-byte alignment
>> ps3disk.c and rsxx/dev.c with variable alignment.
>> 
>> What if our block configuration consists of several devices (in raid
>> array, for example) with different requirements, e.g. one requiring
>> 512-byte alignment and the other requiring 256?
>
> 512-byte alignment is also 256-byte aligned, and the sector size is 512 byte.
>

Yes, but it doesn't work the other way around, e.g. what if some device
has e.g. PAGE_SIZE alignment requirement (this would likely imply that
it's sector size is also not 512 I guess)?

>
> From the Red Hat BZ, looks I understand this issue is only triggered when
> KASAN is enabled, or you have figured out how to reproduce it without
> KASAN involved?

Yes, any SLUB debug triggers it (e.g. build your kernel with
SLUB_DEBUG_ON or slub_debug= options (Red zoning, User tracking, ... -
everything will trigger it)

>
>> 
>> >
>> > 3) If slab can't guarantee to return 512-aligned buffer, how to fix
>> > this data corruption issue?
>> 
>> I'm no expert in block layer but in case of complex block device
>> configurations when bio submitter can't know all the requirements I see
>> no other choice than bouncing.
>
> I guess that might be the last straw, given the current way without
> bouncing works for decades, and seems no one complains before.

Not many drivers have alignment requirements and not many filesystems
do requests of this kind. Another option would be to give an API to
figure out alignment requirements for the whole block stack (returning
which alignment would work for _all_ devices in the stack, not just for
one of them) and mandating that all users have to use this while
allocating buffers.

-- 
  Vitaly

  reply	other threads:[~2018-09-19 11:15 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-19  9:15 block: DMA alignment of IO buffer allocated from slab Ming Lei
2018-09-19  9:41 ` Vitaly Kuznetsov
2018-09-19  9:41   ` Vitaly Kuznetsov
2018-09-19  9:41   ` Vitaly Kuznetsov
2018-09-19 10:02   ` Ming Lei
2018-09-19 10:02     ` Ming Lei
2018-09-19 11:15     ` Vitaly Kuznetsov [this message]
2018-09-19 11:15       ` Vitaly Kuznetsov
2018-09-19 11:15       ` Vitaly Kuznetsov
2018-09-20  1:28       ` Ming Lei
2018-09-20  3:59         ` Yang Shi
2018-09-20  6:32         ` Christoph Hellwig
2018-09-20  6:31 ` Christoph Hellwig
2018-09-21 13:04   ` Vitaly Kuznetsov
2018-09-21 13:04     ` Vitaly Kuznetsov
2018-09-21 13:04     ` Vitaly Kuznetsov
2018-09-21 13:05     ` Christoph Hellwig
2018-09-21 15:00       ` Jens Axboe
2018-09-24 16:06       ` Christopher Lameter
2018-09-24 17:49         ` Jens Axboe
2018-09-24 18:00           ` Christopher Lameter
2018-09-24 18:09             ` Jens Axboe
2018-09-25  7:49               ` Dave Chinner
2018-09-25 15:44                 ` Jens Axboe
2018-09-25 21:04                 ` Matthew Wilcox
2018-09-23 22:42     ` Ming Lei
2018-09-24  9:46       ` Andrey Ryabinin
2018-09-24 14:19         ` Bart Van Assche
2018-09-24 14:43           ` Andrey Ryabinin
2018-09-24 14:43             ` Andrey Ryabinin
2018-09-24 15:08             ` Bart Van Assche
2018-09-24 15:08               ` Bart Van Assche
2018-09-24 15:52               ` Andrey Ryabinin
2018-09-24 15:58                 ` Bart Van Assche
2018-09-24 15:58                   ` Bart Van Assche
2018-09-24 16:07                   ` Andrey Ryabinin
2018-09-24 16:19                     ` Bart Van Assche
2018-09-24 16:19                       ` Bart Van Assche
2018-09-24 16:47                       ` Christopher Lameter
2018-09-24 18:57                       ` Matthew Wilcox
2018-09-24 19:56                         ` Bart Van Assche
2018-09-24 19:56                           ` Bart Van Assche
2018-09-24 20:41                           ` Matthew Wilcox
2018-09-24 20:54                             ` Bart Van Assche
2018-09-24 20:54                               ` Bart Van Assche
2018-09-24 21:09                               ` Matthew Wilcox
2018-09-25  0:16                         ` Ming Lei
2018-09-25  3:28                           ` Matthew Wilcox
2018-09-25  4:10                             ` Bart Van Assche
2018-09-25  4:44                               ` Matthew Wilcox
2018-09-25  6:55                                 ` Ming Lei
2018-09-24 15:17           ` Christopher Lameter
2018-09-25  0:20             ` Ming Lei
2018-09-20 14:07 ` Bart Van Assche
2018-09-21  1:56 ` Dave Chinner
2018-09-21  1:56   ` Dave Chinner
2018-09-21  7:08   ` Christoph Hellwig
2018-09-21  7:25     ` Ming Lei
2018-09-21 14:59       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8736u53fij.fsf@vitty.brq.redhat.com \
    --to=vkuznets@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dchinner@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tom.leiming@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.