linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Alignment Issue with Direct IO to NVMe Drive
@ 2012-11-27  0:35 Laine Walker-Avina
  2012-11-27 12:09 ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Laine Walker-Avina @ 2012-11-27  0:35 UTC (permalink / raw)
  To: Jens Axboe, linux-nvme, linux-kernel, willy; +Cc: lwalkera

Hi all,

We are experiencing an issue with doing direct IO to a NVMe device I'm
helping to develop. Every so often, the physical address given by
sg_dma_address() is aligned to 0x800 instead of 0x1000 as specified by
blk_queue_dma_alignement(queue, 4095) when the queue is initialized.
The request is also split over multiple segments to make up for the
missing space (eg: for a 4k IO it's split into two segments 2k in
size, and for an 8k IO it's split into 3 segments--2k,4k,2k). Our
design requires the physical segments given to the device be aligned
to 4k boundaries and be multiples of 4k in size. When not doing direct
IO the physical addresses appear to always be 4k aligned as expected.
One possible issue is the kernel we're primarily testing against is
2.6.32-220 from CentOS, but we have observed similar behavior from a
vanilla 3.3 kernel as well. Any help would be greatly appreciated.

Thanks,
Laine Walker-Avina

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Alignment Issue with Direct IO to NVMe Drive
  2012-11-27  0:35 Alignment Issue with Direct IO to NVMe Drive Laine Walker-Avina
@ 2012-11-27 12:09 ` Jens Axboe
  2012-11-27 17:05   ` Matthew Wilcox
  2012-11-27 17:43   ` Laine Walker-Avina
  0 siblings, 2 replies; 7+ messages in thread
From: Jens Axboe @ 2012-11-27 12:09 UTC (permalink / raw)
  To: Laine Walker-Avina; +Cc: linux-nvme, linux-kernel, willy, lwalkera

On 2012-11-27 01:35, Laine Walker-Avina wrote:
> Hi all,
> 
> We are experiencing an issue with doing direct IO to a NVMe device I'm
> helping to develop. Every so often, the physical address given by
> sg_dma_address() is aligned to 0x800 instead of 0x1000 as specified by
> blk_queue_dma_alignement(queue, 4095) when the queue is initialized.
> The request is also split over multiple segments to make up for the
> missing space (eg: for a 4k IO it's split into two segments 2k in
> size, and for an 8k IO it's split into 3 segments--2k,4k,2k). Our
> design requires the physical segments given to the device be aligned
> to 4k boundaries and be multiples of 4k in size. When not doing direct
> IO the physical addresses appear to always be 4k aligned as expected.
> One possible issue is the kernel we're primarily testing against is
> 2.6.32-220 from CentOS, but we have observed similar behavior from a
> vanilla 3.3 kernel as well. Any help would be greatly appreciated.

I'm assuming you set the hardware sector size to 4k as well?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Alignment Issue with Direct IO to NVMe Drive
  2012-11-27 12:09 ` Jens Axboe
@ 2012-11-27 17:05   ` Matthew Wilcox
  2012-11-27 17:47     ` Laine Walker-Avina
  2012-11-27 17:43   ` Laine Walker-Avina
  1 sibling, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2012-11-27 17:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Laine Walker-Avina, linux-nvme, linux-kernel, lwalkera

On Tue, Nov 27, 2012 at 01:09:46PM +0100, Jens Axboe wrote:
> On 2012-11-27 01:35, Laine Walker-Avina wrote:
> > Hi all,
> > 
> > We are experiencing an issue with doing direct IO to a NVMe device I'm
> > helping to develop. Every so often, the physical address given by
> > sg_dma_address() is aligned to 0x800 instead of 0x1000 as specified by
> > blk_queue_dma_alignement(queue, 4095) when the queue is initialized.

FYI, this is a modification to the driver that Laine has made; presumably
for a limitation of the prototype hardware he's working with.  The NVMe
spec requires the device to be able to do I/Os to 4 byte boundaries.

Laine, when this occurs, what is the alignment of 'offset' in the sg
entry you're looking at?  If userspace is passing in an unaligned address,
I don't think there's anything we do to try to align it.

> > The request is also split over multiple segments to make up for the
> > missing space (eg: for a 4k IO it's split into two segments 2k in
> > size, and for an 8k IO it's split into 3 segments--2k,4k,2k). Our
> > design requires the physical segments given to the device be aligned
> > to 4k boundaries and be multiples of 4k in size. When not doing direct
> > IO the physical addresses appear to always be 4k aligned as expected.
> > One possible issue is the kernel we're primarily testing against is
> > 2.6.32-220 from CentOS, but we have observed similar behavior from a
> > vanilla 3.3 kernel as well. Any help would be greatly appreciated.
> 
> I'm assuming you set the hardware sector size to 4k as well?
> 
> -- 
> Jens Axboe

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Alignment Issue with Direct IO to NVMe Drive
  2012-11-27 12:09 ` Jens Axboe
  2012-11-27 17:05   ` Matthew Wilcox
@ 2012-11-27 17:43   ` Laine Walker-Avina
  1 sibling, 0 replies; 7+ messages in thread
From: Laine Walker-Avina @ 2012-11-27 17:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-nvme, linux-kernel, willy, lwalkera

On Tue, Nov 27, 2012 at 4:09 AM, Jens Axboe <axboe@kernel.dk> wrote:
> On 2012-11-27 01:35, Laine Walker-Avina wrote:
>> Hi all,
>>
>> We are experiencing an issue with doing direct IO to a NVMe device I'm
>> helping to develop. Every so often, the physical address given by
>> sg_dma_address() is aligned to 0x800 instead of 0x1000 as specified by
>> blk_queue_dma_alignement(queue, 4095) when the queue is initialized.
>> The request is also split over multiple segments to make up for the
>> missing space (eg: for a 4k IO it's split into two segments 2k in
>> size, and for an 8k IO it's split into 3 segments--2k,4k,2k). Our
>> design requires the physical segments given to the device be aligned
>> to 4k boundaries and be multiples of 4k in size. When not doing direct
>> IO the physical addresses appear to always be 4k aligned as expected.
>> One possible issue is the kernel we're primarily testing against is
>> 2.6.32-220 from CentOS, but we have observed similar behavior from a
>> vanilla 3.3 kernel as well. Any help would be greatly appreciated.
>
> I'm assuming you set the hardware sector size to 4k as well?
>
> --
> Jens Axboe
>

Yes, as well as the logical block size and io_min.

Laine Walker-Avina

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Alignment Issue with Direct IO to NVMe Drive
  2012-11-27 17:05   ` Matthew Wilcox
@ 2012-11-27 17:47     ` Laine Walker-Avina
  2012-11-27 21:39       ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Laine Walker-Avina @ 2012-11-27 17:47 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jens Axboe, linux-nvme, linux-kernel, lwalkera

On Tue, Nov 27, 2012 at 9:05 AM, Matthew Wilcox <willy@linux.intel.com> wrote:
> Laine, when this occurs, what is the alignment of 'offset' in the sg
> entry you're looking at?  If userspace is passing in an unaligned address,
> I don't think there's anything we do to try to align it.

I thought this was the case as well, but it appears to be aligned
looking at the request in the user space program (fio using libaio in
this case) before the IO is submitted.

Laine

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Alignment Issue with Direct IO to NVMe Drive
  2012-11-27 17:47     ` Laine Walker-Avina
@ 2012-11-27 21:39       ` Matthew Wilcox
  2012-11-27 22:25         ` Laine Walker-Avina
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2012-11-27 21:39 UTC (permalink / raw)
  To: Laine Walker-Avina; +Cc: Jens Axboe, linux-nvme, linux-kernel, lwalkera

On Tue, Nov 27, 2012 at 09:47:48AM -0800, Laine Walker-Avina wrote:
> On Tue, Nov 27, 2012 at 9:05 AM, Matthew Wilcox <willy@linux.intel.com> wrote:
> > Laine, when this occurs, what is the alignment of 'offset' in the sg
> > entry you're looking at?  If userspace is passing in an unaligned address,
> > I don't think there's anything we do to try to align it.
> 
> I thought this was the case as well, but it appears to be aligned
> looking at the request in the user space program (fio using libaio in
> this case) before the IO is submitted.

OK, so we have an aligned virtual address being translated into an
unaligned DMA address.  I have a suspicion ... are you using the swiotlb,
or are you using a real IOMMU?  Grep dmesg for 'IOMMU' if you're not sure.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Alignment Issue with Direct IO to NVMe Drive
  2012-11-27 21:39       ` Matthew Wilcox
@ 2012-11-27 22:25         ` Laine Walker-Avina
  0 siblings, 0 replies; 7+ messages in thread
From: Laine Walker-Avina @ 2012-11-27 22:25 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jens Axboe, linux-nvme, linux-kernel, lwalkera

On Tue, Nov 27, 2012 at 1:39 PM, Matthew Wilcox <willy@linux.intel.com> wrote:
> On Tue, Nov 27, 2012 at 09:47:48AM -0800, Laine Walker-Avina wrote:
>> On Tue, Nov 27, 2012 at 9:05 AM, Matthew Wilcox <willy@linux.intel.com> wrote:
>> > Laine, when this occurs, what is the alignment of 'offset' in the sg
>> > entry you're looking at?  If userspace is passing in an unaligned address,
>> > I don't think there's anything we do to try to align it.
>>
>> I thought this was the case as well, but it appears to be aligned
>> looking at the request in the user space program (fio using libaio in
>> this case) before the IO is submitted.
>
> OK, so we have an aligned virtual address being translated into an
> unaligned DMA address.  I have a suspicion ... are you using the swiotlb,
> or are you using a real IOMMU?  Grep dmesg for 'IOMMU' if you're not sure.

There is no mention of IOMMU in dmesg on a fresh boot. The processor
in the test system I'm using is an i5-2400 (Sandy Bridge).

-Laine

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-11-27 22:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-27  0:35 Alignment Issue with Direct IO to NVMe Drive Laine Walker-Avina
2012-11-27 12:09 ` Jens Axboe
2012-11-27 17:05   ` Matthew Wilcox
2012-11-27 17:47     ` Laine Walker-Avina
2012-11-27 21:39       ` Matthew Wilcox
2012-11-27 22:25         ` Laine Walker-Avina
2012-11-27 17:43   ` Laine Walker-Avina

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).