All of lore.kernel.org
 help / color / mirror / Atom feed
* NCQ high speed data integrity issues
@ 2009-12-15 19:29 Dan Porat
  2009-12-15 20:47 ` Grant Grundler
  0 siblings, 1 reply; 4+ messages in thread
From: Dan Porat @ 2009-12-15 19:29 UTC (permalink / raw)
  To: linux-ide

I am running into data integrity issues while running sgp_dd ver 1.20
against any sg device while running 31 threads.
kernel 2.6.28 Ubuntu.
When running it in slow speed (20 MB/s) all runs well.
When pushing it towards the 50MB/s , Data integrity issues appear.

When running it against sd devices - no integrity issues , but bpt is
not determined and changes .

Any idea how to fix the sg data integrity issues?

Thanks

Dan Porat

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NCQ high speed data integrity issues
  2009-12-15 19:29 NCQ high speed data integrity issues Dan Porat
@ 2009-12-15 20:47 ` Grant Grundler
  2009-12-17  6:49   ` Dan Porat
  0 siblings, 1 reply; 4+ messages in thread
From: Grant Grundler @ 2009-12-15 20:47 UTC (permalink / raw)
  To: Dan Porat; +Cc: linux-ide

On Tue, Dec 15, 2009 at 11:29 AM, Dan Porat <dan.porat@gmail.com> wrote:
> I am running into data integrity issues while running sgp_dd ver 1.20
> against any sg device while running 31 threads.
> kernel 2.6.28 Ubuntu.
> When running it in slow speed (20 MB/s) all runs well.
> When pushing it towards the 50MB/s , Data integrity issues appear.

How are you varying the speed?

> When running it against sd devices - no integrity issues , but bpt is
> not determined and changes .

Hrm? bpt is a parameter. What do you mean?

I don't know offhand why sg vs sd would make a difference but it's
obviously of interest.

> Any idea how to fix the sg data integrity issues?

Can you post Good vs Bad data?
Knowing the offset, size, and type of corruption will help narrow down
possible sources of corruption.

Any errors reported by the device driver, memory controller, sg device?

cheers,
grant

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NCQ high speed data integrity issues
  2009-12-15 20:47 ` Grant Grundler
@ 2009-12-17  6:49   ` Dan Porat
  2009-12-17 18:16     ` Grant Grundler
  0 siblings, 1 reply; 4+ messages in thread
From: Dan Porat @ 2009-12-17  6:49 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-ide

Variyng the speed - using usleep.

bpt is a parameter but when used over sd , it seems to be ignored somehow , and
transfers have variating sizes.

regarding the data - since the problem is a timing issues or
synchronization issue , it does not always appear.
The disk is written with the same character and when read , using
sgp_dd , sometimes , some data is just different.

We know for sure that the data is written correctly as it sometimes
does succeed to read all successfully.
No errors are to be seen in kernel level , and definitely not in the
application side.

We have seen that the more threads there are , the more likely this
problem to occur.
Same as the amount of data.
The bigger the data , the more likely for us to see the problem.

Any suggestion how to investigate ?


Dan Porat

On Tue, Dec 15, 2009 at 10:47 PM, Grant Grundler <grundler@google.com> wrote:
>
> On Tue, Dec 15, 2009 at 11:29 AM, Dan Porat <dan.porat@gmail.com> wrote:
> > I am running into data integrity issues while running sgp_dd ver 1.20
> > against any sg device while running 31 threads.
> > kernel 2.6.28 Ubuntu.
> > When running it in slow speed (20 MB/s) all runs well.
> > When pushing it towards the 50MB/s , Data integrity issues appear.
>
> How are you varying the speed?
>
> > When running it against sd devices - no integrity issues , but bpt is
> > not determined and changes .
>
> Hrm? bpt is a parameter. What do you mean?
>
> I don't know offhand why sg vs sd would make a difference but it's
> obviously of interest.
>
> > Any idea how to fix the sg data integrity issues?
>
> Can you post Good vs Bad data?
> Knowing the offset, size, and type of corruption will help narrow down
> possible sources of corruption.
>
> Any errors reported by the device driver, memory controller, sg device?
>
> cheers,
> grant

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NCQ high speed data integrity issues
  2009-12-17  6:49   ` Dan Porat
@ 2009-12-17 18:16     ` Grant Grundler
  0 siblings, 0 replies; 4+ messages in thread
From: Grant Grundler @ 2009-12-17 18:16 UTC (permalink / raw)
  To: Dan Porat; +Cc: linux-ide

On Wed, Dec 16, 2009 at 10:49 PM, Dan Porat <dan.porat@gmail.com> wrote:
> Variyng the speed - using usleep.
>
> bpt is a parameter but when used over sd , it seems to be ignored somehow , and
> transfers have variating sizes.
>
> regarding the data - since the problem is a timing issues or
> synchronization issue , it does not always appear.
> The disk is written with the same character and when read , using
> sgp_dd , sometimes , some data is just different.

To track this down, "good" vs "bad" data needs to be presented such
that one can compare alignment and size of "bad" data.

> We know for sure that the data is written correctly as it sometimes
> does succeed to read all successfully.

Does the data appear correctly if it is re-read from the disk?
ie read once and see bad data. Then read again. Still bad?

> No errors are to be seen in kernel level , and definitely not in the
> application side.

Good. Rules out a large amount of code.

hth,
grant

>
> We have seen that the more threads there are , the more likely this
> problem to occur.
> Same as the amount of data.
> The bigger the data , the more likely for us to see the problem.
>
> Any suggestion how to investigate ?
>
>
> Dan Porat
>
> On Tue, Dec 15, 2009 at 10:47 PM, Grant Grundler <grundler@google.com> wrote:
>>
>> On Tue, Dec 15, 2009 at 11:29 AM, Dan Porat <dan.porat@gmail.com> wrote:
>> > I am running into data integrity issues while running sgp_dd ver 1.20
>> > against any sg device while running 31 threads.
>> > kernel 2.6.28 Ubuntu.
>> > When running it in slow speed (20 MB/s) all runs well.
>> > When pushing it towards the 50MB/s , Data integrity issues appear.
>>
>> How are you varying the speed?
>>
>> > When running it against sd devices - no integrity issues , but bpt is
>> > not determined and changes .
>>
>> Hrm? bpt is a parameter. What do you mean?
>>
>> I don't know offhand why sg vs sd would make a difference but it's
>> obviously of interest.
>>
>> > Any idea how to fix the sg data integrity issues?
>>
>> Can you post Good vs Bad data?
>> Knowing the offset, size, and type of corruption will help narrow down
>> possible sources of corruption.
>>
>> Any errors reported by the device driver, memory controller, sg device?
>>
>> cheers,
>> grant
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-12-17 18:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-15 19:29 NCQ high speed data integrity issues Dan Porat
2009-12-15 20:47 ` Grant Grundler
2009-12-17  6:49   ` Dan Porat
2009-12-17 18:16     ` Grant Grundler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.