All of lore.kernel.org
 help / color / mirror / Atom feed
* scsi_cmnd data_buffer checksum
@ 2010-09-09  3:36 Anil kumar
  2010-09-09  8:00 ` Christof Schmitt
  0 siblings, 1 reply; 9+ messages in thread
From: Anil kumar @ 2010-09-09  3:36 UTC (permalink / raw)
  To: linux-kernel, linux-scsi


I am writing a checksum calculation of scsi_cmnd data buffer in the driver. 

I calculate the checksum of the scsi_cmd data buffer(request_buffer) in driver queuecommand.

Now when the command is completed from the hardware and before driver sends it back to mid-layer, I calculate the checksum again of the same scsi_cmd data_buffer again.

Sometimes the checksums don't match. I mean somehow looks like OS changed the scsi_cmd data_buffer(request_buffer) in the meantime when driver is working on the command.
I print the address of the scsi_cmd data_buffer (virtual address) and its same and the contents of the buffer is also same during both the calculations.
 
Can this happen?



      

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: scsi_cmnd data_buffer checksum
  2010-09-09  3:36 scsi_cmnd data_buffer checksum Anil kumar
@ 2010-09-09  8:00 ` Christof Schmitt
  2010-09-09  8:35   ` Anil kumar
  0 siblings, 1 reply; 9+ messages in thread
From: Christof Schmitt @ 2010-09-09  8:00 UTC (permalink / raw)
  To: Anil kumar; +Cc: linux-kernel, linux-scsi

On Wed, Sep 08, 2010 at 08:36:32PM -0700, Anil kumar wrote:
> 
> I am writing a checksum calculation of scsi_cmnd data buffer in the driver. 
> 
> I calculate the checksum of the scsi_cmd data buffer(request_buffer) in driver queuecommand.
> 
> Now when the command is completed from the hardware and before driver sends it back to mid-layer, I calculate the checksum again of the same scsi_cmd data_buffer again.
> 
> Sometimes the checksums don't match. I mean somehow looks like OS changed the scsi_cmd data_buffer(request_buffer) in the meantime when driver is working on the command.
> I print the address of the scsi_cmd data_buffer (virtual address) and its same and the contents of the buffer is also same during both the calculations.
> 
> Can this happen?

Yes. While a write I/O is being processed in Linux or in flight, the
data buffers can change. This causes problems for other checksums as
well; i ran into this when looking at the DIF/DIX checksums for SCSI
commands.

At the moment, this problem can be avoided e.g. by running only direct
I/O on the xfs filesystem. There have been discussions about this, a
short summary is here, look for "stable pages":
http://lwn.net/Articles/399148/

Christof

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: scsi_cmnd data_buffer checksum
  2010-09-09  8:00 ` Christof Schmitt
@ 2010-09-09  8:35   ` Anil kumar
  2010-09-09  8:51     ` Christof Schmitt
  0 siblings, 1 reply; 9+ messages in thread
From: Anil kumar @ 2010-09-09  8:35 UTC (permalink / raw)
  To: Christof Schmitt; +Cc: linux-kernel, linux-scsi

Hi Christof,

Thanks for the response.

I am running mkfs.ext3 command.

I am doing the following in the driver for write(10):

Queuecommand:

sg = scsi_sglist(cmd->scsi_cmd);
cmd->write_buf = (u8 *)(kmap_atomic(sg->page, KM_IRQ0) + sg->offset);
Calculate checksum for write_buf

Write Done:
Calculate checksum for cmd->write_buf

and checksums don't match. I am wondering how come OS changed the cmd->write_buf when I have not even unmapped the buffer. Is filesystem changing this cmd->write_buf pages when driver/HW is working on it?

Is there anyway I can avoid this. How about if we allocate a local buffer(kmalloc/pci_alloc_consistent) and memcpy kmap_atomic to that local buffer and then calculate checksum on that local buffer. Will this help?

--Anil

--- On Thu, 9/9/10, Christof Schmitt <christof.schmitt@de.ibm.com> wrote:

> From: Christof Schmitt <christof.schmitt@de.ibm.com>
> Subject: Re: scsi_cmnd data_buffer checksum
> To: "Anil kumar" <anils_r@yahoo.com>
> Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
> Date: Thursday, September 9, 2010, 4:00 AM
> On Wed, Sep 08, 2010 at 08:36:32PM
> -0700, Anil kumar wrote:
> > 
> > I am writing a checksum calculation of scsi_cmnd data
> buffer in the driver. 
> > 
> > I calculate the checksum of the scsi_cmd data
> buffer(request_buffer) in driver queuecommand.
> > 
> > Now when the command is completed from the hardware
> and before driver sends it back to mid-layer, I calculate
> the checksum again of the same scsi_cmd data_buffer again.
> > 
> > Sometimes the checksums don't match. I mean somehow
> looks like OS changed the scsi_cmd
> data_buffer(request_buffer) in the meantime when driver is
> working on the command.
> > I print the address of the scsi_cmd data_buffer
> (virtual address) and its same and the contents of the
> buffer is also same during both the calculations.
> > 
> > Can this happen?
> 
> Yes. While a write I/O is being processed in Linux or in
> flight, the
> data buffers can change. This causes problems for other
> checksums as
> well; i ran into this when looking at the DIF/DIX checksums
> for SCSI
> commands.
> 
> At the moment, this problem can be avoided e.g. by running
> only direct
> I/O on the xfs filesystem. There have been discussions
> about this, a
> short summary is here, look for "stable pages":
> http://lwn.net/Articles/399148/
> 
> Christof
> 


      

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: scsi_cmnd data_buffer checksum
  2010-09-09  8:35   ` Anil kumar
@ 2010-09-09  8:51     ` Christof Schmitt
  2010-09-09  9:09       ` Anil kumar
  0 siblings, 1 reply; 9+ messages in thread
From: Christof Schmitt @ 2010-09-09  8:51 UTC (permalink / raw)
  To: Anil kumar; +Cc: linux-kernel, linux-scsi

On Thu, Sep 09, 2010 at 01:35:02AM -0700, Anil kumar wrote:
> Hi Christof,
> 
> Thanks for the response.
> 
> I am running mkfs.ext3 command.
> 
> I am doing the following in the driver for write(10):
> 
> Queuecommand:
> 
> sg = scsi_sglist(cmd->scsi_cmd);
> cmd->write_buf = (u8 *)(kmap_atomic(sg->page, KM_IRQ0) + sg->offset);
> Calculate checksum for write_buf
> 
> Write Done:
> Calculate checksum for cmd->write_buf
> 
> and checksums don't match. I am wondering how come OS changed the cmd->write_buf when I have not even unmapped the buffer. Is filesystem changing this cmd->write_buf pages when driver/HW is working on it?

Yes, the driver has direct access to the data. Usually, the data is
not copied for I/O requests. The driver gets one sg list that points
to the data pages of file system (or whatever the data source is).
When the filesystem decides to change the data, this single data
buffer is changed.

> Is there anyway I can avoid this. How about if we allocate a local buffer(kmalloc/pci_alloc_consistent) and memcpy kmap_atomic to that local buffer and then calculate checksum on that local buffer. Will this help?

Sure, you can create copies of data buffers in the driver, calculate
the checksum of the copy and submit the data copy with the checksum to
the hardware controller. This is usually not done for performance
reasons, and you probably should keep a mempool to be able to issue
I/Os when memory is low.

Christof

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: scsi_cmnd data_buffer checksum
  2010-09-09  8:51     ` Christof Schmitt
@ 2010-09-09  9:09       ` Anil kumar
  2010-09-09  9:29         ` Christof Schmitt
  2010-09-09 13:23         ` Martin K. Petersen
  0 siblings, 2 replies; 9+ messages in thread
From: Anil kumar @ 2010-09-09  9:09 UTC (permalink / raw)
  To: Christof Schmitt; +Cc: linux-kernel, linux-scsi

I quickly tried copying the data buffer to local buffers as follows:

In Queuecommand:

cmd->local_write_buf = pci_alloc_consistent(...);
cmd->write_buf = (u8 *)(kmap_atomic(sg->page, KM_IRQ0) + sg->offset);

memcpy(cmd->local_write_buf, cmd->write_buf, scsibufflen(scsi_cmnd));

Now I calculate checksums of cmd->write_buf and my local cmd->local_write_buf

and the checksum fails. Am I doing something wrong here?

--- On Thu, 9/9/10, Christof Schmitt <christof.schmitt@de.ibm.com> wrote:

> From: Christof Schmitt <christof.schmitt@de.ibm.com>
> Subject: Re: scsi_cmnd data_buffer checksum
> To: "Anil kumar" <anils_r@yahoo.com>
> Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
> Date: Thursday, September 9, 2010, 4:51 AM
> On Thu, Sep 09, 2010 at 01:35:02AM
> -0700, Anil kumar wrote:
> > Hi Christof,
> > 
> > Thanks for the response.
> > 
> > I am running mkfs.ext3 command.
> > 
> > I am doing the following in the driver for write(10):
> > 
> > Queuecommand:
> > 
> > sg = scsi_sglist(cmd->scsi_cmd);
> > cmd->write_buf = (u8 *)(kmap_atomic(sg->page,
> KM_IRQ0) + sg->offset);
> > Calculate checksum for write_buf
> > 
> > Write Done:
> > Calculate checksum for cmd->write_buf
> > 
> > and checksums don't match. I am wondering how come OS
> changed the cmd->write_buf when I have not even unmapped
> the buffer. Is filesystem changing this cmd->write_buf
> pages when driver/HW is working on it?
> 
> Yes, the driver has direct access to the data. Usually, the
> data is
> not copied for I/O requests. The driver gets one sg list
> that points
> to the data pages of file system (or whatever the data
> source is).
> When the filesystem decides to change the data, this single
> data
> buffer is changed.
> 
> > Is there anyway I can avoid this. How about if we
> allocate a local buffer(kmalloc/pci_alloc_consistent) and
> memcpy kmap_atomic to that local buffer and then calculate
> checksum on that local buffer. Will this help?
> 
> Sure, you can create copies of data buffers in the driver,
> calculate
> the checksum of the copy and submit the data copy with the
> checksum to
> the hardware controller. This is usually not done for
> performance
> reasons, and you probably should keep a mempool to be able
> to issue
> I/Os when memory is low.
> 
> Christof
> 


      

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: scsi_cmnd data_buffer checksum
  2010-09-09  9:09       ` Anil kumar
@ 2010-09-09  9:29         ` Christof Schmitt
  2010-09-09 13:23         ` Martin K. Petersen
  1 sibling, 0 replies; 9+ messages in thread
From: Christof Schmitt @ 2010-09-09  9:29 UTC (permalink / raw)
  To: Anil kumar; +Cc: linux-kernel, linux-scsi

On Thu, Sep 09, 2010 at 02:09:28AM -0700, Anil kumar wrote:
> I quickly tried copying the data buffer to local buffers as follows:
> 
> In Queuecommand:
> 
> cmd->local_write_buf = pci_alloc_consistent(...);
> cmd->write_buf = (u8 *)(kmap_atomic(sg->page, KM_IRQ0) + sg->offset);
> 
> memcpy(cmd->local_write_buf, cmd->write_buf, scsibufflen(scsi_cmnd));
> 
> Now I calculate checksums of cmd->write_buf and my local cmd->local_write_buf
> 
> and the checksum fails. Am I doing something wrong here?

Are you only copying one element of the sg list? Are you sure that
there is only one element? If not, you have to use something like
scsi_kmap_atomic_sg and create a copy of each element (maybe use
scsi_for_each_sg).

Or maybe there is another problem... hard to guess from the distance.

Christof

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: scsi_cmnd data_buffer checksum
  2010-09-09  9:09       ` Anil kumar
  2010-09-09  9:29         ` Christof Schmitt
@ 2010-09-09 13:23         ` Martin K. Petersen
  2010-09-09 19:34           ` Anil kumar
  1 sibling, 1 reply; 9+ messages in thread
From: Martin K. Petersen @ 2010-09-09 13:23 UTC (permalink / raw)
  To: Anil kumar; +Cc: Christof Schmitt, linux-kernel, linux-scsi

>>>>> "Anil" == Anil kumar <anils_r@yahoo.com> writes:

Anil> cmd-> write_buf = (u8 *)(kmap_atomic(sg->page, KM_IRQ0) + sg->offset);
Anil> memcpy(cmd->local_write_buf, cmd->write_buf, cmd->scsibufflen(scsi_cmnd));

You need to walk the scatterlist.  You can't assume that the pages are
contiguous in memory.

But why are you doing this checksumming stuff manually when the existing
DIX/DIF code can do it for you?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: scsi_cmnd data_buffer checksum
  2010-09-09 13:23         ` Martin K. Petersen
@ 2010-09-09 19:34           ` Anil kumar
  2010-09-10  3:33             ` Martin K. Petersen
  0 siblings, 1 reply; 9+ messages in thread
From: Anil kumar @ 2010-09-09 19:34 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Christof Schmitt, linux-kernel, linux-scsi

Martin,

Thanks. I will go thru the DIF document and code. 

But I had a quick question, isn't that drive has to support "protection enable" also. Also, does Controller FW need to do anything extra to send DIF block down to drive. Ideally we will be sending 512 byte blocks down to drive.

-Anil

--- On Thu, 9/9/10, Martin K. Petersen <martin.petersen@oracle.com> wrote:

> From: Martin K. Petersen <martin.petersen@oracle.com>
> Subject: Re: scsi_cmnd data_buffer checksum
> To: "Anil kumar" <anils_r@yahoo.com>
> Cc: "Christof Schmitt" <christof.schmitt@de.ibm.com>, linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
> Date: Thursday, September 9, 2010, 9:23 AM
> >>>>> "Anil" == Anil
> kumar <anils_r@yahoo.com>
> writes:
> 
> Anil> cmd-> write_buf = (u8
> *)(kmap_atomic(sg->page, KM_IRQ0) + sg->offset);
> Anil> memcpy(cmd->local_write_buf, cmd->write_buf,
> cmd->scsibufflen(scsi_cmnd));
> 
> You need to walk the scatterlist.  You can't assume
> that the pages are
> contiguous in memory.
> 
> But why are you doing this checksumming stuff manually when
> the existing
> DIX/DIF code can do it for you?
> 
> -- 
> Martin K. Petersen    Oracle Linux
> Engineering
> 


      

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: scsi_cmnd data_buffer checksum
  2010-09-09 19:34           ` Anil kumar
@ 2010-09-10  3:33             ` Martin K. Petersen
  0 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2010-09-10  3:33 UTC (permalink / raw)
  To: Anil kumar; +Cc: Martin K. Petersen, Christof Schmitt, linux-kernel, linux-scsi

>>>>> "Anil" == Anil kumar <anils_r@yahoo.com> writes:

Anil> But I had a quick question, isn't that drive has to support
Anil> "protection enable" also.

Just register DIX Type 0 in the host's protection mask.  That'll enable
protected transfer to the HBA only.


Anil> Also, does Controller FW need to do anything extra to send DIF
Anil> block down to drive. Ideally we will be sending 512 byte blocks
Anil> down to drive.

	http://oss.oracle.com/~mkp/docs/dix-draft.pdf

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-09-10  3:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-09  3:36 scsi_cmnd data_buffer checksum Anil kumar
2010-09-09  8:00 ` Christof Schmitt
2010-09-09  8:35   ` Anil kumar
2010-09-09  8:51     ` Christof Schmitt
2010-09-09  9:09       ` Anil kumar
2010-09-09  9:29         ` Christof Schmitt
2010-09-09 13:23         ` Martin K. Petersen
2010-09-09 19:34           ` Anil kumar
2010-09-10  3:33             ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.