block: wrong return value by bio_end_sector?

* block: wrong return value by bio_end_sector?
@ 2022-09-30 15:59 Paolo Valente
  2022-09-30 16:35 ` Jens Axboe
  2022-10-01  0:50 ` Damien Le Moal
  0 siblings, 2 replies; 6+ messages in thread
From: Paolo Valente @ 2022-09-30 15:59 UTC (permalink / raw)
  To: Arie van der Hoeven, Tyler Erickson, Muhammad Ahmad, linux-block,
	linux-kernel, Jan Kara, andrea.righi, glen.valante, axboe,
	Michael English, Andrew Ring, Varun Boddu, Damien Le Moal

Hi Jens, Damien, all other possibly interested people,
this is to raise attention on a mistake that has emerged in a
thread on a bfq extension for multi-actuary drives [1].

The mistake is apparently in the macro bio_end_sector (defined in
include/linux/bio.h), which seems to be translated (incorrectly) as
sector+size, and not as sector+size-1.

For your convenience, I'm pasting a detailed description of the
problem, by Tyler (description taken from the above thread [1]).

The drive reports the actuator ranges as a starting LBA and a count of LBAs for the range.
If the code reading the reported values simply does startingLBA + range, this is an incorrect ending LBA for that actuator. This is because LBAs are zero indexed and this simple addition is not taking that into account.
The proper way to get the endingLBA is startingLBA + range - 1 to get the last LBA value for where to issue a final IO read/write to account for LBA values starting at zero rather than one.

Here is an example from the output in SeaChest/openSeaChest:
====Concurrent Positioning Ranges====

Range#     #Elements            Lowest LBA          # of LBAs      
  0            1                                               0           17578328064
  1            1                         17578328064           17578328064

If using the incorrect formula to get the final LBA for actuator 0, you would get 17578328064, but this is the starting LBA reported by the drive for actuator 1.
So to be consistent for all ranges, the final LBA for a given actuator should be calculated as starting LBA + range - 1.

I had reached out to Seagate's T10 and T13 representatives for clarification and verification and this is most likely what is causing the error is a missing - 1 somewhere after getting the information reported by the device. They agreed that the reporting from the drive and the SCSI to ATA translation is correct.

I'm not sure where this is being read and calculated, but it is not an error in the low-level libata or sd level of the kernel. It may be in bfq, or it may be in some other place after the sd layer. I know there were some additions to read this and report it up the stack, but I did not think those were wrong as they seemed to pass the drive reported information up the stack.

Jens, Damien, can you shed a light on this?

Thanks,
Paolo

[1] https://www.spinics.net/lists/kernel/msg4507408.html

^ permalink raw reply	[flat|nested] 6+ messages in thread