* optimal io size / custom alignment @ 2015-06-13 14:52 Tom Yan 2015-06-15 13:31 ` Karel Zak 0 siblings, 1 reply; 22+ messages in thread From: Tom Yan @ 2015-06-13 14:52 UTC (permalink / raw) To: util-linux As I have mentioned in previous mails, I have an sata/usb3 adapter which could work in uas mode, and when it does, it has a weird optimal i/o size: Disk /dev/sdb: 74.5 GiB, 80026361856 bytes, 156301488 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 33553920 bytes http://www.linuxquestions.org/questions/linux-newbie-8/how-to-foramt-2tb-external-hard-drive-4175529792/ In the above link, there shows another similar case of an external drive with 4k physical sector. I am not sure if there's anything wrong with the device(s) or the kernel, but anyway I doubt if fdisk should determine alignment with this size. As you can calculate, it may not necessarily be a multiple of the size of physical sectors, or that of common erase block of SSDs (which is not reported anywhere AFAIK). Perhaps this I/O size does matter on alignment for certain cases, but shouldn't physical sector or erase block be at least of higher priority when it comes to alignment? In any case, it would be nice if fdisk can allow customize alignment (like gdisk does), so that users can at least decide how partitions should be aligned in weird cases like this. With that, the long-time deprecated "dos compatibility" might be able to go as well. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment 2015-06-13 14:52 optimal io size / custom alignment Tom Yan @ 2015-06-15 13:31 ` Karel Zak 2015-06-16 5:20 ` Tom Yan 2015-07-12 4:19 ` optimal io size / custom alignment -- caution on custom aligns Linda Walsh 0 siblings, 2 replies; 22+ messages in thread From: Karel Zak @ 2015-06-15 13:31 UTC (permalink / raw) To: Tom Yan; +Cc: util-linux, Martin K. Petersen On Sat, Jun 13, 2015 at 10:52:04PM +0800, Tom Yan wrote: > As I have mentioned in previous mails, I have an sata/usb3 adapter > which could work in uas mode, and when it does, it has a weird optimal > i/o size: > > Disk /dev/sdb: 74.5 GiB, 80026361856 bytes, 156301488 sectors > Units: sectors of 1 * 512 = 512 bytes > Sector size (logical/physical): 512 bytes / 512 bytes > I/O size (minimum/optimal): 512 bytes / 33553920 bytes This is no problem (33553920 % 512 = 0) with the current kernel and the current util-linux git tree where we support non power of 2 alignment. > http://www.linuxquestions.org/questions/linux-newbie-8/how-to-foramt-2tb-external-hard-drive-4175529792/ > > In the above link, there shows another similar case of an external > drive with 4k physical sector. from the link: Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 33553920 bytes this is problem (33553920 % 4096 != 0) and frankly it seems like pretty strange thing, maybe kernel guys can comment it (CC: to Martin). > I am not sure if there's anything wrong with the device(s) or the > kernel, but anyway I doubt if fdisk should determine alignment with > this size. As you can calculate, it may not necessarily be a multiple > of the size of physical sectors, or that of common erase block of SSDs > (which is not reported anywhere AFAIK). > > Perhaps this I/O size does matter on alignment for certain cases, but > shouldn't physical sector or erase block be at least of higher > priority when it comes to alignment? I think we can test "optimal_io_size % physical_sector_size" and use physical sector size as the granularity if the optimal_io_size is a strange number. > In any case, it would be nice if fdisk can allow customize alignment > (like gdisk does), so that users can at least decide how partitions > should be aligned in weird cases like this. With that, the long-time > deprecated "dos compatibility" might be able to go as well. I'll think about it... Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment 2015-06-15 13:31 ` Karel Zak @ 2015-06-16 5:20 ` Tom Yan 2015-06-16 5:37 ` Tom Yan ` (2 more replies) 2015-07-12 4:19 ` optimal io size / custom alignment -- caution on custom aligns Linda Walsh 1 sibling, 3 replies; 22+ messages in thread From: Tom Yan @ 2015-06-16 5:20 UTC (permalink / raw) To: Karel Zak; +Cc: util-linux, Martin K. Petersen http://www.spinics.net/lists/linux-usb/msg125988.html This optimal i/o size is derived from a "Optimal transfer length" provided by the hardware through "VPD". The issue might not have seemed common because not all drive provide VPDs and not all driver reads them. >From the adapter/drive I have, it is the same as the "Maximum transfer length" and they seem to be simply limits of SCSI "WRITE SAME (10/16)" command: [tom@localhost ~]$ sudo sg_inq -p 0xb0 /dev/sdb VPD INQUIRY: Block limits page (SBC) Maximum compare and write length: 0 blocks Optimal transfer length granularity: 1 blocks Maximum transfer length: 65535 blocks Optimal transfer length: 65535 blocks Maximum prefetch, xdread, xdwrite transfer length: 65535 blocks Maximum unmap LBA count: 0 Maximum unmap block descriptor count: 0 Optimal unmap granularity: 0 Unmap granularity alignment valid: 0 Unmap granularity alignment: 0 Maximum write same length: 0x0 blocks Maximum atomic transfer length: 0 Atomic alignment: 0 Atomic transfer length granularity: 0 [tom@localhost ~]$ sudo sg_inq -p 0xb0 /dev/sdc VPD INQUIRY: Block limits page (SBC) Maximum compare and write length: 0 blocks Optimal transfer length granularity: 0 blocks Maximum transfer length: 8388607 blocks Optimal transfer length: 8388607 blocks Maximum prefetch, xdread, xdwrite transfer length: 0 blocks The thing is, why any io/transfer size/length should be considered when it comes to partition alignment? From what I understand, partition alignment is only to make sure partition starts at physical boundaries of the disk because of the mismatch between logicial sector (512 bytes) and physical sectors (4096 bytes) or pages/erase blocks of SSDs. On 15 June 2015 at 21:31, Karel Zak <kzak@redhat.com> wrote: > On Sat, Jun 13, 2015 at 10:52:04PM +0800, Tom Yan wrote: >> As I have mentioned in previous mails, I have an sata/usb3 adapter >> which could work in uas mode, and when it does, it has a weird optimal >> i/o size: >> >> Disk /dev/sdb: 74.5 GiB, 80026361856 bytes, 156301488 sectors >> Units: sectors of 1 * 512 = 512 bytes >> Sector size (logical/physical): 512 bytes / 512 bytes >> I/O size (minimum/optimal): 512 bytes / 33553920 bytes > > This is no problem (33553920 % 512 = 0) with the current kernel and > the current util-linux git tree where we support non power of 2 > alignment. > >> http://www.linuxquestions.org/questions/linux-newbie-8/how-to-foramt-2tb-external-hard-drive-4175529792/ >> >> In the above link, there shows another similar case of an external >> drive with 4k physical sector. > > from the link: > > Sector size (logical/physical): 512 bytes / 4096 bytes > I/O size (minimum/optimal): 4096 bytes / 33553920 bytes > > this is problem (33553920 % 4096 != 0) and frankly it seems like > pretty strange thing, maybe kernel guys can comment it (CC: to > Martin). > >> I am not sure if there's anything wrong with the device(s) or the >> kernel, but anyway I doubt if fdisk should determine alignment with >> this size. As you can calculate, it may not necessarily be a multiple >> of the size of physical sectors, or that of common erase block of SSDs >> (which is not reported anywhere AFAIK). >> >> Perhaps this I/O size does matter on alignment for certain cases, but >> shouldn't physical sector or erase block be at least of higher >> priority when it comes to alignment? > > I think we can test "optimal_io_size % physical_sector_size" and use physical > sector size as the granularity if the optimal_io_size is a strange number. > >> In any case, it would be nice if fdisk can allow customize alignment >> (like gdisk does), so that users can at least decide how partitions >> should be aligned in weird cases like this. With that, the long-time >> deprecated "dos compatibility" might be able to go as well. > > I'll think about it... > > Karel > > > -- > Karel Zak <kzak@redhat.com> > http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment 2015-06-16 5:20 ` Tom Yan @ 2015-06-16 5:37 ` Tom Yan 2015-06-16 9:43 ` Karel Zak 2015-06-16 17:08 ` Martin K. Petersen 2 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-16 5:37 UTC (permalink / raw) To: Karel Zak; +Cc: util-linux, Martin K. Petersen I forgot to highlight that it might be a very bad idea to simply "mod check" the physical sector size and the optimal i/o size. For one physical sector size doesn't reflect anything of SSDs. Also as you can see in my last mail, the optimal i/o size could be huge. (And since the numbers seem to be "SCSI standards", I'll say it reflects that they simply means nothing for partition alignment.) IMHO we should find out in what case (if any) optimal i/o size REALLY matters for partition alignment, and only use it to derive alignment for those cases (only if they can be rationally differentiated). On 16 June 2015 at 13:20, Tom Yan <tom.ty89@gmail.com> wrote: > http://www.spinics.net/lists/linux-usb/msg125988.html > > This optimal i/o size is derived from a "Optimal transfer length" > provided by the hardware through "VPD". The issue might not have > seemed common because not all drive provide VPDs and not all driver > reads them. > > From the adapter/drive I have, it is the same as the "Maximum transfer > length" and they seem to be simply limits of SCSI "WRITE SAME (10/16)" > command: > > [tom@localhost ~]$ sudo sg_inq -p 0xb0 /dev/sdb > VPD INQUIRY: Block limits page (SBC) > Maximum compare and write length: 0 blocks > Optimal transfer length granularity: 1 blocks > Maximum transfer length: 65535 blocks > Optimal transfer length: 65535 blocks > Maximum prefetch, xdread, xdwrite transfer length: 65535 blocks > Maximum unmap LBA count: 0 > Maximum unmap block descriptor count: 0 > Optimal unmap granularity: 0 > Unmap granularity alignment valid: 0 > Unmap granularity alignment: 0 > Maximum write same length: 0x0 blocks > Maximum atomic transfer length: 0 > Atomic alignment: 0 > Atomic transfer length granularity: 0 > > [tom@localhost ~]$ sudo sg_inq -p 0xb0 /dev/sdc > VPD INQUIRY: Block limits page (SBC) > Maximum compare and write length: 0 blocks > Optimal transfer length granularity: 0 blocks > Maximum transfer length: 8388607 blocks > Optimal transfer length: 8388607 blocks > Maximum prefetch, xdread, xdwrite transfer length: 0 blocks > > The thing is, why any io/transfer size/length should be considered > when it comes to partition alignment? From what I understand, > partition alignment is only to make sure partition starts at physical > boundaries of the disk because of the mismatch between logicial sector > (512 bytes) and physical sectors (4096 bytes) or pages/erase blocks of > SSDs. > > On 15 June 2015 at 21:31, Karel Zak <kzak@redhat.com> wrote: >> On Sat, Jun 13, 2015 at 10:52:04PM +0800, Tom Yan wrote: >>> As I have mentioned in previous mails, I have an sata/usb3 adapter >>> which could work in uas mode, and when it does, it has a weird optimal >>> i/o size: >>> >>> Disk /dev/sdb: 74.5 GiB, 80026361856 bytes, 156301488 sectors >>> Units: sectors of 1 * 512 = 512 bytes >>> Sector size (logical/physical): 512 bytes / 512 bytes >>> I/O size (minimum/optimal): 512 bytes / 33553920 bytes >> >> This is no problem (33553920 % 512 = 0) with the current kernel and >> the current util-linux git tree where we support non power of 2 >> alignment. >> >>> http://www.linuxquestions.org/questions/linux-newbie-8/how-to-foramt-2tb-external-hard-drive-4175529792/ >>> >>> In the above link, there shows another similar case of an external >>> drive with 4k physical sector. >> >> from the link: >> >> Sector size (logical/physical): 512 bytes / 4096 bytes >> I/O size (minimum/optimal): 4096 bytes / 33553920 bytes >> >> this is problem (33553920 % 4096 != 0) and frankly it seems like >> pretty strange thing, maybe kernel guys can comment it (CC: to >> Martin). >> >>> I am not sure if there's anything wrong with the device(s) or the >>> kernel, but anyway I doubt if fdisk should determine alignment with >>> this size. As you can calculate, it may not necessarily be a multiple >>> of the size of physical sectors, or that of common erase block of SSDs >>> (which is not reported anywhere AFAIK). >>> >>> Perhaps this I/O size does matter on alignment for certain cases, but >>> shouldn't physical sector or erase block be at least of higher >>> priority when it comes to alignment? >> >> I think we can test "optimal_io_size % physical_sector_size" and use physical >> sector size as the granularity if the optimal_io_size is a strange number. >> >>> In any case, it would be nice if fdisk can allow customize alignment >>> (like gdisk does), so that users can at least decide how partitions >>> should be aligned in weird cases like this. With that, the long-time >>> deprecated "dos compatibility" might be able to go as well. >> >> I'll think about it... >> >> Karel >> >> >> -- >> Karel Zak <kzak@redhat.com> >> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment 2015-06-16 5:20 ` Tom Yan 2015-06-16 5:37 ` Tom Yan @ 2015-06-16 9:43 ` Karel Zak 2015-06-16 10:22 ` Tom Yan 2015-06-16 17:08 ` Martin K. Petersen 2 siblings, 1 reply; 22+ messages in thread From: Karel Zak @ 2015-06-16 9:43 UTC (permalink / raw) To: Tom Yan; +Cc: util-linux, Martin K. Petersen On Tue, Jun 16, 2015 at 01:20:37PM +0800, Tom Yan wrote: > The thing is, why any io/transfer size/length should be considered > when it comes to partition alignment? From what I understand, > partition alignment is only to make sure partition starts at physical > boundaries of the disk because of the mismatch between logicial sector > (512 bytes) and physical sectors (4096 bytes) or pages/erase blocks of > SSDs. It's more complicated, the I/O limits are the most important for RAIDs where optimal I/O size is usually stripe size and you want to use it for partitions alignment for better performance (if you align to sector size then read/write on RAID maybe performed on more disks on unaligned partitions). And it's not only fdisk who cares, it's also important for mkfs.<type> (for example XFS align according to I/O limits). And because all this is mess and sometimes HW does not provide relevant information and because people use dd(1) to copy partition tables we have decided to use 1MiB granularity if possible. If 1MiB is useless then we use optimal_io_size, if undefined then minimal_io_size and if undefined then sector_size. http://people.redhat.com/msnitzer/docs/io-limits.txt Unfortunately the current code does not check if optimal_io_size makes sense, so thing like 33553920 for 4k device is blindly accepted ;-( Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-16 10:22 ` Tom Yan 0 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-16 10:22 UTC (permalink / raw) To: Karel Zak, linux-scsi; +Cc: util-linux, Martin K. Petersen I heard about that it matters for RAID but since I don't really know about RAID so I can't comment. I do wonder whether the scsi disk driver should derive minimum/optimal i/o size from VPD at all then. It might still be "tolerable" if it's the limit of WRITE SAME(10), but definitely not if it's that of WRITE SAME (16): [tom@localhost ~]$ sudo fdisk /dev/sdc Welcome to fdisk (util-linux 2.26.2). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table. Created a new DOS disklabel with disk identifier 0xccb261a9. Command (m for help): p Disk /dev/sdc: 29.2 GiB, 31376707072 bytes, 61282631 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 4294966784 bytes Disklabel type: dos Disk identifier: 0xccb261a9 Command (m for help): n Partition type p primary (0 primary, 0 extended, 4 free) e extended (container for logical partitions) Select (default p): Using default response p. Partition number (1-4, default 1): First sector (8388607-61282630, default 8388607): Last sector, +sectors or +size{K,M,G,T,P} (8388607-61282630, default 61282630): Created a new partition 1 of type 'Linux' and of size 25.2 GiB. On 16 June 2015 at 17:43, Karel Zak <kzak@redhat.com> wrote: > On Tue, Jun 16, 2015 at 01:20:37PM +0800, Tom Yan wrote: >> The thing is, why any io/transfer size/length should be considered >> when it comes to partition alignment? From what I understand, >> partition alignment is only to make sure partition starts at physical >> boundaries of the disk because of the mismatch between logicial sector >> (512 bytes) and physical sectors (4096 bytes) or pages/erase blocks of >> SSDs. > > It's more complicated, the I/O limits are the most important for RAIDs > where optimal I/O size is usually stripe size and you want to use it > for partitions alignment for better performance (if you align to > sector size then read/write on RAID maybe performed on more disks on > unaligned partitions). And it's not only fdisk who cares, it's also > important for mkfs.<type> (for example XFS align according to I/O limits). > > And because all this is mess and sometimes HW does not provide > relevant information and because people use dd(1) to copy partition > tables we have decided to use 1MiB granularity if possible. If 1MiB is > useless then we use optimal_io_size, if undefined then minimal_io_size > and if undefined then sector_size. > > http://people.redhat.com/msnitzer/docs/io-limits.txt > > > Unfortunately the current code does not check if optimal_io_size makes > sense, so thing like 33553920 for 4k device is blindly accepted ;-( > > Karel > > > -- > Karel Zak <kzak@redhat.com> > http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-16 10:22 ` Tom Yan 0 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-16 10:22 UTC (permalink / raw) To: Karel Zak, linux-scsi-u79uwXL29TY76Z2rM5mHXA Cc: util-linux-u79uwXL29TY76Z2rM5mHXA, Martin K. Petersen I heard about that it matters for RAID but since I don't really know about RAID so I can't comment. I do wonder whether the scsi disk driver should derive minimum/optimal i/o size from VPD at all then. It might still be "tolerable" if it's the limit of WRITE SAME(10), but definitely not if it's that of WRITE SAME (16): [tom@localhost ~]$ sudo fdisk /dev/sdc Welcome to fdisk (util-linux 2.26.2). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table. Created a new DOS disklabel with disk identifier 0xccb261a9. Command (m for help): p Disk /dev/sdc: 29.2 GiB, 31376707072 bytes, 61282631 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 4294966784 bytes Disklabel type: dos Disk identifier: 0xccb261a9 Command (m for help): n Partition type p primary (0 primary, 0 extended, 4 free) e extended (container for logical partitions) Select (default p): Using default response p. Partition number (1-4, default 1): First sector (8388607-61282630, default 8388607): Last sector, +sectors or +size{K,M,G,T,P} (8388607-61282630, default 61282630): Created a new partition 1 of type 'Linux' and of size 25.2 GiB. On 16 June 2015 at 17:43, Karel Zak <kzak-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On Tue, Jun 16, 2015 at 01:20:37PM +0800, Tom Yan wrote: >> The thing is, why any io/transfer size/length should be considered >> when it comes to partition alignment? From what I understand, >> partition alignment is only to make sure partition starts at physical >> boundaries of the disk because of the mismatch between logicial sector >> (512 bytes) and physical sectors (4096 bytes) or pages/erase blocks of >> SSDs. > > It's more complicated, the I/O limits are the most important for RAIDs > where optimal I/O size is usually stripe size and you want to use it > for partitions alignment for better performance (if you align to > sector size then read/write on RAID maybe performed on more disks on > unaligned partitions). And it's not only fdisk who cares, it's also > important for mkfs.<type> (for example XFS align according to I/O limits). > > And because all this is mess and sometimes HW does not provide > relevant information and because people use dd(1) to copy partition > tables we have decided to use 1MiB granularity if possible. If 1MiB is > useless then we use optimal_io_size, if undefined then minimal_io_size > and if undefined then sector_size. > > http://people.redhat.com/msnitzer/docs/io-limits.txt > > > Unfortunately the current code does not check if optimal_io_size makes > sense, so thing like 33553920 for 4k device is blindly accepted ;-( > > Karel > > > -- > Karel Zak <kzak-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > http://karelzak.blogspot.com -- To unsubscribe from this list: send the line "unsubscribe util-linux" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment 2015-06-16 5:20 ` Tom Yan 2015-06-16 5:37 ` Tom Yan 2015-06-16 9:43 ` Karel Zak @ 2015-06-16 17:08 ` Martin K. Petersen 2015-06-16 19:26 ` Tom Yan 2 siblings, 1 reply; 22+ messages in thread From: Martin K. Petersen @ 2015-06-16 17:08 UTC (permalink / raw) To: Tom Yan; +Cc: Karel Zak, util-linux, Martin K. Petersen >>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes: Tom> From the adapter/drive I have, it is the same as the "Maximum Tom> transfer length" and they seem to be simply limits of SCSI "WRITE Tom> SAME (10/16)" command: The two values have nothing to do with each other. They just happen to be the same in your case (65535 is the maximum block count for the WRITE SAME(10) command). Tom> [tom@localhost ~]$ sudo sg_inq -p 0xb0 /dev/sdb VPD INQUIRY: Block Tom> limits page (SBC) Maximum compare and write length: 0 blocks Tom> Optimal transfer length granularity: 1 blocks Maximum transfer Tom> length: 65535 blocks Optimal transfer length: 65535 blocks Your device sets the transfer length granularity to 1 logical block and the optimal transfer length to 65535 logical blocks. If it then reports a 4096-byte physical block size in response to READ CAPACITY(16) then it's clearly on crack. There's only so much we can do about devices that report garbage. Also, the kernel only reports things. It is up to Karel to decide whether to sanity check the values before he uses them. I would probably err on the side of trusting the physical block size reporting more than anything seeded from the Block Limits VPD. And in this case, assuming the alignment offset is reported to be 0, I guess one could entertain aligning to the nearest 4K boundary. But on the other hand it'll quickly get hairy to have to maintain this kind of heuristics. The best fix, of course, is to complain to the manufacturer of your broken widget and hope for a firmware upgrade. Failing that, adjust your partitions manually. Tom> The thing is, why any io/transfer size/length should be considered Tom> when it comes to partition alignment? From what I understand, Tom> partition alignment is only to make sure partition starts at Tom> physical boundaries of the disk because of the mismatch between Tom> logicial sector (512 bytes) and physical sectors (4096 bytes) or Tom> pages/erase blocks of SSDs. For RAID it makes a big difference to ensure the partition is aligned on a stripe boundary. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-16 19:26 ` Tom Yan 0 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-16 19:26 UTC (permalink / raw) To: Martin K. Petersen, linux-scsi; +Cc: Karel Zak, util-linux, linux-usb On 17 June 2015 at 01:08, Martin K. Petersen <martin.petersen@oracle.com> wrote: > The two values have nothing to do with each other. They just happen to > be the same in your case (65535 is the maximum block count for the WRITE > SAME(10) command). > > Your device sets the transfer length granularity to 1 logical block and > the optimal transfer length to 65535 logical blocks. If it then reports > a 4096-byte physical block size in response to READ CAPACITY(16) then > it's clearly on crack. > > There's only so much we can do about devices that report garbage. All drives I have are flash drives so none of them reports 4k physical sectors. But it does seems possible in the case I linked. The thing is these VPDs/transfer lengths are probably provided by the USB to ATA(/SCSI?) bridges. I can't judge if they are wrong to set the lengths that way but it seem to be a common practice. I have two USB devices provide the SBC-2 (Block limit VPD), one is a SanDisk Extreme USB (SDCZ80), another an Intel X25-M Gen1 on an ASMedia SATA adapter, and both of them set the Optimal transfer length. The usb-storage driver does not read vpd so it won't be a thing, but the the uas driver does. > Also, the kernel only reports things. It is up to Karel to decide > whether to sanity check the values before he uses them. I just feel like the kernel shouldn't bind values from totally different source (raid stripe vs vpd limit) to the same variable. I don't know if what else would make use of this variable but by only considering the fdisk case, it seems the scsi disk driver should be the one who should stop binding. > The best fix, of course, is to complain to the manufacturer of your > broken widget and hope for a firmware upgrade. This is simply too idealistic especially when it seems that this issue mostly happens on USB bridges. I am not even sure if the SCSI standards has anything to say about this practice. > Failing that, adjust your partitions manually. Yeah that's why I said fdisk should allow custom alignment. On 17 June 2015 at 01:08, Martin K. Petersen <martin.petersen@oracle.com> wrote: >>>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes: > > Tom> From the adapter/drive I have, it is the same as the "Maximum > Tom> transfer length" and they seem to be simply limits of SCSI "WRITE > Tom> SAME (10/16)" command: > > The two values have nothing to do with each other. They just happen to > be the same in your case (65535 is the maximum block count for the WRITE > SAME(10) command). > > Tom> [tom@localhost ~]$ sudo sg_inq -p 0xb0 /dev/sdb VPD INQUIRY: Block > Tom> limits page (SBC) Maximum compare and write length: 0 blocks > Tom> Optimal transfer length granularity: 1 blocks Maximum transfer > Tom> length: 65535 blocks Optimal transfer length: 65535 blocks > > Your device sets the transfer length granularity to 1 logical block and > the optimal transfer length to 65535 logical blocks. If it then reports > a 4096-byte physical block size in response to READ CAPACITY(16) then > it's clearly on crack. > > There's only so much we can do about devices that report garbage. > > Also, the kernel only reports things. It is up to Karel to decide > whether to sanity check the values before he uses them. > > I would probably err on the side of trusting the physical block size > reporting more than anything seeded from the Block Limits VPD. And in > this case, assuming the alignment offset is reported to be 0, I guess > one could entertain aligning to the nearest 4K boundary. But on the > other hand it'll quickly get hairy to have to maintain this kind of > heuristics. > > The best fix, of course, is to complain to the manufacturer of your > broken widget and hope for a firmware upgrade. Failing that, adjust your > partitions manually. > > Tom> The thing is, why any io/transfer size/length should be considered > Tom> when it comes to partition alignment? From what I understand, > Tom> partition alignment is only to make sure partition starts at > Tom> physical boundaries of the disk because of the mismatch between > Tom> logicial sector (512 bytes) and physical sectors (4096 bytes) or > Tom> pages/erase blocks of SSDs. > > For RAID it makes a big difference to ensure the partition is aligned on > a stripe boundary. > > -- > Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-16 19:26 ` Tom Yan 0 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-16 19:26 UTC (permalink / raw) To: Martin K. Petersen, linux-scsi-u79uwXL29TY76Z2rM5mHXA Cc: Karel Zak, util-linux-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA On 17 June 2015 at 01:08, Martin K. Petersen <martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote: > The two values have nothing to do with each other. They just happen to > be the same in your case (65535 is the maximum block count for the WRITE > SAME(10) command). > > Your device sets the transfer length granularity to 1 logical block and > the optimal transfer length to 65535 logical blocks. If it then reports > a 4096-byte physical block size in response to READ CAPACITY(16) then > it's clearly on crack. > > There's only so much we can do about devices that report garbage. All drives I have are flash drives so none of them reports 4k physical sectors. But it does seems possible in the case I linked. The thing is these VPDs/transfer lengths are probably provided by the USB to ATA(/SCSI?) bridges. I can't judge if they are wrong to set the lengths that way but it seem to be a common practice. I have two USB devices provide the SBC-2 (Block limit VPD), one is a SanDisk Extreme USB (SDCZ80), another an Intel X25-M Gen1 on an ASMedia SATA adapter, and both of them set the Optimal transfer length. The usb-storage driver does not read vpd so it won't be a thing, but the the uas driver does. > Also, the kernel only reports things. It is up to Karel to decide > whether to sanity check the values before he uses them. I just feel like the kernel shouldn't bind values from totally different source (raid stripe vs vpd limit) to the same variable. I don't know if what else would make use of this variable but by only considering the fdisk case, it seems the scsi disk driver should be the one who should stop binding. > The best fix, of course, is to complain to the manufacturer of your > broken widget and hope for a firmware upgrade. This is simply too idealistic especially when it seems that this issue mostly happens on USB bridges. I am not even sure if the SCSI standards has anything to say about this practice. > Failing that, adjust your partitions manually. Yeah that's why I said fdisk should allow custom alignment. On 17 June 2015 at 01:08, Martin K. Petersen <martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote: >>>>>> "Tom" == Tom Yan <tom.ty89-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > > Tom> From the adapter/drive I have, it is the same as the "Maximum > Tom> transfer length" and they seem to be simply limits of SCSI "WRITE > Tom> SAME (10/16)" command: > > The two values have nothing to do with each other. They just happen to > be the same in your case (65535 is the maximum block count for the WRITE > SAME(10) command). > > Tom> [tom@localhost ~]$ sudo sg_inq -p 0xb0 /dev/sdb VPD INQUIRY: Block > Tom> limits page (SBC) Maximum compare and write length: 0 blocks > Tom> Optimal transfer length granularity: 1 blocks Maximum transfer > Tom> length: 65535 blocks Optimal transfer length: 65535 blocks > > Your device sets the transfer length granularity to 1 logical block and > the optimal transfer length to 65535 logical blocks. If it then reports > a 4096-byte physical block size in response to READ CAPACITY(16) then > it's clearly on crack. > > There's only so much we can do about devices that report garbage. > > Also, the kernel only reports things. It is up to Karel to decide > whether to sanity check the values before he uses them. > > I would probably err on the side of trusting the physical block size > reporting more than anything seeded from the Block Limits VPD. And in > this case, assuming the alignment offset is reported to be 0, I guess > one could entertain aligning to the nearest 4K boundary. But on the > other hand it'll quickly get hairy to have to maintain this kind of > heuristics. > > The best fix, of course, is to complain to the manufacturer of your > broken widget and hope for a firmware upgrade. Failing that, adjust your > partitions manually. > > Tom> The thing is, why any io/transfer size/length should be considered > Tom> when it comes to partition alignment? From what I understand, > Tom> partition alignment is only to make sure partition starts at > Tom> physical boundaries of the disk because of the mismatch between > Tom> logicial sector (512 bytes) and physical sectors (4096 bytes) or > Tom> pages/erase blocks of SSDs. > > For RAID it makes a big difference to ensure the partition is aligned on > a stripe boundary. > > -- > Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe util-linux" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment 2015-06-16 19:26 ` Tom Yan (?) @ 2015-06-16 21:28 ` Martin K. Petersen 2015-06-17 9:49 ` Tom Yan -1 siblings, 1 reply; 22+ messages in thread From: Martin K. Petersen @ 2015-06-16 21:28 UTC (permalink / raw) To: Tom Yan; +Cc: Martin K. Petersen, linux-scsi, Karel Zak, util-linux, linux-usb >>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes: Tom> All drives I have are flash drives so none of them reports 4k Tom> physical sectors. There are plenty of SSDs that report 4K physical sectors, fwiw. Tom> The usb-storage driver does not read vpd so it won't be a thing, Tom> but the the uas driver does. We gave up on USB-SATA bridges long ago. Their designers appear to have a pretty comprehensive misunderstanding of both the ATA and SCSI protocols. We had higher hopes for UAS since it provided a clean slate. So far, however, the results are equally discouraging. Tom> I just feel like the kernel shouldn't bind values from totally Tom> different source (raid stripe vs vpd limit) to the same variable. RAID devices communicate the stripe width through the Block Limits VPD. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-17 9:49 ` Tom Yan 0 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-17 9:49 UTC (permalink / raw) To: Martin K. Petersen; +Cc: linux-scsi, Karel Zak, util-linux, linux-usb On 17 June 2015 at 05:28, Martin K. Petersen <martin.petersen@oracle.com> wrote: > There are plenty of SSDs that report 4K physical sectors, fwiw. Oh didn't know that. Wonder if it's yet another garbage info. Though 4k is often a nice value to make use of. > We gave up on USB-SATA bridges long ago. Their designers appear to have > a pretty comprehensive misunderstanding of both the ATA and SCSI > protocols. Aren't there tons of thumb drives make use of it anyway? > Tom> I just feel like the kernel shouldn't bind values from totally > Tom> different source (raid stripe vs vpd limit) to the same variable. > > RAID devices communicate the stripe width through the Block Limits VPD. No I put it in the wrong way. What I meant was "sd vs md". For example, couldn't the scsi disk driver bind the value it reads from the VPD to another variable instead of "optimal i/o size", so that this value would be exclusively for RAID (and other virtual devices)? Is it even necessary for it to report? Because it seems only to make this variable ambiguous. If it HAS TO BE ambiguous, I see no reason why fdisk should use it to derive the alignment. It should simply let the users do their judgement and provide a way for them to adjust manually. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-17 9:49 ` Tom Yan 0 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-17 9:49 UTC (permalink / raw) To: Martin K. Petersen Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Karel Zak, util-linux-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA On 17 June 2015 at 05:28, Martin K. Petersen <martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote: > There are plenty of SSDs that report 4K physical sectors, fwiw. Oh didn't know that. Wonder if it's yet another garbage info. Though 4k is often a nice value to make use of. > We gave up on USB-SATA bridges long ago. Their designers appear to have > a pretty comprehensive misunderstanding of both the ATA and SCSI > protocols. Aren't there tons of thumb drives make use of it anyway? > Tom> I just feel like the kernel shouldn't bind values from totally > Tom> different source (raid stripe vs vpd limit) to the same variable. > > RAID devices communicate the stripe width through the Block Limits VPD. No I put it in the wrong way. What I meant was "sd vs md". For example, couldn't the scsi disk driver bind the value it reads from the VPD to another variable instead of "optimal i/o size", so that this value would be exclusively for RAID (and other virtual devices)? Is it even necessary for it to report? Because it seems only to make this variable ambiguous. If it HAS TO BE ambiguous, I see no reason why fdisk should use it to derive the alignment. It should simply let the users do their judgement and provide a way for them to adjust manually. -- To unsubscribe from this list: send the line "unsubscribe util-linux" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-18 21:01 ` Martin K. Petersen 0 siblings, 0 replies; 22+ messages in thread From: Martin K. Petersen @ 2015-06-18 21:01 UTC (permalink / raw) To: Tom Yan; +Cc: Martin K. Petersen, linux-scsi, Karel Zak, util-linux, linux-usb >>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes: Tom> No I put it in the wrong way. What I meant was "sd vs md". For Tom> example, couldn't the scsi disk driver bind the value it reads from Tom> the VPD to another variable instead of "optimal i/o size", so that Tom> this value would be exclusively for RAID (and other virtual Tom> devices)? Who says that RAID is a virtual device? Hardware RAID controllers as well as SAS, iSCSI and Fibre Channel disk arrays all use the Block Limits VPD to communicate their preferred I/O size and alignment to us. As do enterprise disk drives. We deal with broken devices by blacklisting them. I suggest you try to find a way we can reliably identify your UAS devices. If there is a common pattern, we can entertain adding a workaround. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-18 21:01 ` Martin K. Petersen 0 siblings, 0 replies; 22+ messages in thread From: Martin K. Petersen @ 2015-06-18 21:01 UTC (permalink / raw) To: Tom Yan Cc: Martin K. Petersen, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Karel Zak, util-linux-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA >>>>> "Tom" == Tom Yan <tom.ty89-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: Tom> No I put it in the wrong way. What I meant was "sd vs md". For Tom> example, couldn't the scsi disk driver bind the value it reads from Tom> the VPD to another variable instead of "optimal i/o size", so that Tom> this value would be exclusively for RAID (and other virtual Tom> devices)? Who says that RAID is a virtual device? Hardware RAID controllers as well as SAS, iSCSI and Fibre Channel disk arrays all use the Block Limits VPD to communicate their preferred I/O size and alignment to us. As do enterprise disk drives. We deal with broken devices by blacklisting them. I suggest you try to find a way we can reliably identify your UAS devices. If there is a common pattern, we can entertain adding a workaround. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-20 16:01 ` Tom Yan 0 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-20 16:01 UTC (permalink / raw) To: Martin K. Petersen; +Cc: linux-scsi, Karel Zak, util-linux, linux-usb I was not saying RAIDs are virtual devices. I just mentioned it because I saw things like virtio-blk or zram use blk_queue_io_opt(). I know they all use VPDs, but the main point is whether those hardware RAIDs or so are handled by sd_mod, and whether those "transfer lengths" info are still important when it's just a simple drive. To me they look like to be of different nature. That's why I think it's inappropraite that they use the same "variable" / "file" to report because that makes tools like fdisk have trouble determining when does those values really matters. In fact, (maybe I am just unlucky :P) VPDs of all my devices are to some extent broken. I just found out today my Intel 530 SSD connecting directly to SATA also reports totally garbage values for TRIM : ( To be honest the UAS thing doesn't really affect me a lot, I mostly use gdisk now (which doesn't care about i/o size AFAIK). I can also disable uas with the quirk so that VPDs are skipped when I really need fdisk for msdos/mbr. It's just I think that it kind of reveal a problem that has to be dealt with sooner or later, though you can optimistically think that vendors would do better on VPDs in the future. On 19 June 2015 at 05:01, Martin K. Petersen <martin.petersen@oracle.com> wrote: >>>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes: > > Tom> No I put it in the wrong way. What I meant was "sd vs md". For > Tom> example, couldn't the scsi disk driver bind the value it reads from > Tom> the VPD to another variable instead of "optimal i/o size", so that > Tom> this value would be exclusively for RAID (and other virtual > Tom> devices)? > > Who says that RAID is a virtual device? Hardware RAID controllers as > well as SAS, iSCSI and Fibre Channel disk arrays all use the Block > Limits VPD to communicate their preferred I/O size and alignment to > us. As do enterprise disk drives. > > We deal with broken devices by blacklisting them. I suggest you try to > find a way we can reliably identify your UAS devices. If there is a > common pattern, we can entertain adding a workaround. > > -- > Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-20 16:01 ` Tom Yan 0 siblings, 0 replies; 22+ messages in thread From: Tom Yan @ 2015-06-20 16:01 UTC (permalink / raw) To: Martin K. Petersen Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Karel Zak, util-linux-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA I was not saying RAIDs are virtual devices. I just mentioned it because I saw things like virtio-blk or zram use blk_queue_io_opt(). I know they all use VPDs, but the main point is whether those hardware RAIDs or so are handled by sd_mod, and whether those "transfer lengths" info are still important when it's just a simple drive. To me they look like to be of different nature. That's why I think it's inappropraite that they use the same "variable" / "file" to report because that makes tools like fdisk have trouble determining when does those values really matters. In fact, (maybe I am just unlucky :P) VPDs of all my devices are to some extent broken. I just found out today my Intel 530 SSD connecting directly to SATA also reports totally garbage values for TRIM : ( To be honest the UAS thing doesn't really affect me a lot, I mostly use gdisk now (which doesn't care about i/o size AFAIK). I can also disable uas with the quirk so that VPDs are skipped when I really need fdisk for msdos/mbr. It's just I think that it kind of reveal a problem that has to be dealt with sooner or later, though you can optimistically think that vendors would do better on VPDs in the future. On 19 June 2015 at 05:01, Martin K. Petersen <martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote: >>>>>> "Tom" == Tom Yan <tom.ty89-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > > Tom> No I put it in the wrong way. What I meant was "sd vs md". For > Tom> example, couldn't the scsi disk driver bind the value it reads from > Tom> the VPD to another variable instead of "optimal i/o size", so that > Tom> this value would be exclusively for RAID (and other virtual > Tom> devices)? > > Who says that RAID is a virtual device? Hardware RAID controllers as > well as SAS, iSCSI and Fibre Channel disk arrays all use the Block > Limits VPD to communicate their preferred I/O size and alignment to > us. As do enterprise disk drives. > > We deal with broken devices by blacklisting them. I suggest you try to > find a way we can reliably identify your UAS devices. If there is a > common pattern, we can entertain adding a workaround. > > -- > Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe util-linux" in ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-21 0:12 ` Martin K. Petersen 0 siblings, 0 replies; 22+ messages in thread From: Martin K. Petersen @ 2015-06-21 0:12 UTC (permalink / raw) To: Tom Yan; +Cc: Martin K. Petersen, linux-scsi, Karel Zak, util-linux, linux-usb >>>>> "Tom" == Tom Yan <tom.ty89@gmail.com> writes: Tom> I know they all use VPDs, but the main point is whether those Tom> hardware RAIDs or so are handled by sd_mod, and whether those Tom> "transfer lengths" info are still important when it's just a simple Tom> drive. To me they look like to be of different nature. We don't know whether a discovered device is "a simple drive". And once again: The whole point of the queue limit is to have an common abstraction for all block devices. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-21 0:12 ` Martin K. Petersen 0 siblings, 0 replies; 22+ messages in thread From: Martin K. Petersen @ 2015-06-21 0:12 UTC (permalink / raw) To: Tom Yan Cc: Martin K. Petersen, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Karel Zak, util-linux-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA >>>>> "Tom" == Tom Yan <tom.ty89-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: Tom> I know they all use VPDs, but the main point is whether those Tom> hardware RAIDs or so are handled by sd_mod, and whether those Tom> "transfer lengths" info are still important when it's just a simple Tom> drive. To me they look like to be of different nature. We don't know whether a discovered device is "a simple drive". And once again: The whole point of the queue limit is to have an common abstraction for all block devices. -- Martin K. Petersen Oracle Linux Engineering -- To unsubscribe from this list: send the line "unsubscribe util-linux" in ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-22 14:32 ` Alan Stern 0 siblings, 0 replies; 22+ messages in thread From: Alan Stern @ 2015-06-22 14:32 UTC (permalink / raw) To: Tom Yan; +Cc: Martin K. Petersen, linux-scsi, Karel Zak, util-linux, linux-usb On Sun, 21 Jun 2015, Tom Yan wrote: > I was not saying RAIDs are virtual devices. I just mentioned it > because I saw things like virtio-blk or zram use blk_queue_io_opt(). > > I know they all use VPDs, but the main point is whether those hardware > RAIDs or so are handled by sd_mod, and whether those "transfer > lengths" info are still important when it's just a simple drive. To me > they look like to be of different nature. That's why I think it's > inappropraite that they use the same "variable" / "file" to report > because that makes tools like fdisk have trouble determining when does > those values really matters. > > In fact, (maybe I am just unlucky :P) VPDs of all my devices are to > some extent broken. I just found out today my Intel 530 SSD connecting > directly to SATA also reports totally garbage values for TRIM : ( > > To be honest the UAS thing doesn't really affect me a lot, I mostly > use gdisk now (which doesn't care about i/o size AFAIK). I can also > disable uas with the quirk so that VPDs are skipped when I really need > fdisk for msdos/mbr. It's just I think that it kind of reveal a > problem that has to be dealt with sooner or later, though you can > optimistically think that vendors would do better on VPDs in the > future. Regardless of all these issues, it is clear that a lot of devices don't implement the VPD data correctly. Therefore the information in the kernel will often be wrong. And consequently, fdisk needs to offer the user an option to override the default partition-alignment setting. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe util-linux" in ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment @ 2015-06-22 14:32 ` Alan Stern 0 siblings, 0 replies; 22+ messages in thread From: Alan Stern @ 2015-06-22 14:32 UTC (permalink / raw) To: Tom Yan Cc: Martin K. Petersen, linux-scsi-u79uwXL29TY76Z2rM5mHXA, Karel Zak, util-linux-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA On Sun, 21 Jun 2015, Tom Yan wrote: > I was not saying RAIDs are virtual devices. I just mentioned it > because I saw things like virtio-blk or zram use blk_queue_io_opt(). > > I know they all use VPDs, but the main point is whether those hardware > RAIDs or so are handled by sd_mod, and whether those "transfer > lengths" info are still important when it's just a simple drive. To me > they look like to be of different nature. That's why I think it's > inappropraite that they use the same "variable" / "file" to report > because that makes tools like fdisk have trouble determining when does > those values really matters. > > In fact, (maybe I am just unlucky :P) VPDs of all my devices are to > some extent broken. I just found out today my Intel 530 SSD connecting > directly to SATA also reports totally garbage values for TRIM : ( > > To be honest the UAS thing doesn't really affect me a lot, I mostly > use gdisk now (which doesn't care about i/o size AFAIK). I can also > disable uas with the quirk so that VPDs are skipped when I really need > fdisk for msdos/mbr. It's just I think that it kind of reveal a > problem that has to be dealt with sooner or later, though you can > optimistically think that vendors would do better on VPDs in the > future. Regardless of all these issues, it is clear that a lot of devices don't implement the VPD data correctly. Therefore the information in the kernel will often be wrong. And consequently, fdisk needs to offer the user an option to override the default partition-alignment setting. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe util-linux" in ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: optimal io size / custom alignment -- caution on custom aligns 2015-06-15 13:31 ` Karel Zak 2015-06-16 5:20 ` Tom Yan @ 2015-07-12 4:19 ` Linda Walsh 1 sibling, 0 replies; 22+ messages in thread From: Linda Walsh @ 2015-07-12 4:19 UTC (permalink / raw) To: Karel Zak; +Cc: Tom Yan, util-linux, Martin K. Petersen Karel Zak wrote: > On Sat, Jun 13, 2015 at 10:52:04PM +0800, Tom Yan wrote > > I think we can test "optimal_io_size % physical_sector_size" and use physical > sector size as the granularity if the optimal_io_size is a strange number. > >> In any case, it would be nice if fdisk can allow customize alignment >> (like gdisk does), so that users can at least decide how partitions >> should be aligned in weird cases like this. With that, the long-time >> deprecated "dos compatibility" might be able to go as well. > > I'll think about it... > > Karel > --------------------- > I know it's been a while since the above note was written, but just saw it reviewing old messages and thought I'd pass on a warning. Warning...if your stripe size (the strip on 1 disk x # data disks) is not a power of 2, don't bother trying to make perl. It uses the gnu DB libraries that choke on non-power-of-two "optimal" I/O sizes (had a RAID 50 that I took 3, 4-data-spindle RAIDS and striped them. strip=64k, stripe=256k, optimal with 3 stripes was listed @ 768k. Several of the gnu libs used assumptions that the optimal size would be a power of 2. If not, the DB would become corrupt -- of course it was only my machine -- not until I read the code and saw the power of 2 assumption....ARG! Took almost a year for that, since first version of perl I found it in was 5.14. Wasn't fixed for 5.16 or 5.18... dunnow about now. Just a random caution...the bug was in the gdbm/ndbm code. Sometime later I needed a disk replacement. went to a RAID10 -- 2 mirrors, stripped. The old bug hasn't cropped up since. I recommended to the perl folk that they should test for that case. They didn't think it would be a problem, given I was the only reportee... But others may have run into that or not used db's... but, the p5p teamm didn't want the extra work so they closed the bug as invalid.... invalid?! ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2015-07-12 4:40 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-06-13 14:52 optimal io size / custom alignment Tom Yan 2015-06-15 13:31 ` Karel Zak 2015-06-16 5:20 ` Tom Yan 2015-06-16 5:37 ` Tom Yan 2015-06-16 9:43 ` Karel Zak 2015-06-16 10:22 ` Tom Yan 2015-06-16 10:22 ` Tom Yan 2015-06-16 17:08 ` Martin K. Petersen 2015-06-16 19:26 ` Tom Yan 2015-06-16 19:26 ` Tom Yan 2015-06-16 21:28 ` Martin K. Petersen 2015-06-17 9:49 ` Tom Yan 2015-06-17 9:49 ` Tom Yan 2015-06-18 21:01 ` Martin K. Petersen 2015-06-18 21:01 ` Martin K. Petersen 2015-06-20 16:01 ` Tom Yan 2015-06-20 16:01 ` Tom Yan 2015-06-21 0:12 ` Martin K. Petersen 2015-06-21 0:12 ` Martin K. Petersen 2015-06-22 14:32 ` Alan Stern 2015-06-22 14:32 ` Alan Stern 2015-07-12 4:19 ` optimal io size / custom alignment -- caution on custom aligns Linda Walsh
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.