All of lore.kernel.org
 help / color / mirror / Atom feed
* Dell BIOS issue reading Disk Extended data
@ 2021-01-22 18:35 Guilherme G. Piccoli
  2021-01-22 18:41 ` Limonciello, Mario
  0 siblings, 1 reply; 6+ messages in thread
From: Guilherme G. Piccoli @ 2021-01-22 18:35 UTC (permalink / raw)
  To: mario.limonciello, divya.bharathi, Alexander.Barabash,
	amit.engel, crag.wang, david.chen7, Narendra.K
  Cc: gpiccoli, Guilherme G. Piccoli, halves, Jay Vosburgh,
	Dan Streetman, Gavin Guo, x86, grub-devel

Hello Dell folks, I'm Guilherme Piccoli from Canonical - first of all,
apologies for the out-of-nowhere communication. We've been investigating
an issue that seems to date long time ago, and eventually we could
narrow it to what appears to be a Dell BIOS bug. Notice I'm also looping
a kernel x86 ML and grub-devel, just for the purpose of archiving such
discussion in public lists, to help others that may find such an issue
in the future.

Since I don't have contacts of Dell representatives, I've just raised a
list of people from Dell contributing to kernel in the last 2 years -
maybe one of you could point me towards the path of a proper
contact/channel to discuss such an issue. If not, I'm sorry for the noise.
Let me detail the problem we're observing - notice all of this is about
legacy BIOS mode, not UEFI.

After creating a HW RAID on a Dell PowerEdge R730 (RAID5, total of 8T),
GRUB fails to load its modules, dropping to "rescue mode". After a lot
of investigation, we narrowed the issue to a bad return from BIOS to
service 48h, int 13h [0] - this is the way GRUB collects disk size
information. To double-check that, I've booted Linux in 16-bit realmode
and with that, I could observe that EDD module [1] gets the same wrong
value as total sectors - both GRUB and kernel EDD returns 0xFFFFFFFF.
The correct value would be 0x3A3600000 according to SCSI Read Capacity
16 command (tested through the sg_readcap tool). In the P.S. session
below there are details of the outputs collected by GRUB
instrumentation, kernel EDD and sg_readcap tool.

There are some workarounds to that, like having a smaller partition
_before_ the rootfs in the disk topology, to hold grub modules and
linux/initrd images - in that case it seems the BIOS responds the int
13h/48h service with proper values, but this issue dates from a while
ago [3][4], so I'm hereby seeking a proper discussion with Dell firmware
engineers to understand if that could be fixed or at least to understand
the root cause of such limitation.
Thanks in advance,


Guilherme


[0]
https://en.wikipedia.org/wiki/INT_13H#INT_13h_AH=48h:_Extended_Read_Drive_Parameters

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/boot/edd.c

[2] https://askubuntu.com/q/867047

[3] https://askubuntu.com/q/416418


P.S. GRUB debug output [dump of struct grub_biosdisk_drp in
grub_biosdisk_get_diskinfo_real() function]:

size=1e, flags=9
cyl=0, heads=0, sec=0
bytesp_s=200, total=ffffffff,


kernel EDD output:
[    0.741378] edd[0]->total_secs=ffffffff


sg_readcap output:
$ sg_readcap /dev/sdb
READ CAPACITY (10) indicates device capacity too large
  now trying 16 byte cdb variant
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=0, lbprz=0
   Last logical block address=15625879551 (0x3a35fffff), Number of
logical blocks=15625879552
[...]


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-31 17:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-22 18:35 Dell BIOS issue reading Disk Extended data Guilherme G. Piccoli
2021-01-22 18:41 ` Limonciello, Mario
2021-03-12 15:44   ` Guilherme Piccoli
2021-03-30 17:43     ` K, Narendra
2021-03-31 14:43       ` Guilherme Piccoli
2021-03-31 17:24         ` Jordan Uggla

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.