From mboxrd@z Thu Jan 1 00:00:00 1970 From: Greg KH Date: Fri, 12 Sep 2014 22:42:49 +0000 Subject: Re: Improper Naming in /dev/disk/by-id and Drives Offline Message-Id: <20140912224249.GA8397@kroah.com> List-Id: References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable To: linux-hotplug@vger.kernel.org On Fri, Sep 12, 2014 at 12:53:23PM -0600, Brandon R Schwartz wrote: > On Fri, Sep 12, 2014 at 12:03 PM, Greg KH wrote: > > On Fri, Sep 12, 2014 at 11:52:30AM -0600, Brandon R Schwartz wrote: > >> On Wed, Sep 10, 2014 at 8:53 PM, Greg KH wrote: > >> > On Wed, Sep 10, 2014 at 08:34:06PM -0600, Brandon R Schwartz wrote: > >> >> Hi, > >> >> > >> >> I'm working on a particular issue (possibly two separate issues) wh= ere > >> >> our HDDs are (1) getting mislabeled in /dev/disk/by-id and (2) > >> >> dropping offline even though drive and controller logs show that the > >> >> drive is communicating and working as expected. I don't have much > >> >> knowledge on the udev side of things so it would be great if someone > >> >> could offer some insight into the way udev assigns device names and= if > >> >> there are thoughts as to why the OS cannot see the drive in certain > >> >> cases (timing issue?). > >> >> > >> >> The first issue, the mislabeling problem, is that on reboots or pow= er > >> >> cycles we occasionally see our drives become mislabeled in > >> >> /dev/disk/by-id. We expect to see something like: > >> >> > >> >> ata-ST3000DM001-1CH166_W1F26HKK > >> >> ata-ST3000DM001-1CH166_Z1F2FBBY > >> >> > >> >> But instead we see: > >> >> > >> >> ata-ST3000DM001-1CH166_W1F26HKK > >> >> scsi-35000c500668a9bdb > >> >> > >> >> The "scsi" drive is assigned a drive letter and the OS can communic= ate > >> >> with the drive. Drives logs and controller logs show the drive is > >> >> working properly, but for some reason it's getting labeled incorrec= tly > >> >> in /dev/disk/by-id. We have looked through dmesg and enabled loggi= ng > >> >> in udev (udevadm control --log-priority=DEbug), but we have not seen > >> >> where these labels are coming from. > >> > > >> > Sounds like blkid didn't read the uuid properly. Is this happening = in > >> > your initrd? Is this a systemd init system, or something else? What > >> > distro / version is this? What kernel version is this? > >> > > >> > >> Hi Greg, > >> > >> The distro is RHEL 6.3 with kernel version 2.6.32. > > > > Then I strongly suggest you get support from Red Hat, as you are paying > > for it :) > > > >> We have also seen > >> the issue on a Debian based system with kernel 3.2.45. We ran into > >> this issue again yesterday on RHEL and tested the command 'udevadm > >> trigger' and it repopulated /dev/disk/by-id with the correct > >> information. Is there another level of debugging that we can enable > >> to see where the information might be getting read improperly? > > > > I don't know how RHEL is set up at all, it's such an old kernel, and > > userspace, the community can't help you out, sorry. >=20 > Haha, that is true, but we do see the failures more often on the > Debian based system. If you think we'd be better off working with the > RHEL community or the Debian forums we'll try our luck there. Thanks > for all the help so far! The "RHEL community" is corporate support, which you are are paying for, use it! As for the fact that it seems reproducable on two very different, and both old, distros, it might be a hardware issue, try using a more "modern" distro to see if it really is a kernel/udev issue, or hardware. good luck, greg k-h