linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kurt Garloff <kurt@garloff.de>
To: Denis Efremov <efremov@linux.com>,
	linux-block@vger.kernel.org,
	Linux-kernel <linux-kernel@vger.kernel.org>,
	Jiri Kosina <jkosina@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Wim Osterholt <wim@djo.tudelft.nl>,
	dmarkh@cfl.rr.com, markh@compro.net
Subject: Re: [BUG] FLOPPY DRIVER since 5.10.20
Date: Sun, 1 Aug 2021 13:14:43 +0200	[thread overview]
Message-ID: <9d46f063-8d8d-95ee-e262-9300f54d527c@garloff.de> (raw)
In-Reply-To: <ffb8ca1c-ac8c-95c4-c05a-1269c4831b0a@linux.com>

Hi Denis,

Here's what I did to uncover and reproduce the bug:
* Building a VM image with openSUSE-15.2
* VM image includes cloud-init which reads from a config-drive (with fallback to network) to customize the generic image on (first) boot
* Started the image with an attached floppy disk in KVM (qemu-5.x) that contained valid cloud-config (cidata) information
(-drive file=/tmp/ci-disk.17138.3Jf2/seed-17138.img,format=raw,if=floppy,id=cidata)

This worked fine until openSUSE updated libblkid to include a backport that opens cdroms and floppies with O_NONBLOCK (to avoid spurious CD tray closes on CDRoms).

History is at
https://bugzilla.suse.com/show_bug.cgi?id=1181018 <https://bugzilla.suse.com/show_bug.cgi?id=1181018>

My understanding is that
* qemu reports media changed once on the attached floppy (it's a removable device after all) -- I have no idea whether or not that behavior from qemu is reasonable or not.
* old libblkid used to probe it without O_NONBLOCK, finding out that there is a medium inserted and clearing the media change flag
* a subsequent mount attempt (by cloud-init) would succeed

With new libblkid, using O_NONBLOCK, the media change was not cleared, and the mount would not succeed, DESPITE a valid floppy being attached before booting.

With the kernel update, the access would work again, despite blkkid using O_NONBLOCK.

Jiri should be able to understand this in more detail than I am -- I am no expert in handling of removable block devices ...

Let me know if you need a VM image to reproduce this issue -- I can find it in my archives and push it to some place for downloading.

-- 
Kurt Garloff <kurt@garloff.de>
Cologne, Germany

On 30/07/2021 07:15, Denis Efremov wrote:
> Hi,
>
> On 7/26/21 7:34 PM, Denis Efremov wrote:
>>
>> On 7/26/21 3:23 PM, Mark Hounschell wrote:
>>> On 7/26/21 7:37 AM, Denis Efremov wrote:
>>>>
>>>> On 7/26/21 2:17 PM, Mark Hounschell wrote:
>>>>> On 7/26/21 3:57 AM, Denis Efremov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 7/23/21 9:47 PM, Mark Hounschell wrote:
>>>>>>> These 2 incremental patches, patch-5.10.19-20 and patch-5.11.2-3 have broken the user land fd = open("/dev/fd0", (O_RDWR | O_NDELAY)); functionality.
>>>>>> Thank you for the report, I'm looking into this.
>>>>>>
>>>>>>> Since FOREVER before the patch, when using O_NDELAY, one could open the floppy device with no media inserted or even with write protected media without error. "Read-only file system" status is returned only when we actually tried to write to it. We have software still in use today that relies on this functionality.
>>>>>> If it's a project with open sources could you please give a link?
>>>>>>
>>>>>> Regards,
>>>>>> Denis
>>>>>>
>>>>> This is immaterial but fdutils and libdsk both use rely on this flag. Who can know who else does. The point is it should NOT have been changed.
>>>> Yes, I asked this only to add utils and this behavior to the tests.
>>>> And be more specific about why we should preserve this behavior in
>>>> next commit messages.
>>>>
>>> Well, first thing is now you can't open a floppy with a write protected floppy installed. I don't think that was intended but that is now how it is.
>>>
>>> Next there are commands that can be sent to the floppy via "ioctl(fd, FDRAWCMD,  &raw_cmd);" that do NOT require a floppy diskette to be installed.
>>>
>>> All commands issued to the device that require a floppy diskette without a diskette installed fail with the proper status letting you know the device is not ready / no diskette installed. That goes for write protected floppies too.
>>>
>>> There is no reason to force a user to only be able to operate on Linux fdformat formatted floppies.
>>>
>> It appears that the story behind the issue is long enough.
>> I'll try to sum up the things:
>> [1] 09954bad4487 floppy: refactor open() flags handling
>> [2] ff06db1efb2a floppy: fix open(O_ACCMODE) for ioctl-only open
>> [3] 468c298ad3ed Revert "floppy: fix open(O_ACCMODE) for ioctl-only open"
>> [4] f2791e7eadf4 Revert "floppy: refactor open() flags handling"
>> [5] 8a0c014cd205 floppy: reintroduce O_NDELAY fix
>>
>> In [1] we tried to fix O_NDELAY behavior because it's hard to define
>> proper non-blocking behavior for floppies. We also added
>> "!(mode & (FMODE_READ|FMODE_WRITE))" sanity check for open in that patch.
>> Motivation for the changes was that it's easy to livelock the system with
>> floppy's O_NDELAY and syzkaller spotted it. Just for the record, /dev/fd0
>> is only accessible by the root user in recent distros. 
>>
>> Patch [1] broke ioctl-only opens in fdutils because:
>> $ grep -nre open ./setfdprm.c 
>> 60:     if ((fd = open(argv[0],3)) < 0) { /* 3 == no access at all */
>> Patch [2] reverted "!(mode & (FMODE_READ|FMODE_WRITE))" to fix ioctls.
>> I guess [2] was not enough and Jens completely reverted [1] with [3] [4].
>>
>> The last [5] patch restores the open function to the [2] state (it's possible
>> to use ioctl with open O_ACCMODE). [5] was added because libblkid use O_NONBLOCK
>> for probing devices, and floppy driver prints many I/O errors to the kernel log.
>> There are also problems with mounts after. I'm afraid simple revert for [5] is
>> not enough, otherwise we will face libblkid issues once again.I'll try to test the things and find a more elegant solution.
>>
> I performed some tests and here is a small example that can be reproduced
> even with qemu.
> With O_NDELAY fix:
> $ fdlist # no floppy inserted
> fdlist (): drive fd0 does not exist
>
> Without O_NDELAY fix:
> $ fdlist # no floppy inserted
> NAME   TYPE  STATUS
>  fd0  2880K  not mounted
>
> That's because of O_RDONLY|O_NDELAY open in probe_drive:
> https://sources.debian.org/src/fdutils/5.6-2/src/fdmount.c/#L390
>
> I guess that's why the original patch was reverted
> f2791e7eadf4 Revert "floppy: refactor open() flags handling"
> We still have software that depends on O_NDELAY in floppies
> and this patch will be reverted again.
>
> Meanwhile I can't fully reproduce the issues with libblkid.
> I know that systemd-udevd tries to open /dev/fd0 during boot
> with O_RDONLY|O_NDELAY. With O_NDELAY implemented we don't call
> floppy_revalidate() which result in an attempt to read block 0
> https://elixir.bootlin.com/linux/v5.10.20/source/drivers/block/floppy.c#L4127
> However, *with* O_NDELAY fix we try to read block 0 and get kernel
> log errors if no floppy inserted:
> [    1.732360] blk_update_request: I/O error, dev fd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> [    1.732822] floppy: error 10 while reading block 0
> If floppy inserted this results in a boot delay on a
> real system.
>
> Jiri, Kurt, can you give more details about test conditions
> for O_NDELAY problem or maybe even provide some examples?
>
> Maybe it will cheaper to implement a special probing for
> floppies in new software than drop O_NDELAY for all already
> written software. Of course, if there is no cheap and obvious
> in-kernel fix.
>
> Denis
>


  reply	other threads:[~2021-08-01 11:24 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-23 18:47 [BUG] FLOPPY DRIVER since 5.10.20 Mark Hounschell
2021-07-26  7:57 ` Denis Efremov
2021-07-26 11:15   ` Mark Hounschell
2021-07-26 11:17   ` Mark Hounschell
2021-07-26 11:37     ` Denis Efremov
2021-07-26 12:23       ` Mark Hounschell
2021-07-26 16:34         ` Denis Efremov
2021-07-30  5:15           ` Denis Efremov
2021-08-01 11:14             ` Kurt Garloff [this message]
2021-08-08  7:42 ` [PATCH] Revert "floppy: reintroduce O_NDELAY fix" Denis Efremov
2021-08-16  7:17   ` Jiri Kosina
2021-08-18 15:53     ` Denis Efremov
2021-08-30  9:20   ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d46f063-8d8d-95ee-e262-9300f54d527c@garloff.de \
    --to=kurt@garloff.de \
    --cc=dmarkh@cfl.rr.com \
    --cc=efremov@linux.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jkosina@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markh@compro.net \
    --cc=wim@djo.tudelft.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).