Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: "Michal Suchánek" <msuchanek@suse.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-scsi@vger.kernel.org, linux-block@vger.kernel.org,
	Jonathan Corbet <corbet@lwn.net>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Eric Biggers <ebiggers@google.com>,
	"J. Bruce Fields" <bfields@redhat.com>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Benjamin Coddington <bcodding@redhat.com>,
	Ming Lei <ming.lei@redhat.com>,
	Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	Hou Tao <houtao1@huawei.com>,
	Pavel Begunkov <asml.silence@gmail.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Jan Kara <jack@suse.cz>, Hannes Reinecke <hare@suse.com>,
	"Ewan D. Milne" <emilne@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v4 rebase 00/10] Fix cdrom autoclose
Date: Wed, 4 Dec 2019 20:01:54 +0100
Message-ID: <20191204190154.GA28406@kitsune.suse.cz> (raw)
In-Reply-To: <20191127081144.GZ11661@kitsune.suse.cz>

On Wed, Nov 27, 2019 at 09:11:44AM +0100, Michal Suchánek wrote:
> On Tue, Nov 26, 2019 at 04:13:32PM -0700, Jens Axboe wrote:
> > On 11/26/19 1:21 PM, Michal Suchánek wrote:
> > > On Tue, Nov 26, 2019 at 01:01:42PM -0700, Jens Axboe wrote:
> > >> On 11/26/19 12:54 PM, Michal Suchanek wrote:
> > >>> Hello,
> > >>>
> > >>> there is cdrom autoclose feature that is supposed to close the tray,
> > >>> wait for the disc to become ready, and then open the device.
> > >>>
> > >>> This used to work in ancient times. Then in old times there was a hack
> > >>> in util-linux which worked around the breakage which probably resulted
> > >>> from switching to scsi emulation.
> > >>>
> > >>> Currently util-linux maintainer refuses to merge another hack on the
> > >>> basis that kernel still has the feature so it should be fixed there.
> > >>> The code needs not be replicated in every userspace utility like mount
> > >>> or dd which has no business knowing which devices are CD-roms and where
> > >>> the autoclose setting is in the kernel.
> > >>>
> > >>> This is rebase on top of current master.
> > >>>
> > >>> Also it seems that most people think that this is fix for WMware because
> > >>> there is one patch dealing with WMware.
> > >>
> > >> I think the main complaint with this is that it's kind of a stretch to
> > >> add core functionality for a device type that's barely being
> > >> manufactured anymore and is mostly used in a virtualized fashion. I

That optical drives are hardly manufactured is kind of a stretch. I have
no problem obtaining drives from a few manufacturers in any nearby
computer store. While using DVDs may be slowly getting out of fashion
the same applies to all optical drives, including Blueray. Some of these
will stay for forseeable future.

> > >> think it you could fix this without 10 patches of churn and without
> > >> adding a new ->open() addition to fops, then people would be a lot more
> > >> receptive to the idea of improving cdrom auto-close.
> > > 
> > > I see no way to do that cleanly.
> > > 
> > > There are two open modes for cdrom devices - blocking and
> > > non-blocking.
> > > 
> > > In blocking mode open() should analyze the medium so that it's ready
> > > when it returns. In non-blocking mode it should return immediately so
> > > long as you can talk to the device.
> > > 
> > > When waiting in open() with locks held the processes trying to open
> > > the device are locked out regradless of the mode they use.
> > > 
> > > The only way to solve this is to pretend that the device is open and
> > > do the wait afterwards with the device unlocked.
> > 
> > How is this any different from an open on a file that needs to bring in
> > meta data on a busy rotating device, which can also take seconds?
> 
> First, accessing a file will take seconds only when your system is
> seriously overloaded or misconfigured. The access time for rotational
> storage is tens of milliseconds. With cdrom the access time after
> closing the door is measured in tens of seconds on common hardware. It
> can be shorter but also possibly longer. I am not aware of any limit
> there. It may be reasonable to want to get device status during this
> time.
> 
> Second, fetching the metadata for the file does not block operations that
> don't need the metadata. Here waiting for the drive to get ready blocks
> all access. You could get drive status if you did not try to open it
> but once you do you can no longer talk to it.

So let's look at the alternatives. One proposed alternative was to
change the locking calls to the locks that are held while waiting in
open() to interuptible so that impatient users can at least kill
processes waiting on their CD medium to become ready.

What is held are sr_mutex and bd_mutex.

bd_mutex is per_device so any open() or close() on the same CD-ROM
device is blocked. There are a number of other sites where bd_mutex is
locked and it will be needed to figure out which of these can be called
with a cd-rom device and change them to killable so that processes
waiting on the lock to don't get uninterriptibly stuck. This may be more
code churn than this patchset. I think we can exclude loop.c and
zram-dev.c but ioctl.c, xen-blkfront.c, and block_dev.c apply. Don't
know about dasd.

The bd_mutex is held in iterate_bdevs so all bets are off wrt being able
to operate the system.  Once a process is stuck waiting in blkdev_get()
which calls open() on the cdrom you cannot iterate block devices. With
boot times measured in seconds and medium analysis times measured in
tens of seconds users will not be amused.

autoclose defaults to on, and blkid reads deviced in blocking mode
causing all the fun stuff to trigger (patch pending to change that).
Nonetheless any number of utilities still not aware of this nonblock
quirk out there will try to open the device sooner or later blocking all
operations that require iterating the list of block devices.

Adding tens of seconds to block device opening time (which I assume
might need iterating list of block devices) might even overflow some
systemd timeout and fail boot.  There is timeout for each particular job
but there are also cumulative timeouts for something like 'locate device
with this UUID' which are fixed regardles of the number of layers (LVM,
md, ..) involved.

The other approach is to do like harddisks. With a harddisk a medium is
'fixed' - that is assumed to be present always. Any error accessing the
medium is reported on read() or write() and not necessarily on open().
This would require hooking the autoclose to the operations that require
actual medium - probably something like count_tracks(), and eschew
calling these from open(). That would work but breaks the contract
described in the current API documentation - that is if you don't open
with O_NONBLOCK and there is obvious medium error like no medium at all
or no usable track you get the error on open() rather than on whatever
opration that tries to use the track.

Thanks

Michal

      reply index

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-26 19:54 Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 01/10] cdrom: add poll_event_interruptible Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 02/10] cdrom: factor out common open_for_* code Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 03/10] cdrom: wait for the tray to close Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 04/10] cdrom: export autoclose logic as a separate function Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 05/10] cdrom: unify log messages Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 06/10] bdev: reset first_open when looping in __blkget_dev Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 07/10] bdev: separate parts of __blkdev_get as helper functions Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 08/10] bdev: add open_finish Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 09/10] scsi: blacklist: add VMware ESXi cdrom - broken tray emulation Michal Suchanek
2019-11-26 19:54 ` [PATCH v4 rebase 10/10] scsi: sr: wait for the medium to become ready Michal Suchanek
2019-11-26 20:01 ` [PATCH v4 rebase 00/10] Fix cdrom autoclose Jens Axboe
2019-11-26 20:21   ` Michal Suchánek
2019-11-26 23:13     ` Jens Axboe
2019-11-27  8:11       ` Michal Suchánek
2019-12-04 19:01         ` Michal Suchánek [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191204190154.GA28406@kitsune.suse.cz \
    --to=msuchanek@suse.de \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=bcodding@redhat.com \
    --cc=bfields@redhat.com \
    --cc=bvanassche@acm.org \
    --cc=chaitanya.kulkarni@wdc.com \
    --cc=corbet@lwn.net \
    --cc=damien.lemoal@wdc.com \
    --cc=ebiggers@google.com \
    --cc=emilne@redhat.com \
    --cc=hare@suse.com \
    --cc=hch@infradead.org \
    --cc=houtao1@huawei.com \
    --cc=jack@suse.cz \
    --cc=jejb@linux.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mchehab+samsung@kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git