On Mon 05-08-19 09:41:39, Bart Van Assche wrote:
> On 7/30/19 6:36 AM, Jan Kara wrote:
> > On Tue 30-07-19 12:16:46, Jan Kara wrote:
> > > On Tue 30-07-19 10:36:59, John Lenton wrote:
> > > > On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote:
> > > > > 
> > > > > Thanks for the notice and the references. What's your version of
> > > > > util-linux? What your test script does is indeed racy. You have there:
> > > > > 
> > > > > echo Running:
> > > > > for i in {a..z}{a..z}; do
> > > > >      mount $i.squash /mnt/$i &
> > > > > done
> > > > > 
> > > > > So all mount(8) commands will run in parallel and race to setup loop
> > > > > devices with LOOP_SET_FD and mount them. However util-linux (at least in
> > > > > the current version) seems to handle EBUSY from LOOP_SET_FD just fine and
> > > > > retries with the new loop device. So at this point I don't see why the patch
> > > > > makes difference... I guess I'll need to reproduce and see what's going on
> > > > > in detail.
> > > > 
> > > > We've observed this in arch with util-linux 2.34, and ubuntu 19.10
> > > > (eoan ermine) with util-linux 2.33.
> > > > 
> > > > just to be clear, the initial reports didn't involve a zany loop of
> > > > mounts, but were triggered by effectively the same thing as systemd
> > > > booted a system with a lot of snaps. The reroducer tries to makes
> > > > things simpler to reproduce :-). FWIW,  systemd versions were 244 and
> > > > 242 for those systems, respectively.
> > > 
> > > Thanks for info! So I think I see what's going on. The two mounts race
> > > like:
> > > 
> > > MOUNT1					MOUNT2
> > > num = ioctl(LOOP_CTL_GET_FREE)
> > > 					num = ioctl(LOOP_CTL_GET_FREE)
> > > ioctl("/dev/loop$num", LOOP_SET_FD, ..)
> > >   - returns OK
> > > 					ioctl("/dev/loop$num", LOOP_SET_FD, ..)
> > > 					  - acquires exclusine loop$num
> > > 					    reference
> > > mount("/dev/loop$num", ...)
> > >   - sees exclusive reference from MOUNT2 and fails
> > > 					  - sees loop device is already
> > > 					    bound and fails
> > > 
> > > It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block
> > > perfectly valid mount. I'll think how to fix this...
> > 
> > So how about attached patch? It fixes the regression for me.
 
Hi Bart,

> A new kernel warning is triggered by blktests block/001 that did not happen
> without this patch. Reverting commit 89e524c04fa9 ("loop: Fix mount(2)
> failure due to race with LOOP_SET_FD") makes that kernel warning disappear.
> Is this reproducible on your setup?

Thanks for report! Hum, no, it seems the warning doesn't trigger in my test
VM. But reviewing the mentioned commit with fresh head, I can see where I
did a mistake during my conversion of blkdev_get(). Does attached patch fix
the warning for you?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR