All of lore.kernel.org
 help / color / mirror / Atom feed
* handling device or resource busy errors.
@ 2015-12-10  0:25 Ed Peschko
  2015-12-10  2:05 ` Carlos E. R.
  2015-12-10  2:13 ` Dave Chinner
  0 siblings, 2 replies; 6+ messages in thread
From: Ed Peschko @ 2015-12-10  0:25 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 1079 bytes --]

All,

we are 'getting device or resource busy' errors which we *know* are
spurious (lsof shows nothing, no multipath daemon, the partition that we
are trying to access was just created).

So the working theory is that the futex in question is spurious, that
something didn't clean up after itself and we are stuck waiting for an
non-existent process to fix it.

And the force option doesn't work for some reason. So a couple of questions.

    1. Why doesn't force work in this case? With parted and partx it does -
in the case of mkfs.xfs it is a fatal error.
    2. could an 'extra force' option - one which ignored the futex - be
added in cases of backwards compatibility?
    3. is there any way to list out what holds mutexes in the linux kernel
so we could try to root out the ultimate cause of the issue? lsof is
useless, as is dmesg and /var/log/messages.
    4. in the case of #2 how easy would it be? Where is the source code
that centos' version of mkfs.xfs uses on the web? And which check would you
remove?

Thanks for any assistance, this is driving me nuts.

greg

[-- Attachment #1.2: Type: text/html, Size: 1325 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: handling device or resource busy errors.
  2015-12-10  0:25 handling device or resource busy errors Ed Peschko
@ 2015-12-10  2:05 ` Carlos E. R.
  2015-12-10  2:18   ` Dave Chinner
  2015-12-10  2:13 ` Dave Chinner
  1 sibling, 1 reply; 6+ messages in thread
From: Carlos E. R. @ 2015-12-10  2:05 UTC (permalink / raw)
  To: XFS mailing list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 2015-12-10 01:25, Ed Peschko wrote:
> All,
> 
> we are 'getting device or resource busy' errors which we *know*
> are spurious (lsof shows nothing, no multipath daemon, the
> partition that we are trying to access was just created).

I don't know if this is the case, but sometimes, when creating
filesystems under a graphical desktop (like gnome or kde) automatic
systems on those desktops would probe the new partition and try to
find its type and mount it, without action on my part. Even in the
middle of formatting!

I had to resort to using "less clever" desktops, even text mode.

Just a wild shot.


HTH. :-)

- -- 
Cheers / Saludos,

		Carlos E. R.

  (from 13.1 x86_64 "Bottle" (Minas Tirith))
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iF4EAREIAAYFAlZo3eAACgkQja8UbcUWM1zWxAD/armZwbBoH0u5phhhuLg9ZqsO
SUMbHtK9sljugVJj4GEA/j0Hnp0hO6H1u6d2qXIJ62HDK0HOz592bwrsLNAsonnm
=RP1s
-----END PGP SIGNATURE-----

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: handling device or resource busy errors.
  2015-12-10  0:25 handling device or resource busy errors Ed Peschko
  2015-12-10  2:05 ` Carlos E. R.
@ 2015-12-10  2:13 ` Dave Chinner
  2015-12-10  7:58   ` Ed Peschko
  1 sibling, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2015-12-10  2:13 UTC (permalink / raw)
  To: Ed Peschko; +Cc: xfs

On Wed, Dec 09, 2015 at 04:25:06PM -0800, Ed Peschko wrote:
> All,
> 
> we are 'getting device or resource busy' errors which we *know* are
> spurious (lsof shows nothing, no multipath daemon, the partition that we
> are trying to access was just created).

Please show your working - it saves us having to guess at how you
came to that conclusion.

> So the working theory is that the futex in question is spurious, that
> something didn't clean up after itself and we are stuck waiting for an
> non-existent process to fix it.

Sorry, what futex has anything to do with whether a block device can
be opened or not?

> And the force option doesn't work for some reason. So a couple of questions.
> 
>     1. Why doesn't force work in this case? With parted and partx it does -
> in the case of mkfs.xfs it is a fatal error.

Because mkfs.xfs uses O_EXCL in it's open() call. If there are other
active references to the block device, then we sure aren't going to
overwrite anything on it.

>     2. could an 'extra force' option - one which ignored the futex - be
> added in cases of backwards compatibility?

What futex?

>     3. is there any way to list out what holds mutexes in the linux kernel
> so we could try to root out the ultimate cause of the issue? lsof is
> useless, as is dmesg and /var/log/messages.

sysrq-l

Does waiting a few seconds make the problem go away, what about
running 'udevadm settle' before mkfs?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: handling device or resource busy errors.
  2015-12-10  2:05 ` Carlos E. R.
@ 2015-12-10  2:18   ` Dave Chinner
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2015-12-10  2:18 UTC (permalink / raw)
  To: Carlos E. R.; +Cc: XFS mailing list

On Thu, Dec 10, 2015 at 03:05:20AM +0100, Carlos E. R. wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> On 2015-12-10 01:25, Ed Peschko wrote:
> > All,
> > 
> > we are 'getting device or resource busy' errors which we *know*
> > are spurious (lsof shows nothing, no multipath daemon, the
> > partition that we are trying to access was just created).
> 
> I don't know if this is the case, but sometimes, when creating
> filesystems under a graphical desktop (like gnome or kde) automatic
> systems on those desktops would probe the new partition and try to
> find its type and mount it, without action on my part. Even in the
> middle of formatting!

Yup, mounts/probing triggered by udev and/or i/d/fanotify events are
a PITA to discover, especially when they are only active for a
short period (like blkid scanning). In general that is enoughtime
to make the command in the script after partitioning fail, but not
enough time to run any diagnostics that will capture the cause...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: handling device or resource busy errors.
  2015-12-10  2:13 ` Dave Chinner
@ 2015-12-10  7:58   ` Ed Peschko
  2015-12-10 22:35     ` Dave Chinner
  0 siblings, 1 reply; 6+ messages in thread
From: Ed Peschko @ 2015-12-10  7:58 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs


[-- Attachment #1.1: Type: text/plain, Size: 581 bytes --]

Dave,

futex comes from doing a simple strace, which showed the futex as the call
that was being blocked. My fear is that it is a holdover from an old, dead
process; we went through each process on the system and lsof -p'd it,
didn't see anything.

Thanks for the suggestion of sysrq btw, I'll try that the next time I have
access to the box, but waiting doesn't help.Also,  I don't see in sysrq the
actual ability to *kill* wayward locks however; but it will a great help us
track down the underlying issue.

Thanks,

Ed

The one thing I don't see in sysrq is a way to
As for sys

[-- Attachment #1.2: Type: text/html, Size: 845 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: handling device or resource busy errors.
  2015-12-10  7:58   ` Ed Peschko
@ 2015-12-10 22:35     ` Dave Chinner
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2015-12-10 22:35 UTC (permalink / raw)
  To: Ed Peschko; +Cc: xfs

On Wed, Dec 09, 2015 at 11:58:41PM -0800, Ed Peschko wrote:
> Dave,
> 
> futex comes from doing a simple strace, which showed the futex as the call
> that was being blocked. My fear is that it is a holdover from an old, dead
> process; we went through each process on the system and lsof -p'd it,
> didn't see anything.

Please show your working. i.e. attach the output of strace for
whatever process is tripping over this. I've still got no idea
what you are talking about, and I won't until I see the output
of the commands you are taling about.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-12-10 22:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-10  0:25 handling device or resource busy errors Ed Peschko
2015-12-10  2:05 ` Carlos E. R.
2015-12-10  2:18   ` Dave Chinner
2015-12-10  2:13 ` Dave Chinner
2015-12-10  7:58   ` Ed Peschko
2015-12-10 22:35     ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.