All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Retry dm device removal if busy
@ 2011-09-13 14:31 Peter Rajnoha
  2011-09-13 14:55 ` Milan Broz
  2011-09-13 18:29 ` Jonathan Brassow
  0 siblings, 2 replies; 6+ messages in thread
From: Peter Rajnoha @ 2011-09-13 14:31 UTC (permalink / raw)
  To: lvm-devel

If a dm device is being opened in parallel while we're
trying to remove it, we'll end up with an error that
the device is busy. This is a legitimate error, but
with udev in play and asynchronous events generated
as a result of using the WATCH udev rule, we can get
into a situation where such failure is very unpleasant.

Let's try the removal a few times then. Though this is
not a complete solution to the problem, let's use this
until we have one.

Peter
---
 libdm/ioctl/libdm-iface.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/libdm/ioctl/libdm-iface.c b/libdm/ioctl/libdm-iface.c
index 816f1e6..e4a3a1f 100644
--- a/libdm/ioctl/libdm-iface.c
+++ b/libdm/ioctl/libdm-iface.c
@@ -1539,11 +1539,14 @@ static const char *_sanitise_message(char *message)
 	return sanitised_message;
 }
 
+#define DM_REMOVE_IOCTL_RETRIES 5
+
 static struct dm_ioctl *_do_dm_ioctl(struct dm_task *dmt, unsigned command,
 				     unsigned repeat_count)
 {
 	struct dm_ioctl *dmi;
 	int ioctl_with_uevent;
+	int retries = DM_REMOVE_IOCTL_RETRIES;
 
 	dmi = _flatten(dmt, repeat_count);
 	if (!dmi) {
@@ -1627,11 +1630,23 @@ static struct dm_ioctl *_do_dm_ioctl(struct dm_task *dmt, unsigned command,
 		  dmt->sector, _sanitise_message(dmt->message),
 		  dmi->data_size);
 #ifdef DM_IOCTLS
+repeat_dm_ioctl:
 	if (ioctl(_control_fd, command, dmi) < 0) {
 		if (errno == ENXIO && ((dmt->type == DM_DEVICE_INFO) ||
 				       (dmt->type == DM_DEVICE_MKNODES) ||
 				       (dmt->type == DM_DEVICE_STATUS)))
 			dmi->flags &= ~DM_EXISTS_FLAG;	/* FIXME */
+		/*
+		 * FIXME: This is a workaround for asynchronous events generated
+		 *        as a result of using the WATCH udev rule with which we
+		 *        have no way of synchronizing. Processing such events in
+		 *        parallel causes devices to be open.
+		 */
+		else if (errno == EBUSY && (dmt->type == DM_DEVICE_REMOVE) && retries--) {
+			log_debug("device-mapper: device is busy, retrying removal");
+			sleep(1);
+			goto repeat_dm_ioctl;
+		}
 		else {
 			if (_log_suppress)
 				log_verbose("device-mapper: %s ioctl "



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] Retry dm device removal if busy
  2011-09-13 14:31 [PATCH] Retry dm device removal if busy Peter Rajnoha
@ 2011-09-13 14:55 ` Milan Broz
  2011-09-13 15:01   ` Joe Thornber
  2011-09-13 18:29 ` Jonathan Brassow
  1 sibling, 1 reply; 6+ messages in thread
From: Milan Broz @ 2011-09-13 14:55 UTC (permalink / raw)
  To: lvm-devel

On 09/13/2011 04:31 PM, Peter Rajnoha wrote:
> If a dm device is being opened in parallel while we're
> trying to remove it, we'll end up with an error that
> the device is busy. This is a legitimate error, but
> with udev in play and asynchronous events generated
> as a result of using the WATCH udev rule, we can get
> into a situation where such failure is very unpleasant.
> 
> Let's try the removal a few times then. Though this is
> not a complete solution to the problem, let's use this
> until we have one.

Before anyone start to complain:

we spent incredible amount of time with fixing the unfixable,

I really prefer make it to work despite it is not 100% correct.
It is ugly but it will allow us to fix another things and not
waste time here for now.

This approach is already used in cryptsetup, mdadm and perhaps
other similar packages.

ACK for me (in principle), just with different time (see below).

(please keep this as internal workaround, no lvm.conf bloat etc. Users
are already very very confused by trillion switches.)

> diff --git a/libdm/ioctl/libdm-iface.c b/libdm/ioctl/libdm-iface.c
> index 816f1e6..e4a3a1f 100644
> --- a/libdm/ioctl/libdm-iface.c
> +++ b/libdm/ioctl/libdm-iface.c
> @@ -1539,11 +1539,14 @@ static const char *_sanitise_message(char *message)
>  	return sanitised_message;
>  }
>  
> +#define DM_REMOVE_IOCTL_RETRIES 5

I would use the same as array stop for mdadm : 25 x usleep(200000).

#define DM_REMOVE_IOCTL_RETRIES 25

maybe lower that to 10 ?

> +
>  static struct dm_ioctl *_do_dm_ioctl(struct dm_task *dmt, unsigned command,
>  				     unsigned repeat_count)
>  {
>  	struct dm_ioctl *dmi;
>  	int ioctl_with_uevent;
> +	int retries = DM_REMOVE_IOCTL_RETRIES;
>  
>  	dmi = _flatten(dmt, repeat_count);
>  	if (!dmi) {
> @@ -1627,11 +1630,23 @@ static struct dm_ioctl *_do_dm_ioctl(struct dm_task *dmt, unsigned command,
>  		  dmt->sector, _sanitise_message(dmt->message),
>  		  dmi->data_size);
>  #ifdef DM_IOCTLS
> +repeat_dm_ioctl:
>  	if (ioctl(_control_fd, command, dmi) < 0) {
>  		if (errno == ENXIO && ((dmt->type == DM_DEVICE_INFO) ||
>  				       (dmt->type == DM_DEVICE_MKNODES) ||
>  				       (dmt->type == DM_DEVICE_STATUS)))
>  			dmi->flags &= ~DM_EXISTS_FLAG;	/* FIXME */
> +		/*
> +		 * FIXME: This is a workaround for asynchronous events generated
> +		 *        as a result of using the WATCH udev rule with which we
> +		 *        have no way of synchronizing. Processing such events in
> +		 *        parallel causes devices to be open.
> +		 */
> +		else if (errno == EBUSY && (dmt->type == DM_DEVICE_REMOVE) && retries--) {
> +			log_debug("device-mapper: device is busy, retrying removal");
> +			sleep(1);

usleep(200000);

It would be good to not retry if we are sure that it is not transient use of device.
Not sure if it is possible here though.

> +			goto repeat_dm_ioctl;
> +		}
>  		else {
>  			if (_log_suppress)
>  				log_verbose("device-mapper: %s ioctl "

Milan



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Retry dm device removal if busy
  2011-09-13 14:55 ` Milan Broz
@ 2011-09-13 15:01   ` Joe Thornber
  0 siblings, 0 replies; 6+ messages in thread
From: Joe Thornber @ 2011-09-13 15:01 UTC (permalink / raw)
  To: lvm-devel

On Tue, Sep 13, 2011 at 04:55:10PM +0200, Milan Broz wrote:
> Before anyone start to complain:
> 
> we spent incredible amount of time with fixing the unfixable,
> 
> I really prefer make it to work despite it is not 100% correct.
> It is ugly but it will allow us to fix another things and not
> waste time here for now.
> 
> This approach is already used in cryptsetup, mdadm and perhaps
> other similar packages.

I've resorted to doing exactly this in the thin test suite.

- Joe



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Retry dm device removal if busy
  2011-09-13 14:31 [PATCH] Retry dm device removal if busy Peter Rajnoha
  2011-09-13 14:55 ` Milan Broz
@ 2011-09-13 18:29 ` Jonathan Brassow
  2011-09-14 12:14   ` Zdenek Kabelac
  1 sibling, 1 reply; 6+ messages in thread
From: Jonathan Brassow @ 2011-09-13 18:29 UTC (permalink / raw)
  To: lvm-devel

Yes please!

All to often we come across this issue.  There are a number of bugs that are likely fixed by doing this change - even if it isn't academically perfect.

 brassow

On Sep 13, 2011, at 9:31 AM, Peter Rajnoha wrote:

> If a dm device is being opened in parallel while we're
> trying to remove it, we'll end up with an error that
> the device is busy. This is a legitimate error, but
> with udev in play and asynchronous events generated
> as a result of using the WATCH udev rule, we can get
> into a situation where such failure is very unpleasant.
> 
> Let's try the removal a few times then. Though this is
> not a complete solution to the problem, let's use this
> until we have one.
> 
> Peter

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20110913/f1201b63/attachment.htm>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Retry dm device removal if busy
  2011-09-13 18:29 ` Jonathan Brassow
@ 2011-09-14 12:14   ` Zdenek Kabelac
  2011-09-14 12:26     ` Milan Broz
  0 siblings, 1 reply; 6+ messages in thread
From: Zdenek Kabelac @ 2011-09-14 12:14 UTC (permalink / raw)
  To: lvm-devel

Dne 13.9.2011 20:29, Jonathan Brassow napsal(a):
> Yes please!
> 
> All to often we come across this issue.  There are a number of bugs that are likely fixed by doing this change - even if it isn't academically perfect.
> 


Workaround is for 'external' tools cooperating with lvm.
lvm itself should work well without this hack.

(Thus I believe Petr should add lvm.conf option - and for buildbot this hack
should be disabled - so the internal bugs are clearly visible.)

Zdenek


>  brassow
> 
> On Sep 13, 2011, at 9:31 AM, Peter Rajnoha wrote:
> 
>> If a dm device is being opened in parallel while we're
>> trying to remove it, we'll end up with an error that
>> the device is busy. This is a legitimate error, but
>> with udev in play and asynchronous events generated
>> as a result of using the WATCH udev rule, we can get
>> into a situation where such failure is very unpleasant.
>>
>> Let's try the removal a few times then. Though this is
>> not a complete solution to the problem, let's use this
>> until we have one.
>>
>> Peter



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Retry dm device removal if busy
  2011-09-14 12:14   ` Zdenek Kabelac
@ 2011-09-14 12:26     ` Milan Broz
  0 siblings, 0 replies; 6+ messages in thread
From: Milan Broz @ 2011-09-14 12:26 UTC (permalink / raw)
  To: lvm-devel

On 09/14/2011 02:14 PM, Zdenek Kabelac wrote:
> (Thus I believe Petr should add lvm.conf option - and for buildbot this hack
> should be disabled - so the internal bugs are clearly visible.)

Builbot is just another user, I do not think we should test it without flag.
What it will prove? That udisks is still using watch? it does, really :-)

I have to always manually remove udev rule (breaking other things),
it is not dedicated machine.

Milan



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-09-14 12:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-13 14:31 [PATCH] Retry dm device removal if busy Peter Rajnoha
2011-09-13 14:55 ` Milan Broz
2011-09-13 15:01   ` Joe Thornber
2011-09-13 18:29 ` Jonathan Brassow
2011-09-14 12:14   ` Zdenek Kabelac
2011-09-14 12:26     ` Milan Broz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.