All of lore.kernel.org
 help / color / mirror / Atom feed
* SUSE multipath-tools patch resync
@ 2011-05-18 15:03 Hannes Reinecke
  2011-05-20  6:26 ` Christophe Varoqui
  2011-05-25  7:51 ` Christophe Varoqui
  0 siblings, 2 replies; 10+ messages in thread
From: Hannes Reinecke @ 2011-05-18 15:03 UTC (permalink / raw)
  To: christophe varoqui; +Cc: device-mapper development, Mike Snitzer

Hi Christophe,

finally I've finished rebasing my patch queue for
multipath-tools to the latest upstream git commit.
The resulting tree is at:

git.kernel.org:/pub/scm/linux/kernel/git/hare/multipath-tools.git
branch for-christophe

Here are the summary lines of the patches:
  * Reload map for device read-only setting changes
  * Increase priority value for emc priority callout
  * Allow dev_loss to be set to 'infinity'
  * Update manpages
  * multipath: add '-t' option to dump internal hwtable
  * Reassign existing device-mapper maps
  * alua: Handle LBA_DEPENDENT state
  * Update multipathd init script for SuSE
  * multipathd: Add PID to 'show daemon' cli command
  * Update pid file handling
  * Serialize startup on large machines
  * Add 'shutdown' cli command
  * multipathd: Update 'max_fds' handling
  * multipathd: crash in reconfigure CLI command
  * Add 'max_polling_interval' config variable
  * libmultipath: Only count UP and GHOST paths for prio update
  * Always synchronize with dm state
  * multipathd is not starting waitevent checker for single paths
  * More debugging output when synchronizing path states
  * libmultipath: Make path state message unique
  * Update dev_loss_tmo for no_path_retry
  * Check for valid argument in update_multipath_strings()
  * libmultipath: Remove duplicate calls to path_offline()
  * Check for offline path in get_prio()
  * Use state name in get_state()
  * Only check offline status for SCSI devices
  * libmultipath: zero out sense buffer in do_inq()
  * multipathd: fix memory issues in cli.c
  * multipathd: Remove handling of 'umount' events
  * Safe memory allocation in cli_handlers
  * multipathd: Fix uxlsnr race condition on shutdown
  * Rework sysfs device handling in multipathd
  * multipath: add SuSE init file
  * Checker name is not displayed on failure
  * Unload priority modules
  * Display avg priority as group priority
  * libmultipath: Improve debugging of log messages
  * libmultipath: rework sysfs handling
  * libmultipath: update waiter handling
  * libmultipath: do not fallback to search /proc/partitions
  * Start uevent service handler from main thread
  * Use lists for uevent processing
  * Remove sysfs_attr cache
  * Fixup strip trailing whitespaces for getuid return value
  * Check return value for select_alias()
  * Use enum free_path_mode for free_paths argument
  * vector paranoia
  * Update cli request processing
  * Implement 'bindings_file' option
  * kpartx: Intendation fix in devmapper.c
  * Use regmatch when checking for duplicates in hwtable
  * Update hwtable factorization
  * Specify prioritizer in the multipath section, too
  * Static Load balancing prioritizer
  * Synchronize dm states for IO errors
  * Add 'max_polling_interval' config variable
  * libmultipath: do not access dm structures after dm_task_destroy
  * Correctly remove logical partition maps
  * rdac checker: Whitespace cleanup
  * libmultipath: Fix possible string overflow
  * More debugging output for feature changes
  * Fixup debugging in dmparser.c
  * Correctly update feature strings
  * Select 'features' from multipath configuration
  * Increase uevent buffer size
  * libmultipath: remove strcmp_chomp()
  * Remove sysfs_lookup_devpath_by_subsys_id()
  * libmultipath: resolve hash collisions in pgcmp()
  * libmultipath: correct path count in setup_map()
  * Make status variable local
  * Make params variable local
  * multipath segfaults if configuration file contains errors
  * Use '--replace-whitespace' option for scsi_id
  * Asynchronous mode for tur checker
  * Move setup_thread_attr() to uevent.c
  * TUR checker should not return 'failed' for reservation conflict
  * directio: Reset 'running' parameter
  * Use timeout parameter for directio
  * libmultipath: check argument length in execute_program()
  * readsector0: block count calculation wrong
  * checkers: argument paranoia
  * Use refcounting for checkers
  * kpartx: Add option '-f' to force devmap creation
  * Add 'no_partitions' feature
  * Implement 'update' option for kpartx
  * kpartx: Indentation fix
  * Add compability flag LIBDM_API_COOKIE

(Did I mention I had quite a few outstanding?)
They are ordered in reverse order, in case you'd want to
cross-check against the git tree.

How to proceed best here?
Sending them all for review to the mailinglist seems a
bit excessive, but then that'll be automated anyway,
so I don't care.
Just tell me what you prefer.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-18 15:03 SUSE multipath-tools patch resync Hannes Reinecke
@ 2011-05-20  6:26 ` Christophe Varoqui
  2011-05-25  7:51 ` Christophe Varoqui
  1 sibling, 0 replies; 10+ messages in thread
From: Christophe Varoqui @ 2011-05-20  6:26 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: device-mapper development, Mike Snitzer, christophe varoqui

On mer., 2011-05-18 at 17:03 +0200, Hannes Reinecke wrote:
> Hi Christophe,
> 
> finally I've finished rebasing my patch queue for
> multipath-tools to the latest upstream git commit.
> The resulting tree is at:
> 
> git.kernel.org:/pub/scm/linux/kernel/git/hare/multipath-tools.git
> branch for-christophe
> 
I'll start merging directly from your repo on wednesday. Reviewers
please dig the code at the source there.

Thanks for the rebase Hannes.

> Here are the summary lines of the patches:
>   * Reload map for device read-only setting changes
>   * Increase priority value for emc priority callout
>   * Allow dev_loss to be set to 'infinity'
>   * Update manpages
>   * multipath: add '-t' option to dump internal hwtable
>   * Reassign existing device-mapper maps
>   * alua: Handle LBA_DEPENDENT state
>   * Update multipathd init script for SuSE
>   * multipathd: Add PID to 'show daemon' cli command
>   * Update pid file handling
>   * Serialize startup on large machines
>   * Add 'shutdown' cli command
>   * multipathd: Update 'max_fds' handling
>   * multipathd: crash in reconfigure CLI command
>   * Add 'max_polling_interval' config variable
>   * libmultipath: Only count UP and GHOST paths for prio update
>   * Always synchronize with dm state
>   * multipathd is not starting waitevent checker for single paths
>   * More debugging output when synchronizing path states
>   * libmultipath: Make path state message unique
>   * Update dev_loss_tmo for no_path_retry
>   * Check for valid argument in update_multipath_strings()
>   * libmultipath: Remove duplicate calls to path_offline()
>   * Check for offline path in get_prio()
>   * Use state name in get_state()
>   * Only check offline status for SCSI devices
>   * libmultipath: zero out sense buffer in do_inq()
>   * multipathd: fix memory issues in cli.c
>   * multipathd: Remove handling of 'umount' events
>   * Safe memory allocation in cli_handlers
>   * multipathd: Fix uxlsnr race condition on shutdown
>   * Rework sysfs device handling in multipathd
>   * multipath: add SuSE init file
>   * Checker name is not displayed on failure
>   * Unload priority modules
>   * Display avg priority as group priority
>   * libmultipath: Improve debugging of log messages
>   * libmultipath: rework sysfs handling
>   * libmultipath: update waiter handling
>   * libmultipath: do not fallback to search /proc/partitions
>   * Start uevent service handler from main thread
>   * Use lists for uevent processing
>   * Remove sysfs_attr cache
>   * Fixup strip trailing whitespaces for getuid return value
>   * Check return value for select_alias()
>   * Use enum free_path_mode for free_paths argument
>   * vector paranoia
>   * Update cli request processing
>   * Implement 'bindings_file' option
>   * kpartx: Intendation fix in devmapper.c
>   * Use regmatch when checking for duplicates in hwtable
>   * Update hwtable factorization
>   * Specify prioritizer in the multipath section, too
>   * Static Load balancing prioritizer
>   * Synchronize dm states for IO errors
>   * Add 'max_polling_interval' config variable
>   * libmultipath: do not access dm structures after dm_task_destroy
>   * Correctly remove logical partition maps
>   * rdac checker: Whitespace cleanup
>   * libmultipath: Fix possible string overflow
>   * More debugging output for feature changes
>   * Fixup debugging in dmparser.c
>   * Correctly update feature strings
>   * Select 'features' from multipath configuration
>   * Increase uevent buffer size
>   * libmultipath: remove strcmp_chomp()
>   * Remove sysfs_lookup_devpath_by_subsys_id()
>   * libmultipath: resolve hash collisions in pgcmp()
>   * libmultipath: correct path count in setup_map()
>   * Make status variable local
>   * Make params variable local
>   * multipath segfaults if configuration file contains errors
>   * Use '--replace-whitespace' option for scsi_id
>   * Asynchronous mode for tur checker
>   * Move setup_thread_attr() to uevent.c
>   * TUR checker should not return 'failed' for reservation conflict
>   * directio: Reset 'running' parameter
>   * Use timeout parameter for directio
>   * libmultipath: check argument length in execute_program()
>   * readsector0: block count calculation wrong
>   * checkers: argument paranoia
>   * Use refcounting for checkers
>   * kpartx: Add option '-f' to force devmap creation
>   * Add 'no_partitions' feature
>   * Implement 'update' option for kpartx
>   * kpartx: Indentation fix
>   * Add compability flag LIBDM_API_COOKIE
> 
> (Did I mention I had quite a few outstanding?)
> They are ordered in reverse order, in case you'd want to
> cross-check against the git tree.
> 
> How to proceed best here?
> Sending them all for review to the mailinglist seems a
> bit excessive, but then that'll be automated anyway,
> so I don't care.
> Just tell me what you prefer.
> 
> Cheers,
> 
> Hannes

-- 
Christophe Varoqui
OpenSVC - Tools to scale
http://www.opensvc.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-18 15:03 SUSE multipath-tools patch resync Hannes Reinecke
  2011-05-20  6:26 ` Christophe Varoqui
@ 2011-05-25  7:51 ` Christophe Varoqui
  2011-05-25  9:08   ` Hannes Reinecke
  1 sibling, 1 reply; 10+ messages in thread
From: Christophe Varoqui @ 2011-05-25  7:51 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: device-mapper development

On mer., 2011-05-18 at 17:03 +0200, Hannes Reinecke wrote:
> git.kernel.org:/pub/scm/linux/kernel/git/hare/multipath-tools.git

I started the merge. I'll post comments along the reading. It seems
there won't be much : this patchset is clearly a must-have.

...

Shouldn't this one be reverted ? Seems the real thing is
10d68a92b2a0027d5468cf76656692f34f6dbc54


commit efc8ace4b335e752a7d28aca6040af0f9fe37530
Author: Hannes Reinecke <hare@suse.de>
Date:   Mon Feb 1 09:46:57 2010 +0100

    Add 'max_polling_interval' config variable
    
    We should be able to set the 'max_polling_interval' variable
    manually. Especially systems requiring precise failover timing
    will want to disable the automatic polling interval increase.
    
    Signed-off-by: Hannes Reinecke <hare@suse.de>

diff --git a/multipathd/main.c b/multipathd/main.c
index 7659bb2..6867f23 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1096,9 +1096,9 @@ check_path (struct vectors * vecs, struct path *
pp)
                         * max at conf->max_checkint
                         */
                        if (pp->checkint < (conf->max_checkint / 2))
-                               pp->checkint = 2 * pp->checkint;
+                           pp->checkint = 2 * pp->checkint;
                        else
-                               pp->checkint = conf->max_checkint;
+                           pp->checkint = conf->max_checkint;
 
                        pp->tick = pp->checkint;
                        condlog(4, "%s: delay next check %is",

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-25  7:51 ` Christophe Varoqui
@ 2011-05-25  9:08   ` Hannes Reinecke
  2011-05-25 11:14     ` Christophe Varoqui
  0 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2011-05-25  9:08 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: device-mapper development, Christophe Varoqui

On 05/25/2011 09:51 AM, Christophe Varoqui wrote:
> On mer., 2011-05-18 at 17:03 +0200, Hannes Reinecke wrote:
>> git.kernel.org:/pub/scm/linux/kernel/git/hare/multipath-tools.git
>
> I started the merge. I'll post comments along the reading. It seems
> there won't be much : this patchset is clearly a must-have.
>
:-)

With some things so bloody obvious that I keep wondering if I'm the
only one seeing these ...

> ...
>
> Shouldn't this one be reverted ? Seems the real thing is
> 10d68a92b2a0027d5468cf76656692f34f6dbc54
>
>
> commit efc8ace4b335e752a7d28aca6040af0f9fe37530
> Author: Hannes Reinecke<hare@suse.de>
> Date:   Mon Feb 1 09:46:57 2010 +0100
>
>      Add 'max_polling_interval' config variable
>
>      We should be able to set the 'max_polling_interval' variable
>      manually. Especially systems requiring precise failover timing
>      will want to disable the automatic polling interval increase.
>
>      Signed-off-by: Hannes Reinecke<hare@suse.de>
>
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 7659bb2..6867f23 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1096,9 +1096,9 @@ check_path (struct vectors * vecs, struct path *
> pp)
>                           * max at conf->max_checkint
>                           */
>                          if (pp->checkint<  (conf->max_checkint / 2))
> -                               pp->checkint = 2 * pp->checkint;
> +                           pp->checkint = 2 * pp->checkint;
>                          else
> -                               pp->checkint = conf->max_checkint;
> +                           pp->checkint = conf->max_checkint;
>
>                          pp->tick = pp->checkint;
>                          condlog(4, "%s: delay next check %is",
>
>
>
Oh, yes. Merge error. Please do not include this.
Thanks for your effort.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-25  9:08   ` Hannes Reinecke
@ 2011-05-25 11:14     ` Christophe Varoqui
  2011-05-25 12:10       ` Christophe Varoqui
  0 siblings, 1 reply; 10+ messages in thread
From: Christophe Varoqui @ 2011-05-25 11:14 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: device-mapper development

On mer., 2011-05-25 at 11:08 +0200, Hannes Reinecke wrote:
> On 05/25/2011 09:51 AM, Christophe Varoqui wrote:
> > On mer., 2011-05-18 at 17:03 +0200, Hannes Reinecke wrote:
> >> git.kernel.org:/pub/scm/linux/kernel/git/hare/multipath-tools.git
> >
> > I started the merge. I'll post comments along the reading. It seems
> > there won't be much : this patchset is clearly a must-have.
> >
> :-)
> 
> With some things so bloody obvious that I keep wondering if I'm the
> only one seeing these ...
> 
The merge is done. No change wrt your branch. Still not pushed to korg.
I struggle with a hwtable corruption with no /etc/multipath.conf ... can
you reproduce that ?

For example, iterating through the hwtable and printing ->product from
config.c:find_hwe() gives:

...
(StorEdge 3510|T4)
FC2502
RAIGE VOLUME
CSM200_R
LCSM100_[IEFS]
INF-01-00
FLEXLINE 380
\x01
\x01\x02hd\x03\x10
\x01dcssblk

The final spurious entries are clearly from the blacklist mem region. I
verified the hwtable is not corrupted after initialization. Corruption
thus occurs between hwtable init and path discovery.

I'll continue tracking it, but if you see something obvious ...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-25 11:14     ` Christophe Varoqui
@ 2011-05-25 12:10       ` Christophe Varoqui
  2011-05-25 12:22         ` Christophe Varoqui
  0 siblings, 1 reply; 10+ messages in thread
From: Christophe Varoqui @ 2011-05-25 12:10 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: device-mapper development

On mer., 2011-05-25 at 13:14 +0200, Christophe Varoqui wrote:
> On mer., 2011-05-25 at 11:08 +0200, Hannes Reinecke wrote:
> > On 05/25/2011 09:51 AM, Christophe Varoqui wrote:
> > > On mer., 2011-05-18 at 17:03 +0200, Hannes Reinecke wrote:
> > >> git.kernel.org:/pub/scm/linux/kernel/git/hare/multipath-tools.git
> > >
> > > I started the merge. I'll post comments along the reading. It seems
> > > there won't be much : this patchset is clearly a must-have.
> > >
> > :-)
> > 
> > With some things so bloody obvious that I keep wondering if I'm the
> > only one seeing these ...
> > 
> The merge is done. No change wrt your branch. Still not pushed to korg.
> I struggle with a hwtable corruption with no /etc/multipath.conf ... can
> you reproduce that ?
> 
Well, forget it ... it was just a matter of 'make clean && make' as some
structures changed length.

The merge is over and pushed to korg.
I catched only to problems.
You might want to review the fixes.

1/
commit e2ae02287aaec0b56ac832841130e3c855a5a471
Author: Christophe Varoqui <christophe.varoqui@opensvc.com>
Date:   Wed May 25 14:00:52 2011 +0200

    Fix segfault in dm reassign code path
    
    alias is allocated and freed in multipathd/main.c:uev_add_map().
    
    Don't free it in ev_add_map() called from uev_add_map() to avoid
    double free.

diff --git a/multipathd/main.c b/multipathd/main.c
index 4497609..cc75921 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -272,7 +272,6 @@ ev_add_map (char * dev, char * alias, struct vectors
* vecs)
                                alias);
                        dm_reassign(alias);
                }
-               FREE(alias);
                return 0;
        }

2/
commit 7b541d1e2cee70ad5e61edf12333e7ff49615c8c
Author: Christophe Varoqui <christophe.varoqui@opensvc.com>
Date:   Wed May 25 13:59:53 2011 +0200

    Set an internal default for feature
    
    Fix segfault when no /etc/multipath.conf is present.

diff --git a/libmultipath/config.c b/libmultipath/config.c
index 87039f0..4236088 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -493,6 +493,7 @@ load_config (char * file)
        conf->bindings_file = set_default(DEFAULT_BINDINGS_FILE);
        conf->bindings_read_only = 0;
        conf->multipath_dir = set_default(DEFAULT_MULTIPATHDIR);
+       conf->features = set_default(DEFAULT_FEATURES);
        conf->flush_on_last_del = 0;
        conf->attribute_flags = 0;
        conf->reassign_maps = DEFAULT_REASSIGN_MAPS;

-- 
Christophe Varoqui
OpenSVC - Tools to scale
http://www.opensvc.com/

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-25 12:10       ` Christophe Varoqui
@ 2011-05-25 12:22         ` Christophe Varoqui
  2011-05-25 12:45           ` Hannes Reinecke
  2011-05-25 21:29           ` Christophe Varoqui
  0 siblings, 2 replies; 10+ messages in thread
From: Christophe Varoqui @ 2011-05-25 12:22 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: device-mapper development

> Well, forget it ... it was just a matter of 'make clean && make' as some
> structures changed length.
> 
> The merge is over and pushed to korg.
> I catched only to problems.
> You might want to review the fixes.
> 
Hmm, there might be other problems.

on a 'multipathd reconfigure' command, the uxclient gets stuck and the
multipathd daemon strace shows:

$ sudo strace -f -p 17721
Process 17721 attached with 7 threads - interrupt to quit
[pid 17757] futex(0x7fdc6a1540a4, FUTEX_WAIT_PRIVATE, 3, NULL
<unfinished ...>
[pid 17756] futex(0x11167f0, FUTEX_WAIT_PRIVATE, 2, NULL
<unfinished ...>
[pid 17755] ioctl(3, DM_DEV_WAIT <unfinished ...>
[pid 17724] futex(0x11167f0, FUTEX_WAIT_PRIVATE, 2, NULL
<unfinished ...>
[pid 17723] recvmsg(6,  <unfinished ...>
[pid 17722] futex(0x110a1b4, FUTEX_WAIT_PRIVATE, 15, NULL
<unfinished ...>
[pid 17721] futex(0x612624, FUTEX_WAIT_PRIVATE, 1, NULL

-- 
Christophe Varoqui
OpenSVC - Tools to scale
http://www.opensvc.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-25 12:22         ` Christophe Varoqui
@ 2011-05-25 12:45           ` Hannes Reinecke
  2011-05-25 21:29           ` Christophe Varoqui
  1 sibling, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2011-05-25 12:45 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: device-mapper development, Christophe Varoqui

On 05/25/2011 02:22 PM, Christophe Varoqui wrote:
>> Well, forget it ... it was just a matter of 'make clean&&  make' as some
>> structures changed length.
>>
>> The merge is over and pushed to korg.
>> I catched only to problems.
>> You might want to review the fixes.
>>
> Hmm, there might be other problems.
>
> on a 'multipathd reconfigure' command, the uxclient gets stuck and the
> multipathd daemon strace shows:
>
> $ sudo strace -f -p 17721
> Process 17721 attached with 7 threads - interrupt to quit
> [pid 17757] futex(0x7fdc6a1540a4, FUTEX_WAIT_PRIVATE, 3, NULL
> <unfinished ...>
> [pid 17756] futex(0x11167f0, FUTEX_WAIT_PRIVATE, 2, NULL
> <unfinished ...>
> [pid 17755] ioctl(3, DM_DEV_WAIT<unfinished ...>
> [pid 17724] futex(0x11167f0, FUTEX_WAIT_PRIVATE, 2, NULL
> <unfinished ...>
> [pid 17723] recvmsg(6,<unfinished ...>
> [pid 17722] futex(0x110a1b4, FUTEX_WAIT_PRIVATE, 15, NULL
> <unfinished ...>
> [pid 17721] futex(0x612624, FUTEX_WAIT_PRIVATE, 1, NULL
>
I've just send some more fixes which have been uncovered during
recent tests.

- Use refcounting for sysfs devices: We do use sysdev_get() and
   sysdev_put(), but forgot to refcount them. So occasionally
   uevents would try to call sysdev_put() on already freed devices.
- Race condition during shutdown with stop_waiter_thread():
   Trying to access anything inside the waiter structure is
   suicidal, as the waiter thread might be gone at any time.
- Do not handle external renames from multipathd thread:
   Fallout from the above patch; and what's more, the external
   rename should be handled by the waiter thread already.
   So no point in doing so from the daemon itself.

Patches should appear on the mailing list pretty soon.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-25 12:22         ` Christophe Varoqui
  2011-05-25 12:45           ` Hannes Reinecke
@ 2011-05-25 21:29           ` Christophe Varoqui
  2011-05-26  6:28             ` Hannes Reinecke
  1 sibling, 1 reply; 10+ messages in thread
From: Christophe Varoqui @ 2011-05-25 21:29 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: device-mapper development


> on a 'multipathd reconfigure' command, the uxclient gets stuck and the
> multipathd daemon strace shows:
> 
> $ sudo strace -f -p 17721
> Process 17721 attached with 7 threads - interrupt to quit
> [pid 17757] futex(0x7fdc6a1540a4, FUTEX_WAIT_PRIVATE, 3, NULL
> <unfinished ...>
> [pid 17756] futex(0x11167f0, FUTEX_WAIT_PRIVATE, 2, NULL
> <unfinished ...>
> [pid 17755] ioctl(3, DM_DEV_WAIT <unfinished ...>
> [pid 17724] futex(0x11167f0, FUTEX_WAIT_PRIVATE, 2, NULL
> <unfinished ...>
> [pid 17723] recvmsg(6,  <unfinished ...>
> [pid 17722] futex(0x110a1b4, FUTEX_WAIT_PRIVATE, 15, NULL
> <unfinished ...>
> [pid 17721] futex(0x612624, FUTEX_WAIT_PRIVATE, 1, NULL
> 
ok, I dug it to 9e7b4d8d6fa8dc9433c1e60d4bd6717aec2f5296

Here you add acquire/release the vector lock inside
multipathd/main.c:reconfigure(), but as seen in the following LCKDBG
trace, the lock is already acquired in
multipathd/main.c:uxsock_trigger()

Hence the lock -> lock = hang.

I commited and pushed a partial revert of
9e7b4d8d6fa8dc9433c1e60d4bd6717aec2f5296

But maybe you'd rather see us stop acquiring the lock from
uxsock_trigger() to acquire more selectively in the functions called
from there ... Please comment.

Regards,
-- 
Christophe Varoqui
OpenSVC - Tools to scale
http://www.opensvc.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SUSE multipath-tools patch resync
  2011-05-25 21:29           ` Christophe Varoqui
@ 2011-05-26  6:28             ` Hannes Reinecke
  0 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2011-05-26  6:28 UTC (permalink / raw)
  To: christophe.varoqui; +Cc: device-mapper development, Christophe Varoqui

On 05/25/2011 11:29 PM, Christophe Varoqui wrote:
>
>> on a 'multipathd reconfigure' command, the uxclient gets stuck and the
>> multipathd daemon strace shows:
>>
>> $ sudo strace -f -p 17721
>> Process 17721 attached with 7 threads - interrupt to quit
>> [pid 17757] futex(0x7fdc6a1540a4, FUTEX_WAIT_PRIVATE, 3, NULL
>> <unfinished ...>
>> [pid 17756] futex(0x11167f0, FUTEX_WAIT_PRIVATE, 2, NULL
>> <unfinished ...>
>> [pid 17755] ioctl(3, DM_DEV_WAIT<unfinished ...>
>> [pid 17724] futex(0x11167f0, FUTEX_WAIT_PRIVATE, 2, NULL
>> <unfinished ...>
>> [pid 17723] recvmsg(6,<unfinished ...>
>> [pid 17722] futex(0x110a1b4, FUTEX_WAIT_PRIVATE, 15, NULL
>> <unfinished ...>
>> [pid 17721] futex(0x612624, FUTEX_WAIT_PRIVATE, 1, NULL
>>
> ok, I dug it to 9e7b4d8d6fa8dc9433c1e60d4bd6717aec2f5296
>
> Here you add acquire/release the vector lock inside
> multipathd/main.c:reconfigure(), but as seen in the following LCKDBG
> trace, the lock is already acquired in
> multipathd/main.c:uxsock_trigger()
>
> Hence the lock ->  lock = hang.
>
> I commited and pushed a partial revert of
> 9e7b4d8d6fa8dc9433c1e60d4bd6717aec2f5296
>
> But maybe you'd rather see us stop acquiring the lock from
> uxsock_trigger() to acquire more selectively in the functions called
> from there ... Please comment.
>
Hmm. Yes, your fix appears to be correct.
I had several locking issues during startup (calling cli commands
while the daemon is still starting up is a nice way of testing it),
and several (unsuccessful) attempts in fixing it.
Real cause was a missing locking during initial configuration,
so it looks as if 9e7b4d8d6fa8dc9433c1e60d4bd6717aec2f5296
was in fact a left-over from the earlier attempts.

So yeah, your patch seems to be fine.

Will be doing more testing here.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-05-26  6:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-18 15:03 SUSE multipath-tools patch resync Hannes Reinecke
2011-05-20  6:26 ` Christophe Varoqui
2011-05-25  7:51 ` Christophe Varoqui
2011-05-25  9:08   ` Hannes Reinecke
2011-05-25 11:14     ` Christophe Varoqui
2011-05-25 12:10       ` Christophe Varoqui
2011-05-25 12:22         ` Christophe Varoqui
2011-05-25 12:45           ` Hannes Reinecke
2011-05-25 21:29           ` Christophe Varoqui
2011-05-26  6:28             ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.