All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
@ 2010-04-19  9:15 Line Holen
       [not found] ` <4BCC1F3F.5080000-UdXhSnd/wVw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Line Holen @ 2010-04-19  9:15 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
This can happen if a path request is handled while LFT updates to the fabric
are in progress. 
The LFT of the switch data structure is updated as part of the LFT response 
processing. So while the SM is busy pushing the LFT updates, some switches have
up to date LFT info while others are not yet updated and contains the LFT of
the previous routing. For a (short) time interval there is a potential for 
loops in the fabric. The livelock occurs if a path request is received during
this time interval.
Both LFT response handling and path request processing needs the SM lock.
When the livelock occurs the LFT response handling blocks forever waiting for 
the lock to be released.

The suggested fix is simply to introduce a max number of hops that should
be traversed while handling the path request. If this max is reached then
the request will return with NO_RECORD response and release the SM lock.
This way the LFT processing will be able to complete.

Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>

---

diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
index c4c3f86..b399b70 100644
--- a/opensm/opensm/osm_sa_path_record.c
+++ b/opensm/opensm/osm_sa_path_record.c
@@ -4,6 +4,7 @@
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
  * Copyright (c) 2009 HNR Consulting. All rights reserved.
+ * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -69,6 +70,9 @@
 #include <opensm/osm_prefix_route.h>
 #include <opensm/osm_ucast_lash.h>
 
+
+#define MAX_HOPS 128
+
 typedef struct osm_pr_item {
 	cl_list_item_t list_item;
 	ib_path_rec_t path_rec;
@@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
 	osm_qos_level_t *p_qos_level = NULL;
 	uint16_t valid_sl_mask = 0xffff;
 	int is_lash;
+	int hops = 0;
 
 	OSM_LOG_ENTER(sa->p_log);
 
@@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
 				goto Exit;
 			}
 		}
+
+		/* update number of hops traversed */
+		hops++;
+		if (hops > MAX_HOPS) {
+
+			OSM_LOG(sa->p_log, OSM_LOG_ERROR,
+			    "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
+			    PRIx64 " (%s) needs more than %d hops, "
+			    "max %d hops allowed\n",
+			    cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
+			    p_src_physp->p_node->print_desc,
+			    dest_lid_ho,
+			    cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
+			    p_dest_physp->p_node->print_desc,
+			    hops, MAX_HOPS);
+
+			status = IB_NOT_FOUND;
+			goto Exit;
+		}
 	}
 
 	/*
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
       [not found] ` <4BCC1F3F.5080000-UdXhSnd/wVw@public.gmane.org>
@ 2010-04-19 15:34   ` Sasha Khapyorsky
  2010-04-19 18:32     ` Line Holen
  2010-04-19 18:20   ` Hal Rosenstock
  1 sibling, 1 reply; 8+ messages in thread
From: Sasha Khapyorsky @ 2010-04-19 15:34 UTC (permalink / raw)
  To: Line Holen; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11:15 Mon 19 Apr     , Line Holen wrote:
> SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
> This can happen if a path request is handled while LFT updates to the fabric
> are in progress. 
> The LFT of the switch data structure is updated as part of the LFT response 
> processing. So while the SM is busy pushing the LFT updates, some switches have
> up to date LFT info while others are not yet updated and contains the LFT of
> the previous routing. For a (short) time interval there is a potential for 
> loops in the fabric. The livelock occurs if a path request is received during
> this time interval.
> Both LFT response handling and path request processing needs the SM lock.
> When the livelock occurs the LFT response handling blocks forever waiting for 
> the lock to be released.
> 
> The suggested fix is simply to introduce a max number of hops that should
> be traversed while handling the path request. If this max is reached then
> the request will return with NO_RECORD response and release the SM lock.
> This way the LFT processing will be able to complete.
> 
> Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>

Applied. Thanks. See minor question/note below.

> 
> ---
> 
> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
> index c4c3f86..b399b70 100644
> --- a/opensm/opensm/osm_sa_path_record.c
> +++ b/opensm/opensm/osm_sa_path_record.c
> @@ -4,6 +4,7 @@
>   * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
>   * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
>   * Copyright (c) 2009 HNR Consulting. All rights reserved.
> + * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
>   *
>   * This software is available to you under a choice of one of two
>   * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -69,6 +70,9 @@
>  #include <opensm/osm_prefix_route.h>
>  #include <opensm/osm_ucast_lash.h>
>  
> +
> +#define MAX_HOPS 128

IB spec defines maximal number of hops for a fabric which is 64. Would
it be netter to use this value here?

Sasha

> +
>  typedef struct osm_pr_item {
>  	cl_list_item_t list_item;
>  	ib_path_rec_t path_rec;
> @@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>  	osm_qos_level_t *p_qos_level = NULL;
>  	uint16_t valid_sl_mask = 0xffff;
>  	int is_lash;
> +	int hops = 0;
>  
>  	OSM_LOG_ENTER(sa->p_log);
>  
> @@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>  				goto Exit;
>  			}
>  		}
> +
> +		/* update number of hops traversed */
> +		hops++;
> +		if (hops > MAX_HOPS) {
> +
> +			OSM_LOG(sa->p_log, OSM_LOG_ERROR,
> +			    "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
> +			    PRIx64 " (%s) needs more than %d hops, "
> +			    "max %d hops allowed\n",
> +			    cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
> +			    p_src_physp->p_node->print_desc,
> +			    dest_lid_ho,
> +			    cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
> +			    p_dest_physp->p_node->print_desc,
> +			    hops, MAX_HOPS);
> +
> +			status = IB_NOT_FOUND;
> +			goto Exit;
> +		}
>  	}
>  
>  	/*
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
       [not found] ` <4BCC1F3F.5080000-UdXhSnd/wVw@public.gmane.org>
  2010-04-19 15:34   ` Sasha Khapyorsky
@ 2010-04-19 18:20   ` Hal Rosenstock
       [not found]     ` <j2uf0e08f231004191120oc1e78130l683b9ae0ca51003a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 8+ messages in thread
From: Hal Rosenstock @ 2010-04-19 18:20 UTC (permalink / raw)
  To: Line Holen
  Cc: sashak-smomgflXvOZWk0Htik3J/w, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Mon, Apr 19, 2010 at 5:15 AM, Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org> wrote:
> SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
> This can happen if a path request is handled while LFT updates to the fabric
> are in progress.
> The LFT of the switch data structure is updated as part of the LFT response
> processing. So while the SM is busy pushing the LFT updates, some switches have
> up to date LFT info while others are not yet updated and contains the LFT of
> the previous routing. For a (short) time interval there is a potential for
> loops in the fabric. The livelock occurs if a path request is received during
> this time interval.
> Both LFT response handling and path request processing needs the SM lock.
> When the livelock occurs the LFT response handling blocks forever waiting for
> the lock to be released.
>
> The suggested fix is simply to introduce a max number of hops that should
> be traversed while handling the path request. If this max is reached then
> the request will return with NO_RECORD response

To me, this begs the question of whether this should return a BUSY
status rather than no record (and whether SA clients should handle
those two differently) but that is a bigger change (and may require
some end node change as well).

Also, should a similar change be made in SA MPR mpr_rcv_get_path_parms ?

-- Hal

> and release the SM lock.
> This way the LFT processing will be able to complete.
>
> Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
>
> ---
>
> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
> index c4c3f86..b399b70 100644
> --- a/opensm/opensm/osm_sa_path_record.c
> +++ b/opensm/opensm/osm_sa_path_record.c
> @@ -4,6 +4,7 @@
>  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
>  * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
>  * Copyright (c) 2009 HNR Consulting. All rights reserved.
> + * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
>  *
>  * This software is available to you under a choice of one of two
>  * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -69,6 +70,9 @@
>  #include <opensm/osm_prefix_route.h>
>  #include <opensm/osm_ucast_lash.h>
>
> +
> +#define MAX_HOPS 128
> +
>  typedef struct osm_pr_item {
>        cl_list_item_t list_item;
>        ib_path_rec_t path_rec;
> @@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>        osm_qos_level_t *p_qos_level = NULL;
>        uint16_t valid_sl_mask = 0xffff;
>        int is_lash;
> +       int hops = 0;
>
>        OSM_LOG_ENTER(sa->p_log);
>
> @@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>                                goto Exit;
>                        }
>                }
> +
> +               /* update number of hops traversed */
> +               hops++;
> +               if (hops > MAX_HOPS) {
> +
> +                       OSM_LOG(sa->p_log, OSM_LOG_ERROR,
> +                           "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
> +                           PRIx64 " (%s) needs more than %d hops, "
> +                           "max %d hops allowed\n",
> +                           cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
> +                           p_src_physp->p_node->print_desc,
> +                           dest_lid_ho,
> +                           cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
> +                           p_dest_physp->p_node->print_desc,
> +                           hops, MAX_HOPS);
> +
> +                       status = IB_NOT_FOUND;
> +                       goto Exit;
> +               }
>        }
>
>        /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
  2010-04-19 15:34   ` Sasha Khapyorsky
@ 2010-04-19 18:32     ` Line Holen
       [not found]       ` <4BCCA1C5.5000904-UdXhSnd/wVw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Line Holen @ 2010-04-19 18:32 UTC (permalink / raw)
  To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 04/19/10 05:34 PM, Sasha Khapyorsky wrote:
> On 11:15 Mon 19 Apr     , Line Holen wrote:
>> SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
>> This can happen if a path request is handled while LFT updates to the fabric
>> are in progress. 
>> The LFT of the switch data structure is updated as part of the LFT response 
>> processing. So while the SM is busy pushing the LFT updates, some switches have
>> up to date LFT info while others are not yet updated and contains the LFT of
>> the previous routing. For a (short) time interval there is a potential for 
>> loops in the fabric. The livelock occurs if a path request is received during
>> this time interval.
>> Both LFT response handling and path request processing needs the SM lock.
>> When the livelock occurs the LFT response handling blocks forever waiting for 
>> the lock to be released.
>>
>> The suggested fix is simply to introduce a max number of hops that should
>> be traversed while handling the path request. If this max is reached then
>> the request will return with NO_RECORD response and release the SM lock.
>> This way the LFT processing will be able to complete.
>>
>> Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
> 
> Applied. Thanks. See minor question/note below.
> 
>> ---
>>
>> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
>> index c4c3f86..b399b70 100644
>> --- a/opensm/opensm/osm_sa_path_record.c
>> +++ b/opensm/opensm/osm_sa_path_record.c
>> @@ -4,6 +4,7 @@
>>   * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
>>   * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
>>   * Copyright (c) 2009 HNR Consulting. All rights reserved.
>> + * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
>>   *
>>   * This software is available to you under a choice of one of two
>>   * licenses.  You may choose to be licensed under the terms of the GNU
>> @@ -69,6 +70,9 @@
>>  #include <opensm/osm_prefix_route.h>
>>  #include <opensm/osm_ucast_lash.h>
>>  
>> +
>> +#define MAX_HOPS 128
> 
> IB spec defines maximal number of hops for a fabric which is 64. Would
> it be netter to use this value here?
> 
> Sasha

The value of 128 was chosen as 2x max DR path allowing the SM to be in
the middle of a fabric. But I have no problem lowering to 64.

Line

> 
>> +
>>  typedef struct osm_pr_item {
>>  	cl_list_item_t list_item;
>>  	ib_path_rec_t path_rec;
>> @@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>>  	osm_qos_level_t *p_qos_level = NULL;
>>  	uint16_t valid_sl_mask = 0xffff;
>>  	int is_lash;
>> +	int hops = 0;
>>  
>>  	OSM_LOG_ENTER(sa->p_log);
>>  
>> @@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>>  				goto Exit;
>>  			}
>>  		}
>> +
>> +		/* update number of hops traversed */
>> +		hops++;
>> +		if (hops > MAX_HOPS) {
>> +
>> +			OSM_LOG(sa->p_log, OSM_LOG_ERROR,
>> +			    "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
>> +			    PRIx64 " (%s) needs more than %d hops, "
>> +			    "max %d hops allowed\n",
>> +			    cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
>> +			    p_src_physp->p_node->print_desc,
>> +			    dest_lid_ho,
>> +			    cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>> +			    p_dest_physp->p_node->print_desc,
>> +			    hops, MAX_HOPS);
>> +
>> +			status = IB_NOT_FOUND;
>> +			goto Exit;
>> +		}
>>  	}
>>  
>>  	/*
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
       [not found]     ` <j2uf0e08f231004191120oc1e78130l683b9ae0ca51003a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-04-19 18:48       ` Line Holen
  0 siblings, 0 replies; 8+ messages in thread
From: Line Holen @ 2010-04-19 18:48 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: sashak-smomgflXvOZWk0Htik3J/w, linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 04/19/10 08:20 PM, Hal Rosenstock wrote:
> On Mon, Apr 19, 2010 at 5:15 AM, Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org> wrote:
>> SA path request handling can end up in a livelock in pr_rcv_get_path_parms().
>> This can happen if a path request is handled while LFT updates to the fabric
>> are in progress.
>> The LFT of the switch data structure is updated as part of the LFT response
>> processing. So while the SM is busy pushing the LFT updates, some switches have
>> up to date LFT info while others are not yet updated and contains the LFT of
>> the previous routing. For a (short) time interval there is a potential for
>> loops in the fabric. The livelock occurs if a path request is received during
>> this time interval.
>> Both LFT response handling and path request processing needs the SM lock.
>> When the livelock occurs the LFT response handling blocks forever waiting for
>> the lock to be released.
>>
>> The suggested fix is simply to introduce a max number of hops that should
>> be traversed while handling the path request. If this max is reached then
>> the request will return with NO_RECORD response
> 
> To me, this begs the question of whether this should return a BUSY
> status rather than no record (and whether SA clients should handle
> those two differently) but that is a bigger change (and may require
> some end node change as well).

I think the fundamental issue here is that the path request handling is operating
on inconsistent data - a mixture of old and new lft setup. A proper fix would
be to use a consistent lft setup (either old or new) or deny service (return BUSY)
while LFT updates are in progress. A check on number of hops still make sense
though, because the routing could generate loops too.

> 
> Also, should a similar change be made in SA MPR mpr_rcv_get_path_parms ?

Could be. I haven't checked that code.

Line

> 
> -- Hal
> 
>> and release the SM lock.
>> This way the LFT processing will be able to complete.
>>
>> Signed-off-by: Line Holen <Line.Holen-xsfywfwIY+M@public.gmane.org>
>>
>> ---
>>
>> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
>> index c4c3f86..b399b70 100644
>> --- a/opensm/opensm/osm_sa_path_record.c
>> +++ b/opensm/opensm/osm_sa_path_record.c
>> @@ -4,6 +4,7 @@
>>  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
>>  * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved.
>>  * Copyright (c) 2009 HNR Consulting. All rights reserved.
>> + * Copyright (c) 2010 Sun Microsystems, Inc. All rights reserved.
>>  *
>>  * This software is available to you under a choice of one of two
>>  * licenses.  You may choose to be licensed under the terms of the GNU
>> @@ -69,6 +70,9 @@
>>  #include <opensm/osm_prefix_route.h>
>>  #include <opensm/osm_ucast_lash.h>
>>
>> +
>> +#define MAX_HOPS 128
>> +
>>  typedef struct osm_pr_item {
>>        cl_list_item_t list_item;
>>        ib_path_rec_t path_rec;
>> @@ -178,6 +182,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>>        osm_qos_level_t *p_qos_level = NULL;
>>        uint16_t valid_sl_mask = 0xffff;
>>        int is_lash;
>> +       int hops = 0;
>>
>>        OSM_LOG_ENTER(sa->p_log);
>>
>> @@ -369,6 +374,25 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa,
>>                                goto Exit;
>>                        }
>>                }
>> +
>> +               /* update number of hops traversed */
>> +               hops++;
>> +               if (hops > MAX_HOPS) {
>> +
>> +                       OSM_LOG(sa->p_log, OSM_LOG_ERROR,
>> +                           "Path from GUID 0x%016" PRIx64 " (%s) to lid %u GUID 0x%016"
>> +                           PRIx64 " (%s) needs more than %d hops, "
>> +                           "max %d hops allowed\n",
>> +                           cl_ntoh64(osm_physp_get_port_guid(p_src_physp)),
>> +                           p_src_physp->p_node->print_desc,
>> +                           dest_lid_ho,
>> +                           cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>> +                           p_dest_physp->p_node->print_desc,
>> +                           hops, MAX_HOPS);
>> +
>> +                       status = IB_NOT_FOUND;
>> +                       goto Exit;
>> +               }
>>        }
>>
>>        /*
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
       [not found]       ` <4BCCA1C5.5000904-UdXhSnd/wVw@public.gmane.org>
@ 2010-04-21 10:16         ` Sasha Khapyorsky
  2010-04-21 10:21         ` Sasha Khapyorsky
  1 sibling, 0 replies; 8+ messages in thread
From: Sasha Khapyorsky @ 2010-04-21 10:16 UTC (permalink / raw)
  To: Line Holen; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 20:32 Mon 19 Apr     , Line Holen wrote:
> >> @@ -69,6 +70,9 @@
> >>  #include <opensm/osm_prefix_route.h>
> >>  #include <opensm/osm_ucast_lash.h>
> >>  
> >> +
> >> +#define MAX_HOPS 128
> > 
> > IB spec defines maximal number of hops for a fabric which is 64. Would
> > it be netter to use this value here?
> > 
> > Sasha
> 
> The value of 128 was chosen as 2x max DR path allowing the SM to be in
> the middle of a fabric. But I have no problem lowering to 64.

The path in this calculation is between ports and SM is not part of the
game.

For me it seems that 64 would be better number. Hypothetically it could
be even unrelated to LFTs transition issue - when path exceeds 64 hops
SA can return NOT FOUND just well.

Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
       [not found]       ` <4BCCA1C5.5000904-UdXhSnd/wVw@public.gmane.org>
  2010-04-21 10:16         ` Sasha Khapyorsky
@ 2010-04-21 10:21         ` Sasha Khapyorsky
  2010-04-21 10:40           ` Line Holen
  1 sibling, 1 reply; 8+ messages in thread
From: Sasha Khapyorsky @ 2010-04-21 10:21 UTC (permalink / raw)
  To: Line Holen; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 20:32 Mon 19 Apr     , Line Holen wrote:
> 
> The value of 128 was chosen as 2x max DR path allowing the SM to be in
> the middle of a fabric. But I have no problem lowering to 64.

Would you care about patch?

Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms
  2010-04-21 10:21         ` Sasha Khapyorsky
@ 2010-04-21 10:40           ` Line Holen
  0 siblings, 0 replies; 8+ messages in thread
From: Line Holen @ 2010-04-21 10:40 UTC (permalink / raw)
  To: Sasha Khapyorsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 04/21/10 12:21 PM, Sasha Khapyorsky wrote:
> On 20:32 Mon 19 Apr     , Line Holen wrote:
>> The value of 128 was chosen as 2x max DR path allowing the SM to be in
>> the middle of a fabric. But I have no problem lowering to 64.
> 
> Would you care about patch?

Sure, I can send a patch.

Line

> 
> Sasha
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-04-21 10:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-19  9:15 [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms Line Holen
     [not found] ` <4BCC1F3F.5080000-UdXhSnd/wVw@public.gmane.org>
2010-04-19 15:34   ` Sasha Khapyorsky
2010-04-19 18:32     ` Line Holen
     [not found]       ` <4BCCA1C5.5000904-UdXhSnd/wVw@public.gmane.org>
2010-04-21 10:16         ` Sasha Khapyorsky
2010-04-21 10:21         ` Sasha Khapyorsky
2010-04-21 10:40           ` Line Holen
2010-04-19 18:20   ` Hal Rosenstock
     [not found]     ` <j2uf0e08f231004191120oc1e78130l683b9ae0ca51003a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-04-19 18:48       ` Line Holen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.