All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jon Maloy <jon.maloy@ericsson.com>
To: Kolmakov Dmitriy <kolmakov.dmitriy@huawei.com>,
	David Miller <davem@davemloft.net>
Cc: Ying Xue <ying.xue@windriver.com>,
	"tipc-discussion@lists.sourceforge.net" 
	<tipc-discussion@lists.sourceforge.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH] tipc: fix stall during bclink wakeup procedure
Date: Thu, 3 Sep 2015 12:38:43 +0000	[thread overview]
Message-ID: <A2BAEFC30C8FD34388F02C9B3121859D22319EC7@eusaamb103.ericsson.se> (raw)
In-Reply-To: <F44CB10AAC4D8B448FFBC398F7C73866FBAD7A@lhreml501-mbb>



> -----Original Message-----
> From: Kolmakov Dmitriy [mailto:kolmakov.dmitriy@huawei.com]
> Sent: Thursday, 03 September, 2015 04:31
> To: David Miller
> Cc: Jon Maloy; Ying Xue; tipc-discussion@lists.sourceforge.net;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH] tipc: fix stall during bclink wakeup procedure
> 
> From: David Miller [mailto:davem@davemloft.net]
> >
> > From: Kolmakov Dmitriy <kolmakov.dmitriy@huawei.com>
> > Date: Wed, 2 Sep 2015 15:33:00 +0000
> >
> > > If an attempt to wake up users of broadcast link is made when there
> > is
> > > no enough place in send queue than it may hang up inside the
> > > tipc_sk_rcv() function since the loop breaks only after the wake up
> > > queue becomes empty. This can lead to complete CPU stall with the
> > > following message generated by RCU:
> >
> > I don't understand how it can loop forever.
> >
> > It should either successfully deliver each packet to the socket, or
> > respond with a TIPC_ERR_OVERLOAD.
> >
> > In both cases, the SKB is dequeued from the queue and forward progress
> > is made.
> 
> The issue occurs only when tipc_sk_rcv() is used to wake up postponed
> senders. In this case the call stack is following:
> 
> 	tipc_bclink_wakeup_users()
> 		// wakeupq - is a queue consist of special
> 		// 		 messages with SOCK_WAKEUP type.
> 		tipc_sk_rcv(wakeupq)
> 			...
> 			while (skb_queue_len(inputq)) {
> 				filter_rcv(skb)
> 					// Here the type of message is
> checked
> 					// and if it is SOCK_WAKEUP than
> 					// it tries to wake up a sender.
> 					tipc_write_space(sk)
> 
> 	wake_up_interruptible_sync_poll()
> 			}
> 
> After the sender thread is woke up it can gather control and perform an
> attempt to send a message. But if there is no enough place in send queue it
> will call link_schedule_user() function which puts a message of type
> SOCK_WAKEUP to the wakeup queue and put the sender to sleep. Thus the
> size of the queue actually is not changed and the while() loop never exits.
> 
> The approach I proposed is to wake up only senders for which there is
> enough place in send queue so the described issue can't occur. Moreover
> the same approach is already used to wake up senders on unicast links so it
> was possible to reuse existed code.

The issue seems real, but I don't feel comfortable with your solution, because
of its possible effect on the unicast code. I would prefer that you write a separate
 "prepare_wakeup()" inside bcast.c, and uses that one.  Then, post it to "net" so it 
gets applied as far back as possible.

You may not like the code duplication this entails, but it is only temporary.  I am
just doing the final testing on a series that basically redesigns the broadcast 
implementation, and where this issue is inherently solved.  This will be delivered
when net-next re-opens.

///jon

> 
> >
> > If there really is a problem somewhere in here, then two things:
> >
> > 1) You need to describe exactly the sequence of tests and conditions
> >    that lead to the endless loop in this code, because I cannot see
> >    it.
> 
> I have got into the issue on our product code but to reproduce the issue I
> changed a benchmark test application (from tipcutils/demos/benchmark) to
> perform the following scenario:
> 	1. Run 64 instances of test application (nodes). It can be done on the
> one physical machine.
> 	2. Each application connects to all other using TIPC sockets in RDM
> mode.
> 	3. When setup is done all nodes start simultaneously send broadcast
> messages.
> 	4. Everything hangs up.
> 
> The issue is reproducible only when a congestion on broadcast link occurs.
> For example, when there are only 8 nodes it works fine since congestion
> doesn't occur. Send queue limit is 40 in my case (I use a critical importance
> level) and when 64 nodes send a message at the same moment a congestion
> occurs every time.
> 
> >
> > 2) I suspect the fix is more likely to be appropriate in tipc_sk_rcv()
> >    or similar, rather than creating a dummy queue to workaround it's
> >    behavior.
> >
> > Thanks.
> 
> BR,
> DK

WARNING: multiple messages have this Message-ID (diff)
From: Jon Maloy <jon.maloy@ericsson.com>
To: Kolmakov Dmitriy <kolmakov.dmitriy@huawei.com>,
	David Miller <davem@davemloft.net>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"tipc-discussion@lists.sourceforge.net"
	<tipc-discussion@lists.sourceforge.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] tipc: fix stall during bclink wakeup procedure
Date: Thu, 3 Sep 2015 12:38:43 +0000	[thread overview]
Message-ID: <A2BAEFC30C8FD34388F02C9B3121859D22319EC7@eusaamb103.ericsson.se> (raw)
In-Reply-To: <F44CB10AAC4D8B448FFBC398F7C73866FBAD7A@lhreml501-mbb>



> -----Original Message-----
> From: Kolmakov Dmitriy [mailto:kolmakov.dmitriy@huawei.com]
> Sent: Thursday, 03 September, 2015 04:31
> To: David Miller
> Cc: Jon Maloy; Ying Xue; tipc-discussion@lists.sourceforge.net;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH] tipc: fix stall during bclink wakeup procedure
> 
> From: David Miller [mailto:davem@davemloft.net]
> >
> > From: Kolmakov Dmitriy <kolmakov.dmitriy@huawei.com>
> > Date: Wed, 2 Sep 2015 15:33:00 +0000
> >
> > > If an attempt to wake up users of broadcast link is made when there
> > is
> > > no enough place in send queue than it may hang up inside the
> > > tipc_sk_rcv() function since the loop breaks only after the wake up
> > > queue becomes empty. This can lead to complete CPU stall with the
> > > following message generated by RCU:
> >
> > I don't understand how it can loop forever.
> >
> > It should either successfully deliver each packet to the socket, or
> > respond with a TIPC_ERR_OVERLOAD.
> >
> > In both cases, the SKB is dequeued from the queue and forward progress
> > is made.
> 
> The issue occurs only when tipc_sk_rcv() is used to wake up postponed
> senders. In this case the call stack is following:
> 
> 	tipc_bclink_wakeup_users()
> 		// wakeupq - is a queue consist of special
> 		// 		 messages with SOCK_WAKEUP type.
> 		tipc_sk_rcv(wakeupq)
> 			...
> 			while (skb_queue_len(inputq)) {
> 				filter_rcv(skb)
> 					// Here the type of message is
> checked
> 					// and if it is SOCK_WAKEUP than
> 					// it tries to wake up a sender.
> 					tipc_write_space(sk)
> 
> 	wake_up_interruptible_sync_poll()
> 			}
> 
> After the sender thread is woke up it can gather control and perform an
> attempt to send a message. But if there is no enough place in send queue it
> will call link_schedule_user() function which puts a message of type
> SOCK_WAKEUP to the wakeup queue and put the sender to sleep. Thus the
> size of the queue actually is not changed and the while() loop never exits.
> 
> The approach I proposed is to wake up only senders for which there is
> enough place in send queue so the described issue can't occur. Moreover
> the same approach is already used to wake up senders on unicast links so it
> was possible to reuse existed code.

The issue seems real, but I don't feel comfortable with your solution, because
of its possible effect on the unicast code. I would prefer that you write a separate
 "prepare_wakeup()" inside bcast.c, and uses that one.  Then, post it to "net" so it 
gets applied as far back as possible.

You may not like the code duplication this entails, but it is only temporary.  I am
just doing the final testing on a series that basically redesigns the broadcast 
implementation, and where this issue is inherently solved.  This will be delivered
when net-next re-opens.

///jon

> 
> >
> > If there really is a problem somewhere in here, then two things:
> >
> > 1) You need to describe exactly the sequence of tests and conditions
> >    that lead to the endless loop in this code, because I cannot see
> >    it.
> 
> I have got into the issue on our product code but to reproduce the issue I
> changed a benchmark test application (from tipcutils/demos/benchmark) to
> perform the following scenario:
> 	1. Run 64 instances of test application (nodes). It can be done on the
> one physical machine.
> 	2. Each application connects to all other using TIPC sockets in RDM
> mode.
> 	3. When setup is done all nodes start simultaneously send broadcast
> messages.
> 	4. Everything hangs up.
> 
> The issue is reproducible only when a congestion on broadcast link occurs.
> For example, when there are only 8 nodes it works fine since congestion
> doesn't occur. Send queue limit is 40 in my case (I use a critical importance
> level) and when 64 nodes send a message at the same moment a congestion
> occurs every time.
> 
> >
> > 2) I suspect the fix is more likely to be appropriate in tipc_sk_rcv()
> >    or similar, rather than creating a dummy queue to workaround it's
> >    behavior.
> >
> > Thanks.
> 
> BR,
> DK

------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140

  reply	other threads:[~2015-09-03 12:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-02 15:33 [PATCH] tipc: fix stall during bclink wakeup procedure Kolmakov Dmitriy
2015-09-02 15:33 ` Kolmakov Dmitriy
2015-09-02 23:27 ` David Miller
2015-09-03  8:30   ` Kolmakov Dmitriy
2015-09-03  8:30     ` Kolmakov Dmitriy
2015-09-03 12:38     ` Jon Maloy [this message]
2015-09-03 12:38       ` Jon Maloy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A2BAEFC30C8FD34388F02C9B3121859D22319EC7@eusaamb103.ericsson.se \
    --to=jon.maloy@ericsson.com \
    --cc=davem@davemloft.net \
    --cc=kolmakov.dmitriy@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tipc-discussion@lists.sourceforge.net \
    --cc=ying.xue@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.