linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net/smc: Transfer remaining wait queue entries during fallback
@ 2021-11-12  3:30 Wen Gu
  2021-11-12 14:02 ` Heiko Carstens
  0 siblings, 1 reply; 3+ messages in thread
From: Wen Gu @ 2021-11-12  3:30 UTC (permalink / raw)
  To: kgraul
  Cc: davem, kuba, linux-s390, netdev, linux-kernel, guwen, tonylu,
	dust.li, xuanzhuo

The SMC fallback is incomplete currently. There may be some
wait queue entries remaining in smc socket->wq, which should
be removed to clcsocket->wq during the fallback.

For example, in nginx/wrk benchmark, this issue causes an
all-zeros test result:

server: nginx -g 'daemon off;'
client: smc_run wrk -c 1 -t 1 -d 5 http://11.200.15.93/index.html

  Running 5s test @ http://11.200.15.93/index.html
     1 threads and 1 connections
     Thread Stats   Avg      Stdev     Max   ± Stdev
     	Latency     0.00us    0.00us   0.00us    -nan%
	Req/Sec     0.00      0.00     0.00      -nan%
	0 requests in 5.00s, 0.00B read
     Requests/sec:      0.00
     Transfer/sec:       0.00B

The reason for this all-zeros result is that when wrk used SMC
to replace TCP, it added an eppoll_entry into smc socket->wq
and expected to be notified if epoll events like EPOLL_IN/
EPOLL_OUT occurred on the smc socket.

However, once a fallback occurred, wrk switches to use clcsocket.
Now it is clcsocket->wq instead of smc socket->wq which will
be woken up. The eppoll_entry remaining in smc socket->wq does
not work anymore and wrk stops the test.

This patch fixes this issue by removing remaining wait queue
entries from smc socket->wq to clcsocket->wq during the fallback.

Link: https://www.spinics.net/lists/netdev/msg779769.html
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
---
 net/smc/af_smc.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 0cf7ed2..11a966a 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -562,6 +562,10 @@ static void smc_stat_fallback(struct smc_sock *smc)
 
 static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code)
 {
+	wait_queue_head_t *smc_wait = sk_sleep(&smc->sk);
+	wait_queue_head_t *clc_wait = sk_sleep(smc->clcsock->sk);
+	unsigned long flags;
+
 	smc->use_fallback = true;
 	smc->fallback_rsn = reason_code;
 	smc_stat_fallback(smc);
@@ -571,6 +575,16 @@ static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code)
 		smc->clcsock->file->private_data = smc->clcsock;
 		smc->clcsock->wq.fasync_list =
 			smc->sk.sk_socket->wq.fasync_list;
+
+		/* There might be some wait queue entries remaining
+		 * in smc socket->wq, which should be removed to
+		 * clcsocket->wq during the fallback.
+		 */
+		spin_lock_irqsave(&smc_wait->lock, flags);
+		spin_lock_irqsave(&clc_wait->lock, flags);
+		list_splice_init(&smc_wait->head, &clc_wait->head);
+		spin_unlock_irqrestore(&clc_wait->lock, flags);
+		spin_unlock_irqrestore(&smc_wait->lock, flags);
 	}
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net] net/smc: Transfer remaining wait queue entries during fallback
  2021-11-12  3:30 [PATCH net] net/smc: Transfer remaining wait queue entries during fallback Wen Gu
@ 2021-11-12 14:02 ` Heiko Carstens
  2021-11-12 15:28   ` Wen Gu
  0 siblings, 1 reply; 3+ messages in thread
From: Heiko Carstens @ 2021-11-12 14:02 UTC (permalink / raw)
  To: Wen Gu
  Cc: kgraul, davem, kuba, linux-s390, netdev, linux-kernel, tonylu,
	dust.li, xuanzhuo

On Fri, Nov 12, 2021 at 11:30:39AM +0800, Wen Gu wrote:
...
> +	wait_queue_head_t *smc_wait = sk_sleep(&smc->sk);
> +	wait_queue_head_t *clc_wait = sk_sleep(smc->clcsock->sk);
> +	unsigned long flags;
> +
>  	smc->use_fallback = true;
>  	smc->fallback_rsn = reason_code;
>  	smc_stat_fallback(smc);
> @@ -571,6 +575,16 @@ static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code)
>  		smc->clcsock->file->private_data = smc->clcsock;
>  		smc->clcsock->wq.fasync_list =
>  			smc->sk.sk_socket->wq.fasync_list;
> +
> +		/* There might be some wait queue entries remaining
> +		 * in smc socket->wq, which should be removed to
> +		 * clcsocket->wq during the fallback.
> +		 */
> +		spin_lock_irqsave(&smc_wait->lock, flags);
> +		spin_lock_irqsave(&clc_wait->lock, flags);
> +		list_splice_init(&smc_wait->head, &clc_wait->head);
> +		spin_unlock_irqrestore(&clc_wait->lock, flags);
> +		spin_unlock_irqrestore(&smc_wait->lock, flags);

No idea if the rest of the patch makes sense, however this usage of
spin_lock_irqsave() is not correct. The second spin_lock_irqsave()
would always save a state with irqs disabled into "flags", and
therefore this path would always be left with irqs disabled,
regardless if irqs were enabled or disabled when entering.

You need to change it to something like

> +		spin_lock_irqsave(&smc_wait->lock, flags);
> +		spin_lock(&clc_wait->lock);
> +		list_splice_init(&smc_wait->head, &clc_wait->head);
> +		spin_unlock(&clc_wait->lock);
> +		spin_unlock_irqrestore(&smc_wait->lock, flags);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net] net/smc: Transfer remaining wait queue entries during fallback
  2021-11-12 14:02 ` Heiko Carstens
@ 2021-11-12 15:28   ` Wen Gu
  0 siblings, 0 replies; 3+ messages in thread
From: Wen Gu @ 2021-11-12 15:28 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: kgraul, davem, kuba, linux-s390, netdev, linux-kernel, tonylu,
	dust.li, xuanzhuo



On 2021/11/12 10:02 pm, Heiko Carstens wrote:
> On Fri, Nov 12, 2021 at 11:30:39AM +0800, Wen Gu wrote:
> ...
>> +	wait_queue_head_t *smc_wait = sk_sleep(&smc->sk);
>> +	wait_queue_head_t *clc_wait = sk_sleep(smc->clcsock->sk);
>> +	unsigned long flags;
>> +
>>   	smc->use_fallback = true;
>>   	smc->fallback_rsn = reason_code;
>>   	smc_stat_fallback(smc);
>> @@ -571,6 +575,16 @@ static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code)
>>   		smc->clcsock->file->private_data = smc->clcsock;
>>   		smc->clcsock->wq.fasync_list =
>>   			smc->sk.sk_socket->wq.fasync_list;
>> +
>> +		/* There might be some wait queue entries remaining
>> +		 * in smc socket->wq, which should be removed to
>> +		 * clcsocket->wq during the fallback.
>> +		 */
>> +		spin_lock_irqsave(&smc_wait->lock, flags);
>> +		spin_lock_irqsave(&clc_wait->lock, flags);
>> +		list_splice_init(&smc_wait->head, &clc_wait->head);
>> +		spin_unlock_irqrestore(&clc_wait->lock, flags);
>> +		spin_unlock_irqrestore(&smc_wait->lock, flags);
> 
> No idea if the rest of the patch makes sense, however this usage of
> spin_lock_irqsave() is not correct. The second spin_lock_irqsave()
> would always save a state with irqs disabled into "flags", and
> therefore this path would always be left with irqs disabled,
> regardless if irqs were enabled or disabled when entering.
> 
> You need to change it to something like
> 
>> +		spin_lock_irqsave(&smc_wait->lock, flags);
>> +		spin_lock(&clc_wait->lock);
>> +		list_splice_init(&smc_wait->head, &clc_wait->head);
>> +		spin_unlock(&clc_wait->lock);
>> +		spin_unlock_irqrestore(&smc_wait->lock, flags);

Sorry for the mistake... I will correct this.

Thanks,
Wen Gu

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-11-12 15:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-12  3:30 [PATCH net] net/smc: Transfer remaining wait queue entries during fallback Wen Gu
2021-11-12 14:02 ` Heiko Carstens
2021-11-12 15:28   ` Wen Gu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).