netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX
@ 2022-08-18 18:26 Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 01/17] net: Fix data-races around sysctl_[rw]mem_(max|default) Kuniyuki Iwashima
                   ` (17 more replies)
  0 siblings, 18 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

This series fixes data-races around all knobs in net_core_table and
netns_core_table except for bpf stuff.

These knobs are skipped:

  - 4 bpf knobs
  - netdev_rss_key: Written only once by net_get_random_once() and
                    read-only knob
  - rps_sock_flow_entries: Protected with sock_flow_mutex
  - flow_limit_cpu_bitmap: Protected with flow_limit_update_mutex
  - flow_limit_table_len: Protected with flow_limit_update_mutex
  - default_qdisc: Protected with qdisc_mod_lock
  - warnings: Unused
  - high_order_alloc_disable: Protected with static_key_mutex
  - skb_defer_max: Already using READ_ONCE()
  - sysctl_txrehash: Already using READ_ONCE()

Note 5th patch fixes net.core.message_cost and net.core.message_burst,
and lib/ratelimit.c does not have an explicit maintainer.

Changes:
  v3:
    * Fix build failures of CONFIG_SYSCTL=n case in 13th & 14th patches

  v2: https://lore.kernel.org/netdev/20220818035227.81567-1-kuniyu@amazon.com/
    * Remove 4 bpf knobs and added 6 knobs

  v1: https://lore.kernel.org/netdev/20220816052347.70042-1-kuniyu@amazon.com/

Kuniyuki Iwashima (17):
  net: Fix data-races around sysctl_[rw]mem_(max|default).
  net: Fix data-races around weight_p and dev_weight_[rt]x_bias.
  net: Fix data-races around netdev_max_backlog.
  net: Fix data-races around netdev_tstamp_prequeue.
  ratelimit: Fix data-races in ___ratelimit().
  net: Fix data-races around sysctl_optmem_max.
  net: Fix a data-race around sysctl_tstamp_allow_data.
  net: Fix a data-race around sysctl_net_busy_poll.
  net: Fix a data-race around sysctl_net_busy_read.
  net: Fix a data-race around netdev_budget.
  net: Fix data-races around sysctl_max_skb_frags.
  net: Fix a data-race around netdev_budget_usecs.
  net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
  net: Fix data-races around sysctl_devconf_inherit_init_net.
  net: Fix a data-race around gro_normal_batch.
  net: Fix a data-race around netdev_unregister_timeout_secs.
  net: Fix a data-race around sysctl_somaxconn.

 Documentation/admin-guide/sysctl/net.rst |  2 +-
 include/linux/netdevice.h                | 20 +++++++++++++++++---
 include/net/busy_poll.h                  |  2 +-
 include/net/gro.h                        |  2 +-
 lib/ratelimit.c                          |  8 +++++---
 net/core/bpf_sk_storage.c                |  5 +++--
 net/core/dev.c                           | 20 ++++++++++----------
 net/core/filter.c                        | 13 +++++++------
 net/core/gro_cells.c                     |  2 +-
 net/core/skbuff.c                        |  2 +-
 net/core/sock.c                          | 18 ++++++++++--------
 net/core/sysctl_net_core.c               |  6 ++++--
 net/ipv4/devinet.c                       | 16 ++++++++++------
 net/ipv4/ip_output.c                     |  2 +-
 net/ipv4/ip_sockglue.c                   |  6 +++---
 net/ipv4/tcp.c                           |  4 ++--
 net/ipv4/tcp_output.c                    |  2 +-
 net/ipv6/addrconf.c                      |  5 ++---
 net/ipv6/ipv6_sockglue.c                 |  4 ++--
 net/mptcp/protocol.c                     |  2 +-
 net/netfilter/ipvs/ip_vs_sync.c          |  4 ++--
 net/sched/sch_generic.c                  |  2 +-
 net/socket.c                             |  2 +-
 net/xfrm/espintcp.c                      |  2 +-
 net/xfrm/xfrm_input.c                    |  2 +-
 25 files changed, 89 insertions(+), 64 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 net 01/17] net: Fix data-races around sysctl_[rw]mem_(max|default).
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 02/17] net: Fix data-races around weight_p and dev_weight_[rt]x_bias Kuniyuki Iwashima
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

While reading sysctl_[rw]mem_(max|default), they can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/filter.c               | 4 ++--
 net/core/sock.c                 | 8 ++++----
 net/ipv4/ip_output.c            | 2 +-
 net/ipv4/tcp_output.c           | 2 +-
 net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index e8508aaafd27..c4f14ad82029 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5034,14 +5034,14 @@ static int __bpf_setsockopt(struct sock *sk, int level, int optname,
 		/* Only some socketops are supported */
 		switch (optname) {
 		case SO_RCVBUF:
-			val = min_t(u32, val, sysctl_rmem_max);
+			val = min_t(u32, val, READ_ONCE(sysctl_rmem_max));
 			val = min_t(int, val, INT_MAX / 2);
 			sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
 			WRITE_ONCE(sk->sk_rcvbuf,
 				   max_t(int, val * 2, SOCK_MIN_RCVBUF));
 			break;
 		case SO_SNDBUF:
-			val = min_t(u32, val, sysctl_wmem_max);
+			val = min_t(u32, val, READ_ONCE(sysctl_wmem_max));
 			val = min_t(int, val, INT_MAX / 2);
 			sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
 			WRITE_ONCE(sk->sk_sndbuf,
diff --git a/net/core/sock.c b/net/core/sock.c
index 4cb957d934a2..303af52f3b79 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1101,7 +1101,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 		 * play 'guess the biggest size' games. RCVBUF/SNDBUF
 		 * are treated in BSD as hints
 		 */
-		val = min_t(u32, val, sysctl_wmem_max);
+		val = min_t(u32, val, READ_ONCE(sysctl_wmem_max));
 set_sndbuf:
 		/* Ensure val * 2 fits into an int, to prevent max_t()
 		 * from treating it as a negative value.
@@ -1133,7 +1133,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 		 * play 'guess the biggest size' games. RCVBUF/SNDBUF
 		 * are treated in BSD as hints
 		 */
-		__sock_set_rcvbuf(sk, min_t(u32, val, sysctl_rmem_max));
+		__sock_set_rcvbuf(sk, min_t(u32, val, READ_ONCE(sysctl_rmem_max)));
 		break;
 
 	case SO_RCVBUFFORCE:
@@ -3309,8 +3309,8 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 	timer_setup(&sk->sk_timer, NULL, 0);
 
 	sk->sk_allocation	=	GFP_KERNEL;
-	sk->sk_rcvbuf		=	sysctl_rmem_default;
-	sk->sk_sndbuf		=	sysctl_wmem_default;
+	sk->sk_rcvbuf		=	READ_ONCE(sysctl_rmem_default);
+	sk->sk_sndbuf		=	READ_ONCE(sysctl_wmem_default);
 	sk->sk_state		=	TCP_CLOSE;
 	sk_set_socket(sk, sock);
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index d7bd1daf022b..04e2034f2f8e 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1730,7 +1730,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 
 	sk->sk_protocol = ip_hdr(skb)->protocol;
 	sk->sk_bound_dev_if = arg->bound_dev_if;
-	sk->sk_sndbuf = sysctl_wmem_default;
+	sk->sk_sndbuf = READ_ONCE(sysctl_wmem_default);
 	ipc.sockc.mark = fl4.flowi4_mark;
 	err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base,
 			     len, 0, &ipc, &rt, MSG_DONTWAIT);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 78b654ff421b..290019de766d 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -239,7 +239,7 @@ void tcp_select_initial_window(const struct sock *sk, int __space, __u32 mss,
 	if (wscale_ok) {
 		/* Set window scaling on max possible window */
 		space = max_t(u32, space, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]));
-		space = max_t(u32, space, sysctl_rmem_max);
+		space = max_t(u32, space, READ_ONCE(sysctl_rmem_max));
 		space = min_t(u32, space, *window_clamp);
 		*rcv_wscale = clamp_t(int, ilog2(space) - 15,
 				      0, TCP_MAX_WSCALE);
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index 9d43277b8b4f..a56fd0b5a430 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1280,12 +1280,12 @@ static void set_sock_size(struct sock *sk, int mode, int val)
 	lock_sock(sk);
 	if (mode) {
 		val = clamp_t(int, val, (SOCK_MIN_SNDBUF + 1) / 2,
-			      sysctl_wmem_max);
+			      READ_ONCE(sysctl_wmem_max));
 		sk->sk_sndbuf = val * 2;
 		sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
 	} else {
 		val = clamp_t(int, val, (SOCK_MIN_RCVBUF + 1) / 2,
-			      sysctl_rmem_max);
+			      READ_ONCE(sysctl_rmem_max));
 		sk->sk_rcvbuf = val * 2;
 		sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 02/17] net: Fix data-races around weight_p and dev_weight_[rt]x_bias.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 01/17] net: Fix data-races around sysctl_[rw]mem_(max|default) Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-20  0:03   ` Jakub Kicinski
  2022-08-18 18:26 ` [PATCH v3 net 03/17] net: Fix data-races around netdev_max_backlog Kuniyuki Iwashima
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Matthias Tafelmeier

While reading weight_p and dev_weight_[rt]x_bias, they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 3d48b53fb2ae ("net: dev_weight: TX/RX orthogonality")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
CC: Matthias Tafelmeier <matthias.tafelmeier@gmx.net>
---
 net/core/dev.c             | 2 +-
 net/core/sysctl_net_core.c | 6 ++++--
 net/sched/sch_generic.c    | 2 +-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 716df64fcfa5..b5b92dcd5eea 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5918,7 +5918,7 @@ static int process_backlog(struct napi_struct *napi, int quota)
 		net_rps_action_and_irq_enable(sd);
 	}
 
-	napi->weight = dev_rx_weight;
+	napi->weight = READ_ONCE(dev_rx_weight);
 	while (again) {
 		struct sk_buff *skb;
 
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 71a13596ea2b..d82ba0c27175 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -240,8 +240,10 @@ static int proc_do_dev_weight(struct ctl_table *table, int write,
 	if (ret != 0)
 		return ret;
 
-	dev_rx_weight = weight_p * dev_weight_rx_bias;
-	dev_tx_weight = weight_p * dev_weight_tx_bias;
+	WRITE_ONCE(dev_rx_weight,
+		   READ_ONCE(weight_p) * READ_ONCE(dev_weight_rx_bias));
+	WRITE_ONCE(dev_tx_weight,
+		   READ_ONCE(weight_p) * READ_ONCE(dev_weight_tx_bias));
 
 	return ret;
 }
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d47b9689eba6..99b697ad2b98 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -409,7 +409,7 @@ static inline bool qdisc_restart(struct Qdisc *q, int *packets)
 
 void __qdisc_run(struct Qdisc *q)
 {
-	int quota = dev_tx_weight;
+	int quota = READ_ONCE(dev_tx_weight);
 	int packets;
 
 	while (qdisc_restart(q, &packets)) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 03/17] net: Fix data-races around netdev_max_backlog.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 01/17] net: Fix data-races around sysctl_[rw]mem_(max|default) Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 02/17] net: Fix data-races around weight_p and dev_weight_[rt]x_bias Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 04/17] net: Fix data-races around netdev_tstamp_prequeue Kuniyuki Iwashima
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

While reading netdev_max_backlog, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

While at it, we remove the unnecessary spaces in the doc.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 Documentation/admin-guide/sysctl/net.rst | 2 +-
 net/core/dev.c                           | 4 ++--
 net/core/gro_cells.c                     | 2 +-
 net/xfrm/espintcp.c                      | 2 +-
 net/xfrm/xfrm_input.c                    | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 805f2281e000..60d44165fba7 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -271,7 +271,7 @@ poll cycle or the number of packets processed reaches netdev_budget.
 netdev_max_backlog
 ------------------
 
-Maximum number  of  packets,  queued  on  the  INPUT  side, when the interface
+Maximum number of packets, queued on the INPUT side, when the interface
 receives packets faster than kernel can process them.
 
 netdev_rss_key
diff --git a/net/core/dev.c b/net/core/dev.c
index b5b92dcd5eea..07da69c1ac0a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4624,7 +4624,7 @@ static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen)
 	struct softnet_data *sd;
 	unsigned int old_flow, new_flow;
 
-	if (qlen < (netdev_max_backlog >> 1))
+	if (qlen < (READ_ONCE(netdev_max_backlog) >> 1))
 		return false;
 
 	sd = this_cpu_ptr(&softnet_data);
@@ -4672,7 +4672,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
 	if (!netif_running(skb->dev))
 		goto drop;
 	qlen = skb_queue_len(&sd->input_pkt_queue);
-	if (qlen <= netdev_max_backlog && !skb_flow_limit(skb, qlen)) {
+	if (qlen <= READ_ONCE(netdev_max_backlog) && !skb_flow_limit(skb, qlen)) {
 		if (qlen) {
 enqueue:
 			__skb_queue_tail(&sd->input_pkt_queue, skb);
diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
index 541c7a72a28a..21619c70a82b 100644
--- a/net/core/gro_cells.c
+++ b/net/core/gro_cells.c
@@ -26,7 +26,7 @@ int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb)
 
 	cell = this_cpu_ptr(gcells->cells);
 
-	if (skb_queue_len(&cell->napi_skbs) > netdev_max_backlog) {
+	if (skb_queue_len(&cell->napi_skbs) > READ_ONCE(netdev_max_backlog)) {
 drop:
 		dev_core_stats_rx_dropped_inc(dev);
 		kfree_skb(skb);
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index 82d14eea1b5a..974eb97b77d2 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -168,7 +168,7 @@ int espintcp_queue_out(struct sock *sk, struct sk_buff *skb)
 {
 	struct espintcp_ctx *ctx = espintcp_getctx(sk);
 
-	if (skb_queue_len(&ctx->out_queue) >= netdev_max_backlog)
+	if (skb_queue_len(&ctx->out_queue) >= READ_ONCE(netdev_max_backlog))
 		return -ENOBUFS;
 
 	__skb_queue_tail(&ctx->out_queue, skb);
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 144238a50f3d..a3eb21a85810 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -783,7 +783,7 @@ int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb,
 
 	trans = this_cpu_ptr(&xfrm_trans_tasklet);
 
-	if (skb_queue_len(&trans->queue) >= netdev_max_backlog)
+	if (skb_queue_len(&trans->queue) >= READ_ONCE(netdev_max_backlog))
 		return -ENOBUFS;
 
 	BUILD_BUG_ON(sizeof(struct xfrm_trans_cb) > sizeof(skb->cb));
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 04/17] net: Fix data-races around netdev_tstamp_prequeue.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (2 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 03/17] net: Fix data-races around netdev_max_backlog Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit() Kuniyuki Iwashima
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

While reading netdev_tstamp_prequeue, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 3b098e2d7c69 ("net: Consistent skb timestamping")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/dev.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 07da69c1ac0a..4705e6630efa 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4928,7 +4928,7 @@ static int netif_rx_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	net_timestamp_check(netdev_tstamp_prequeue, skb);
+	net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	trace_netif_rx(skb);
 
@@ -5281,7 +5281,7 @@ static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc,
 	int ret = NET_RX_DROP;
 	__be16 type;
 
-	net_timestamp_check(!netdev_tstamp_prequeue, skb);
+	net_timestamp_check(!READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	trace_netif_receive_skb(skb);
 
@@ -5664,7 +5664,7 @@ static int netif_receive_skb_internal(struct sk_buff *skb)
 {
 	int ret;
 
-	net_timestamp_check(netdev_tstamp_prequeue, skb);
+	net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 
 	if (skb_defer_rx_timestamp(skb))
 		return NET_RX_SUCCESS;
@@ -5694,7 +5694,7 @@ void netif_receive_skb_list_internal(struct list_head *head)
 
 	INIT_LIST_HEAD(&sublist);
 	list_for_each_entry_safe(skb, next, head, list) {
-		net_timestamp_check(netdev_tstamp_prequeue, skb);
+		net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
 		skb_list_del_init(skb);
 		if (!skb_defer_rx_timestamp(skb))
 			list_add_tail(&skb->list, &sublist);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit().
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (3 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 04/17] net: Fix data-races around netdev_tstamp_prequeue Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-20  0:04   ` Jakub Kicinski
  2022-08-22 19:00   ` Eric Dumazet
  2022-08-18 18:26 ` [PATCH v3 net 06/17] net: Fix data-races around sysctl_optmem_max Kuniyuki Iwashima
                   ` (12 subsequent siblings)
  17 siblings, 2 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

While reading rs->interval and rs->burst, they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 lib/ratelimit.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index e01a93f46f83..b59a1d3d0cc3 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -26,10 +26,12 @@
  */
 int ___ratelimit(struct ratelimit_state *rs, const char *func)
 {
+	int interval = READ_ONCE(rs->interval);
+	int burst = READ_ONCE(rs->burst);
 	unsigned long flags;
 	int ret;
 
-	if (!rs->interval)
+	if (!interval)
 		return 1;
 
 	/*
@@ -44,7 +46,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 	if (!rs->begin)
 		rs->begin = jiffies;
 
-	if (time_is_before_jiffies(rs->begin + rs->interval)) {
+	if (time_is_before_jiffies(rs->begin + interval)) {
 		if (rs->missed) {
 			if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
 				printk_deferred(KERN_WARNING
@@ -56,7 +58,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
 		rs->begin   = jiffies;
 		rs->printed = 0;
 	}
-	if (rs->burst && rs->burst > rs->printed) {
+	if (burst && burst > rs->printed) {
 		rs->printed++;
 		ret = 1;
 	} else {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 06/17] net: Fix data-races around sysctl_optmem_max.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (4 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit() Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 07/17] net: Fix a data-race around sysctl_tstamp_allow_data Kuniyuki Iwashima
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

While reading sysctl_optmem_max, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/bpf_sk_storage.c | 5 +++--
 net/core/filter.c         | 9 +++++----
 net/core/sock.c           | 8 +++++---
 net/ipv4/ip_sockglue.c    | 6 +++---
 net/ipv6/ipv6_sockglue.c  | 4 ++--
 5 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 1b7f385643b4..94374d529ea4 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -310,11 +310,12 @@ BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk)
 static int bpf_sk_storage_charge(struct bpf_local_storage_map *smap,
 				 void *owner, u32 size)
 {
+	int optmem_max = READ_ONCE(sysctl_optmem_max);
 	struct sock *sk = (struct sock *)owner;
 
 	/* same check as in sock_kmalloc() */
-	if (size <= sysctl_optmem_max &&
-	    atomic_read(&sk->sk_omem_alloc) + size < sysctl_optmem_max) {
+	if (size <= optmem_max &&
+	    atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
 		atomic_add(size, &sk->sk_omem_alloc);
 		return 0;
 	}
diff --git a/net/core/filter.c b/net/core/filter.c
index c4f14ad82029..c191db80ce93 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1214,10 +1214,11 @@ void sk_filter_uncharge(struct sock *sk, struct sk_filter *fp)
 static bool __sk_filter_charge(struct sock *sk, struct sk_filter *fp)
 {
 	u32 filter_size = bpf_prog_size(fp->prog->len);
+	int optmem_max = READ_ONCE(sysctl_optmem_max);
 
 	/* same check as in sock_kmalloc() */
-	if (filter_size <= sysctl_optmem_max &&
-	    atomic_read(&sk->sk_omem_alloc) + filter_size < sysctl_optmem_max) {
+	if (filter_size <= optmem_max &&
+	    atomic_read(&sk->sk_omem_alloc) + filter_size < optmem_max) {
 		atomic_add(filter_size, &sk->sk_omem_alloc);
 		return true;
 	}
@@ -1548,7 +1549,7 @@ int sk_reuseport_attach_filter(struct sock_fprog *fprog, struct sock *sk)
 	if (IS_ERR(prog))
 		return PTR_ERR(prog);
 
-	if (bpf_prog_size(prog->len) > sysctl_optmem_max)
+	if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max))
 		err = -ENOMEM;
 	else
 		err = reuseport_attach_prog(sk, prog);
@@ -1615,7 +1616,7 @@ int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk)
 		}
 	} else {
 		/* BPF_PROG_TYPE_SOCKET_FILTER */
-		if (bpf_prog_size(prog->len) > sysctl_optmem_max) {
+		if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max)) {
 			err = -ENOMEM;
 			goto err_prog_put;
 		}
diff --git a/net/core/sock.c b/net/core/sock.c
index 303af52f3b79..95abf4604d88 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2536,7 +2536,7 @@ struct sk_buff *sock_omalloc(struct sock *sk, unsigned long size,
 
 	/* small safe race: SKB_TRUESIZE may differ from final skb->truesize */
 	if (atomic_read(&sk->sk_omem_alloc) + SKB_TRUESIZE(size) >
-	    sysctl_optmem_max)
+	    READ_ONCE(sysctl_optmem_max))
 		return NULL;
 
 	skb = alloc_skb(size, priority);
@@ -2554,8 +2554,10 @@ struct sk_buff *sock_omalloc(struct sock *sk, unsigned long size,
  */
 void *sock_kmalloc(struct sock *sk, int size, gfp_t priority)
 {
-	if ((unsigned int)size <= sysctl_optmem_max &&
-	    atomic_read(&sk->sk_omem_alloc) + size < sysctl_optmem_max) {
+	int optmem_max = READ_ONCE(sysctl_optmem_max);
+
+	if ((unsigned int)size <= optmem_max &&
+	    atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
 		void *mem;
 		/* First do the add, to avoid the race if kmalloc
 		 * might sleep.
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index a8a323ecbb54..e49a61a053a6 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -772,7 +772,7 @@ static int ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval, int optlen)
 
 	if (optlen < GROUP_FILTER_SIZE(0))
 		return -EINVAL;
-	if (optlen > sysctl_optmem_max)
+	if (optlen > READ_ONCE(sysctl_optmem_max))
 		return -ENOBUFS;
 
 	gsf = memdup_sockptr(optval, optlen);
@@ -808,7 +808,7 @@ static int compat_ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval,
 
 	if (optlen < size0)
 		return -EINVAL;
-	if (optlen > sysctl_optmem_max - 4)
+	if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
 		return -ENOBUFS;
 
 	p = kmalloc(optlen + 4, GFP_KERNEL);
@@ -1233,7 +1233,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname,
 
 		if (optlen < IP_MSFILTER_SIZE(0))
 			goto e_inval;
-		if (optlen > sysctl_optmem_max) {
+		if (optlen > READ_ONCE(sysctl_optmem_max)) {
 			err = -ENOBUFS;
 			break;
 		}
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 222f6bf220ba..e0dcc7a193df 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -210,7 +210,7 @@ static int ipv6_set_mcast_msfilter(struct sock *sk, sockptr_t optval,
 
 	if (optlen < GROUP_FILTER_SIZE(0))
 		return -EINVAL;
-	if (optlen > sysctl_optmem_max)
+	if (optlen > READ_ONCE(sysctl_optmem_max))
 		return -ENOBUFS;
 
 	gsf = memdup_sockptr(optval, optlen);
@@ -244,7 +244,7 @@ static int compat_ipv6_set_mcast_msfilter(struct sock *sk, sockptr_t optval,
 
 	if (optlen < size0)
 		return -EINVAL;
-	if (optlen > sysctl_optmem_max - 4)
+	if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
 		return -ENOBUFS;
 
 	p = kmalloc(optlen + 4, GFP_KERNEL);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 07/17] net: Fix a data-race around sysctl_tstamp_allow_data.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (5 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 06/17] net: Fix data-races around sysctl_optmem_max Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 08/17] net: Fix a data-race around sysctl_net_busy_poll Kuniyuki Iwashima
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Willem de Bruijn

While reading sysctl_tstamp_allow_data, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: b245be1f4db1 ("net-timestamp: no-payload only sysctl")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
CC: Willem de Bruijn <willemb@google.com>
---
 net/core/skbuff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 974bbbbe7138..174f34124c06 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4798,7 +4798,7 @@ static bool skb_may_tx_timestamp(struct sock *sk, bool tsonly)
 {
 	bool ret;
 
-	if (likely(sysctl_tstamp_allow_data || tsonly))
+	if (likely(READ_ONCE(sysctl_tstamp_allow_data) || tsonly))
 		return true;
 
 	read_lock_bh(&sk->sk_callback_lock);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 08/17] net: Fix a data-race around sysctl_net_busy_poll.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (6 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 07/17] net: Fix a data-race around sysctl_tstamp_allow_data Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 09/17] net: Fix a data-race around sysctl_net_busy_read Kuniyuki Iwashima
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Eliezer Tamir

While reading sysctl_net_busy_poll, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 060212928670 ("net: add low latency socket poll")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
CC: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---
 include/net/busy_poll.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h
index c4898fcbf923..f90f0021f5f2 100644
--- a/include/net/busy_poll.h
+++ b/include/net/busy_poll.h
@@ -33,7 +33,7 @@ extern unsigned int sysctl_net_busy_poll __read_mostly;
 
 static inline bool net_busy_loop_on(void)
 {
-	return sysctl_net_busy_poll;
+	return READ_ONCE(sysctl_net_busy_poll);
 }
 
 static inline bool sk_can_busy_loop(const struct sock *sk)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 09/17] net: Fix a data-race around sysctl_net_busy_read.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (7 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 08/17] net: Fix a data-race around sysctl_net_busy_poll Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 10/17] net: Fix a data-race around netdev_budget Kuniyuki Iwashima
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Eliezer Tamir

While reading sysctl_net_busy_read, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 2d48d67fa8cd ("net: poll/select low latency socket support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
CC: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---
 net/core/sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 95abf4604d88..788c1372663c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3367,7 +3367,7 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	sk->sk_napi_id		=	0;
-	sk->sk_ll_usec		=	sysctl_net_busy_read;
+	sk->sk_ll_usec		=	READ_ONCE(sysctl_net_busy_read);
 #endif
 
 	sk->sk_max_pacing_rate = ~0UL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 10/17] net: Fix a data-race around netdev_budget.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (8 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 09/17] net: Fix a data-race around sysctl_net_busy_read Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 11/17] net: Fix data-races around sysctl_max_skb_frags Kuniyuki Iwashima
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Stephen Hemminger

While reading netdev_budget, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 51b0bdedb8e7 ("[NET]: Separate two usages of netdev_max_backlog.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
CC: Stephen Hemminger <shemminger@osdl.org>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 4705e6630efa..c83e23cfc57d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6666,7 +6666,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
 	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 	unsigned long time_limit = jiffies +
 		usecs_to_jiffies(netdev_budget_usecs);
-	int budget = netdev_budget;
+	int budget = READ_ONCE(netdev_budget);
 	LIST_HEAD(list);
 	LIST_HEAD(repoll);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 11/17] net: Fix data-races around sysctl_max_skb_frags.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (9 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 10/17] net: Fix a data-race around netdev_budget Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 12/17] net: Fix a data-race around netdev_budget_usecs Kuniyuki Iwashima
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Hans Westgaard Ry

While reading sysctl_max_skb_frags, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 5f74f82ea34c ("net:Add sysctl_max_skb_frags")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
CC: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
---
 net/ipv4/tcp.c       | 4 ++--
 net/mptcp/protocol.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 970e9a2cca4a..9a6fe3d6ab26 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1000,7 +1000,7 @@ static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
 
 	i = skb_shinfo(skb)->nr_frags;
 	can_coalesce = skb_can_coalesce(skb, i, page, offset);
-	if (!can_coalesce && i >= sysctl_max_skb_frags) {
+	if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) {
 		tcp_mark_push(tp, skb);
 		goto new_segment;
 	}
@@ -1354,7 +1354,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 			if (!skb_can_coalesce(skb, i, pfrag->page,
 					      pfrag->offset)) {
-				if (i >= sysctl_max_skb_frags) {
+				if (i >= READ_ONCE(sysctl_max_skb_frags)) {
 					tcp_mark_push(tp, skb);
 					goto new_segment;
 				}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index da4257504fad..d398f3810662 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1263,7 +1263,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 
 		i = skb_shinfo(skb)->nr_frags;
 		can_coalesce = skb_can_coalesce(skb, i, dfrag->page, offset);
-		if (!can_coalesce && i >= sysctl_max_skb_frags) {
+		if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) {
 			tcp_mark_push(tcp_sk(ssk), skb);
 			goto alloc_skb;
 		}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 12/17] net: Fix a data-race around netdev_budget_usecs.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (10 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 11/17] net: Fix data-races around sysctl_max_skb_frags Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 13/17] net: Fix data-races around sysctl_fb_tunnels_only_for_init_net Kuniyuki Iwashima
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Matthew Whitehead

While reading netdev_budget_usecs, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 7acf8a1e8a28 ("Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
CC: Matthew Whitehead <tedheadster@gmail.com>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index c83e23cfc57d..8221322d86db 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6665,7 +6665,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
 {
 	struct softnet_data *sd = this_cpu_ptr(&softnet_data);
 	unsigned long time_limit = jiffies +
-		usecs_to_jiffies(netdev_budget_usecs);
+		usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
 	int budget = READ_ONCE(netdev_budget);
 	LIST_HEAD(list);
 	LIST_HEAD(repoll);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 13/17] net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (11 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 12/17] net: Fix a data-race around netdev_budget_usecs Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 14/17] net: Fix data-races around sysctl_devconf_inherit_init_net Kuniyuki Iwashima
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

While reading sysctl_fb_tunnels_only_for_init_net, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 79134e6ce2c9 ("net: do not create fallback tunnels for non-default namespaces")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/linux/netdevice.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1a3cb93c3dcc..6d3a33fd0cdb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -640,9 +640,14 @@ extern int sysctl_devconf_inherit_init_net;
  */
 static inline bool net_has_fallback_tunnels(const struct net *net)
 {
-	return !IS_ENABLED(CONFIG_SYSCTL) ||
-	       !sysctl_fb_tunnels_only_for_init_net ||
-	       (net == &init_net && sysctl_fb_tunnels_only_for_init_net == 1);
+#if IS_ENABLED(CONFIG_SYSCTL)
+	int fb_tunnels_only_for_init_net = READ_ONCE(sysctl_fb_tunnels_only_for_init_net);
+
+	return !fb_tunnels_only_for_init_net ||
+		(net_eq(net, &init_net) && fb_tunnels_only_for_init_net == 1);
+#else
+	return true;
+#endif
 }
 
 static inline int netdev_queue_numa_node_read(const struct netdev_queue *q)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 14/17] net: Fix data-races around sysctl_devconf_inherit_init_net.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (12 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 13/17] net: Fix data-races around sysctl_fb_tunnels_only_for_init_net Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 15/17] net: Fix a data-race around gro_normal_batch Kuniyuki Iwashima
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Cong Wang

While reading sysctl_devconf_inherit_init_net, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 856c395cfa63 ("net: introduce a knob to control whether to inherit devconf config")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
CC: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/linux/netdevice.h |  9 +++++++++
 net/ipv4/devinet.c        | 16 ++++++++++------
 net/ipv6/addrconf.c       |  5 ++---
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 6d3a33fd0cdb..05d6f3facd5a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -650,6 +650,15 @@ static inline bool net_has_fallback_tunnels(const struct net *net)
 #endif
 }
 
+static inline int net_inherit_devconf(void)
+{
+#if IS_ENABLED(CONFIG_SYSCTL)
+	return READ_ONCE(sysctl_devconf_inherit_init_net);
+#else
+	return 0;
+#endif
+}
+
 static inline int netdev_queue_numa_node_read(const struct netdev_queue *q)
 {
 #if defined(CONFIG_XPS) && defined(CONFIG_NUMA)
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 92b778e423df..e8b9a9202fec 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -2682,23 +2682,27 @@ static __net_init int devinet_init_net(struct net *net)
 #endif
 
 	if (!net_eq(net, &init_net)) {
-		if (IS_ENABLED(CONFIG_SYSCTL) &&
-		    sysctl_devconf_inherit_init_net == 3) {
+		switch (net_inherit_devconf()) {
+		case 3:
 			/* copy from the current netns */
 			memcpy(all, current->nsproxy->net_ns->ipv4.devconf_all,
 			       sizeof(ipv4_devconf));
 			memcpy(dflt,
 			       current->nsproxy->net_ns->ipv4.devconf_dflt,
 			       sizeof(ipv4_devconf_dflt));
-		} else if (!IS_ENABLED(CONFIG_SYSCTL) ||
-			   sysctl_devconf_inherit_init_net != 2) {
-			/* inherit == 0 or 1: copy from init_net */
+			break;
+		case 0:
+		case 1:
+			/* copy from init_net */
 			memcpy(all, init_net.ipv4.devconf_all,
 			       sizeof(ipv4_devconf));
 			memcpy(dflt, init_net.ipv4.devconf_dflt,
 			       sizeof(ipv4_devconf_dflt));
+			break;
+		case 2:
+			/* use compiled values */
+			break;
 		}
-		/* else inherit == 2: use compiled values */
 	}
 
 #ifdef CONFIG_SYSCTL
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index b624e3d8c5f0..e15f64f22fa8 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -7162,9 +7162,8 @@ static int __net_init addrconf_init_net(struct net *net)
 	if (!dflt)
 		goto err_alloc_dflt;
 
-	if (IS_ENABLED(CONFIG_SYSCTL) &&
-	    !net_eq(net, &init_net)) {
-		switch (sysctl_devconf_inherit_init_net) {
+	if (!net_eq(net, &init_net)) {
+		switch (net_inherit_devconf()) {
 		case 1:  /* copy from init_net */
 			memcpy(all, init_net.ipv6.devconf_all,
 			       sizeof(ipv6_devconf));
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 15/17] net: Fix a data-race around gro_normal_batch.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (13 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 14/17] net: Fix data-races around sysctl_devconf_inherit_init_net Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 16/17] net: Fix a data-race around netdev_unregister_timeout_secs Kuniyuki Iwashima
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Edward Cree

While reading gro_normal_batch, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 323ebb61e32b ("net: use listified RX for handling GRO_NORMAL skbs")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
---
 include/net/gro.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/gro.h b/include/net/gro.h
index 867656b0739c..24003dea8fa4 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -439,7 +439,7 @@ static inline void gro_normal_one(struct napi_struct *napi, struct sk_buff *skb,
 {
 	list_add_tail(&skb->list, &napi->rx_list);
 	napi->rx_count += segs;
-	if (napi->rx_count >= gro_normal_batch)
+	if (napi->rx_count >= READ_ONCE(gro_normal_batch))
 		gro_normal_list(napi);
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 16/17] net: Fix a data-race around netdev_unregister_timeout_secs.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (14 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 15/17] net: Fix a data-race around gro_normal_batch Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 18:26 ` [PATCH v3 net 17/17] net: Fix a data-race around sysctl_somaxconn Kuniyuki Iwashima
  2022-08-18 19:00 ` [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Jakub Kicinski
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Dmitry Vyukov

While reading netdev_unregister_timeout_secs, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 5aa3afe107d9 ("net: make unregister netdev warning timeout configurable")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8221322d86db..56c8b0921c9f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10284,7 +10284,7 @@ static struct net_device *netdev_wait_allrefs_any(struct list_head *list)
 				return dev;
 
 		if (time_after(jiffies, warning_time +
-			       netdev_unregister_timeout_secs * HZ)) {
+			       READ_ONCE(netdev_unregister_timeout_secs) * HZ)) {
 			list_for_each_entry(dev, list, todo_list) {
 				pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n",
 					 dev->name, netdev_refcnt_read(dev));
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net 17/17] net: Fix a data-race around sysctl_somaxconn.
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (15 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 16/17] net: Fix a data-race around netdev_unregister_timeout_secs Kuniyuki Iwashima
@ 2022-08-18 18:26 ` Kuniyuki Iwashima
  2022-08-18 19:00 ` [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Jakub Kicinski
  17 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 18:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

While reading sysctl_somaxconn, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/socket.c b/net/socket.c
index 9b27c5e4e5ba..7378375d3a5b 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1801,7 +1801,7 @@ int __sys_listen(int fd, int backlog)
 
 	sock = sockfd_lookup_light(fd, &err, &fput_needed);
 	if (sock) {
-		somaxconn = sock_net(sock->sk)->core.sysctl_somaxconn;
+		somaxconn = READ_ONCE(sock_net(sock->sk)->core.sysctl_somaxconn);
 		if ((unsigned int)backlog > somaxconn)
 			backlog = somaxconn;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX
  2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
                   ` (16 preceding siblings ...)
  2022-08-18 18:26 ` [PATCH v3 net 17/17] net: Fix a data-race around sysctl_somaxconn Kuniyuki Iwashima
@ 2022-08-18 19:00 ` Jakub Kicinski
  2022-08-18 19:09   ` Kuniyuki Iwashima
  17 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2022-08-18 19:00 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Kuniyuki Iwashima, netdev

On Thu, 18 Aug 2022 11:26:36 -0700 Kuniyuki Iwashima wrote:
> This series fixes data-races around all knobs in net_core_table and
> netns_core_table except for bpf stuff.

No need to repost or split this one, but for future reference please 
be reminded that the limit is 15 patches per series:

https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX
  2022-08-18 19:00 ` [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Jakub Kicinski
@ 2022-08-18 19:09   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-18 19:09 UTC (permalink / raw)
  To: kuba; +Cc: davem, edumazet, kuni1840, kuniyu, netdev, pabeni

From:   Jakub Kicinski <kuba@kernel.org>
Date:   Thu, 18 Aug 2022 12:00:25 -0700
> On Thu, 18 Aug 2022 11:26:36 -0700 Kuniyuki Iwashima wrote:
> > This series fixes data-races around all knobs in net_core_table and
> > netns_core_table except for bpf stuff.
> 
> No need to repost or split this one, but for future reference please 
> be reminded that the limit is 15 patches per series:
> 
> https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html

Sorry, I'll keep it in mind.
Thank you.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 02/17] net: Fix data-races around weight_p and dev_weight_[rt]x_bias.
  2022-08-18 18:26 ` [PATCH v3 net 02/17] net: Fix data-races around weight_p and dev_weight_[rt]x_bias Kuniyuki Iwashima
@ 2022-08-20  0:03   ` Jakub Kicinski
  2022-08-22 18:30     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2022-08-20  0:03 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Kuniyuki Iwashima,
	netdev, Matthias Tafelmeier

On Thu, 18 Aug 2022 11:26:38 -0700 Kuniyuki Iwashima wrote:
> -	dev_rx_weight = weight_p * dev_weight_rx_bias;
> -	dev_tx_weight = weight_p * dev_weight_tx_bias;
> +	WRITE_ONCE(dev_rx_weight,
> +		   READ_ONCE(weight_p) * READ_ONCE(dev_weight_rx_bias));
> +	WRITE_ONCE(dev_tx_weight,
> +		   READ_ONCE(weight_p) * READ_ONCE(dev_weight_tx_bias));

Is there some locking on procfs writes? Otherwise one interrupted write
may get overtaken by another and we'll end up with inconsistent values.
OTOH if there is some locking we shouldn't have to protect weight_p
here.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit().
  2022-08-18 18:26 ` [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit() Kuniyuki Iwashima
@ 2022-08-20  0:04   ` Jakub Kicinski
  2022-08-22 18:32     ` Kuniyuki Iwashima
  2022-08-22 19:00   ` Eric Dumazet
  1 sibling, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2022-08-20  0:04 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Kuniyuki Iwashima, netdev

On Thu, 18 Aug 2022 11:26:41 -0700 Kuniyuki Iwashima wrote:
> +	int interval = READ_ONCE(rs->interval);
> +	int burst = READ_ONCE(rs->burst);

Also feels a little bit like papering over an issue if we read 
two values separately.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 02/17] net: Fix data-races around weight_p and dev_weight_[rt]x_bias.
  2022-08-20  0:03   ` Jakub Kicinski
@ 2022-08-22 18:30     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-22 18:30 UTC (permalink / raw)
  To: kuba
  Cc: davem, edumazet, kuni1840, kuniyu, matthias.tafelmeier, netdev, pabeni

From:   Jakub Kicinski <kuba@kernel.org>
Date:   Fri, 19 Aug 2022 17:03:01 -0700
> On Thu, 18 Aug 2022 11:26:38 -0700 Kuniyuki Iwashima wrote:
> > -	dev_rx_weight = weight_p * dev_weight_rx_bias;
> > -	dev_tx_weight = weight_p * dev_weight_tx_bias;
> > +	WRITE_ONCE(dev_rx_weight,
> > +		   READ_ONCE(weight_p) * READ_ONCE(dev_weight_rx_bias));
> > +	WRITE_ONCE(dev_tx_weight,
> > +		   READ_ONCE(weight_p) * READ_ONCE(dev_weight_tx_bias));
> 
> Is there some locking on procfs writes? Otherwise one interrupted write
> may get overtaken by another and we'll end up with inconsistent values.

Thanks for catching!
procfs doesn't provide locking for writes, so we need a mutex like other
knobs.


> OTOH if there is some locking we shouldn't have to protect weight_p
> here.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit().
  2022-08-20  0:04   ` Jakub Kicinski
@ 2022-08-22 18:32     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-22 18:32 UTC (permalink / raw)
  To: kuba; +Cc: davem, edumazet, kuni1840, kuniyu, netdev, pabeni

From:   Jakub Kicinski <kuba@kernel.org>
Date:   Fri, 19 Aug 2022 17:04:46 -0700
> On Thu, 18 Aug 2022 11:26:41 -0700 Kuniyuki Iwashima wrote:
> > +	int interval = READ_ONCE(rs->interval);
> > +	int burst = READ_ONCE(rs->burst);
> 
> Also feels a little bit like papering over an issue if we read 
> two values separately.

Exactly, we have to protect it with a single lock.
Considering ___ratelimit() can be called in many paths, it seems better
to add a spin lock in struct ratelimit_state.

Thank you.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit().
  2022-08-18 18:26 ` [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit() Kuniyuki Iwashima
  2022-08-20  0:04   ` Jakub Kicinski
@ 2022-08-22 19:00   ` Eric Dumazet
  2022-08-22 19:14     ` Kuniyuki Iwashima
  1 sibling, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2022-08-22 19:00 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima, netdev

On Thu, Aug 18, 2022 at 11:29 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> While reading rs->interval and rs->burst, they can be changed
> concurrently.  Thus, we need to add READ_ONCE() to their readers.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
>  lib/ratelimit.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/lib/ratelimit.c b/lib/ratelimit.c
> index e01a93f46f83..b59a1d3d0cc3 100644
> --- a/lib/ratelimit.c
> +++ b/lib/ratelimit.c
> @@ -26,10 +26,12 @@
>   */
>  int ___ratelimit(struct ratelimit_state *rs, const char *func)
>  {
> +       int interval = READ_ONCE(rs->interval);
> +       int burst = READ_ONCE(rs->burst);

I thought rs->interval and rs->burst were constants...

Can you point to the part where they are changed ?

Ideally such a patch should also add corresponding WRITE_ONCE(), and
comments to pair them,
 this would really help reviewing it.


>         unsigned long flags;
>         int ret;
>
> -       if (!rs->interval)
> +       if (!interval)
>                 return 1;
>
>         /*
> @@ -44,7 +46,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
>         if (!rs->begin)
>                 rs->begin = jiffies;
>
> -       if (time_is_before_jiffies(rs->begin + rs->interval)) {
> +       if (time_is_before_jiffies(rs->begin + interval)) {
>                 if (rs->missed) {
>                         if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
>                                 printk_deferred(KERN_WARNING
> @@ -56,7 +58,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
>                 rs->begin   = jiffies;
>                 rs->printed = 0;
>         }
> -       if (rs->burst && rs->burst > rs->printed) {
> +       if (burst && burst > rs->printed) {
>                 rs->printed++;
>                 ret = 1;
>         } else {
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit().
  2022-08-22 19:00   ` Eric Dumazet
@ 2022-08-22 19:14     ` Kuniyuki Iwashima
  2022-08-22 19:22       ` Eric Dumazet
  0 siblings, 1 reply; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-22 19:14 UTC (permalink / raw)
  To: edumazet; +Cc: davem, kuba, kuni1840, kuniyu, netdev, pabeni

From:   Eric Dumazet <edumazet@google.com>
Date:   Mon, 22 Aug 2022 12:00:11 -0700
> On Thu, Aug 18, 2022 at 11:29 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > While reading rs->interval and rs->burst, they can be changed
> > concurrently.  Thus, we need to add READ_ONCE() to their readers.
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > ---
> >  lib/ratelimit.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/ratelimit.c b/lib/ratelimit.c
> > index e01a93f46f83..b59a1d3d0cc3 100644
> > --- a/lib/ratelimit.c
> > +++ b/lib/ratelimit.c
> > @@ -26,10 +26,12 @@
> >   */
> >  int ___ratelimit(struct ratelimit_state *rs, const char *func)
> >  {
> > +       int interval = READ_ONCE(rs->interval);
> > +       int burst = READ_ONCE(rs->burst);
> 
> I thought rs->interval and rs->burst were constants...
> 
> Can you point to the part where they are changed ?
> 
> Ideally such a patch should also add corresponding WRITE_ONCE(), and
> comments to pair them,
>  this would really help reviewing it.

In this case, &net_ratelimit_state.(burst|interval) are directly
passed to proc_handlers, and exactly the relation is unclear.

As Jakub pointed out, two reads can be inconsistent, so I'll add
a spin lock in struct ratelimit_state and two dedicated proc
handlers for each member.  Then, I'll add few more comments to
make that relation clear.

Thanks for feedback!


> >         unsigned long flags;
> >         int ret;
> >
> > -       if (!rs->interval)
> > +       if (!interval)
> >                 return 1;
> >
> >         /*
> > @@ -44,7 +46,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> >         if (!rs->begin)
> >                 rs->begin = jiffies;
> >
> > -       if (time_is_before_jiffies(rs->begin + rs->interval)) {
> > +       if (time_is_before_jiffies(rs->begin + interval)) {
> >                 if (rs->missed) {
> >                         if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
> >                                 printk_deferred(KERN_WARNING
> > @@ -56,7 +58,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> >                 rs->begin   = jiffies;
> >                 rs->printed = 0;
> >         }
> > -       if (rs->burst && rs->burst > rs->printed) {
> > +       if (burst && burst > rs->printed) {
> >                 rs->printed++;
> >                 ret = 1;
> >         } else {
> > --
> > 2.30.2
> >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit().
  2022-08-22 19:14     ` Kuniyuki Iwashima
@ 2022-08-22 19:22       ` Eric Dumazet
  2022-08-22 19:49         ` Kuniyuki Iwashima
  0 siblings, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2022-08-22 19:22 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David Miller, Jakub Kicinski, Kuniyuki Iwashima, netdev, Paolo Abeni

On Mon, Aug 22, 2022 at 12:15 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From:   Eric Dumazet <edumazet@google.com>
> Date:   Mon, 22 Aug 2022 12:00:11 -0700
> > On Thu, Aug 18, 2022 at 11:29 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > >
> > > While reading rs->interval and rs->burst, they can be changed
> > > concurrently.  Thus, we need to add READ_ONCE() to their readers.
> > >
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > ---
> > >  lib/ratelimit.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/lib/ratelimit.c b/lib/ratelimit.c
> > > index e01a93f46f83..b59a1d3d0cc3 100644
> > > --- a/lib/ratelimit.c
> > > +++ b/lib/ratelimit.c
> > > @@ -26,10 +26,12 @@
> > >   */
> > >  int ___ratelimit(struct ratelimit_state *rs, const char *func)
> > >  {
> > > +       int interval = READ_ONCE(rs->interval);
> > > +       int burst = READ_ONCE(rs->burst);
> >
> > I thought rs->interval and rs->burst were constants...
> >
> > Can you point to the part where they are changed ?
> >
> > Ideally such a patch should also add corresponding WRITE_ONCE(), and
> > comments to pair them,
> >  this would really help reviewing it.
>
> In this case, &net_ratelimit_state.(burst|interval) are directly
> passed to proc_handlers, and exactly the relation is unclear.
>
> As Jakub pointed out, two reads can be inconsistent, so I'll add
> a spin lock in struct ratelimit_state and two dedicated proc
> handlers for each member.

This seems overkill to me... Adding a comment explaining why a race
(or inconsistency) is acceptable is enough I think.

Otherwise, we will have to review all other 'struct ratelimit_state'
which expose
in r/w mode their @interval or @burst field.


>  Then, I'll add few more comments to
> make that relation clear.
>
> Thanks for feedback!
>
>
> > >         unsigned long flags;
> > >         int ret;
> > >
> > > -       if (!rs->interval)
> > > +       if (!interval)
> > >                 return 1;
> > >
> > >         /*
> > > @@ -44,7 +46,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> > >         if (!rs->begin)
> > >                 rs->begin = jiffies;
> > >
> > > -       if (time_is_before_jiffies(rs->begin + rs->interval)) {
> > > +       if (time_is_before_jiffies(rs->begin + interval)) {
> > >                 if (rs->missed) {
> > >                         if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
> > >                                 printk_deferred(KERN_WARNING
> > > @@ -56,7 +58,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> > >                 rs->begin   = jiffies;
> > >                 rs->printed = 0;
> > >         }
> > > -       if (rs->burst && rs->burst > rs->printed) {
> > > +       if (burst && burst > rs->printed) {
> > >                 rs->printed++;
> > >                 ret = 1;
> > >         } else {
> > > --
> > > 2.30.2
> > >

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit().
  2022-08-22 19:22       ` Eric Dumazet
@ 2022-08-22 19:49         ` Kuniyuki Iwashima
  0 siblings, 0 replies; 28+ messages in thread
From: Kuniyuki Iwashima @ 2022-08-22 19:49 UTC (permalink / raw)
  To: edumazet; +Cc: davem, kuba, kuni1840, kuniyu, netdev, pabeni

From:   Eric Dumazet <edumazet@google.com>
Date:   Mon, 22 Aug 2022 12:22:35 -0700
> On Mon, Aug 22, 2022 at 12:15 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > From:   Eric Dumazet <edumazet@google.com>
> > Date:   Mon, 22 Aug 2022 12:00:11 -0700
> > > On Thu, Aug 18, 2022 at 11:29 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> > > >
> > > > While reading rs->interval and rs->burst, they can be changed
> > > > concurrently.  Thus, we need to add READ_ONCE() to their readers.
> > > >
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > > > ---
> > > >  lib/ratelimit.c | 8 +++++---
> > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/lib/ratelimit.c b/lib/ratelimit.c
> > > > index e01a93f46f83..b59a1d3d0cc3 100644
> > > > --- a/lib/ratelimit.c
> > > > +++ b/lib/ratelimit.c
> > > > @@ -26,10 +26,12 @@
> > > >   */
> > > >  int ___ratelimit(struct ratelimit_state *rs, const char *func)
> > > >  {
> > > > +       int interval = READ_ONCE(rs->interval);
> > > > +       int burst = READ_ONCE(rs->burst);
> > >
> > > I thought rs->interval and rs->burst were constants...
> > >
> > > Can you point to the part where they are changed ?
> > >
> > > Ideally such a patch should also add corresponding WRITE_ONCE(), and
> > > comments to pair them,
> > >  this would really help reviewing it.
> >
> > In this case, &net_ratelimit_state.(burst|interval) are directly
> > passed to proc_handlers, and exactly the relation is unclear.
> >
> > As Jakub pointed out, two reads can be inconsistent, so I'll add
> > a spin lock in struct ratelimit_state and two dedicated proc
> > handlers for each member.
> 
> This seems overkill to me... Adding a comment explaining why a race
> (or inconsistency) is acceptable is enough I think.

Ok, I'll add a comment like this.

/* Paired with WRITE_ONCE() in .proc_handler(). (see: net_ratelimit_state)
 * Changing two values seperately could be inconsistent and some message
 * could be lost.
 */


> 
> Otherwise, we will have to review all other 'struct ratelimit_state'
> which expose
> in r/w mode their @interval or @burst field.
> 
> 
> >  Then, I'll add few more comments to
> > make that relation clear.
> >
> > Thanks for feedback!
> >
> >
> > > >         unsigned long flags;
> > > >         int ret;
> > > >
> > > > -       if (!rs->interval)
> > > > +       if (!interval)
> > > >                 return 1;
> > > >
> > > >         /*
> > > > @@ -44,7 +46,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> > > >         if (!rs->begin)
> > > >                 rs->begin = jiffies;
> > > >
> > > > -       if (time_is_before_jiffies(rs->begin + rs->interval)) {
> > > > +       if (time_is_before_jiffies(rs->begin + interval)) {
> > > >                 if (rs->missed) {
> > > >                         if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
> > > >                                 printk_deferred(KERN_WARNING
> > > > @@ -56,7 +58,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
> > > >                 rs->begin   = jiffies;
> > > >                 rs->printed = 0;
> > > >         }
> > > > -       if (rs->burst && rs->burst > rs->printed) {
> > > > +       if (burst && burst > rs->printed) {
> > > >                 rs->printed++;
> > > >                 ret = 1;
> > > >         } else {
> > > > --
> > > > 2.30.2


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2022-08-22 19:49 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-18 18:26 [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 01/17] net: Fix data-races around sysctl_[rw]mem_(max|default) Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 02/17] net: Fix data-races around weight_p and dev_weight_[rt]x_bias Kuniyuki Iwashima
2022-08-20  0:03   ` Jakub Kicinski
2022-08-22 18:30     ` Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 03/17] net: Fix data-races around netdev_max_backlog Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 04/17] net: Fix data-races around netdev_tstamp_prequeue Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 05/17] ratelimit: Fix data-races in ___ratelimit() Kuniyuki Iwashima
2022-08-20  0:04   ` Jakub Kicinski
2022-08-22 18:32     ` Kuniyuki Iwashima
2022-08-22 19:00   ` Eric Dumazet
2022-08-22 19:14     ` Kuniyuki Iwashima
2022-08-22 19:22       ` Eric Dumazet
2022-08-22 19:49         ` Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 06/17] net: Fix data-races around sysctl_optmem_max Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 07/17] net: Fix a data-race around sysctl_tstamp_allow_data Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 08/17] net: Fix a data-race around sysctl_net_busy_poll Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 09/17] net: Fix a data-race around sysctl_net_busy_read Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 10/17] net: Fix a data-race around netdev_budget Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 11/17] net: Fix data-races around sysctl_max_skb_frags Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 12/17] net: Fix a data-race around netdev_budget_usecs Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 13/17] net: Fix data-races around sysctl_fb_tunnels_only_for_init_net Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 14/17] net: Fix data-races around sysctl_devconf_inherit_init_net Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 15/17] net: Fix a data-race around gro_normal_batch Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 16/17] net: Fix a data-race around netdev_unregister_timeout_secs Kuniyuki Iwashima
2022-08-18 18:26 ` [PATCH v3 net 17/17] net: Fix a data-race around sysctl_somaxconn Kuniyuki Iwashima
2022-08-18 19:00 ` [PATCH v3 net 00/17] net: sysctl: Fix data-races around net.core.XXX Jakub Kicinski
2022-08-18 19:09   ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).