All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -RT] net: xfrm: fix compress vs decompress serialization
@ 2020-08-14 17:37 Davidlohr Bueso
  2020-08-17 14:47 ` Sebastian Andrzej Siewior
  2020-08-18 16:20 ` [PATCH -RT 4.x] " Davidlohr Bueso
  0 siblings, 2 replies; 6+ messages in thread
From: Davidlohr Bueso @ 2020-08-14 17:37 UTC (permalink / raw)
  To: bigeasy; +Cc: linux-rt-users, Davidlohr Bueso, Davidlohr Bueso

The following splat was seen when running ltp's 'tcp4_ipsec06' stresser on v4.x
based RT kernels:

[   82.523064] BUG: unable to handle kernel paging request at ffffbfbec0c0bf64
[   82.523085] IP: deflate_slow+0x32/0x400
[   82.523086] PGD 3e10d067 P4D 3e10d067 PUD 3e10e067 PMD 3a5ea067 PTE 0
[   82.523091] Oops: 0000 [#1] PREEMPT SMP PTI
[   82.523101] CPU: 0 PID: 5883 Comm: netstress Not tainted 4.12.14-14.26-rt #1 SLE15-SP1
[   82.523102] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-rebuilt.opensuse.org 04/01/2014
[   82.523105] task: ffff9c703731cc40 task.stack: ffffbfbec0fc4000
[   82.523108] RIP: 0010:deflate_slow+0x32/0x400
[   82.523112] RSP: 0018:ffffbfbec0fc7a48 EFLAGS: 00010202
[   82.523114] RAX: 000000000008581c RBX: ffffbfbec0b85000 RCX: 0000000000000005
[   82.523115] RDX: ffffbfbec0b86748 RSI: 000000000000001a RDI: 000000000008c660
[   82.523116] RBP: 0000000000000005 R08: 0000000000000027 R09: 0000000000000027
[   82.523117] R10: ffffbfbec0b86748 R11: 0000000000000053 R12: 00000000000005c5
[   82.523118] R13: 00000000000005c5 R14: ffffbfbec0b85000 R15: 00000000000005c8
[   82.523120] FS:  00007fc9da6ec700(0000) GS:ffff9c703fc00000(0000) knlGS:0000000000000000
[   82.523121] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   82.523122] CR2: ffffbfbec0c0bf64 CR3: 000000002fd16000 CR4: 00000000000006f0
[   82.523126] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   82.523127] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   82.523127] Call Trace:
[   82.523168]  zlib_deflate+0xae/0x3b0
[   82.523181]  deflate_compress+0x5d/0x90 [deflate]
[   82.523186]  ipcomp_output+0xf0/0x1b0 [xfrm_ipcomp]
[   82.523191]  xfrm_output_resume+0x391/0x480
[   82.523199]  ? skb_mac_gso_segment+0xad/0x110
[   82.523202]  xfrm_output+0xd4/0x1d0
[   82.523206]  xfrm4_output+0x2c/0xc0
[   82.523209]  ? xfrm4_udp_encap_rcv+0x1a0/0x1a0
[   82.523213]  ip_queue_xmit+0x145/0x3e0
[   82.523217]  __tcp_transmit_skb+0x513/0x9c0
[   82.523220]  tcp_write_xmit+0x1ba/0xf00
[   82.523223]  __tcp_push_pending_frames+0x31/0xd0
[   82.523225]  tcp_sendmsg_locked+0x395/0xbd0
[   82.523228]  tcp_sendmsg+0x27/0x40
[   82.523231]  sock_sendmsg+0x36/0x40
[   82.523237]  SYSC_sendto+0x10e/0x140
[   82.523240]  ? sock_setsockopt+0x2aa/0xa30
[   82.523245]  ? kvm_clock_read+0x21/0x50
[   82.523249]  ? ktime_get_ts64+0x4c/0xe0
[   82.523253]  ? SyS_poll+0x70/0x100
[   82.523257]  do_syscall_64+0x74/0x150
[   82.523267]  entry_SYSCALL_64_after_hwframe+0x59/0xbe
[   82.523279] RIP: 0033:0x7fc9db4c348a
[   82.523280] RSP: 002b:00007fc9da6cbd00 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   82.523282] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc9db4c348a
[   82.523283] RDX: 000000000000ffff RSI: 00007fc9da6dbd90 RDI: 0000000000000007
[   82.523284] RBP: 00007fc9da6dbd90 R08: 0000000000000000 R09: 0000000000000000
[   82.523285] R10: 0000000000004000 R11: 0000000000000246 R12: 000000000000ffff
[   82.523286] R13: 0000556d7d69fca4 R14: 0000000000000259 R15: 0000000000000001
[   82.523287] Code: 55 53 89 f5 8b 87 9c 00 00 00 48 89 fb 3d 05 01 00 00 0f 86 04 01 00 00 8b b3 94 00 00 00 48 8b 53 48 8d 46 02 8b 4b 78 23 73 40 <0f> b6 04 02 8b 53 68 d3 e2 31 d0 23 43 74 48 8b 53 60 89 43 68
[   82.523314] Modules linked in: ipcomp xfrm_ipcomp deflate authenc echainiv esp4 des3_ede_x86_64 des_generic xfrm4_mode_tunnel ah4 xfrm4_mode_transport ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_filter ip_tables x_tables xfrm_user xfrm_algo veth af_packet iscsi_ibft iscsi_boot_sysfs snd_hda_codec_generic ledtrig_audio snd_hda_intel ppdev snd_hda_codec snd_hda_core snd_hwdep pcspkr snd_pcm snd_timer joydev virtio_net snd net_failover failover parport_pc soundcore parport qemu_fw_cfg button i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs xor zstd_decompress zstd_compress xxhash raid6_pq virtio_blk virtio_console virtio_scsi
[   82.523355]  hid_generic usbhid sr_mod cdrom ata_generic bochs_drm drm_kms_helper ata_piix ahci syscopyarea sysfillrect sysimgblt fb_sys_fops ehci_pci libahci ttm ehci_hcd drm libata serio_raw usbcore floppy virtio_pci virtio_ring virtio drm_panel_orientation_quirks sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
[   82.523376] Supported: Yes
[   82.523379] CR2: ffffbfbec0c0bf64
[   82.582750] ---[ end trace 0000000000000002 ]---
[   82.582763] RIP: 0010:deflate_slow+0x32/0x400
[   82.582764] RSP: 0018:ffffbfbec0fc7a48 EFLAGS: 00010202
[   82.582765] RAX: 000000000008581c RBX: ffffbfbec0b85000 RCX: 0000000000000005
[   82.582765] RDX: ffffbfbec0b86748 RSI: 000000000000001a RDI: 000000000008c660
[   82.582766] RBP: 0000000000000005 R08: 0000000000000027 R09: 0000000000000027
[   82.582766] R10: ffffbfbec0b86748 R11: 0000000000000053 R12: 00000000000005c5
[   82.582767] R13: 00000000000005c5 R14: ffffbfbec0b85000 R15: 00000000000005c8
[   82.582768] FS:  00007fc9da6ec700(0000) GS:ffff9c703fc00000(0000) knlGS:0000000000000000
[   82.582769] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   82.582770] CR2: ffffbfbec0c0bf64 CR3: 000000002fd16000 CR4: 00000000000006f0
[   82.582772] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   82.582773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

ipcomp_compress() will serialize access to the ipcomp_scratches percpu buffer by
disabling BH and preventing a softirq from coming in and running ipcom_decompress(),
which is never called from process context. This of course won't work on RT and
the buffer can get corrupted; there have been similar issues with in the past with
such assumptions, ie: ebf255ed6c44 (net: add back the missing serialization in
ip_send_unicast_reply()).

Similarly, this patch addresses the issue with locallocks allowing RT to have a
percpu spinlock and do the correct serialization.

Addressing such races on an individual basis seemed like a game of whack a mole,
until afaict local_bh_enable() was reworked to use locallocks in 96fac673174
(softirq: Add preemptible softirq) which is why the BUG is not seen in newer kernels.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---

This applies on top of all v4.x based kernels -rt.

 net/xfrm/xfrm_ipcomp.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index ccfdc7115a83..f13871de6e0d 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -20,6 +20,7 @@
 #include <linux/list.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
+#include <linux/locallock.h>
 #include <linux/percpu.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
@@ -36,6 +37,7 @@ struct ipcomp_tfms {
 
 static DEFINE_MUTEX(ipcomp_resource_mutex);
 static void * __percpu *ipcomp_scratches;
+static DEFINE_LOCAL_IRQ_LOCK(ipcomp_scratches_lock);
 static int ipcomp_scratch_users;
 static LIST_HEAD(ipcomp_tfms_list);
 
@@ -45,12 +47,15 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
 	const int plen = skb->len;
 	int dlen = IPCOMP_SCRATCH_SIZE;
 	const u8 *start = skb->data;
-	const int cpu = get_cpu();
-	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
-	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
-	int err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
-	int len;
+	u8 *scratch;
+	struct crypto_comp *tfm;
+	int err, len;
+
+	local_lock(ipcomp_scratches_lock);
 
+	scratch = *this_cpu_ptr(ipcomp_scratches);
+	tfm = *this_cpu_ptr(ipcd->tfms);
+	err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
 	if (err)
 		goto out;
 
@@ -103,7 +108,7 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
 	err = 0;
 
 out:
-	put_cpu();
+	local_unlock(ipcomp_scratches_lock);
 	return err;
 }
 
@@ -146,6 +151,8 @@ static int ipcomp_compress(struct xfrm_state *x, struct sk_buff *skb)
 	int err;
 
 	local_bh_disable();
+	local_lock(ipcomp_scratches_lock);
+
 	scratch = *this_cpu_ptr(ipcomp_scratches);
 	tfm = *this_cpu_ptr(ipcd->tfms);
 	err = crypto_comp_compress(tfm, start, plen, scratch, &dlen);
@@ -158,12 +165,14 @@ static int ipcomp_compress(struct xfrm_state *x, struct sk_buff *skb)
 	}
 
 	memcpy(start + sizeof(struct ip_comp_hdr), scratch, dlen);
+	local_unlock(ipcomp_scratches_lock);
 	local_bh_enable();
 
 	pskb_trim(skb, dlen + sizeof(struct ip_comp_hdr));
 	return 0;
 
 out:
+	local_unlock(ipcomp_scratches_lock);
 	local_bh_enable();
 	return err;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH -RT] net: xfrm: fix compress vs decompress serialization
  2020-08-14 17:37 [PATCH -RT] net: xfrm: fix compress vs decompress serialization Davidlohr Bueso
@ 2020-08-17 14:47 ` Sebastian Andrzej Siewior
  2020-08-18 15:42   ` Davidlohr Bueso
  2020-08-18 16:20 ` [PATCH -RT 4.x] " Davidlohr Bueso
  1 sibling, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-08-17 14:47 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: linux-rt-users, Davidlohr Bueso

On 2020-08-14 10:37:00 [-0700], Davidlohr Bueso wrote:
> ipcomp_compress() will serialize access to the ipcomp_scratches percpu buffer by
> disabling BH and preventing a softirq from coming in and running ipcom_decompress(),
> which is never called from process context. This of course won't work on RT and
> the buffer can get corrupted; there have been similar issues with in the past with
> such assumptions, ie: ebf255ed6c44 (net: add back the missing serialization in
> ip_send_unicast_reply()).
> 
> Similarly, this patch addresses the issue with locallocks allowing RT to have a
> percpu spinlock and do the correct serialization.
> 
> Addressing such races on an individual basis seemed like a game of whack a mole,
> until afaict local_bh_enable() was reworked to use locallocks in 96fac673174
> (softirq: Add preemptible softirq) which is why the BUG is not seen in newer kernels.

So what you are saying that this patch isn't needed v5.0.19-rt10+
because local_bh_disable() acts there as a lock. Repost is then with Tom
in Cc because he will be the first one to apply to the v4.19 tree. And
please shrink the patch description to a sane amount of information.

For upstream, could you please get rid of get_cpu() in
ipcomp_decompress()? If that function is always called with BH disabled
then something like this_cpu_ptr() would be enough. Also I highly
discourage code where half of the stuff is done in the part of the
function where you define local variables.
Removing get_cpu() would be nice because some implementation of
crypto_comp_decompress() acquire sleeping locks. That would be:
- nx842_crypto_decompress()
- zip_comp_decompress()
- nx842_crypto_decompress()

> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> ---
> 
> This applies on top of all v4.x based kernels -rt.

Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -RT] net: xfrm: fix compress vs decompress serialization
  2020-08-17 14:47 ` Sebastian Andrzej Siewior
@ 2020-08-18 15:42   ` Davidlohr Bueso
  2020-08-18 16:03     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 6+ messages in thread
From: Davidlohr Bueso @ 2020-08-18 15:42 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, Davidlohr Bueso

On Mon, 17 Aug 2020, Sebastian Andrzej Siewior wrote:
>So what you are saying that this patch isn't needed v5.0.19-rt10+
>because local_bh_disable() acts there as a lock. Repost is then with Tom
>in Cc because he will be the first one to apply to the v4.19 tree. And
>please shrink the patch description to a sane amount of information.

Sure, but which Tom are you refering to?

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -RT] net: xfrm: fix compress vs decompress serialization
  2020-08-18 15:42   ` Davidlohr Bueso
@ 2020-08-18 16:03     ` Sebastian Andrzej Siewior
  2020-08-18 16:42       ` Tom Zanussi
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-08-18 16:03 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: linux-rt-users, Davidlohr Bueso, Tom Zanussi

On 2020-08-18 08:42:50 [-0700], Davidlohr Bueso wrote:
> On Mon, 17 Aug 2020, Sebastian Andrzej Siewior wrote:
> > So what you are saying that this patch isn't needed v5.0.19-rt10+
> > because local_bh_disable() acts there as a lock. Repost is then with Tom
> > in Cc because he will be the first one to apply to the v4.19 tree. And
> > please shrink the patch description to a sane amount of information.
> 
> Sure, but which Tom are you refering to?

Tom Zanussi (Cc:). The v4.19 kernel is top most kernel which requires
the change [0]. So it should start with Tom and drop down.

[0] as per https://wiki.linuxfoundation.org/realtime/preempt_rt_versions

> Thanks,
> Davidlohr

Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH -RT 4.x] net: xfrm: fix compress vs decompress serialization
  2020-08-14 17:37 [PATCH -RT] net: xfrm: fix compress vs decompress serialization Davidlohr Bueso
  2020-08-17 14:47 ` Sebastian Andrzej Siewior
@ 2020-08-18 16:20 ` Davidlohr Bueso
  1 sibling, 0 replies; 6+ messages in thread
From: Davidlohr Bueso @ 2020-08-18 16:20 UTC (permalink / raw)
  To: Tom Zanussi; +Cc: bigeasy, linux-rt-users, Davidlohr Bueso

A crash was seen in xfrm when running ltp's 'tcp4_ipsec06' stresser on v4.x
based RT kernels.

ipcomp_compress() will serialize access to the ipcomp_scratches percpu buffer by
disabling BH and preventing a softirq from coming in and running ipcom_decompress(),
which is never called from process context. This of course won't work on RT and
the buffer can get corrupted; there have been similar issues with in the past with
such assumptions, ie: ebf255ed6c44 (net: add back the missing serialization in
ip_send_unicast_reply()).

Similarly, this patch addresses the issue with locallocks allowing RT to have a
percpu spinlock and do the correct serialization.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
This applies on top of all v4.x based kernels -rt.

 net/xfrm/xfrm_ipcomp.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index ccfdc7115a83..f13871de6e0d 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -20,6 +20,7 @@
 #include <linux/list.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
+#include <linux/locallock.h>
 #include <linux/percpu.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
@@ -36,6 +37,7 @@ struct ipcomp_tfms {
 
 static DEFINE_MUTEX(ipcomp_resource_mutex);
 static void * __percpu *ipcomp_scratches;
+static DEFINE_LOCAL_IRQ_LOCK(ipcomp_scratches_lock);
 static int ipcomp_scratch_users;
 static LIST_HEAD(ipcomp_tfms_list);
 
@@ -45,12 +47,15 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
 	const int plen = skb->len;
 	int dlen = IPCOMP_SCRATCH_SIZE;
 	const u8 *start = skb->data;
-	const int cpu = get_cpu();
-	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
-	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
-	int err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
-	int len;
+	u8 *scratch;
+	struct crypto_comp *tfm;
+	int err, len;
+
+	local_lock(ipcomp_scratches_lock);
 
+	scratch = *this_cpu_ptr(ipcomp_scratches);
+	tfm = *this_cpu_ptr(ipcd->tfms);
+	err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
 	if (err)
 		goto out;
 
@@ -103,7 +108,7 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
 	err = 0;
 
 out:
-	put_cpu();
+	local_unlock(ipcomp_scratches_lock);
 	return err;
 }
 
@@ -146,6 +151,8 @@ static int ipcomp_compress(struct xfrm_state *x, struct sk_buff *skb)
 	int err;
 
 	local_bh_disable();
+	local_lock(ipcomp_scratches_lock);
+
 	scratch = *this_cpu_ptr(ipcomp_scratches);
 	tfm = *this_cpu_ptr(ipcd->tfms);
 	err = crypto_comp_compress(tfm, start, plen, scratch, &dlen);
@@ -158,12 +165,14 @@ static int ipcomp_compress(struct xfrm_state *x, struct sk_buff *skb)
 	}
 
 	memcpy(start + sizeof(struct ip_comp_hdr), scratch, dlen);
+	local_unlock(ipcomp_scratches_lock);
 	local_bh_enable();
 
 	pskb_trim(skb, dlen + sizeof(struct ip_comp_hdr));
 	return 0;
 
 out:
+	local_unlock(ipcomp_scratches_lock);
 	local_bh_enable();
 	return err;
 }
--
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH -RT] net: xfrm: fix compress vs decompress serialization
  2020-08-18 16:03     ` Sebastian Andrzej Siewior
@ 2020-08-18 16:42       ` Tom Zanussi
  0 siblings, 0 replies; 6+ messages in thread
From: Tom Zanussi @ 2020-08-18 16:42 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Davidlohr Bueso
  Cc: linux-rt-users, Davidlohr Bueso

On Tue, 2020-08-18 at 18:03 +0200, Sebastian Andrzej Siewior wrote:
> On 2020-08-18 08:42:50 [-0700], Davidlohr Bueso wrote:
> > On Mon, 17 Aug 2020, Sebastian Andrzej Siewior wrote:
> > > So what you are saying that this patch isn't needed v5.0.19-rt10+
> > > because local_bh_disable() acts there as a lock. Repost is then
> > > with Tom
> > > in Cc because he will be the first one to apply to the v4.19
> > > tree. And
> > > please shrink the patch description to a sane amount of
> > > information.
> > 
> > Sure, but which Tom are you refering to?
> 
> Tom Zanussi (Cc:). The v4.19 kernel is top most kernel which requires
> the change [0]. So it should start with Tom and drop down.
> 

Thanks, yeah, I'll apply this to the next update as soon as I can get
to it this week..

Tom

> [0] as per 
> https://wiki.linuxfoundation.org/realtime/preempt_rt_versions
> 
> > Thanks,
> > Davidlohr
> 
> Sebastian


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-18 16:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-14 17:37 [PATCH -RT] net: xfrm: fix compress vs decompress serialization Davidlohr Bueso
2020-08-17 14:47 ` Sebastian Andrzej Siewior
2020-08-18 15:42   ` Davidlohr Bueso
2020-08-18 16:03     ` Sebastian Andrzej Siewior
2020-08-18 16:42       ` Tom Zanussi
2020-08-18 16:20 ` [PATCH -RT 4.x] " Davidlohr Bueso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.