linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* vmsplice triggering bug in kfree.
@ 2012-06-07  2:51 Dave Jones
  2012-06-07  4:27 ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Jones @ 2012-06-07  2:51 UTC (permalink / raw)
  To: Linux Kernel; +Cc: axboe

kernel BUG at mm/slub.c:3474!
invalid opcode: 0000 [#1] PREEMPT SMP 
CPU 7 
Modules linked in: ipt_ULOG tun fuse binfmt_misc nfnetlink caif_socket caif phonet bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr i2c_i801 e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
Pid: 21252, comm: trinity-child7 Not tainted 3.5.0-rc1+ #74
RIP: 0010:[<ffffffff811945ce>]  [<ffffffff811945ce>] kfree+0x26e/0x270
RSP: 0018:ffff880104065c48  EFLAGS: 00010246
RAX: 0020000000000000 RBX: ffff880104065d18 RCX: 0000000000000000
RDX: ffffffff7fffffff RSI: ffff880104065cf0 RDI: ffff880104065d18
RBP: ffff880104065c78 R08: 00000000fffffff2 R09: 0000000000000000
R10: ffffffff821e2d00 R11: 0000000000000001 R12: 0000000000000ffc
R13: ffffea0004101940 R14: 0000000000000000 R15: ffff880104065d98
FS:  00007f5baafd3740(0000) GS:ffff880148a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000ffc CR3: 0000000107181000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process trinity-child7 (pid: 21252, threadinfo ffff880104064000, task ffff8801080acd60)
Stack:
 0000000000000010 ffff880104065cf0 0000000000000ffc fffffffffffffff2
 0000000000000000 ffff880104065d98 ffff880104065c98 ffffffff811dc9ef
 0000000000000018 0000000000000161 ffff880104065ec8 ffffffff811dcc4c
Call Trace:
 [<ffffffff811dc9ef>] splice_shrink_spd+0x1f/0x30
 [<ffffffff811dcc4c>] vmsplice_to_pipe+0x24c/0x290
 [<ffffffff811db920>] ? page_cache_pipe_buf_release+0x30/0x30
 [<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
 [<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
 [<ffffffff8108cd97>] ? local_clock+0x47/0x60
 [<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
 [<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
 [<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
 [<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
 [<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
 [<ffffffff8108cd97>] ? local_clock+0x47/0x60
 [<ffffffff81050e0c>] ? do_setitimer+0x1cc/0x310
 [<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
 [<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
 [<ffffffff81651919>] ? sub_preempt_count+0x79/0xd0
 [<ffffffff811ad4da>] ? fget_light+0x3ca/0x500
 [<ffffffff811dd90d>] sys_vmsplice+0x9d/0x210
 [<ffffffff81655937>] ? sysret_check+0x1b/0x56
 [<ffffffff81326f3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81655912>] system_call_fastpath+0x16/0x1b
Code: e8 58 ac fb ff e9 a8 fe ff ff 0f 0b 4d 8b 6d 30 e9 fe fd ff ff 4c 89 f1 48 89 da 4c 89 ee 4c 89 e7 e8 91 fd 4a 00 e9 87 fe ff ff <0f> 0b 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb 48 8b 
RIP  [<ffffffff811945ce>] kfree+0x26e/0x270
 RSP <ffff880104065c48>
---[ end trace 77573bf4cc1dedea ]---


That's...


3473         if (unlikely(!PageSlab(page))) {
3474                 BUG_ON(!PageCompound(page));



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vmsplice triggering bug in kfree.
  2012-06-07  2:51 vmsplice triggering bug in kfree Dave Jones
@ 2012-06-07  4:27 ` Eric Dumazet
  2012-06-07  4:40   ` Eric Dumazet
  2012-06-07 10:36   ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07  4:27 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel, axboe

On Wed, 2012-06-06 at 22:51 -0400, Dave Jones wrote:
> kernel BUG at mm/slub.c:3474!
> invalid opcode: 0000 [#1] PREEMPT SMP 
> CPU 7 
> Modules linked in: ipt_ULOG tun fuse binfmt_misc nfnetlink caif_socket caif phonet bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr i2c_i801 e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
> Pid: 21252, comm: trinity-child7 Not tainted 3.5.0-rc1+ #74
> RIP: 0010:[<ffffffff811945ce>]  [<ffffffff811945ce>] kfree+0x26e/0x270
> RSP: 0018:ffff880104065c48  EFLAGS: 00010246
> RAX: 0020000000000000 RBX: ffff880104065d18 RCX: 0000000000000000
> RDX: ffffffff7fffffff RSI: ffff880104065cf0 RDI: ffff880104065d18
> RBP: ffff880104065c78 R08: 00000000fffffff2 R09: 0000000000000000
> R10: ffffffff821e2d00 R11: 0000000000000001 R12: 0000000000000ffc
> R13: ffffea0004101940 R14: 0000000000000000 R15: ffff880104065d98
> FS:  00007f5baafd3740(0000) GS:ffff880148a00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000ffc CR3: 0000000107181000 CR4: 00000000001407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process trinity-child7 (pid: 21252, threadinfo ffff880104064000, task ffff8801080acd60)
> Stack:
>  0000000000000010 ffff880104065cf0 0000000000000ffc fffffffffffffff2
>  0000000000000000 ffff880104065d98 ffff880104065c98 ffffffff811dc9ef
>  0000000000000018 0000000000000161 ffff880104065ec8 ffffffff811dcc4c
> Call Trace:
>  [<ffffffff811dc9ef>] splice_shrink_spd+0x1f/0x30
>  [<ffffffff811dcc4c>] vmsplice_to_pipe+0x24c/0x290
>  [<ffffffff811db920>] ? page_cache_pipe_buf_release+0x30/0x30
>  [<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
>  [<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
>  [<ffffffff8108cd97>] ? local_clock+0x47/0x60
>  [<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
>  [<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
>  [<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
>  [<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
>  [<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
>  [<ffffffff8108cd97>] ? local_clock+0x47/0x60
>  [<ffffffff81050e0c>] ? do_setitimer+0x1cc/0x310
>  [<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
>  [<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
>  [<ffffffff81651919>] ? sub_preempt_count+0x79/0xd0
>  [<ffffffff811ad4da>] ? fget_light+0x3ca/0x500
>  [<ffffffff811dd90d>] sys_vmsplice+0x9d/0x210
>  [<ffffffff81655937>] ? sysret_check+0x1b/0x56
>  [<ffffffff81326f3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>  [<ffffffff81655912>] system_call_fastpath+0x16/0x1b
> Code: e8 58 ac fb ff e9 a8 fe ff ff 0f 0b 4d 8b 6d 30 e9 fe fd ff ff 4c 89 f1 48 89 da 4c 89 ee 4c 89 e7 e8 91 fd 4a 00 e9 87 fe ff ff <0f> 0b 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb 48 8b 
> RIP  [<ffffffff811945ce>] kfree+0x26e/0x270
>  RSP <ffff880104065c48>
> ---[ end trace 77573bf4cc1dedea ]---
> 
> 
> That's...
> 
> 
> 3473         if (unlikely(!PageSlab(page))) {
> 3474                 BUG_ON(!PageCompound(page));
> 

Thanks Dave, I'll take a look today on this report.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vmsplice triggering bug in kfree.
  2012-06-07  4:27 ` Eric Dumazet
@ 2012-06-07  4:40   ` Eric Dumazet
  2012-06-07  5:52     ` Eric Dumazet
  2012-06-07 10:36   ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07  4:40 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel, axboe

On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote:

> Thanks Dave, I'll take a look today on this report.
> 

OK, trivial bug, I am testing a fix, thanks again.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vmsplice triggering bug in kfree.
  2012-06-07  4:40   ` Eric Dumazet
@ 2012-06-07  5:52     ` Eric Dumazet
  2012-06-07  7:47       ` Eric Dumazet
  2012-06-08 18:04       ` Dave Jones
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07  5:52 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert

On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote:
> 
> > Thanks Dave, I'll take a look today on this report.
> > 
> 
> OK, trivial bug, I am testing a fix, thanks again.
> 

Not sure if you can reproduce this bug easily, if so could you test
following patch ?

Problem is that bipe->buffers can change anytime if we dont lock the
pipe mutex. So we must store in spd a new nr_pages_max field to shadow
pipe->buffers for the duration of the syscall.

Bug introduced in commit 35f3d14dbbc ( pipe: add support for shrinking
and growing pipes )

Thanks

 fs/splice.c            |   35 ++++++++++++++++++++---------------
 include/linux/splice.h |    8 ++++----
 kernel/relay.c         |    5 +++--
 kernel/trace/trace.c   |    6 ++++--
 mm/shmem.c             |    3 ++-
 net/core/skbuff.c      |    1 +
 6 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index c9f1318..7bf08fa 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -273,13 +273,16 @@ void spd_release_page(struct splice_pipe_desc *spd, unsigned int i)
  * Check if we need to grow the arrays holding pages and partial page
  * descriptions.
  */
-int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
+int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
 {
-	if (pipe->buffers <= PIPE_DEF_BUFFERS)
+	unsigned int buffers = ACCESS_ONCE(pipe->buffers);
+
+	spd->nr_pages_max = buffers;
+	if (buffers <= PIPE_DEF_BUFFERS)
 		return 0;
 
-	spd->pages = kmalloc(pipe->buffers * sizeof(struct page *), GFP_KERNEL);
-	spd->partial = kmalloc(pipe->buffers * sizeof(struct partial_page), GFP_KERNEL);
+	spd->pages = kmalloc(buffers * sizeof(struct page *), GFP_KERNEL);
+	spd->partial = kmalloc(buffers * sizeof(struct partial_page), GFP_KERNEL);
 
 	if (spd->pages && spd->partial)
 		return 0;
@@ -289,10 +292,9 @@ int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
 	return -ENOMEM;
 }
 
-void splice_shrink_spd(struct pipe_inode_info *pipe,
-		       struct splice_pipe_desc *spd)
+void splice_shrink_spd(struct splice_pipe_desc *spd)
 {
-	if (pipe->buffers <= PIPE_DEF_BUFFERS)
+	if (spd->nr_pages_max <= PIPE_DEF_BUFFERS)
 		return;
 
 	kfree(spd->pages);
@@ -315,6 +317,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.flags = flags,
 		.ops = &page_cache_pipe_buf_ops,
 		.spd_release = spd_release_page,
@@ -326,7 +329,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
 	index = *ppos >> PAGE_CACHE_SHIFT;
 	loff = *ppos & ~PAGE_CACHE_MASK;
 	req_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
-	nr_pages = min(req_pages, pipe->buffers);
+	nr_pages = min(req_pages, spd.nr_pages_max);
 
 	/*
 	 * Lookup the (hopefully) full range of pages we need.
@@ -497,7 +500,7 @@ fill_it:
 	if (spd.nr_pages)
 		error = splice_to_pipe(pipe, &spd);
 
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 	return error;
 }
 
@@ -598,6 +601,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.flags = flags,
 		.ops = &default_pipe_buf_ops,
 		.spd_release = spd_release_page,
@@ -608,8 +612,8 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 
 	res = -ENOMEM;
 	vec = __vec;
-	if (pipe->buffers > PIPE_DEF_BUFFERS) {
-		vec = kmalloc(pipe->buffers * sizeof(struct iovec), GFP_KERNEL);
+	if (spd.nr_pages_max > PIPE_DEF_BUFFERS) {
+		vec = kmalloc(spd.nr_pages_max * sizeof(struct iovec), GFP_KERNEL);
 		if (!vec)
 			goto shrink_ret;
 	}
@@ -617,7 +621,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 	offset = *ppos & ~PAGE_CACHE_MASK;
 	nr_pages = (len + offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
 
-	for (i = 0; i < nr_pages && i < pipe->buffers && len; i++) {
+	for (i = 0; i < nr_pages && i < spd.nr_pages_max && len; i++) {
 		struct page *page;
 
 		page = alloc_page(GFP_USER);
@@ -665,7 +669,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 shrink_ret:
 	if (vec != __vec)
 		kfree(vec);
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 	return res;
 
 err:
@@ -1614,6 +1618,7 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.flags = flags,
 		.ops = &user_page_pipe_buf_ops,
 		.spd_release = spd_release_page,
@@ -1629,13 +1634,13 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
 
 	spd.nr_pages = get_iovec_page_array(iov, nr_segs, spd.pages,
 					    spd.partial, false,
-					    pipe->buffers);
+					    spd.nr_pages_max);
 	if (spd.nr_pages <= 0)
 		ret = spd.nr_pages;
 	else
 		ret = splice_to_pipe(pipe, &spd);
 
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 	return ret;
 }
 
diff --git a/include/linux/splice.h b/include/linux/splice.h
index 26e5b61..09a545a 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -51,7 +51,8 @@ struct partial_page {
 struct splice_pipe_desc {
 	struct page **pages;		/* page map */
 	struct partial_page *partial;	/* pages[] may not be contig */
-	int nr_pages;			/* number of pages in map */
+	int nr_pages;			/* number of populated pages in map */
+	unsigned int nr_pages_max;	/* pages[] & partial[] arrays size */
 	unsigned int flags;		/* splice flags */
 	const struct pipe_buf_operations *ops;/* ops associated with output pipe */
 	void (*spd_release)(struct splice_pipe_desc *, unsigned int);
@@ -85,9 +86,8 @@ extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *,
 /*
  * for dynamic pipe sizing
  */
-extern int splice_grow_spd(struct pipe_inode_info *, struct splice_pipe_desc *);
-extern void splice_shrink_spd(struct pipe_inode_info *,
-				struct splice_pipe_desc *);
+extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *);
+extern void splice_shrink_spd(struct splice_pipe_desc *);
 extern void spd_release_page(struct splice_pipe_desc *, unsigned int);
 
 extern const struct pipe_buf_operations page_cache_pipe_buf_ops;
diff --git a/kernel/relay.c b/kernel/relay.c
index ab56a17..e8cd202 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -1235,6 +1235,7 @@ static ssize_t subbuf_splice_actor(struct file *in,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.nr_pages = 0,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.partial = partial,
 		.flags = flags,
 		.ops = &relay_pipe_buf_ops,
@@ -1302,8 +1303,8 @@ static ssize_t subbuf_splice_actor(struct file *in,
                 ret += padding;
 
 out:
-	splice_shrink_spd(pipe, &spd);
-        return ret;
+	splice_shrink_spd(&spd);
+	return ret;
 }
 
 static ssize_t relay_file_splice_read(struct file *in,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 68032c6..2884880 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3609,6 +3609,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
 		.pages		= pages_def,
 		.partial	= partial_def,
 		.nr_pages	= 0, /* This gets updated below. */
+		.nr_pages_max	= PIPE_DEF_BUFFERS,
 		.flags		= flags,
 		.ops		= &tracing_pipe_buf_ops,
 		.spd_release	= tracing_spd_release_pipe,
@@ -3680,7 +3681,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
 
 	ret = splice_to_pipe(pipe, &spd);
 out:
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 	return ret;
 
 out_err:
@@ -4231,6 +4232,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
 	struct splice_pipe_desc spd = {
 		.pages		= pages_def,
 		.partial	= partial_def,
+		.nr_pages_max	= PIPE_DEF_BUFFERS,
 		.flags		= flags,
 		.ops		= &buffer_pipe_buf_ops,
 		.spd_release	= buffer_spd_release,
@@ -4318,7 +4320,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
 	}
 
 	ret = splice_to_pipe(pipe, &spd);
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 out:
 	return ret;
 }
diff --git a/mm/shmem.c b/mm/shmem.c
index 585bd220..c244e93 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1577,6 +1577,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.flags = flags,
 		.ops = &page_cache_pipe_buf_ops,
 		.spd_release = spd_release_page,
@@ -1665,7 +1666,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
 	if (spd.nr_pages)
 		error = splice_to_pipe(pipe, &spd);
 
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 
 	if (error > 0) {
 		*ppos += error;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 016694d..bac3c57 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1755,6 +1755,7 @@ int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = MAX_SKB_FRAGS,
 		.flags = flags,
 		.ops = &sock_pipe_buf_ops,
 		.spd_release = sock_spd_release,



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: vmsplice triggering bug in kfree.
  2012-06-07  5:52     ` Eric Dumazet
@ 2012-06-07  7:47       ` Eric Dumazet
  2012-06-08 18:04       ` Dave Jones
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07  7:47 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert

On Thu, 2012-06-07 at 07:52 +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote:
> > On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote:
> > 
> > > Thanks Dave, I'll take a look today on this report.
> > > 
> > 
> > OK, trivial bug, I am testing a fix, thanks again.
> > 
> 
> Not sure if you can reproduce this bug easily, if so could you test
> following patch ?

By the way, following program triggers the bug instantly :

#define __USE_GNU 1
#define _GNU_SOURCE
#include <fcntl.h>
#include <pthread.h>
#include <sys/types.h>

#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

int pfd[2];

void *worker(void *arg)
{
	unsigned int page_size = 4096;

	while (1) {
		fcntl(pfd[1], F_SETPIPE_SZ, 16 * page_size);
		fcntl(pfd[1], F_SETPIPE_SZ, 64 * page_size);
	}
}

char buffer[1024*1024];

int main(int argc, char *argv[])
{
	pthread_t tid;
	int res, fdnull = open("/dev/null", O_WRONLY);

	if (pipe(pfd) == -1) {
		perror("pipe");
		return 1;
	}
	res = pthread_create(&tid, NULL, worker, NULL);
	if (res) {
		errno = res;
		perror("pthread_create");
		return 1;
	}

	while (1) {
		struct iovec iov[1];
		int wr;

		iov[0].iov_base = buffer;
		iov[0].iov_len = 1024*1024;
		wr = vmsplice(pfd[1], iov, 1, SPLICE_F_GIFT);
		if (wr > 0) {
			wr = splice(pfd[0], NULL,
				    fdnull, NULL,
				    wr, SPLICE_F_MOVE | SPLICE_F_MORE);
		}
	}
}



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] splice: fix racy pipe->buffers uses
  2012-06-07  4:27 ` Eric Dumazet
  2012-06-07  4:40   ` Eric Dumazet
@ 2012-06-07 10:36   ` Eric Dumazet
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07 10:36 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert, Alexander Viro

From: Eric Dumazet <edumazet@google.com>

Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()

commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe->buffers.

Problem is some paths don't hold pipe mutex and assume pipe->buffers
doesn't change for their duration.

Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe->buffers where appropriate.

splice_shrink_spd() loses its struct pipe_inode_info argument.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tom Herbert <therbert@google.com>
Cc: stable <stable@vger.kernel.org> # 2.6.35
---
 fs/splice.c            |   35 ++++++++++++++++++++---------------
 include/linux/splice.h |    8 ++++----
 kernel/relay.c         |    5 +++--
 kernel/trace/trace.c   |    6 ++++--
 mm/shmem.c             |    3 ++-
 net/core/skbuff.c      |    1 +
 6 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index c9f1318..7bf08fa 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -273,13 +273,16 @@ void spd_release_page(struct splice_pipe_desc *spd, unsigned int i)
  * Check if we need to grow the arrays holding pages and partial page
  * descriptions.
  */
-int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
+int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
 {
-	if (pipe->buffers <= PIPE_DEF_BUFFERS)
+	unsigned int buffers = ACCESS_ONCE(pipe->buffers);
+
+	spd->nr_pages_max = buffers;
+	if (buffers <= PIPE_DEF_BUFFERS)
 		return 0;
 
-	spd->pages = kmalloc(pipe->buffers * sizeof(struct page *), GFP_KERNEL);
-	spd->partial = kmalloc(pipe->buffers * sizeof(struct partial_page), GFP_KERNEL);
+	spd->pages = kmalloc(buffers * sizeof(struct page *), GFP_KERNEL);
+	spd->partial = kmalloc(buffers * sizeof(struct partial_page), GFP_KERNEL);
 
 	if (spd->pages && spd->partial)
 		return 0;
@@ -289,10 +292,9 @@ int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
 	return -ENOMEM;
 }
 
-void splice_shrink_spd(struct pipe_inode_info *pipe,
-		       struct splice_pipe_desc *spd)
+void splice_shrink_spd(struct splice_pipe_desc *spd)
 {
-	if (pipe->buffers <= PIPE_DEF_BUFFERS)
+	if (spd->nr_pages_max <= PIPE_DEF_BUFFERS)
 		return;
 
 	kfree(spd->pages);
@@ -315,6 +317,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.flags = flags,
 		.ops = &page_cache_pipe_buf_ops,
 		.spd_release = spd_release_page,
@@ -326,7 +329,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
 	index = *ppos >> PAGE_CACHE_SHIFT;
 	loff = *ppos & ~PAGE_CACHE_MASK;
 	req_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
-	nr_pages = min(req_pages, pipe->buffers);
+	nr_pages = min(req_pages, spd.nr_pages_max);
 
 	/*
 	 * Lookup the (hopefully) full range of pages we need.
@@ -497,7 +500,7 @@ fill_it:
 	if (spd.nr_pages)
 		error = splice_to_pipe(pipe, &spd);
 
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 	return error;
 }
 
@@ -598,6 +601,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.flags = flags,
 		.ops = &default_pipe_buf_ops,
 		.spd_release = spd_release_page,
@@ -608,8 +612,8 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 
 	res = -ENOMEM;
 	vec = __vec;
-	if (pipe->buffers > PIPE_DEF_BUFFERS) {
-		vec = kmalloc(pipe->buffers * sizeof(struct iovec), GFP_KERNEL);
+	if (spd.nr_pages_max > PIPE_DEF_BUFFERS) {
+		vec = kmalloc(spd.nr_pages_max * sizeof(struct iovec), GFP_KERNEL);
 		if (!vec)
 			goto shrink_ret;
 	}
@@ -617,7 +621,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 	offset = *ppos & ~PAGE_CACHE_MASK;
 	nr_pages = (len + offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
 
-	for (i = 0; i < nr_pages && i < pipe->buffers && len; i++) {
+	for (i = 0; i < nr_pages && i < spd.nr_pages_max && len; i++) {
 		struct page *page;
 
 		page = alloc_page(GFP_USER);
@@ -665,7 +669,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 shrink_ret:
 	if (vec != __vec)
 		kfree(vec);
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 	return res;
 
 err:
@@ -1614,6 +1618,7 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.flags = flags,
 		.ops = &user_page_pipe_buf_ops,
 		.spd_release = spd_release_page,
@@ -1629,13 +1634,13 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
 
 	spd.nr_pages = get_iovec_page_array(iov, nr_segs, spd.pages,
 					    spd.partial, false,
-					    pipe->buffers);
+					    spd.nr_pages_max);
 	if (spd.nr_pages <= 0)
 		ret = spd.nr_pages;
 	else
 		ret = splice_to_pipe(pipe, &spd);
 
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 	return ret;
 }
 
diff --git a/include/linux/splice.h b/include/linux/splice.h
index 26e5b61..09a545a 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -51,7 +51,8 @@ struct partial_page {
 struct splice_pipe_desc {
 	struct page **pages;		/* page map */
 	struct partial_page *partial;	/* pages[] may not be contig */
-	int nr_pages;			/* number of pages in map */
+	int nr_pages;			/* number of populated pages in map */
+	unsigned int nr_pages_max;	/* pages[] & partial[] arrays size */
 	unsigned int flags;		/* splice flags */
 	const struct pipe_buf_operations *ops;/* ops associated with output pipe */
 	void (*spd_release)(struct splice_pipe_desc *, unsigned int);
@@ -85,9 +86,8 @@ extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *,
 /*
  * for dynamic pipe sizing
  */
-extern int splice_grow_spd(struct pipe_inode_info *, struct splice_pipe_desc *);
-extern void splice_shrink_spd(struct pipe_inode_info *,
-				struct splice_pipe_desc *);
+extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *);
+extern void splice_shrink_spd(struct splice_pipe_desc *);
 extern void spd_release_page(struct splice_pipe_desc *, unsigned int);
 
 extern const struct pipe_buf_operations page_cache_pipe_buf_ops;
diff --git a/kernel/relay.c b/kernel/relay.c
index ab56a17..e8cd202 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -1235,6 +1235,7 @@ static ssize_t subbuf_splice_actor(struct file *in,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.nr_pages = 0,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.partial = partial,
 		.flags = flags,
 		.ops = &relay_pipe_buf_ops,
@@ -1302,8 +1303,8 @@ static ssize_t subbuf_splice_actor(struct file *in,
                 ret += padding;
 
 out:
-	splice_shrink_spd(pipe, &spd);
-        return ret;
+	splice_shrink_spd(&spd);
+	return ret;
 }
 
 static ssize_t relay_file_splice_read(struct file *in,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 68032c6..2884880 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3609,6 +3609,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
 		.pages		= pages_def,
 		.partial	= partial_def,
 		.nr_pages	= 0, /* This gets updated below. */
+		.nr_pages_max	= PIPE_DEF_BUFFERS,
 		.flags		= flags,
 		.ops		= &tracing_pipe_buf_ops,
 		.spd_release	= tracing_spd_release_pipe,
@@ -3680,7 +3681,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
 
 	ret = splice_to_pipe(pipe, &spd);
 out:
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 	return ret;
 
 out_err:
@@ -4231,6 +4232,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
 	struct splice_pipe_desc spd = {
 		.pages		= pages_def,
 		.partial	= partial_def,
+		.nr_pages_max	= PIPE_DEF_BUFFERS,
 		.flags		= flags,
 		.ops		= &buffer_pipe_buf_ops,
 		.spd_release	= buffer_spd_release,
@@ -4318,7 +4320,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
 	}
 
 	ret = splice_to_pipe(pipe, &spd);
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 out:
 	return ret;
 }
diff --git a/mm/shmem.c b/mm/shmem.c
index 585bd220..c244e93 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1577,6 +1577,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = PIPE_DEF_BUFFERS,
 		.flags = flags,
 		.ops = &page_cache_pipe_buf_ops,
 		.spd_release = spd_release_page,
@@ -1665,7 +1666,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
 	if (spd.nr_pages)
 		error = splice_to_pipe(pipe, &spd);
 
-	splice_shrink_spd(pipe, &spd);
+	splice_shrink_spd(&spd);
 
 	if (error > 0) {
 		*ppos += error;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 016694d..bac3c57 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1755,6 +1755,7 @@ int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
 	struct splice_pipe_desc spd = {
 		.pages = pages,
 		.partial = partial,
+		.nr_pages_max = MAX_SKB_FRAGS,
 		.flags = flags,
 		.ops = &sock_pipe_buf_ops,
 		.spd_release = sock_spd_release,



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: vmsplice triggering bug in kfree.
  2012-06-07  5:52     ` Eric Dumazet
  2012-06-07  7:47       ` Eric Dumazet
@ 2012-06-08 18:04       ` Dave Jones
  1 sibling, 0 replies; 7+ messages in thread
From: Dave Jones @ 2012-06-08 18:04 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel, axboe, Tom Herbert

On Thu, Jun 07, 2012 at 07:52:45AM +0200, Eric Dumazet wrote:
 > On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote:
 > > On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote:
 > > 
 > > > Thanks Dave, I'll take a look today on this report.
 > > > 
 > > 
 > > OK, trivial bug, I am testing a fix, thanks again.
 > > 
 > 
 > Not sure if you can reproduce this bug easily, if so could you test
 > following patch ?

haven't seen this reoccur since applying this patch, so it looks good to me.

Tested-by: Dave Jones <davej@redhat.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-06-08 18:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-07  2:51 vmsplice triggering bug in kfree Dave Jones
2012-06-07  4:27 ` Eric Dumazet
2012-06-07  4:40   ` Eric Dumazet
2012-06-07  5:52     ` Eric Dumazet
2012-06-07  7:47       ` Eric Dumazet
2012-06-08 18:04       ` Dave Jones
2012-06-07 10:36   ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).