All of lore.kernel.org
 help / color / mirror / Atom feed
* fs: out of bounds on stack in iov_iter_advance
@ 2015-08-12 14:13 Sasha Levin
  2015-08-15 20:13 ` Chuck Ebbert
  0 siblings, 1 reply; 35+ messages in thread
From: Sasha Levin @ 2015-08-12 14:13 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, LKML

Hi all,

While fuzzing with trinity inside a KVM tools guest running -next I've stumbled on the following:

[64092.216447] ==================================================================
[64092.217840] BUG: KASan: out of bounds on stack in iov_iter_advance+0x3b7/0x480 at addr ffff88040506fd48
[64092.219314] Read of size 8 by task trinity-c194/11387
[64092.220114] page:ffffea0010141bc0 count:0 mapcount:0 mapping:          (null) index:0x2
[64092.221354] flags: 0x46fffff80000000()
[64092.221998] page dumped because: kasan: bad access detected
[64092.222879] CPU: 4 PID: 11387 Comm: trinity-c194 Not tainted 4.2.0-rc6-next-20150810-sasha-00040-g12ad0db3-dirty #2427
[64092.224537]  ffff88040506fd30 ffff88040506fa88 ffffffff9ce7763b ffff88040506fb10
[64092.225763]  ffff88040506fb00 ffffffff9376b1be 0000000000000000 ffff880270108600
[64092.226992]  0000000000000282 0000000000000000 0000000000000000 0000000000000000
[64092.228221] Call Trace:
[64092.228679] dump_stack (lib/dump_stack.c:52)
[64092.231252] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[64092.232219] __asan_report_load8_noabort (mm/kasan/report.c:251)
[64092.234167] iov_iter_advance (lib/iov_iter.c:511)
[64092.235105] generic_file_read_iter (mm/filemap.c:1743)
[64092.241532] blkdev_read_iter (fs/block_dev.c:1649)
[64092.242448] __vfs_read (fs/read_write.c:423 fs/read_write.c:434)
[64092.246949] vfs_read (fs/read_write.c:454)
[64092.247743] SyS_pread64 (fs/read_write.c:607 fs/read_write.c:594)
[64092.250445] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
[64092.251440] Memory state around the buggy address:
[64092.252221]  ffff88040506fc00: 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f4 f4 f4 f3
[64092.253340]  ffff88040506fc80: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00
[64092.254456] >ffff88040506fd00: 00 00 f1 f1 f1 f1 00 00 f4 f4 f2 f2 f2 f2 00 00
[64092.255566]                                               ^
[64092.256432]  ffff88040506fd80: 00 00 00 f4 f4 f4 f2 f2 f2 f2 00 00 00 00 00 f4
[64092.257557]  ffff88040506fe00: f4 f4 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
[64092.258684] ==================================================================


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-08-12 14:13 fs: out of bounds on stack in iov_iter_advance Sasha Levin
@ 2015-08-15 20:13 ` Chuck Ebbert
  2015-08-17  9:18   ` Andrey Ryabinin
  0 siblings, 1 reply; 35+ messages in thread
From: Chuck Ebbert @ 2015-08-15 20:13 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Al Viro, linux-fsdevel, LKML, Andrey Ryabinin

On Wed, 12 Aug 2015 10:13:24 -0400
Sasha Levin <sasha.levin@oracle.com> wrote:

> While fuzzing with trinity inside a KVM tools guest running -next I've stumbled on the following:
> 
> [64092.216447] ==================================================================
> [64092.217840] BUG: KASan: out of bounds on stack in iov_iter_advance+0x3b7/0x480 at addr ffff88040506fd48
> [64092.219314] Read of size 8 by task trinity-c194/11387
> [64092.220114] page:ffffea0010141bc0 count:0 mapcount:0 mapping:          (null) index:0x2
> [64092.221354] flags: 0x46fffff80000000()
> [64092.221998] page dumped because: kasan: bad access detected
> [64092.222879] CPU: 4 PID: 11387 Comm: trinity-c194 Not tainted 4.2.0-rc6-next-20150810-sasha-00040-g12ad0db3-dirty #2427
> [64092.224537]  ffff88040506fd30 ffff88040506fa88 ffffffff9ce7763b ffff88040506fb10
> [64092.225763]  ffff88040506fb00 ffffffff9376b1be 0000000000000000 ffff880270108600
> [64092.226992]  0000000000000282 0000000000000000 0000000000000000 0000000000000000
> [64092.228221] Call Trace:
> [64092.228679] dump_stack (lib/dump_stack.c:52)
> [64092.231252] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
> [64092.232219] __asan_report_load8_noabort (mm/kasan/report.c:251)
> [64092.234167] iov_iter_advance (lib/iov_iter.c:511)
> [64092.235105] generic_file_read_iter (mm/filemap.c:1743)
> [64092.241532] blkdev_read_iter (fs/block_dev.c:1649)
> [64092.242448] __vfs_read (fs/read_write.c:423 fs/read_write.c:434)
> [64092.246949] vfs_read (fs/read_write.c:454)
> [64092.247743] SyS_pread64 (fs/read_write.c:607 fs/read_write.c:594)
> [64092.250445] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
> [64092.251440] Memory state around the buggy address:
> [64092.252221]  ffff88040506fc00: 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f4 f4 f4 f3
> [64092.253340]  ffff88040506fc80: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00
> [64092.254456] >ffff88040506fd00: 00 00 f1 f1 f1 f1 00 00 f4 f4 f2 f2 f2 f2 00 00
> [64092.255566]                                               ^
> [64092.256432]  ffff88040506fd80: 00 00 00 f4 f4 f4 f2 f2 f2 f2 00 00 00 00 00 f4
> [64092.257557]  ffff88040506fe00: f4 f4 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
> [64092.258684] ==================================================================
> 

I tried to debug this but kasan doesn't print much useful information
for stack out of bounds access. It shows the address that's being
accessed but it doesn't show the value of the boundary that was
exceeded. And the stack dump doesn't show any addresses either - just
contents. It would be nice to see a full stack frame dump showing
where all the parent frames are too. Also too the file and line number
(lib/iov_iter.c:511) are completely useless because of inlining,
though that's not kasan's fault.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-08-15 20:13 ` Chuck Ebbert
@ 2015-08-17  9:18   ` Andrey Ryabinin
  2015-08-19  5:46     ` Al Viro
  0 siblings, 1 reply; 35+ messages in thread
From: Andrey Ryabinin @ 2015-08-17  9:18 UTC (permalink / raw)
  To: Chuck Ebbert, Sasha Levin; +Cc: Al Viro, linux-fsdevel, LKML



On 08/15/2015 11:13 PM, Chuck Ebbert wrote:
> On Wed, 12 Aug 2015 10:13:24 -0400
> Sasha Levin <sasha.levin@oracle.com> wrote:
> 
>> While fuzzing with trinity inside a KVM tools guest running -next I've stumbled on the following:
>>
>> [64092.216447] ==================================================================
>> [64092.217840] BUG: KASan: out of bounds on stack in iov_iter_advance+0x3b7/0x480 at addr ffff88040506fd48
>> [64092.219314] Read of size 8 by task trinity-c194/11387
>> [64092.220114] page:ffffea0010141bc0 count:0 mapcount:0 mapping:          (null) index:0x2
>> [64092.221354] flags: 0x46fffff80000000()
>> [64092.221998] page dumped because: kasan: bad access detected
>> [64092.222879] CPU: 4 PID: 11387 Comm: trinity-c194 Not tainted 4.2.0-rc6-next-20150810-sasha-00040-g12ad0db3-dirty #2427
>> [64092.224537]  ffff88040506fd30 ffff88040506fa88 ffffffff9ce7763b ffff88040506fb10
>> [64092.225763]  ffff88040506fb00 ffffffff9376b1be 0000000000000000 ffff880270108600
>> [64092.226992]  0000000000000282 0000000000000000 0000000000000000 0000000000000000
>> [64092.228221] Call Trace:
>> [64092.228679] dump_stack (lib/dump_stack.c:52)
>> [64092.231252] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
>> [64092.232219] __asan_report_load8_noabort (mm/kasan/report.c:251)
>> [64092.234167] iov_iter_advance (lib/iov_iter.c:511)
>> [64092.235105] generic_file_read_iter (mm/filemap.c:1743)
>> [64092.241532] blkdev_read_iter (fs/block_dev.c:1649)
>> [64092.242448] __vfs_read (fs/read_write.c:423 fs/read_write.c:434)
>> [64092.246949] vfs_read (fs/read_write.c:454)
>> [64092.247743] SyS_pread64 (fs/read_write.c:607 fs/read_write.c:594)
>> [64092.250445] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
>> [64092.251440] Memory state around the buggy address:
>> [64092.252221]  ffff88040506fc00: 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f4 f4 f4 f3
>> [64092.253340]  ffff88040506fc80: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [64092.254456] >ffff88040506fd00: 00 00 f1 f1 f1 f1 00 00 f4 f4 f2 f2 f2 f2 00 00
>> [64092.255566]                                               ^
>> [64092.256432]  ffff88040506fd80: 00 00 00 f4 f4 f4 f2 f2 f2 f2 00 00 00 00 00 f4
>> [64092.257557]  ffff88040506fe00: f4 f4 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
>> [64092.258684] ==================================================================
>>
> 
> I tried to debug this but kasan doesn't print much useful information
> for stack out of bounds access. It shows the address that's being
> accessed but it doesn't show the value of the boundary that was
> exceeded.

This could be estimated by looking at the shadow memory:

	ffff88040506fd00: 00 00 f1 f1 f1 f1 00 00 f4 [f4] f2 f2 f2 f2 00 00

Each byte in shadow represents 8 bytes of memory. So f1 - is the left redzone of the stack frame.
2 zeroes is probably 'struct iovec iov' defined in new_sync_read(). The next two f4 is redzone.
We hit the second f4, which means that we accessed iov[1].iov_len

This bug is similar to recently found bug in 9p: http://thread.gmane.org/gmane.linux.kernel/1931799/focus=1936542

Such report could be produced if retval > count.

generic_file_read_iter():
...
	size_t count = iov_iter_count(iter);
...
	if (!count)
		goto out; /* skip atime */
	size = i_size_read(inode);
	retval = filemap_write_and_wait_range(mapping, pos,
				pos + count - 1);
	if (!retval) {
		struct iov_iter data = *iter;
		retval = mapping->a_ops->direct_IO(iocb, &data, pos);
	}

	if (retval > 0) {
		*ppos = pos + retval;
		iov_iter_advance(iter, retval);


So either filemap_write_and_wait_range() or mapping->a_ops->direct_IO() returned more
than 'count'.


> And the stack dump doesn't show any addresses either - just
> contents. It would be nice to see a full stack frame dump showing
> where all the parent frames are too. 

Yes, I think it might be helpful to dump some portion of stack around the access address.

> Also too the file and line number
> (lib/iov_iter.c:511) are completely useless because of inlining,
> though that's not kasan's fault.
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-08-17  9:18   ` Andrey Ryabinin
@ 2015-08-19  5:46     ` Al Viro
  2015-09-02 20:00       ` Sasha Levin
  2015-09-18  2:24       ` Sasha Levin
  0 siblings, 2 replies; 35+ messages in thread
From: Al Viro @ 2015-08-19  5:46 UTC (permalink / raw)
  To: Andrey Ryabinin; +Cc: Chuck Ebbert, Sasha Levin, linux-fsdevel, LKML

On Mon, Aug 17, 2015 at 12:18:12PM +0300, Andrey Ryabinin wrote:

> This bug is similar to recently found bug in 9p: http://thread.gmane.org/gmane.linux.kernel/1931799/focus=1936542

Ow.  For those who'd missed that fun: the bug in question had turned out to
be caused by improper reuse of request ids, _not_ in the call chain of
the triggering syscall.

> 	if (!retval) {
> 		struct iov_iter data = *iter;
> 		retval = mapping->a_ops->direct_IO(iocb, &data, pos);
> 	}
> 
> 	if (retval > 0) {
> 		*ppos = pos + retval;
> 		iov_iter_advance(iter, retval);
> 
> 
> So either filemap_write_and_wait_range()
	Shouldn't - it's supposed to return 0 or -E...

> or mapping->a_ops->direct_IO() returned more
> than 'count'.

	Was there DAX involved?  ->direct_IO() in there is blkdev_direct_IO(),
which takes rather different paths in those cases...

> > Also too the file and line number
> > (lib/iov_iter.c:511) are completely useless because of inlining,
> > though that's not kasan's fault.

Might make sense to slap
	if (WARN_ON(size > iov_iter_count(i)))
		print size and *i
and see if it triggers...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-08-19  5:46     ` Al Viro
@ 2015-09-02 20:00       ` Sasha Levin
  2015-09-18  2:24       ` Sasha Levin
  1 sibling, 0 replies; 35+ messages in thread
From: Sasha Levin @ 2015-09-02 20:00 UTC (permalink / raw)
  To: Al Viro, Andrey Ryabinin; +Cc: Chuck Ebbert, linux-fsdevel, LKML

On 08/19/2015 01:46 AM, Al Viro wrote:
> On Mon, Aug 17, 2015 at 12:18:12PM +0300, Andrey Ryabinin wrote:
> 
>> This bug is similar to recently found bug in 9p: http://thread.gmane.org/gmane.linux.kernel/1931799/focus=1936542
> 
> Ow.  For those who'd missed that fun: the bug in question had turned out to
> be caused by improper reuse of request ids, _not_ in the call chain of
> the triggering syscall.
> 
>> 	if (!retval) {
>> 		struct iov_iter data = *iter;
>> 		retval = mapping->a_ops->direct_IO(iocb, &data, pos);
>> 	}
>>
>> 	if (retval > 0) {
>> 		*ppos = pos + retval;
>> 		iov_iter_advance(iter, retval);
>>
>>
>> So either filemap_write_and_wait_range()
> 	Shouldn't - it's supposed to return 0 or -E...
> 
>> or mapping->a_ops->direct_IO() returned more
>> than 'count'.
> 
> 	Was there DAX involved?  ->direct_IO() in there is blkdev_direct_IO(),
> which takes rather different paths in those cases...

I don't think so, at least I didn't configure it in.

>>> Also too the file and line number
>>> (lib/iov_iter.c:511) are completely useless because of inlining,
>>> though that's not kasan's fault.
> 
> Might make sense to slap
> 	if (WARN_ON(size > iov_iter_count(i)))
> 		print size and *i
> and see if it triggers...

It finally reproduced. size == 0x1000000, iov_iter_count(iter) == 0x1234.


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-08-19  5:46     ` Al Viro
  2015-09-02 20:00       ` Sasha Levin
@ 2015-09-18  2:24       ` Sasha Levin
  2015-09-30 21:30         ` Sasha Levin
  1 sibling, 1 reply; 35+ messages in thread
From: Sasha Levin @ 2015-09-18  2:24 UTC (permalink / raw)
  To: Al Viro, Andrey Ryabinin, willy; +Cc: Chuck Ebbert, linux-fsdevel, LKML

On 08/19/2015 01:46 AM, Al Viro wrote:
>> or mapping->a_ops->direct_IO() returned more
>> > than 'count'.
> 	Was there DAX involved?  ->direct_IO() in there is blkdev_direct_IO(),
> which takes rather different paths in those cases...
> 

So I've traced this all the way back to dax_io(). I can trigger this with:

diff --git a/fs/dax.c b/fs/dax.c
index 93bf2f9..2cdb8a5 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -178,6 +178,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
        if (need_wmb)
                wmb_pmem();

+       WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
        return (pos == start) ? retval : pos - start;
 }

So it seems that iter gets moved twice here: once in dax_io(), and once again
back at generic_file_read_iter().

I don't see how it ever worked. Am I missing something?


Thanks,
Sasha

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-09-18  2:24       ` Sasha Levin
@ 2015-09-30 21:30         ` Sasha Levin
  2015-10-17 19:22           ` Sasha Levin
  2015-11-06  1:34           ` Al Viro
  0 siblings, 2 replies; 35+ messages in thread
From: Sasha Levin @ 2015-09-30 21:30 UTC (permalink / raw)
  To: Al Viro, Andrey Ryabinin, willy; +Cc: Chuck Ebbert, linux-fsdevel, LKML

On 09/17/2015 10:24 PM, Sasha Levin wrote:
> On 08/19/2015 01:46 AM, Al Viro wrote:
>>> or mapping->a_ops->direct_IO() returned more
>>>> than 'count'.
>> 	Was there DAX involved?  ->direct_IO() in there is blkdev_direct_IO(),
>> which takes rather different paths in those cases...
>>
> 
> So I've traced this all the way back to dax_io(). I can trigger this with:
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 93bf2f9..2cdb8a5 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -178,6 +178,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
>         if (need_wmb)
>                 wmb_pmem();
> 
> +       WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
>         return (pos == start) ? retval : pos - start;
>  }
> 
> So it seems that iter gets moved twice here: once in dax_io(), and once again
> back at generic_file_read_iter().
> 
> I don't see how it ever worked. Am I missing something?

Ping?


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-09-30 21:30         ` Sasha Levin
@ 2015-10-17 19:22           ` Sasha Levin
  2015-10-18  4:17             ` Ross Zwisler
  2015-11-06  1:34           ` Al Viro
  1 sibling, 1 reply; 35+ messages in thread
From: Sasha Levin @ 2015-10-17 19:22 UTC (permalink / raw)
  To: Al Viro, Andrey Ryabinin, willy; +Cc: Chuck Ebbert, linux-fsdevel, LKML

On 09/30/2015 05:30 PM, Sasha Levin wrote:
> On 09/17/2015 10:24 PM, Sasha Levin wrote:
>> On 08/19/2015 01:46 AM, Al Viro wrote:
>>>> or mapping->a_ops->direct_IO() returned more
>>>>> than 'count'.
>>> 	Was there DAX involved?  ->direct_IO() in there is blkdev_direct_IO(),
>>> which takes rather different paths in those cases...
>>>
>>
>> So I've traced this all the way back to dax_io(). I can trigger this with:
>>
>> diff --git a/fs/dax.c b/fs/dax.c
>> index 93bf2f9..2cdb8a5 100644
>> --- a/fs/dax.c
>> +++ b/fs/dax.c
>> @@ -178,6 +178,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
>>         if (need_wmb)
>>                 wmb_pmem();
>>
>> +       WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
>>         return (pos == start) ? retval : pos - start;
>>  }
>>
>> So it seems that iter gets moved twice here: once in dax_io(), and once again
>> back at generic_file_read_iter().
>>
>> I don't see how it ever worked. Am I missing something?
> 
> Ping?

Ping?


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-10-17 19:22           ` Sasha Levin
@ 2015-10-18  4:17             ` Ross Zwisler
  2015-10-19 23:34               ` Sasha Levin
  0 siblings, 1 reply; 35+ messages in thread
From: Ross Zwisler @ 2015-10-18  4:17 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Al Viro, Andrey Ryabinin, willy, Chuck Ebbert, linux-fsdevel, LKML

On Sat, Oct 17, 2015 at 03:22:19PM -0400, Sasha Levin wrote:
> On 09/30/2015 05:30 PM, Sasha Levin wrote:
> > On 09/17/2015 10:24 PM, Sasha Levin wrote:
> >> On 08/19/2015 01:46 AM, Al Viro wrote:
> >>>> or mapping->a_ops->direct_IO() returned more
> >>>>> than 'count'.
> >>> 	Was there DAX involved?  ->direct_IO() in there is blkdev_direct_IO(),
> >>> which takes rather different paths in those cases...
> >>>
> >>
> >> So I've traced this all the way back to dax_io(). I can trigger this with:
> >>
> >> diff --git a/fs/dax.c b/fs/dax.c
> >> index 93bf2f9..2cdb8a5 100644
> >> --- a/fs/dax.c
> >> +++ b/fs/dax.c
> >> @@ -178,6 +178,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
> >>         if (need_wmb)
> >>                 wmb_pmem();
> >>
> >> +       WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
> >>         return (pos == start) ? retval : pos - start;
> >>  }
> >>
> >> So it seems that iter gets moved twice here: once in dax_io(), and once again
> >> back at generic_file_read_iter().
> >>
> >> I don't see how it ever worked. Am I missing something?
> > 
> > Ping?
> 
> Ping?

I'll try and find time to look at this issue this week.  Sasha, do you have a
more targeted reproducer, or is still just the trinity fuzzer?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-10-18  4:17             ` Ross Zwisler
@ 2015-10-19 23:34               ` Sasha Levin
  0 siblings, 0 replies; 35+ messages in thread
From: Sasha Levin @ 2015-10-19 23:34 UTC (permalink / raw)
  To: Ross Zwisler, Al Viro, Andrey Ryabinin, willy, Chuck Ebbert,
	linux-fsdevel, LKML

On 10/18/2015 12:17 AM, Ross Zwisler wrote:
> I'll try and find time to look at this issue this week.  Sasha, do you have a
> more targeted reproducer, or is still just the trinity fuzzer?

Nope, I haven't looked at it much beyond looking into dax_io().


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-09-30 21:30         ` Sasha Levin
  2015-10-17 19:22           ` Sasha Levin
@ 2015-11-06  1:34           ` Al Viro
  2015-11-06  2:19             ` Al Viro
  1 sibling, 1 reply; 35+ messages in thread
From: Al Viro @ 2015-11-06  1:34 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Andrey Ryabinin, willy, Chuck Ebbert, linux-fsdevel, LKML

On Wed, Sep 30, 2015 at 05:30:17PM -0400, Sasha Levin wrote:

> > So I've traced this all the way back to dax_io(). I can trigger this with:
> > 
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 93bf2f9..2cdb8a5 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -178,6 +178,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
> >         if (need_wmb)
> >                 wmb_pmem();
> > 
> > +       WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
> >         return (pos == start) ? retval : pos - start;
> >  }
> > 
> > So it seems that iter gets moved twice here: once in dax_io(), and once again
> > back at generic_file_read_iter().
> > 
> > I don't see how it ever worked. Am I missing something?

This:
                        struct iov_iter data = *iter;
                        retval = mapping->a_ops->direct_IO(iocb, &data, pos);
                }

                if (retval > 0) {
                        *ppos = pos + retval;
                        iov_iter_advance(iter, retval);

The iterator advanced in ->direct_IO() is a _copy_, not the original.
The contents of *iter as seen by generic_file_read_iter() is not
modifiable by ->direct_IO(), simply because its address is nowhere to
be found.  And checking iov_iter_count(iter) at the end of dax_io() is
pointless - from the POV of generic_file_read_iter() it's data.count,
and while it used to be equal to iter->count, it's already been modified.
By the time we call iov_iter_advance() in generic_file_read_iter() that
value will be already discarded, along with rest of struct iov_iter data.

Wait a minute - you are triggering _what_???
> > +       WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
With '&&'?  iov_iter_count() is size_t, while pos and start are loff_t,
so you are seeing equal values in pos and start (as integers) *and*
(loff_t)0 > (size_t)something.  loff_t is a signed type, size_t - unsigned.
6.3.1.8[1] says that
	* if rank of size_t is greater or equal to rank of loff_t, the
latter gets converted to size_t.  And conversion of zero should be zero,
i.e. (size_t) 0 > (size_t)something, which is impossible (we compare them
as non-negative integers).
	* if loff_t can represent all values of size_t, size_t value gets
converted to loff_t.  Result of conversion should have the same (in particular,
non-negative) value.  Again, comparison can't be true.
	* otherwise both values are converted to unsigned counterpart of
loff_t.  Again, zero converts to 0 and in any unsigned type 0 > x is
impossible.

I don't see any way for that condition to evaluate true.

Assuming that it's a misquoted ||...  I don't see any way for pos to
get greater than start + original iov_iter_count().  However, I *do*
see a way for bad things to happen in a different way.  Look:
	// first pass through the loop, pos == start (and so's max)
                                retval = dax_get_addr(bh, &addr, blkbits);
	// got a positive value
                                if (retval < 0)
                                        break;
	// nope, keep going
                                if (buffer_unwritten(bh) || buffer_new(bh)) {
                                        dax_new_buf(addr, retval, first, pos,
                                                                        end);
                                        need_wmb = true;
                                }
                                addr += first;
                                size = retval - first;
	// OK...
                        }
                        max = min(pos + size, end);
	// OK...
                }

                if (iov_iter_rw(iter) == WRITE) {
                        len = copy_from_iter_pmem(addr, max - pos, iter);
                        need_wmb = true;
                } else if (!hole)
                        len = copy_to_iter((void __force *)addr, max - pos,
                                        iter);
                else
                        len = iov_iter_zero(max - pos, iter);
	// too bad - we'd hit an unmapped memory area.  len is 0...
	// and retval is fucking positive.
                if (!len)
                        break;

	return (pos == start) ? retval : pos - start;
	// will return a bloody big positive value

Could you try to reproduce it with this:

dax_io(): don't let non-error value escape via retval instead of EFAULT

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/dax.c b/fs/dax.c
index a86d3cc..7b653e9 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -169,8 +169,10 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
 		else
 			len = iov_iter_zero(max - pos, iter);
 
-		if (!len)
+		if (!len) {
+			retval = -EFAULT;
 			break;
+		}
 
 		pos += len;
 		addr += len;


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-06  1:34           ` Al Viro
@ 2015-11-06  2:19             ` Al Viro
  2015-11-06  3:38               ` Linus Torvalds
  0 siblings, 1 reply; 35+ messages in thread
From: Al Viro @ 2015-11-06  2:19 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Andrey Ryabinin, willy, Chuck Ebbert, linux-fsdevel, LKML,
	Jens Axboe, Linus Torvalds, Dan Williams

On Fri, Nov 06, 2015 at 01:34:02AM +0000, Al Viro wrote:

> Could you try to reproduce it with this:
> 
> dax_io(): don't let non-error value escape via retval instead of EFAULT
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> diff --git a/fs/dax.c b/fs/dax.c
> index a86d3cc..7b653e9 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -169,8 +169,10 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
>  		else
>  			len = iov_iter_zero(max - pos, iter);
>  
> -		if (!len)
> +		if (!len) {
> +			retval = -EFAULT;
>  			break;
> +		}
>  
>  		pos += len;
>  		addr += len;
> 

PS: "block, dax: fix lifetime of in-kernel dax mappings with dax_map_atomic()"
Dan Williams had posted a while ago does change the things a bit, but
AFAICS only in turning "return a bogus positive value" into "return an
uninitialized value"; if applying that one after it, s/retval/rc/ in
the above.  And whether it fixes the bug Sasha had been able to trigger,
the bug is real and needs fixing - it's been there since 4.0 when fs/dax.c
went into the tree.

How are we going to handle that one?  I can put it into mainline pull
request via vfs.git, with Cc: stable, but if e.g. Jens prefers to take it
via the block tree, I'll be glad to leave it for him to deal with.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-06  2:19             ` Al Viro
@ 2015-11-06  3:38               ` Linus Torvalds
  2015-11-06 16:06                 ` Jens Axboe
  2015-11-11  2:21                 ` Linus Torvalds
  0 siblings, 2 replies; 35+ messages in thread
From: Linus Torvalds @ 2015-11-06  3:38 UTC (permalink / raw)
  To: Al Viro
  Cc: Sasha Levin, Andrey Ryabinin, Matthew Wilcox, Chuck Ebbert,
	linux-fsdevel, LKML, Jens Axboe, Dan Williams

On Thu, Nov 5, 2015 at 6:19 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> How are we going to handle that one?  I can put it into mainline pull
> request via vfs.git, with Cc: stable, but if e.g. Jens prefers to take it
> via the block tree, I'll be glad to leave it for him to deal with.

Put it in the vfs tree (I'm hoping for a pull request soon..)

I pulled the block trees from Jens yesterday, so there is presumably
nothing pending there right now.

              Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-06  3:38               ` Linus Torvalds
@ 2015-11-06 16:06                 ` Jens Axboe
  2015-11-11  2:21                 ` Linus Torvalds
  1 sibling, 0 replies; 35+ messages in thread
From: Jens Axboe @ 2015-11-06 16:06 UTC (permalink / raw)
  To: Linus Torvalds, Al Viro
  Cc: Sasha Levin, Andrey Ryabinin, Matthew Wilcox, Chuck Ebbert,
	linux-fsdevel, LKML, Dan Williams

On 11/05/2015 08:38 PM, Linus Torvalds wrote:
> On Thu, Nov 5, 2015 at 6:19 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> How are we going to handle that one?  I can put it into mainline pull
>> request via vfs.git, with Cc: stable, but if e.g. Jens prefers to take it
>> via the block tree, I'll be glad to leave it for him to deal with.
>
> Put it in the vfs tree (I'm hoping for a pull request soon..)
>
> I pulled the block trees from Jens yesterday, so there is presumably
> nothing pending there right now.

Either way is obviously fine with me. I have 4 patches pending, but 
unless more urgent things show up, I was going to continue collecting 
fixes and submit that post -rc1.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-06  3:38               ` Linus Torvalds
  2015-11-06 16:06                 ` Jens Axboe
@ 2015-11-11  2:21                 ` Linus Torvalds
  2015-11-11  2:25                   ` Jens Axboe
  2015-11-11  2:56                   ` Al Viro
  1 sibling, 2 replies; 35+ messages in thread
From: Linus Torvalds @ 2015-11-11  2:21 UTC (permalink / raw)
  To: Al Viro
  Cc: Sasha Levin, Andrey Ryabinin, Matthew Wilcox, Chuck Ebbert,
	linux-fsdevel, LKML, Jens Axboe, Dan Williams

Al, ping?

On Thu, Nov 5, 2015 at 7:38 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Nov 5, 2015 at 6:19 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> How are we going to handle that one?  I can put it into mainline pull
>> request via vfs.git, with Cc: stable, but if e.g. Jens prefers to take it
>> via the block tree, I'll be glad to leave it for him to deal with.
>
> Put it in the vfs tree (I'm hoping for a pull request soon..)
>
> I pulled the block trees from Jens yesterday, so there is presumably
> nothing pending there right now.

Apparently my "hoping for a pull request soon" was ridiculously optimistic.

Al, looking at the most recent linux-next, most of the vfs commits
there seem to be committed in the last day or two. I'm getting the
feeling that that is all 4.5 material by now.

Should I just take the iov patch as-is, since apparently no vfs pull
request is happening this merge cycle? And no, I'm not taking
"developed during the second week of the merge window, and sent in the
last few days of it". I'm done with that.

                    Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:21                 ` Linus Torvalds
@ 2015-11-11  2:25                   ` Jens Axboe
  2015-11-11  2:31                     ` Linus Torvalds
  2015-11-11  2:56                   ` Al Viro
  1 sibling, 1 reply; 35+ messages in thread
From: Jens Axboe @ 2015-11-11  2:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Dan Williams

On Tue, Nov 10 2015, Linus Torvalds wrote:
> Al, ping?
> 
> On Thu, Nov 5, 2015 at 7:38 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Thu, Nov 5, 2015 at 6:19 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >>
> >> How are we going to handle that one?  I can put it into mainline pull
> >> request via vfs.git, with Cc: stable, but if e.g. Jens prefers to take it
> >> via the block tree, I'll be glad to leave it for him to deal with.
> >
> > Put it in the vfs tree (I'm hoping for a pull request soon..)
> >
> > I pulled the block trees from Jens yesterday, so there is presumably
> > nothing pending there right now.
> 
> Apparently my "hoping for a pull request soon" was ridiculously optimistic.
> 
> Al, looking at the most recent linux-next, most of the vfs commits
> there seem to be committed in the last day or two. I'm getting the
> feeling that that is all 4.5 material by now.
> 
> Should I just take the iov patch as-is, since apparently no vfs pull
> request is happening this merge cycle? And no, I'm not taking
> "developed during the second week of the merge window, and sent in the
> last few days of it". I'm done with that.

I've got 8 other patches pending for a post core merge, just waiting for
the last core pull request to go in. I haven't seen this iov iter fix,
though.



  git://git.kernel.dk/linux-block.git for-linus


----------------------------------------------------------------
Jan Kara (1):
      brd: Refuse improperly aligned discard requests

Jens Axboe (2):
      MAINTAINERS: add reference to new linux-block list
      blk-mq: mark __blk_mq_complete_request() static

Randy Dunlap (1):
      block: fix blk-core.c kernel-doc warning

Sathyavathi M (1):
      NVMe: Increase the max transfer size when mdts is 0

Stephan Günther (2):
      NVMe: use split lo_hi_{read,write}q
      NVMe: add support for Apple NVMe controller

Vivek Goyal (1):
      fs/block_dev.c: Remove WARN_ON() when inode writeback fails

 MAINTAINERS             |  1 +
 block/blk-core.c        |  3 +++
 block/blk-mq.c          |  2 +-
 block/blk-mq.h          |  1 -
 drivers/block/brd.c     |  3 +++
 drivers/nvme/host/pci.c | 15 +++++++++------
 fs/block_dev.c          | 15 ++++++++++++---
 7 files changed, 29 insertions(+), 11 deletions(-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:25                   ` Jens Axboe
@ 2015-11-11  2:31                     ` Linus Torvalds
  2015-11-11  2:40                       ` Jens Axboe
  2015-11-11  3:20                       ` Sasha Levin
  0 siblings, 2 replies; 35+ messages in thread
From: Linus Torvalds @ 2015-11-11  2:31 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Al Viro, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Dan Williams

On Tue, Nov 10, 2015 at 6:25 PM, Jens Axboe <axboe@kernel.dk> wrote:
> On Tue, Nov 10 2015, Linus Torvalds wrote:
>> Al, ping?
>>
>> On Thu, Nov 5, 2015 at 7:38 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> > On Thu, Nov 5, 2015 at 6:19 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> >>
>> >> How are we going to handle that one?  I can put it into mainline pull
>> >> request via vfs.git, with Cc: stable, but if e.g. Jens prefers to take it
>> >> via the block tree, I'll be glad to leave it for him to deal with.
>> >
>> > Put it in the vfs tree (I'm hoping for a pull request soon..)
>> >
>> > I pulled the block trees from Jens yesterday, so there is presumably
>> > nothing pending there right now.
>>
>> Apparently my "hoping for a pull request soon" was ridiculously optimistic.
>>
>> Al, looking at the most recent linux-next, most of the vfs commits
>> there seem to be committed in the last day or two. I'm getting the
>> feeling that that is all 4.5 material by now.
>>
>> Should I just take the iov patch as-is, since apparently no vfs pull
>> request is happening this merge cycle? And no, I'm not taking
>> "developed during the second week of the merge window, and sent in the
>> last few days of it". I'm done with that.
>
> I've got 8 other patches pending for a post core merge, just waiting for
> the last core pull request to go in. I haven't seen this iov iter fix,
> though.

It was in this thread, looked like this (without the whitespace damage):

    dax_io(): don't let non-error value escape via retval instead of EFAULT

    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
    ---
    diff --git a/fs/dax.c b/fs/dax.c
    index a86d3cc..7b653e9 100644
    --- a/fs/dax.c
    +++ b/fs/dax.c
    @@ -169,8 +169,10 @@ static ssize_t dax_io(struct inode *inode,
struct iov_iter *iter,
                    else
                            len = iov_iter_zero(max - pos, iter);

    -               if (!len)
    +               if (!len) {
    +                       retval = -EFAULT;
                            break;
    +               }

                    pos += len;
                    addr += len;


although I don't think I saw a confirmation that that was what Sasha
actually hit (but Sasha had narrowed it down to DAX, so it looks
possible/likely)

                    Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:31                     ` Linus Torvalds
@ 2015-11-11  2:40                       ` Jens Axboe
  2015-11-11  2:41                         ` Jens Axboe
  2015-11-11  3:20                       ` Sasha Levin
  1 sibling, 1 reply; 35+ messages in thread
From: Jens Axboe @ 2015-11-11  2:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Dan Williams

On 11/10/2015 07:31 PM, Linus Torvalds wrote:
> On Tue, Nov 10, 2015 at 6:25 PM, Jens Axboe <axboe@kernel.dk> wrote:
>> On Tue, Nov 10 2015, Linus Torvalds wrote:
>>> Al, ping?
>>>
>>> On Thu, Nov 5, 2015 at 7:38 PM, Linus Torvalds
>>> <torvalds@linux-foundation.org> wrote:
>>>> On Thu, Nov 5, 2015 at 6:19 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>>>>
>>>>> How are we going to handle that one?  I can put it into mainline pull
>>>>> request via vfs.git, with Cc: stable, but if e.g. Jens prefers to take it
>>>>> via the block tree, I'll be glad to leave it for him to deal with.
>>>>
>>>> Put it in the vfs tree (I'm hoping for a pull request soon..)
>>>>
>>>> I pulled the block trees from Jens yesterday, so there is presumably
>>>> nothing pending there right now.
>>>
>>> Apparently my "hoping for a pull request soon" was ridiculously optimistic.
>>>
>>> Al, looking at the most recent linux-next, most of the vfs commits
>>> there seem to be committed in the last day or two. I'm getting the
>>> feeling that that is all 4.5 material by now.
>>>
>>> Should I just take the iov patch as-is, since apparently no vfs pull
>>> request is happening this merge cycle? And no, I'm not taking
>>> "developed during the second week of the merge window, and sent in the
>>> last few days of it". I'm done with that.
>>
>> I've got 8 other patches pending for a post core merge, just waiting for
>> the last core pull request to go in. I haven't seen this iov iter fix,
>> though.
>
> It was in this thread, looked like this (without the whitespace damage):
>
>      dax_io(): don't let non-error value escape via retval instead of EFAULT
>
>      Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>      ---
>      diff --git a/fs/dax.c b/fs/dax.c
>      index a86d3cc..7b653e9 100644
>      --- a/fs/dax.c
>      +++ b/fs/dax.c
>      @@ -169,8 +169,10 @@ static ssize_t dax_io(struct inode *inode,
> struct iov_iter *iter,
>                      else
>                              len = iov_iter_zero(max - pos, iter);
>
>      -               if (!len)
>      +               if (!len) {
>      +                       retval = -EFAULT;
>                              break;
>      +               }
>
>                      pos += len;
>                      addr += len;
>
>
> although I don't think I saw a confirmation that that was what Sasha
> actually hit (but Sasha had narrowed it down to DAX, so it looks
> possible/likely)

I found it right after sending that email. Patch looks pretty straight 
forward, at least from the case of max - pos != 0 and len == 0 on 
return. Might be cleaner to add a

if (retval < 0)
     break;

check, that should be the case where max == pos anyway. But we'd 
potentially return -Exx into -EFAULT for that case with the patch.

Hmm?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:40                       ` Jens Axboe
@ 2015-11-11  2:41                         ` Jens Axboe
  2015-11-11  2:44                           ` Jens Axboe
  0 siblings, 1 reply; 35+ messages in thread
From: Jens Axboe @ 2015-11-11  2:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Dan Williams

On 11/10/2015 07:40 PM, Jens Axboe wrote:
> On 11/10/2015 07:31 PM, Linus Torvalds wrote:
>> On Tue, Nov 10, 2015 at 6:25 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>> On Tue, Nov 10 2015, Linus Torvalds wrote:
>>>> Al, ping?
>>>>
>>>> On Thu, Nov 5, 2015 at 7:38 PM, Linus Torvalds
>>>> <torvalds@linux-foundation.org> wrote:
>>>>> On Thu, Nov 5, 2015 at 6:19 PM, Al Viro <viro@zeniv.linux.org.uk>
>>>>> wrote:
>>>>>>
>>>>>> How are we going to handle that one?  I can put it into mainline pull
>>>>>> request via vfs.git, with Cc: stable, but if e.g. Jens prefers to
>>>>>> take it
>>>>>> via the block tree, I'll be glad to leave it for him to deal with.
>>>>>
>>>>> Put it in the vfs tree (I'm hoping for a pull request soon..)
>>>>>
>>>>> I pulled the block trees from Jens yesterday, so there is presumably
>>>>> nothing pending there right now.
>>>>
>>>> Apparently my "hoping for a pull request soon" was ridiculously
>>>> optimistic.
>>>>
>>>> Al, looking at the most recent linux-next, most of the vfs commits
>>>> there seem to be committed in the last day or two. I'm getting the
>>>> feeling that that is all 4.5 material by now.
>>>>
>>>> Should I just take the iov patch as-is, since apparently no vfs pull
>>>> request is happening this merge cycle? And no, I'm not taking
>>>> "developed during the second week of the merge window, and sent in the
>>>> last few days of it". I'm done with that.
>>>
>>> I've got 8 other patches pending for a post core merge, just waiting for
>>> the last core pull request to go in. I haven't seen this iov iter fix,
>>> though.
>>
>> It was in this thread, looked like this (without the whitespace damage):
>>
>>      dax_io(): don't let non-error value escape via retval instead of
>> EFAULT
>>
>>      Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>>      ---
>>      diff --git a/fs/dax.c b/fs/dax.c
>>      index a86d3cc..7b653e9 100644
>>      --- a/fs/dax.c
>>      +++ b/fs/dax.c
>>      @@ -169,8 +169,10 @@ static ssize_t dax_io(struct inode *inode,
>> struct iov_iter *iter,
>>                      else
>>                              len = iov_iter_zero(max - pos, iter);
>>
>>      -               if (!len)
>>      +               if (!len) {
>>      +                       retval = -EFAULT;
>>                              break;
>>      +               }
>>
>>                      pos += len;
>>                      addr += len;
>>
>>
>> although I don't think I saw a confirmation that that was what Sasha
>> actually hit (but Sasha had narrowed it down to DAX, so it looks
>> possible/likely)
>
> I found it right after sending that email. Patch looks pretty straight
> forward, at least from the case of max - pos != 0 and len == 0 on
> return. Might be cleaner to add a
>
> if (retval < 0)
>      break;
>
> check, that should be the case where max == pos anyway. But we'd
> potentially return -Exx into -EFAULT for that case with the patch.
>
> Hmm?

So we already do that, in the 'if' above. I think the patch looks fine.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:41                         ` Jens Axboe
@ 2015-11-11  2:44                           ` Jens Axboe
  2015-11-11  3:06                             ` Al Viro
  0 siblings, 1 reply; 35+ messages in thread
From: Jens Axboe @ 2015-11-11  2:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Dan Williams

On 11/10/2015 07:41 PM, Jens Axboe wrote:
> On 11/10/2015 07:40 PM, Jens Axboe wrote:
>> On 11/10/2015 07:31 PM, Linus Torvalds wrote:
>>> On Tue, Nov 10, 2015 at 6:25 PM, Jens Axboe <axboe@kernel.dk> wrote:
>>>> On Tue, Nov 10 2015, Linus Torvalds wrote:
>>>>> Al, ping?
>>>>>
>>>>> On Thu, Nov 5, 2015 at 7:38 PM, Linus Torvalds
>>>>> <torvalds@linux-foundation.org> wrote:
>>>>>> On Thu, Nov 5, 2015 at 6:19 PM, Al Viro <viro@zeniv.linux.org.uk>
>>>>>> wrote:
>>>>>>>
>>>>>>> How are we going to handle that one?  I can put it into mainline
>>>>>>> pull
>>>>>>> request via vfs.git, with Cc: stable, but if e.g. Jens prefers to
>>>>>>> take it
>>>>>>> via the block tree, I'll be glad to leave it for him to deal with.
>>>>>>
>>>>>> Put it in the vfs tree (I'm hoping for a pull request soon..)
>>>>>>
>>>>>> I pulled the block trees from Jens yesterday, so there is presumably
>>>>>> nothing pending there right now.
>>>>>
>>>>> Apparently my "hoping for a pull request soon" was ridiculously
>>>>> optimistic.
>>>>>
>>>>> Al, looking at the most recent linux-next, most of the vfs commits
>>>>> there seem to be committed in the last day or two. I'm getting the
>>>>> feeling that that is all 4.5 material by now.
>>>>>
>>>>> Should I just take the iov patch as-is, since apparently no vfs pull
>>>>> request is happening this merge cycle? And no, I'm not taking
>>>>> "developed during the second week of the merge window, and sent in the
>>>>> last few days of it". I'm done with that.
>>>>
>>>> I've got 8 other patches pending for a post core merge, just waiting
>>>> for
>>>> the last core pull request to go in. I haven't seen this iov iter fix,
>>>> though.
>>>
>>> It was in this thread, looked like this (without the whitespace damage):
>>>
>>>      dax_io(): don't let non-error value escape via retval instead of
>>> EFAULT
>>>
>>>      Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>>>      ---
>>>      diff --git a/fs/dax.c b/fs/dax.c
>>>      index a86d3cc..7b653e9 100644
>>>      --- a/fs/dax.c
>>>      +++ b/fs/dax.c
>>>      @@ -169,8 +169,10 @@ static ssize_t dax_io(struct inode *inode,
>>> struct iov_iter *iter,
>>>                      else
>>>                              len = iov_iter_zero(max - pos, iter);
>>>
>>>      -               if (!len)
>>>      +               if (!len) {
>>>      +                       retval = -EFAULT;
>>>                              break;
>>>      +               }
>>>
>>>                      pos += len;
>>>                      addr += len;
>>>
>>>
>>> although I don't think I saw a confirmation that that was what Sasha
>>> actually hit (but Sasha had narrowed it down to DAX, so it looks
>>> possible/likely)
>>
>> I found it right after sending that email. Patch looks pretty straight
>> forward, at least from the case of max - pos != 0 and len == 0 on
>> return. Might be cleaner to add a
>>
>> if (retval < 0)
>>      break;
>>
>> check, that should be the case where max == pos anyway. But we'd
>> potentially return -Exx into -EFAULT for that case with the patch.
>>
>> Hmm?
>
> So we already do that, in the 'if' above. I think the patch looks fine.

Queued up. Unless Al objects, it'll be part of the 'for-linus' pull 
later this week.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:21                 ` Linus Torvalds
  2015-11-11  2:25                   ` Jens Axboe
@ 2015-11-11  2:56                   ` Al Viro
  2015-11-11  3:30                     ` Al Viro
  1 sibling, 1 reply; 35+ messages in thread
From: Al Viro @ 2015-11-11  2:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sasha Levin, Andrey Ryabinin, Matthew Wilcox, Chuck Ebbert,
	linux-fsdevel, LKML, Jens Axboe, Dan Williams

On Tue, Nov 10, 2015 at 06:21:47PM -0800, Linus Torvalds wrote:

> Al, looking at the most recent linux-next, most of the vfs commits
> there seem to be committed in the last day or two. I'm getting the
> feeling that that is all 4.5 material by now.
> 
> Should I just take the iov patch as-is, since apparently no vfs pull
> request is happening this merge cycle? And no, I'm not taking
> "developed during the second week of the merge window, and sent in the
> last few days of it". I'm done with that.

s/developed/rebased/, actually, but... point taken.  Mea culpa, and what
to do with those patches is for you to decide; some of those are simply
-stable fodder and probably ought to go one-by-one at any point you would
consider convenient, some are of the "remove stale comment" variety (obviously
can sit around until the next cycle, or go in one-by-one at any point - the
things like
-
-       /* WARNING: probably going away soon, do not use! */
in inode_operations; the comment used to be about the method removed last
cycle and should've been gone with it; etc.)

There's a large pile not in those two classes - xattr+richacl stuff.  I'm more
confident about the first part, but strictly speaking neither qualifies as
fixes.

FWIW, the stuff that had been _developed_ during the merge window is not there
- a patch series around the descriptor bitmaps.  Doesn't change the situation;
I'd fucked up this cycle ;-/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:44                           ` Jens Axboe
@ 2015-11-11  3:06                             ` Al Viro
  2015-11-11  3:07                               ` Jens Axboe
  0 siblings, 1 reply; 35+ messages in thread
From: Al Viro @ 2015-11-11  3:06 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Linus Torvalds, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Dan Williams

On Tue, Nov 10, 2015 at 07:44:14PM -0700, Jens Axboe wrote:

> Queued up. Unless Al objects, it'll be part of the 'for-linus' pull
> later this week.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: stable@vger.kernel.org # 4.0+

probably ought to be there...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  3:06                             ` Al Viro
@ 2015-11-11  3:07                               ` Jens Axboe
  0 siblings, 0 replies; 35+ messages in thread
From: Jens Axboe @ 2015-11-11  3:07 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Dan Williams

On 11/10/2015 08:06 PM, Al Viro wrote:
> On Tue, Nov 10, 2015 at 07:44:14PM -0700, Jens Axboe wrote:
>
>> Queued up. Unless Al objects, it'll be part of the 'for-linus' pull
>> later this week.
>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Cc: stable@vger.kernel.org # 4.0+
>
> probably ought to be there...

Agree, done.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:31                     ` Linus Torvalds
  2015-11-11  2:40                       ` Jens Axboe
@ 2015-11-11  3:20                       ` Sasha Levin
  1 sibling, 0 replies; 35+ messages in thread
From: Sasha Levin @ 2015-11-11  3:20 UTC (permalink / raw)
  To: Linus Torvalds, Jens Axboe
  Cc: Al Viro, Andrey Ryabinin, Matthew Wilcox, Chuck Ebbert,
	linux-fsdevel, LKML, Dan Williams

On 11/10/2015 09:31 PM, Linus Torvalds wrote:
> although I don't think I saw a confirmation that that was what Sasha
> actually hit (but Sasha had narrowed it down to DAX, so it looks
> possible/likely)

Yup, that indeed fixed the problem I was seeing.

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  2:56                   ` Al Viro
@ 2015-11-11  3:30                     ` Al Viro
  2015-11-11  4:36                       ` Linus Torvalds
  0 siblings, 1 reply; 35+ messages in thread
From: Al Viro @ 2015-11-11  3:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sasha Levin, Andrey Ryabinin, Matthew Wilcox, Chuck Ebbert,
	linux-fsdevel, LKML, Jens Axboe, Dan Williams

On Wed, Nov 11, 2015 at 02:56:47AM +0000, Al Viro wrote:
> s/developed/rebased/, actually, but... point taken.  Mea culpa, and what
> to do with those patches is for you to decide; some of those are simply
> -stable fodder and probably ought to go one-by-one at any point you would
> consider convenient, some are of the "remove stale comment" variety (obviously
> can sit around until the next cycle, or go in one-by-one at any point - the
> things like
> -
> -       /* WARNING: probably going away soon, do not use! */
> in inode_operations; the comment used to be about the method removed last
> cycle and should've been gone with it; etc.)

FWIW, here's what's in there:
	dax_io fix
Jens has just taken it
	fs: fix inode.c kernel-doc warning
	fs: fix writeback.c kernel-doc warnings
trivial comment patches
	overlayfs: move super block magic number to magic.h
got picked into overlayfs tree yesterday
	debugfs: fix refcount imbalance in start_creating
old fix, -stable fodder (had been first posted in October, IIRC)
	vfs: Check attribute names in posix acl xattr handers
	vfs: Fix the posix_acl_xattr_list return value
	ubifs: Remove unused security xattr handler
	hfsplus: Remove unused xattr handler list operations
	jffs2: Add missing capability check for listing trusted xattrs
	xattr handlers: Pass handler to operations instead of flags
	9p: xattr simplifications
	squashfs: xattr simplifications
	f2fs: xattr simplifications
xattr series; the first two are arguably fixes, and whatever happens in this
window, I'm taking the rest into -next for 4.5.  Series makes sense and
cleans the things nicely, IMO.
	FS-Cache: Increase reference of parent after registering, netfs success
	FS-Cache: Don't override netfs's primary_index if registering failed
	cachefiles: perform test on s_blocksize when opening cache file.
	FS-Cache: Handle a write to the page immediately beyond the EOF marker
1, 2 and 4 are simply -stable fodder, 3 is an obvious optimization.
	binfmt_elf: Don't clobber passed executable's file header
	binfmt_elf: Correct `arch_check_elf's description
-stable fodder.
	fs/pipe.c: preserve alloc_file() error code
	fs/pipe.c: return error code rather than 0 in pipe_write()
-stable fodder.
	vfs: remove unused wrapper block_page_mkwrite()
	vfs: remove stale comment in inode_operations
dead code and stale comment removal.  Can go at any point.
	fs: 9p: cache.h: Add #define of include guard
trivial, can go at any point, or stay until the next cycle.
	richacl series
probably misses the window - I'd really like to hear more detailed variant
of Christoph's objections in any case.

Again, my apologies to everyone involved - I'd fucked up, badly.  The only
question is how much PITA it will end up causing.  I can put those into
separate branches and/or mail directly; what ends up missing the window
will go into vfs.git#for-next as soon as -rc1 is out there (with the
possible exception of richacl stuff - I really want to hear from Christoph
and in more details than "it's all been said some iterations ago").

Linus, what would be your preference wrt that stuff?  Besides the "don't
ever do that kind of shit again", that is - that much is obvious.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  3:30                     ` Al Viro
@ 2015-11-11  4:36                       ` Linus Torvalds
  2015-11-11  7:43                         ` Al Viro
  0 siblings, 1 reply; 35+ messages in thread
From: Linus Torvalds @ 2015-11-11  4:36 UTC (permalink / raw)
  To: Al Viro
  Cc: Sasha Levin, Andrey Ryabinin, Matthew Wilcox, Chuck Ebbert,
	linux-fsdevel, LKML, Jens Axboe, Dan Williams

On Tue, Nov 10, 2015 at 7:30 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Linus, what would be your preference wrt that stuff?

If you can just create a branch with the stuff that is obvious and
clearly worth it (ie stuff that would basically be stable material
anyway), I'll just merge it.  Assuming it's all done in some
reasonable timeframe..

               Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  4:36                       ` Linus Torvalds
@ 2015-11-11  7:43                         ` Al Viro
  2015-11-11  8:16                           ` Stephen Rothwell
  0 siblings, 1 reply; 35+ messages in thread
From: Al Viro @ 2015-11-11  7:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sasha Levin, Andrey Ryabinin, Matthew Wilcox, Chuck Ebbert,
	linux-fsdevel, LKML, Jens Axboe, Dan Williams

On Tue, Nov 10, 2015 at 08:36:48PM -0800, Linus Torvalds wrote:
> On Tue, Nov 10, 2015 at 7:30 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Linus, what would be your preference wrt that stuff?
> 
> If you can just create a branch with the stuff that is obvious and
> clearly worth it (ie stuff that would basically be stable material
> anyway), I'll just merge it.  Assuming it's all done in some
> reasonable timeframe..

OK...  Right now I have #for-linus-stable and #for-linus-2 on top
of it, the latter adding several comment fixes, etc., the most serious
change among which is the removal of never used block_page_mkwrite().

dax_io fix isn't there, neither is overlayfs magic.h patch - both are
already in other trees.  I would like to get xattr series in as well,
but that's a separate pull request, if you'd accept them in this window in
the first place.  richacl stuff isn't there as well, and I think that one
is clear "leave it for 4.5" fodder.

Anyway, for
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus-2
(both -stable fodder and trivial patches)

Shortlog:
Daniel Borkmann (1):
      debugfs: fix refcount imbalance in start_creating

David Howells (1):
      FS-Cache: Handle a write to the page immediately beyond the EOF marker

Eric Biggers (2):
      fs/pipe.c: preserve alloc_file() error code
      fs/pipe.c: return error code rather than 0 in pipe_write()

Kinglong Mee (2):
      FS-Cache: Increase reference of parent after registering, netfs success
      FS-Cache: Don't override netfs's primary_index if registering failed

Maciej W. Rozycki (2):
      binfmt_elf: Don't clobber passed executable's file header
      binfmt_elf: Correct `arch_check_elf's description

NeilBrown (1):
      cachefiles: perform test on s_blocksize when opening cache file.

Randy Dunlap (2):
      fs: fix inode.c kernel-doc warning
      fs: fix writeback.c kernel-doc warnings

Ross Zwisler (2):
      vfs: remove unused wrapper block_page_mkwrite()
      vfs: remove stale comment in inode_operations

Tzvetelin Katchov (1):
      fs: 9p: cache.h: Add #define of include guard

Diffstat:
 fs/9p/cache.h               |  1 +
 fs/binfmt_elf.c             | 12 ++++----
 fs/buffer.c                 | 24 ++-------------
 fs/cachefiles/namei.c       |  2 ++
 fs/cachefiles/rdwr.c        | 73 +++++++++++++++++++++++----------------------
 fs/debugfs/inode.c          |  6 +++-
 fs/ext4/inode.c             |  4 +--
 fs/fs-writeback.c           |  3 +-
 fs/fscache/netfs.c          | 38 +++++++++++------------
 fs/fscache/page.c           |  2 +-
 fs/inode.c                  |  1 +
 fs/nilfs2/file.c            |  2 +-
 fs/pipe.c                   | 18 ++++++-----
 fs/xfs/xfs_file.c           |  2 +-
 include/linux/buffer_head.h |  2 --
 include/linux/fs.h          |  2 --
 16 files changed, 89 insertions(+), 103 deletions(-)

If you'd prefer to do that in two separate pulls - yell (or just pull
#for-linux-stable first, then this on top of it).  I'd reordered
#for-next so that it continues #for-linus-2; tree of its tip being the
same as yesterday.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  7:43                         ` Al Viro
@ 2015-11-11  8:16                           ` Stephen Rothwell
  2015-11-11 10:19                             ` Al Viro
  0 siblings, 1 reply; 35+ messages in thread
From: Stephen Rothwell @ 2015-11-11  8:16 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Jens Axboe, Dan Williams

Hi Al,

On Wed, 11 Nov 2015 07:43:30 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote:
>
> dax_io fix isn't there, neither is overlayfs magic.h patch - both are
> already in other trees.  I would like to get xattr series in as well,
> but that's a separate pull request, if you'd accept them in this window in
> the first place.  richacl stuff isn't there as well, and I think that one
> is clear "leave it for 4.5" fodder.

So could you please remove the 4.5 stuff from your for-next branch
until after the merge window closes.

Also, I noticed these new warnings today:

fs/orangefs/xattr.c:509:9: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
  .get = pvfs2_xattr_get_trusted,
         ^
fs/orangefs/xattr.c:509:9: note: (near initialization for 'pvfs2_xattr_trusted_handler.get')
fs/orangefs/xattr.c:510:9: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
  .set = pvfs2_xattr_set_trusted,
         ^
fs/orangefs/xattr.c:510:9: note: (near initialization for 'pvfs2_xattr_trusted_handler.set')
fs/orangefs/xattr.c:520:9: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
  .get = pvfs2_xattr_get_default,
         ^
fs/orangefs/xattr.c:520:9: note: (near initialization for 'pvfs2_xattr_default_handler.get')
fs/orangefs/xattr.c:521:9: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
  .set = pvfs2_xattr_set_default,
         ^
fs/orangefs/xattr.c:521:9: note: (near initialization for 'pvfs2_xattr_default_handler.set')

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11  8:16                           ` Stephen Rothwell
@ 2015-11-11 10:19                             ` Al Viro
  2015-11-11 10:28                               ` Stephen Rothwell
  2015-11-11 16:33                               ` Al Viro
  0 siblings, 2 replies; 35+ messages in thread
From: Al Viro @ 2015-11-11 10:19 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linus Torvalds, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Jens Axboe, Dan Williams

On Wed, Nov 11, 2015 at 07:16:36PM +1100, Stephen Rothwell wrote:
> Hi Al,
> 
> On Wed, 11 Nov 2015 07:43:30 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote:
> >
> > dax_io fix isn't there, neither is overlayfs magic.h patch - both are
> > already in other trees.  I would like to get xattr series in as well,
> > but that's a separate pull request, if you'd accept them in this window in
> > the first place.  richacl stuff isn't there as well, and I think that one
> > is clear "leave it for 4.5" fodder.
> 
> So could you please remove the 4.5 stuff from your for-next branch
> until after the merge window closes.

Done.

> Also, I noticed these new warnings today:
> 
> fs/orangefs/xattr.c:509:9: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
>   .get = pvfs2_xattr_get_trusted,
>          ^
> fs/orangefs/xattr.c:509:9: note: (near initialization for 'pvfs2_xattr_trusted_handler.get')
> fs/orangefs/xattr.c:510:9: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
>   .set = pvfs2_xattr_set_trusted,
>          ^
> fs/orangefs/xattr.c:510:9: note: (near initialization for 'pvfs2_xattr_trusted_handler.set')
> fs/orangefs/xattr.c:520:9: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
>   .get = pvfs2_xattr_get_default,
>          ^
> fs/orangefs/xattr.c:520:9: note: (near initialization for 'pvfs2_xattr_default_handler.get')
> fs/orangefs/xattr.c:521:9: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
>   .set = pvfs2_xattr_set_default,
>          ^
> fs/orangefs/xattr.c:521:9: note: (near initialization for 'pvfs2_xattr_default_handler.set')

That's "xattr handlers: Pass handler to operations instead of flags" fallout,
trivially adjusted (typical change is
-ext2_xattr_security_list(struct dentry *dentry, char *list, size_t list_size,
-                        const char *name, size_t name_len, int type)
+ext2_xattr_security_list(const struct xattr_handler *handler,
+                        struct dentry *dentry, char *list, size_t list_size,
+                        const char *name, size_t name_len)
with type replaced with handler->flags if it's used anywhere in the body;
AFAICS, none of orangefs instances use it at all, so it's just a matter of
changing the argument lists in pvfs2_xattr_[gs]et_{default,trusted},
adding const struct xattr_handler *handler in the beginning and removing
the last argument; callers in pvfs2_ioctl() should simply use
pvfs2_inode_[gs]etxattr()).

Note, however, that orangefs in linux-next lacks a lot of fixes (see
vfs.git#orangefs-untested for some; AFAICS, those are missing from all
branches in orangefs git tree) and there are problems I don't know
how to fix, mostly due to the lack of documentation.  The last I've
heard from them was that they were putting such docs together; hopefully
once that get done we'll be able to sort the rest of that thing out.
It'll be after -rc1, though.

So xattr conflicts are the least of the problems there; those are easy
to adjust for, there are more serious issues in the entire thing ;-/
BTW, while we are at it - pvfs2_listxattr() doesn't even validate
resp.listxattr.returned_count, so a bogus response from buggered
server will do really interesting things to the kernel.

I'll cook the minimal fixup for API change after I get some sleep and
send it your way, unless somebody gets there first...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11 10:19                             ` Al Viro
@ 2015-11-11 10:28                               ` Stephen Rothwell
  2015-11-11 16:25                                 ` Mike Marshall
  2015-11-11 16:33                               ` Al Viro
  1 sibling, 1 reply; 35+ messages in thread
From: Stephen Rothwell @ 2015-11-11 10:28 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Jens Axboe, Dan Williams

Hi Al,

On Wed, 11 Nov 2015 10:19:48 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote:
>
> On Wed, Nov 11, 2015 at 07:16:36PM +1100, Stephen Rothwell wrote:
> > 
> > So could you please remove the 4.5 stuff from your for-next branch
> > until after the merge window closes.  
> 
> Done.

Thanks.

> > Also, I noticed these new warnings today:
> > 
> I'll cook the minimal fixup for API change after I get some sleep and
> send it your way, unless somebody gets there first...

Thanks again.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11 10:28                               ` Stephen Rothwell
@ 2015-11-11 16:25                                 ` Mike Marshall
  2015-11-11 16:36                                   ` Al Viro
  0 siblings, 1 reply; 35+ messages in thread
From: Mike Marshall @ 2015-11-11 16:25 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Al Viro, Linus Torvalds, Sasha Levin, Andrey Ryabinin,
	Matthew Wilcox, Chuck Ebbert, linux-fsdevel, LKML, Jens Axboe,
	Dan Williams

I'm the Orangefs guy...

If the orangefs warnings that people see because of what's in
linux-next is annoying, I could focus on quieting them down...

We've been focusing on code review and documentation ever
since our last big exchange with Al and Linus...

-Mike

On Wed, Nov 11, 2015 at 5:28 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> Hi Al,
>
> On Wed, 11 Nov 2015 10:19:48 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote:
>>
>> On Wed, Nov 11, 2015 at 07:16:36PM +1100, Stephen Rothwell wrote:
>> >
>> > So could you please remove the 4.5 stuff from your for-next branch
>> > until after the merge window closes.
>>
>> Done.
>
> Thanks.
>
>> > Also, I noticed these new warnings today:
>> >
>> I'll cook the minimal fixup for API change after I get some sleep and
>> send it your way, unless somebody gets there first...
>
> Thanks again.
>
> --
> Cheers,
> Stephen Rothwell                    sfr@canb.auug.org.au
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11 10:19                             ` Al Viro
  2015-11-11 10:28                               ` Stephen Rothwell
@ 2015-11-11 16:33                               ` Al Viro
  2015-11-11 21:47                                 ` Stephen Rothwell
  1 sibling, 1 reply; 35+ messages in thread
From: Al Viro @ 2015-11-11 16:33 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linus Torvalds, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Jens Axboe, Dan Williams

On Wed, Nov 11, 2015 at 10:19:48AM +0000, Al Viro wrote:

> I'll cook the minimal fixup for API change after I get some sleep and
> send it your way, unless somebody gets there first...

This should do it - switches ->ioctl() to pvfs2_inode_[gs]etxattr() and
converts xattr_handler ->[gs]et() to new API.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/orangefs/file.c b/fs/orangefs/file.c
index feb1764..3d6ffe0 100644
--- a/fs/orangefs/file.c
+++ b/fs/orangefs/file.c
@@ -793,11 +793,10 @@ static long pvfs2_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	 */
 	if (cmd == FS_IOC_GETFLAGS) {
 		val = 0;
-		ret = pvfs2_xattr_get_default(file->f_path.dentry,
-					      "user.pvfs2.meta_hint",
-					      &val,
-					      sizeof(val),
-					      0);
+		ret = pvfs2_inode_getxattr(file_inode(file),
+					   PVFS2_XATTR_NAME_DEFAULT_PREFIX,
+					   "user.pvfs2.meta_hint",
+					   &val, sizeof(val));
 		if (ret < 0 && ret != -ENODATA)
 			return ret;
 		else if (ret == -ENODATA)
@@ -827,12 +826,10 @@ static long pvfs2_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		gossip_debug(GOSSIP_FILE_DEBUG,
 			     "pvfs2_ioctl: FS_IOC_SETFLAGS: %llu\n",
 			     (unsigned long long)val);
-		ret = pvfs2_xattr_set_default(file->f_path.dentry,
-					      "user.pvfs2.meta_hint",
-					      &val,
-					      sizeof(val),
-					      0,
-					      0);
+		ret = pvfs2_inode_setxattr(file_inode(file),
+					   PVFS2_XATTR_NAME_DEFAULT_PREFIX,
+					   "user.pvfs2.meta_hint",
+					   &val, sizeof(val), 0);
 	}
 
 	return ret;
diff --git a/fs/orangefs/pvfs2-kernel.h b/fs/orangefs/pvfs2-kernel.h
index 29b4a48..43339c6 100644
--- a/fs/orangefs/pvfs2-kernel.h
+++ b/fs/orangefs/pvfs2-kernel.h
@@ -237,19 +237,6 @@ extern const struct xattr_handler *pvfs2_xattr_handlers[];
 extern struct posix_acl *pvfs2_get_acl(struct inode *inode, int type);
 extern int pvfs2_set_acl(struct inode *inode, struct posix_acl *acl, int type);
 
-int pvfs2_xattr_set_default(struct dentry *dentry,
-			    const char *name,
-			    const void *buffer,
-			    size_t size,
-			    int flags,
-			    int handler_flags);
-
-int pvfs2_xattr_get_default(struct dentry *dentry,
-			    const char *name,
-			    void *buffer,
-			    size_t size,
-			    int handler_flags);
-
 /*
  * Redefine xtvec structure so that we could move helper functions out of
  * the define
diff --git a/fs/orangefs/xattr.c b/fs/orangefs/xattr.c
index 227eaa4..b683daa 100644
--- a/fs/orangefs/xattr.c
+++ b/fs/orangefs/xattr.c
@@ -447,12 +447,12 @@ out_unlock:
 	return ret;
 }
 
-int pvfs2_xattr_set_default(struct dentry *dentry,
-			    const char *name,
-			    const void *buffer,
-			    size_t size,
-			    int flags,
-			    int handler_flags)
+static int pvfs2_xattr_set_default(const struct xattr_handler *handler,
+				   struct dentry *dentry,
+				   const char *name,
+				   const void *buffer,
+				   size_t size,
+				   int flags)
 {
 	return pvfs2_inode_setxattr(dentry->d_inode,
 				    PVFS2_XATTR_NAME_DEFAULT_PREFIX,
@@ -462,11 +462,11 @@ int pvfs2_xattr_set_default(struct dentry *dentry,
 				    flags);
 }
 
-int pvfs2_xattr_get_default(struct dentry *dentry,
-			    const char *name,
-			    void *buffer,
-			    size_t size,
-			    int handler_flags)
+static int pvfs2_xattr_get_default(const struct xattr_handler *handler,
+				   struct dentry *dentry,
+				   const char *name,
+				   void *buffer,
+				   size_t size)
 {
 	return pvfs2_inode_getxattr(dentry->d_inode,
 				    PVFS2_XATTR_NAME_DEFAULT_PREFIX,
@@ -476,12 +476,12 @@ int pvfs2_xattr_get_default(struct dentry *dentry,
 
 }
 
-static int pvfs2_xattr_set_trusted(struct dentry *dentry,
-			    const char *name,
-			    const void *buffer,
-			    size_t size,
-			    int flags,
-			    int handler_flags)
+static int pvfs2_xattr_set_trusted(const struct xattr_handler *handler,
+				   struct dentry *dentry,
+				   const char *name,
+				   const void *buffer,
+				   size_t size,
+				   int flags)
 {
 	return pvfs2_inode_setxattr(dentry->d_inode,
 				    PVFS2_XATTR_NAME_TRUSTED_PREFIX,
@@ -491,11 +491,11 @@ static int pvfs2_xattr_set_trusted(struct dentry *dentry,
 				    flags);
 }
 
-static int pvfs2_xattr_get_trusted(struct dentry *dentry,
-			    const char *name,
-			    void *buffer,
-			    size_t size,
-			    int handler_flags)
+static int pvfs2_xattr_get_trusted(const struct xattr_handler *handler,
+				   struct dentry *dentry,
+				   const char *name,
+				   void *buffer,
+				   size_t size)
 {
 	return pvfs2_inode_getxattr(dentry->d_inode,
 				    PVFS2_XATTR_NAME_TRUSTED_PREFIX,

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11 16:25                                 ` Mike Marshall
@ 2015-11-11 16:36                                   ` Al Viro
  2015-11-11 16:56                                     ` Mike Marshall
  0 siblings, 1 reply; 35+ messages in thread
From: Al Viro @ 2015-11-11 16:36 UTC (permalink / raw)
  To: Mike Marshall
  Cc: Stephen Rothwell, Linus Torvalds, Sasha Levin, Andrey Ryabinin,
	Matthew Wilcox, Chuck Ebbert, linux-fsdevel, LKML, Jens Axboe,
	Dan Williams

On Wed, Nov 11, 2015 at 11:25:17AM -0500, Mike Marshall wrote:
> I'm the Orangefs guy...
> 
> If the orangefs warnings that people see because of what's in
> linux-next is annoying, I could focus on quieting them down...

See the fixup just posted in this thread.

> We've been focusing on code review and documentation ever
> since our last big exchange with Al and Linus...

BTW, could you put the current state of the docs someplace public?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11 16:36                                   ` Al Viro
@ 2015-11-11 16:56                                     ` Mike Marshall
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Marshall @ 2015-11-11 16:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Stephen Rothwell, Linus Torvalds, Sasha Levin, Andrey Ryabinin,
	Matthew Wilcox, Chuck Ebbert, linux-fsdevel, LKML, Jens Axboe,
	Dan Williams

 > BTW, could you put the current state of the docs someplace public?

The documentation will eventually end up in
Documentation/filesystems/orangefs.txt.

This part about the creation of the shared memory between userspace and
the kernel module seems complete and accurate to me so far. This "bufmap"
data structure is central to the protocol between userspace and the kernel
module. This describes the creation of the bufmap, details on how it is used
in exchanges is what I am working on now...

-----------------------------------------------------------------------------------------------------------

Orangefs is a user space filesystem and an associated kernel module.
We'll just refer to the user space part of Orangefs as "userspace"
from here on out...

The kernel module implements a pseudo device that userspace
can read from and write to. Userspace can also manipulate the
kernel module through the pseudo device with ioctl.

At startup userspace allocates two page-size-aligned (posix_memalign)
mlocked memory blocks, one is used for IO and one is used for readdir
operations. The IO block is 41943040 bytes and the readdir block is
4194304 bytes. Each block contains logical chunks, and a pointer to each
block is added to its own PVFS_dev_map_desc structure which also describes
its total size, as well as the size and number of the logical chunks.

A pointer to the IO block's PVFS_dev_map_desc structure is sent to a
mapping routine in the kernel module with an ioctl. The structure is
copied from user space to kernel space with copy_from_user and is used
to initialize the kernel module's "bufmap" (struct pvfs2_bufmap), which
then contains:

  * refcnt - a reference counter
  * desc_size - PVFS2_BUFMAP_DEFAULT_DESC_SIZE (4194304) the IO block's
    logical chunk size, which represents the filesystem's block size and
    is used for s_blocksize in super blocks.
  * desc_count - PVFS2_BUFMAP_DEFAULT_DESC_COUNT (10) the number of
    logical chunks in the IO block.
  * desc_shift - log2(desc_size), used for s_blocksize_bits in super blocks.
  * total_size - the total size of the IO block.
  * page_count - the number of 4096 byte pages in the IO block.
  * page_array - a pointer to page_count * (sizeof(struct page*)) bytes
    of kcalloced memory. This memory is used as an array of pointers
    to each of the pages in the IO block through a call to get_user_pages.
  * desc_array - a pointer to desc_count * (sizeof(struct pvfs_bufmap_desc))
    bytes of kcalloced memory. This memory is further intialized:

      user_desc is the kernel's copy of the IO block's PVFS_dev_map_desc
      structure. user_desc->ptr points to the IO block.

      pages_per_desc = bufmap->desc_size / PAGE_SIZE
      offset = 0

        bufmap->desc_array[0].page_array = &bufmap->page_array[offset]
        bufmap->desc_array[0].array_count = pages_per_desc = 1024
        bufmap->desc_array[0].uaddr = (user_desc->ptr) + (0 * 1024 * 4096)
        offset += 1024
                           .
                           .
                           .
        bufmap->desc_array[9].page_array = &bufmap->page_array[offset]
        bufmap->desc_array[9].array_count = pages_per_desc = 1024
        bufmap->desc_array[9].uaddr = (user_desc->ptr) +
                                               (9 * 1024 * 4096)
        offset += 1024

  * buffer_index_array - a desc_count sized array of ints, used to
    indicate which of the IO block's chunks are available to use.
  * buffer_index_lock - a spinlock to protect buffer_index_array during update.
  * readdir_index_array - a five (PVFS2_READDIR_DEFAULT_DESC_COUNT) element
    int array used to indicate which of the readdir block's chunks are
    available to use.
  * readdir_index_lock - a spinlock to protect readdir_index_array during
    update.

On Wed, Nov 11, 2015 at 11:36 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Wed, Nov 11, 2015 at 11:25:17AM -0500, Mike Marshall wrote:
>> I'm the Orangefs guy...
>>
>> If the orangefs warnings that people see because of what's in
>> linux-next is annoying, I could focus on quieting them down...
>
> See the fixup just posted in this thread.
>
>> We've been focusing on code review and documentation ever
>> since our last big exchange with Al and Linus...
>
> BTW, could you put the current state of the docs someplace public?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: fs: out of bounds on stack in iov_iter_advance
  2015-11-11 16:33                               ` Al Viro
@ 2015-11-11 21:47                                 ` Stephen Rothwell
  0 siblings, 0 replies; 35+ messages in thread
From: Stephen Rothwell @ 2015-11-11 21:47 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Sasha Levin, Andrey Ryabinin, Matthew Wilcox,
	Chuck Ebbert, linux-fsdevel, LKML, Jens Axboe, Dan Williams

Hi Al,

On Wed, 11 Nov 2015 16:33:39 +0000 Al Viro <viro@ZenIV.linux.org.uk> wrote:
>
> On Wed, Nov 11, 2015 at 10:19:48AM +0000, Al Viro wrote:
> 
> > I'll cook the minimal fixup for API change after I get some sleep and
> > send it your way, unless somebody gets there first...  
> 
> This should do it - switches ->ioctl() to pvfs2_inode_[gs]etxattr() and
> converts xattr_handler ->[gs]et() to new API.

Thanks, I will use that as a merge conflict fix patch from today.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-11-11 21:47 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-12 14:13 fs: out of bounds on stack in iov_iter_advance Sasha Levin
2015-08-15 20:13 ` Chuck Ebbert
2015-08-17  9:18   ` Andrey Ryabinin
2015-08-19  5:46     ` Al Viro
2015-09-02 20:00       ` Sasha Levin
2015-09-18  2:24       ` Sasha Levin
2015-09-30 21:30         ` Sasha Levin
2015-10-17 19:22           ` Sasha Levin
2015-10-18  4:17             ` Ross Zwisler
2015-10-19 23:34               ` Sasha Levin
2015-11-06  1:34           ` Al Viro
2015-11-06  2:19             ` Al Viro
2015-11-06  3:38               ` Linus Torvalds
2015-11-06 16:06                 ` Jens Axboe
2015-11-11  2:21                 ` Linus Torvalds
2015-11-11  2:25                   ` Jens Axboe
2015-11-11  2:31                     ` Linus Torvalds
2015-11-11  2:40                       ` Jens Axboe
2015-11-11  2:41                         ` Jens Axboe
2015-11-11  2:44                           ` Jens Axboe
2015-11-11  3:06                             ` Al Viro
2015-11-11  3:07                               ` Jens Axboe
2015-11-11  3:20                       ` Sasha Levin
2015-11-11  2:56                   ` Al Viro
2015-11-11  3:30                     ` Al Viro
2015-11-11  4:36                       ` Linus Torvalds
2015-11-11  7:43                         ` Al Viro
2015-11-11  8:16                           ` Stephen Rothwell
2015-11-11 10:19                             ` Al Viro
2015-11-11 10:28                               ` Stephen Rothwell
2015-11-11 16:25                                 ` Mike Marshall
2015-11-11 16:36                                   ` Al Viro
2015-11-11 16:56                                     ` Mike Marshall
2015-11-11 16:33                               ` Al Viro
2015-11-11 21:47                                 ` Stephen Rothwell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.