All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] sysfs: fix use after free in sysfs_readdir()
@ 2013-03-20 15:25 Ming Lei
  2013-03-20 15:25 ` [PATCH 1/2] sysfs: fix race between readdir and lseek Ming Lei
  2013-03-20 15:25 ` [PATCH 2/2] sysfs: handle failure path correctly for readdir() Ming Lei
  0 siblings, 2 replies; 14+ messages in thread
From: Ming Lei @ 2013-03-20 15:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel

Hi,

These two patches fix two bugs inside sysfs_readdir(), and both
the two bugs may cause use after free problem, which is reported
by Dave on trinity fuzz test on syscall.


Thanks,
--
Ming Lei


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-20 15:25 [PATCH 0/2] sysfs: fix use after free in sysfs_readdir() Ming Lei
@ 2013-03-20 15:25 ` Ming Lei
  2013-03-21  2:41   ` Li Zefan
  2013-03-20 15:25 ` [PATCH 2/2] sysfs: handle failure path correctly for readdir() Ming Lei
  1 sibling, 1 reply; 14+ messages in thread
From: Ming Lei @ 2013-03-20 15:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ming Lei, stable

While readdir() is running, lseek() may set filp->f_pos as zero,
then may leave filp->private_data pointing to one sysfs_dirent
object without holding its reference counter, so the sysfs_dirent
object may be used after free in next readdir().

This patch holds inode->i_mutex to avoid the problem since
the lock is always held in readdir path.

Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 fs/sysfs/dir.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 2fbdff6..c9e1660 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -1058,10 +1058,21 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 	return 0;
 }
 
+static loff_t sysfs_dir_llseek(struct file *file, loff_t offset, int whence)
+{
+	struct inode *inode = file_inode(file);
+	loff_t ret;
+
+	mutex_lock(&inode->i_mutex);
+	ret = generic_file_llseek(file, offset, whence);
+	mutex_unlock(&inode->i_mutex);
+
+	return ret;
+}
 
 const struct file_operations sysfs_dir_operations = {
 	.read		= generic_read_dir,
 	.readdir	= sysfs_readdir,
 	.release	= sysfs_dir_release,
-	.llseek		= generic_file_llseek,
+	.llseek		= sysfs_dir_llseek,
 };
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] sysfs: handle failure path correctly for readdir()
  2013-03-20 15:25 [PATCH 0/2] sysfs: fix use after free in sysfs_readdir() Ming Lei
  2013-03-20 15:25 ` [PATCH 1/2] sysfs: fix race between readdir and lseek Ming Lei
@ 2013-03-20 15:25 ` Ming Lei
  2013-03-20 16:26   ` Shuah Khan
  1 sibling, 1 reply; 14+ messages in thread
From: Ming Lei @ 2013-03-20 15:25 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Ming Lei, stable

In case of 'if (filp->f_pos ==  0 or 1)' of sysfs_readdir(),
the failure from filldir() isn't handled, and the reference counter
of the sysfs_dirent object pointed by filp->private_data will be
released without clearing filp->private_data, so use after free
bug will be triggered later.

This patch returns immeadiately under the situation for fixing the bug,
and it is reasonable to return from readdir() when filldir() fails.

Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 fs/sysfs/dir.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index c9e1660..e145126 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -1020,6 +1020,8 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 		ino = parent_sd->s_ino;
 		if (filldir(dirent, ".", 1, filp->f_pos, ino, DT_DIR) == 0)
 			filp->f_pos++;
+		else
+			return 0;
 	}
 	if (filp->f_pos == 1) {
 		if (parent_sd->s_parent)
@@ -1028,6 +1030,8 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 			ino = parent_sd->s_ino;
 		if (filldir(dirent, "..", 2, filp->f_pos, ino, DT_DIR) == 0)
 			filp->f_pos++;
+		else
+			return 0;
 	}
 	mutex_lock(&sysfs_mutex);
 	for (pos = sysfs_dir_pos(ns, parent_sd, filp->f_pos, pos);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] sysfs: handle failure path correctly for readdir()
  2013-03-20 15:25 ` [PATCH 2/2] sysfs: handle failure path correctly for readdir() Ming Lei
@ 2013-03-20 16:26   ` Shuah Khan
  0 siblings, 0 replies; 14+ messages in thread
From: Shuah Khan @ 2013-03-20 16:26 UTC (permalink / raw)
  To: Ming Lei; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On Wed, Mar 20, 2013 at 9:25 AM, Ming Lei <ming.lei@canonical.com> wrote:
> In case of 'if (filp->f_pos ==  0 or 1)' of sysfs_readdir(),
> the failure from filldir() isn't handled, and the reference counter
> of the sysfs_dirent object pointed by filp->private_data will be
> released without clearing filp->private_data, so use after free
> bug will be triggered later.
>
> This patch returns immeadiately under the situation for fixing the bug,
> and it is reasonable to return from readdir() when filldir() fails.
>
> Reported-by: Dave Jones <davej@redhat.com>
> Tested-by: Sasha Levin <levinsasha928@gmail.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
>  fs/sysfs/dir.c |    4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index c9e1660..e145126 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -1020,6 +1020,8 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
>                 ino = parent_sd->s_ino;
>                 if (filldir(dirent, ".", 1, filp->f_pos, ino, DT_DIR) == 0)
>                         filp->f_pos++;
> +               else
> +                       return 0;
>         }
>         if (filp->f_pos == 1) {
>                 if (parent_sd->s_parent)
> @@ -1028,6 +1030,8 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
>                         ino = parent_sd->s_ino;
>                 if (filldir(dirent, "..", 2, filp->f_pos, ino, DT_DIR) == 0)
>                         filp->f_pos++;
> +               else
> +                       return 0;
>         }

Looks good to me. This is just an observation. readdir callers are
checking against NULL as opposed 0. Not a problem really probably
since NULL is defined as 0.

-- Shuah

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-20 15:25 ` [PATCH 1/2] sysfs: fix race between readdir and lseek Ming Lei
@ 2013-03-21  2:41   ` Li Zefan
  2013-03-21  3:17     ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Li Zefan @ 2013-03-21  2:41 UTC (permalink / raw)
  To: Ming Lei; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On 2013/3/20 23:25, Ming Lei wrote:
> While readdir() is running, lseek() may set filp->f_pos as zero,
> then may leave filp->private_data pointing to one sysfs_dirent
> object without holding its reference counter, so the sysfs_dirent
> object may be used after free in next readdir().
> 
> This patch holds inode->i_mutex to avoid the problem since
> the lock is always held in readdir path.
> 

In fact the same race exists between readdir() and read()/write()...

> Reported-by: Dave Jones <davej@redhat.com>
> Tested-by: Sasha Levin <levinsasha928@gmail.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
>  fs/sysfs/dir.c |   13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-21  2:41   ` Li Zefan
@ 2013-03-21  3:17     ` Ming Lei
  2013-03-21  3:28       ` Li Zefan
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2013-03-21  3:17 UTC (permalink / raw)
  To: Li Zefan; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On Thu, Mar 21, 2013 at 10:41 AM, Li Zefan <lizefan@huawei.com> wrote:
>
> In fact the same race exists between readdir() and read()/write()...

Fortunately, no read()/write() are implemented on sysfs directory, :-)

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-21  3:17     ` Ming Lei
@ 2013-03-21  3:28       ` Li Zefan
  2013-03-21  4:48         ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Li Zefan @ 2013-03-21  3:28 UTC (permalink / raw)
  To: Ming Lei; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On 2013/3/21 11:17, Ming Lei wrote:
> On Thu, Mar 21, 2013 at 10:41 AM, Li Zefan <lizefan@huawei.com> wrote:
>>
>> In fact the same race exists between readdir() and read()/write()...
> 
> Fortunately, no read()/write() are implemented on sysfs directory, :-)
> 

That's irrelevant...

See my report:

https://patchwork.kernel.org/patch/2160771/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-21  3:28       ` Li Zefan
@ 2013-03-21  4:48         ` Ming Lei
  2013-03-22  5:48           ` Li Zefan
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2013-03-21  4:48 UTC (permalink / raw)
  To: Li Zefan; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On Thu, Mar 21, 2013 at 11:28 AM, Li Zefan <lizefan@huawei.com> wrote:
> On 2013/3/21 11:17, Ming Lei wrote:
>> On Thu, Mar 21, 2013 at 10:41 AM, Li Zefan <lizefan@huawei.com> wrote:
>>>
>>> In fact the same race exists between readdir() and read()/write()...
>>
>> Fortunately, no read()/write() are implemented on sysfs directory, :-)
>>
>
> That's irrelevant...

As far as sysfs is concerned, the filp->f_ops can't be changed in
read/write path.

>
> See my report:
>
> https://patchwork.kernel.org/patch/2160771/

Yes, I know there might be some mess after the commit ef3d0fd2
(vfs: do (nearly) lockless generic_file_llseek).

Also looks it has been stated in Documentation/filesystems/Locking:

->llseek() locking has moved from llseek to the individual llseek
implementations.  If your fs is not using generic_file_llseek, you
need to acquire and release the appropriate locks in your ->llseek().
For many filesystems, it is probably safe to acquire the inode
mutex or just to use i_size_read() instead.
Note: this does not protect the file->f_pos against concurrent modifications
since this is something the userspace has to take care about.


Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-21  4:48         ` Ming Lei
@ 2013-03-22  5:48           ` Li Zefan
  2013-03-22  9:31             ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Li Zefan @ 2013-03-22  5:48 UTC (permalink / raw)
  To: Ming Lei; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On 2013/3/21 12:48, Ming Lei wrote:
> On Thu, Mar 21, 2013 at 11:28 AM, Li Zefan <lizefan@huawei.com> wrote:
>> On 2013/3/21 11:17, Ming Lei wrote:
>>> On Thu, Mar 21, 2013 at 10:41 AM, Li Zefan <lizefan@huawei.com> wrote:
>>>>
>>>> In fact the same race exists between readdir() and read()/write()...
>>>
>>> Fortunately, no read()/write() are implemented on sysfs directory, :-)
>>>
>>
>> That's irrelevant...
> 
> As far as sysfs is concerned, the filp->f_ops can't be changed in
> read/write path.
> 

Yes, it can...As I said, it's irrelevant, because it's vfs that changes
file->f_pos.

SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
        struct fd f = fdget(fd);
        ssize_t ret = -EBADF;

        if (f.file) {
                loff_t pos = file_pos_read(f.file);		<--- read f_pos
                ret = vfs_read(f.file, buf, count, &pos);	<--- return -EISDIR
                file_pos_write(f.file, pos);			<--- write f_pos
                fdput(f);
        }
        return ret;
}

>>
>> See my report:
>>
>> https://patchwork.kernel.org/patch/2160771/
> 
> Yes, I know there might be some mess after the commit ef3d0fd2
> (vfs: do (nearly) lockless generic_file_llseek).
> 
> Also looks it has been stated in Documentation/filesystems/Locking:
> 
> ->llseek() locking has moved from llseek to the individual llseek
> implementations.  If your fs is not using generic_file_llseek, you
> need to acquire and release the appropriate locks in your ->llseek().
> For many filesystems, it is probably safe to acquire the inode
> mutex or just to use i_size_read() instead.
> Note: this does not protect the file->f_pos against concurrent modifications
> since this is something the userspace has to take care about.
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-22  5:48           ` Li Zefan
@ 2013-03-22  9:31             ` Ming Lei
  2013-03-26  7:30               ` Li Zefan
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2013-03-22  9:31 UTC (permalink / raw)
  To: Li Zefan; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On Fri, Mar 22, 2013 at 1:48 PM, Li Zefan <lizefan@huawei.com> wrote:
> On 2013/3/21 12:48, Ming Lei wrote:
>
> Yes, it can...As I said, it's irrelevant, because it's vfs that changes
> file->f_pos.
>
> SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
> {
>         struct fd f = fdget(fd);
>         ssize_t ret = -EBADF;
>
>         if (f.file) {
>                 loff_t pos = file_pos_read(f.file);             <--- read f_pos
>                 ret = vfs_read(f.file, buf, count, &pos);       <--- return -EISDIR
>                 file_pos_write(f.file, pos);                    <--- write f_pos

Considered that f_pos of sysfs directory is always less than INT_MAX,
we need't worry about atomic writing it in file_pos_write().

The only probable problem on sysfs is below scenario in read()/write():

-  pos is read as less than 2 in file_pos_read(f.file)
-  ret = vfs_read(f.file, buf, count, &pos)
          ---> readdir() in another path
-  file_pos_write(pos)
         ---> readdir() found f_pos becomes 0 or 1, and may cause
use-after-free problem

Considered that vfs_read()/vfs_write on sysfs dir is almost doing nothing, the
above problem may only exist in theory. The patch[1] can't avoid it too.
I am wondering if it need to be fixed, but I will try to figure out how to
avoid it in sysfs_readdir() if possible.

[1], https://patchwork.kernel.org/patch/2160771/

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-22  9:31             ` Ming Lei
@ 2013-03-26  7:30               ` Li Zefan
  2013-03-26  8:45                 ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Li Zefan @ 2013-03-26  7:30 UTC (permalink / raw)
  To: Ming Lei; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On 2013/3/22 17:31, Ming Lei wrote:
> On Fri, Mar 22, 2013 at 1:48 PM, Li Zefan <lizefan@huawei.com> wrote:
>> On 2013/3/21 12:48, Ming Lei wrote:
>>
>> Yes, it can...As I said, it's irrelevant, because it's vfs that changes
>> file->f_pos.
>>
>> SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
>> {
>>         struct fd f = fdget(fd);
>>         ssize_t ret = -EBADF;
>>
>>         if (f.file) {
>>                 loff_t pos = file_pos_read(f.file);             <--- read f_pos
>>                 ret = vfs_read(f.file, buf, count, &pos);       <--- return -EISDIR
>>                 file_pos_write(f.file, pos);                    <--- write f_pos
> 
> Considered that f_pos of sysfs directory is always less than INT_MAX,
> we need't worry about atomic writing it in file_pos_write().
> 
> The only probable problem on sysfs is below scenario in read()/write():
> 
> -  pos is read as less than 2 in file_pos_read(f.file)
> -  ret = vfs_read(f.file, buf, count, &pos)
>           ---> readdir() in another path
> -  file_pos_write(pos)
>          ---> readdir() found f_pos becomes 0 or 1, and may cause
> use-after-free problem
> 
> Considered that vfs_read()/vfs_write on sysfs dir is almost doing nothing, the
> above problem may only exist in theory.

The read() vs readdir() race in sysfs directory doesn't exist in theory only.

Mar 25 11:16:57 lxc34 kernel: [ 3581.923110] ------------[ cut here ]------------
Mar 25 11:16:57 lxc34 kernel: [ 3581.923124] WARNING: at fs/sysfs/sysfs.h:195 sysfs_readdir+0x277/0x290()
Mar 25 11:16:57 lxc34 kernel: [ 3581.923131] Hardware name: Tecal RH2285
Mar 25 11:16:57 lxc34 kernel: [ 3581.923136] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_i
scsi bridge ipv6 stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave binfmt_misc fuse loop dm_mod c
oretemp acpi_cpufreq mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf
128mul bnx2 iTCO_wdt iTCO_vendor_support sg i2c_i801 ehci_pci mptctl tpm_tis tpm tpm_bios serio_raw microcode lp
c_ich i2c_core hid_generic mfd_core button usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif edd
 ext3 mbcache jbd fan processor ide_pci_generic ide_core ata_generic ata_piix libata mptsas mptscsih mptbase scs
i_transport_sas scsi_mod thermal thermal_sys hwmon
Mar 25 11:16:57 lxc34 kernel: [ 3581.923238] Pid: 13289, comm: a.out Not tainted 3.9.0-rc1-0.7-default+ #38
Mar 25 11:16:57 lxc34 kernel: [ 3581.923245] Call Trace:
Mar 25 11:16:57 lxc34 kernel: [ 3581.923251]  [<ffffffff8120b137>] ? sysfs_readdir+0x277/0x290
Mar 25 11:16:57 lxc34 kernel: [ 3581.923258]  [<ffffffff8120b137>] ? sysfs_readdir+0x277/0x290
Mar 25 11:16:57 lxc34 kernel: [ 3581.923273]  [<ffffffff81042d3f>] warn_slowpath_common+0x7f/0xc0
Mar 25 11:16:57 lxc34 kernel: [ 3581.923281]  [<ffffffff81042d9a>] warn_slowpath_null+0x1a/0x20
Mar 25 11:16:57 lxc34 kernel: [ 3581.923288]  [<ffffffff8120b137>] sysfs_readdir+0x277/0x290
Mar 25 11:16:57 lxc34 kernel: [ 3581.923296]  [<ffffffff811a11a0>] ? sys_ioctl+0x90/0x90
Mar 25 11:16:57 lxc34 kernel: [ 3581.923303]  [<ffffffff811a11a0>] ? sys_ioctl+0x90/0x90
Mar 25 11:16:57 lxc34 kernel: [ 3581.923310]  [<ffffffff811a1589>] vfs_readdir+0xa9/0xc0
Mar 25 11:16:57 lxc34 kernel: [ 3581.923317]  [<ffffffff811a163b>] sys_getdents64+0x9b/0x110
Mar 25 11:16:57 lxc34 kernel: [ 3581.923327]  [<ffffffff814bb5d9>] system_call_fastpath+0x16/0x1b
Mar 25 11:16:57 lxc34 kernel: [ 3581.923333] ---[ end trace a995ae360b2301bd ]---
Mar 25 11:16:57 lxc34 kernel: [ 3581.923339] ida_remove called for id=19055 which is not allocated.
...

And finally my kernel crashed.

I'm not asking you to fix it, but just let you know about this bug. Al proposed a fix but I don't know
if he'll work on it.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-26  7:30               ` Li Zefan
@ 2013-03-26  8:45                 ` Ming Lei
  2013-03-26 14:03                   ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2013-03-26  8:45 UTC (permalink / raw)
  To: Li Zefan; +Cc: Greg Kroah-Hartman, linux-kernel, stable

On Tue, Mar 26, 2013 at 3:30 PM, Li Zefan <lizefan@huawei.com> wrote:
>> Considered that vfs_read()/vfs_write on sysfs dir is almost doing nothing, the
>> above problem may only exist in theory.
>
> The read() vs readdir() race in sysfs directory doesn't exist in theory only.

Could you let me know if you have applied the two patches on your test?

https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=e5110f411d2ee35bf8d202ccca2e89c633060dca

https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=991f76f837bf22c5bb07261cfd86525a0a96650c

Also I appreciate it if you may share your test case...

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-26  8:45                 ` Ming Lei
@ 2013-03-26 14:03                   ` Ming Lei
  2013-03-26 15:59                     ` Ming Lei
  0 siblings, 1 reply; 14+ messages in thread
From: Ming Lei @ 2013-03-26 14:03 UTC (permalink / raw)
  To: Li Zefan; +Cc: Greg Kroah-Hartman, linux-kernel, stable

Hi Zefan,

On Tue, Mar 26, 2013 at 4:45 PM, Ming Lei <ming.lei@canonical.com> wrote:
> On Tue, Mar 26, 2013 at 3:30 PM, Li Zefan <lizefan@huawei.com> wrote:
>>> Considered that vfs_read()/vfs_write on sysfs dir is almost doing nothing, the
>>> above problem may only exist in theory.
>>
>> The read() vs readdir() race in sysfs directory doesn't exist in theory only.
>
> Could you let me know if you have applied the two patches on your test?
>
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=e5110f411d2ee35bf8d202ccca2e89c633060dca
>
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=991f76f837bf22c5bb07261cfd86525a0a96650c
>
> Also I appreciate it if you may share your test case...

If you mean the test code on link[1], I can't reproduce the
warning with the two sysfs fix patches in 4 hours's test.

[1], https://patchwork.kernel.org/patch/2160771/

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] sysfs: fix race between readdir and lseek
  2013-03-26 14:03                   ` Ming Lei
@ 2013-03-26 15:59                     ` Ming Lei
  0 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2013-03-26 15:59 UTC (permalink / raw)
  To: Li Zefan; +Cc: Greg Kroah-Hartman, linux-kernel, stable

[-- Attachment #1: Type: text/plain, Size: 2618 bytes --]

On Tue, Mar 26, 2013 at 10:03 PM, Ming Lei <ming.lei@canonical.com> wrote:
>
> If you mean the test code on link[1], I can't reproduce the
> warning with the two sysfs fix patches in 4 hours's test.
>
> [1], https://patchwork.kernel.org/patch/2160771/

You are right, looks it is not a problem just in theory, and I can
reproduce it now with your test code by the following steps:

- load all modules
- run your test code on the directory of '/sys/module'
- then can observe the use after free after minutes(a bit easier to
add below debug code[1])

Previously, I can't reproduce because I just test on one specific
unused module directory.

[1], debug code
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -280,6 +280,11 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 	 * sd->s_parent won't change beneath us.
 	 */
 	parent_sd = sd->s_parent;
+	if(!(sd->s_flags & SYSFS_FLAG_REMOVED)) {
+		printk("%s-%d sysfs_dirent use after free: %s-%s\n",
+			__func__, __LINE__, parent_sd->s_name, sd->s_name);
+		dump_stack();
+	}


The below patch(also attached) can fix the issue.
--
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 79a0fd2..484f25e 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -1022,6 +1022,7 @@ static int sysfs_readdir(struct file * filp,
void * dirent, filldir_t filldir)
 	enum kobj_ns_type type;
 	const void *ns;
 	ino_t ino;
+	loff_t off;

 	type = sysfs_ns_type(parent_sd);
 	ns = sysfs_info(dentry->d_sb)->ns[type];
@@ -1044,6 +1045,7 @@ static int sysfs_readdir(struct file * filp,
void * dirent, filldir_t filldir)
 			return 0;
 	}
 	mutex_lock(&sysfs_mutex);
+	off = filp->f_pos;
 	for (pos = sysfs_dir_pos(ns, parent_sd, filp->f_pos, pos);
 	     pos;
 	     pos = sysfs_dir_next_pos(ns, parent_sd, filp->f_pos, pos)) {
@@ -1055,19 +1057,24 @@ static int sysfs_readdir(struct file * filp,
void * dirent, filldir_t filldir)
 		len = strlen(name);
 		ino = pos->s_ino;
 		type = dt_type(pos);
-		filp->f_pos = pos->s_hash;
+		off = filp->f_pos = pos->s_hash;
 		filp->private_data = sysfs_get(pos);

 		mutex_unlock(&sysfs_mutex);
-		ret = filldir(dirent, name, len, filp->f_pos, ino, type);
+		ret = filldir(dirent, name, len, off, ino, type);
 		mutex_lock(&sysfs_mutex);
 		if (ret < 0)
 			break;
 	}
 	mutex_unlock(&sysfs_mutex);
-	if ((filp->f_pos > 1) && !pos) { /* EOF */
-		filp->f_pos = INT_MAX;
+
+	/* don't reference last entry if its refcount is dropped */
+	if (!pos) {
 		filp->private_data = NULL;
+
+		/* EOF and not changed as 0 or 1 in read/write path */
+		if (off == filp->f_pos && off > 1)
+			filp->f_pos = INT_MAX;
 	}
 	return 0;
 }



Thanks,
--
Ming Lei

[-- Attachment #2: sysfs-fix-readdir-v5.patch --]
[-- Type: application/octet-stream, Size: 1515 bytes --]

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 79a0fd2..484f25e 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -1022,6 +1022,7 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 	enum kobj_ns_type type;
 	const void *ns;
 	ino_t ino;
+	loff_t off;
 
 	type = sysfs_ns_type(parent_sd);
 	ns = sysfs_info(dentry->d_sb)->ns[type];
@@ -1044,6 +1045,7 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 			return 0;
 	}
 	mutex_lock(&sysfs_mutex);
+	off = filp->f_pos;
 	for (pos = sysfs_dir_pos(ns, parent_sd, filp->f_pos, pos);
 	     pos;
 	     pos = sysfs_dir_next_pos(ns, parent_sd, filp->f_pos, pos)) {
@@ -1055,19 +1057,24 @@ static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 		len = strlen(name);
 		ino = pos->s_ino;
 		type = dt_type(pos);
-		filp->f_pos = pos->s_hash;
+		off = filp->f_pos = pos->s_hash;
 		filp->private_data = sysfs_get(pos);
 
 		mutex_unlock(&sysfs_mutex);
-		ret = filldir(dirent, name, len, filp->f_pos, ino, type);
+		ret = filldir(dirent, name, len, off, ino, type);
 		mutex_lock(&sysfs_mutex);
 		if (ret < 0)
 			break;
 	}
 	mutex_unlock(&sysfs_mutex);
-	if ((filp->f_pos > 1) && !pos) { /* EOF */
-		filp->f_pos = INT_MAX;
+
+	/* don't reference last entry if its refcount is dropped */
+	if (!pos) {
 		filp->private_data = NULL;
+
+		/* EOF and not changed as 0 or 1 in read/write path */
+		if (off == filp->f_pos && off > 1)
+			filp->f_pos = INT_MAX;
 	}
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-03-26 15:59 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-20 15:25 [PATCH 0/2] sysfs: fix use after free in sysfs_readdir() Ming Lei
2013-03-20 15:25 ` [PATCH 1/2] sysfs: fix race between readdir and lseek Ming Lei
2013-03-21  2:41   ` Li Zefan
2013-03-21  3:17     ` Ming Lei
2013-03-21  3:28       ` Li Zefan
2013-03-21  4:48         ` Ming Lei
2013-03-22  5:48           ` Li Zefan
2013-03-22  9:31             ` Ming Lei
2013-03-26  7:30               ` Li Zefan
2013-03-26  8:45                 ` Ming Lei
2013-03-26 14:03                   ` Ming Lei
2013-03-26 15:59                     ` Ming Lei
2013-03-20 15:25 ` [PATCH 2/2] sysfs: handle failure path correctly for readdir() Ming Lei
2013-03-20 16:26   ` Shuah Khan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.