Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
* potential data race on ext_inode_hdr(inode)->eh_depth, ext_inode_hdr(inode)->eh_max between a creat and unlink syscall
@ 2019-11-28 17:03 Meng Xu
  2019-11-28 23:19 ` Theodore Y. Ts'o
  0 siblings, 1 reply; 3+ messages in thread
From: Meng Xu @ 2019-11-28 17:03 UTC (permalink / raw)
  To: linux-ext4

Hi Ext4 Developers,

I notice a potential data race on ext_inode_hdr(inode)->eh_depth,
ext_inode_hdr(inode)->eh_max between a create and unlink syscall.
Following is the trace:

[Setup]
mkdir("foo", 511) = 0;
open("foo", 65536, 511) = 3;
create("bar", 511) = 4;
symlink("foo", "sym_foo") = 0;
open("sym_foo", 65536, 511) = 5;

[Thread 1]
create("bar", 438);

__do_sys_creat
  ksys_open
    do_filp_open
      path_openat
        do_last
          handle_truncate
            do_truncate
              notify_change
                ext4_setattr
                  ext4_truncate
                    ext4_ext_truncate
                      ext4_ext _remove_space
                        [WRITE, 2 bytes] ext_inode_hdr(inode)->eh_depth = 0;
                        [WRITE, 2 bytes] ext_inode_hdr(inode)->eh_max
= cpu_to_le16(ext4_ext_space_root(inode, 0));

[Thread 2]
unlink("sym_foo");

__do_sys_unlink
  do_unlinkat
    iput
      iput_final
        evict
          ext4_evict_inode
            ext4_orphan_del
              ext4_mark_iloc_dirty
                ext4_do_update_inode
                  [READ, 4 bytes] raw_inode->i_block[block] = ei->i_data[block];


I could observe that the order between the READ and WRITE is not
deterministic and I was curious what will happen if the READ takes
place in the middle of the two WRITES? Does it cause any damages or
violations?

Best Regards,
Meng

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: potential data race on ext_inode_hdr(inode)->eh_depth, ext_inode_hdr(inode)->eh_max between a creat and unlink syscall
  2019-11-28 17:03 potential data race on ext_inode_hdr(inode)->eh_depth, ext_inode_hdr(inode)->eh_max between a creat and unlink syscall Meng Xu
@ 2019-11-28 23:19 ` Theodore Y. Ts'o
  2019-11-29  4:43   ` Meng Xu
  0 siblings, 1 reply; 3+ messages in thread
From: Theodore Y. Ts'o @ 2019-11-28 23:19 UTC (permalink / raw)
  To: Meng Xu; +Cc: linux-ext4

On Thu, Nov 28, 2019 at 12:03:04PM -0500, Meng Xu wrote:
> I notice a potential data race on ext_inode_hdr(inode)->eh_depth,
> ext_inode_hdr(inode)->eh_max between a create and unlink syscall.
> Following is the trace:
> 
> [Setup]
> mkdir("foo", 511) = 0;
> open("foo", 65536, 511) = 3;
> create("bar", 511) = 4;
> symlink("foo", "sym_foo") = 0;
> open("sym_foo", 65536, 511) = 5;
> 
> [Thread 1]
> create("bar", 438);
> 
> __do_sys_creat
>   ksys_open
>     do_filp_open
>       path_openat
>         do_last
>           handle_truncate
>             do_truncate
>               notify_change
>                 ext4_setattr
>                   ext4_truncate
>                     ext4_ext_truncate
>                       ext4_ext _remove_space
>                         [WRITE, 2 bytes] ext_inode_hdr(inode)->eh_depth = 0;
>                         [WRITE, 2 bytes] ext_inode_hdr(inode)->eh_max
> = cpu_to_le16(ext4_ext_space_root(inode, 0));
> 
> [Thread 2]
> unlink("sym_foo");
> 
> __do_sys_unlink
>   do_unlinkat
>     iput
>       iput_final
>         evict
>           ext4_evict_inode
>             ext4_orphan_del
>               ext4_mark_iloc_dirty
>                 ext4_do_update_inode
>                   [READ, 4 bytes] raw_inode->i_block[block] = ei->i_data[block];
> 
> 
> I could observe that the order between the READ and WRITE is not
> deterministic and I was curious what will happen if the READ takes
> place in the middle of the two WRITES? Does it cause any damages or
> violations?

This makes no sense.  The inodes corresponding to "sym_foo" and "bar"
are completely differenth.  So why would there be a data race?

How are you concluding that that there is, in fact, a data race?

    	    	       	    	       - Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: potential data race on ext_inode_hdr(inode)->eh_depth, ext_inode_hdr(inode)->eh_max between a creat and unlink syscall
  2019-11-28 23:19 ` Theodore Y. Ts'o
@ 2019-11-29  4:43   ` Meng Xu
  0 siblings, 0 replies; 3+ messages in thread
From: Meng Xu @ 2019-11-29  4:43 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: linux-ext4

Hi Ted,

First, thank you for checking this out.

I hook every memory access in the kernel so I know that the [READ] and
[WRITE] are accessing to the exact same memory address. Plus, this
access cannot be from two malloc-ed inode because we replaced kfree
with a quarantine scheme like KASan so they two inodes will have to
have two different addresses. This is what confused me too.

In addition, just in case it may make a difference, there is an fsync
happening on another thread too. The three threads are like:

[Setup]
mkdir("foo", 511) = 0;
open("foo", 65536, 511) = 3;
creat("bar", 511) = 4;
symlink("foo", "sym_foo") = 0;
open("sym_foo", 65536, 511) = 5;
dup2(5, 195) = 195;

[Thread 0: fsync(195)]
[Thread 1: creat("bar", 438)]
[Thread 2: unlink("sym_foo")]

Or in orders observed at runtime:
Enter fsync(195);
Enter unlink("sym_foo");
Enter creat("bar", 438);

Exit unlink("sym_foo");
Exit creat("bar", 438);
Exit fsync(195);

I can provide more information (eg, other function calls on the trace
or memory access logs), if that would help in checking this case. And
I am sorry for wasting your time if this case does not make sense.

Best regards,
Meng

On Thu, Nov 28, 2019 at 6:19 PM Theodore Y. Ts'o <tytso@mit.edu> wrote:
>
> On Thu, Nov 28, 2019 at 12:03:04PM -0500, Meng Xu wrote:
> > I notice a potential data race on ext_inode_hdr(inode)->eh_depth,
> > ext_inode_hdr(inode)->eh_max between a create and unlink syscall.
> > Following is the trace:
> >
> > [Setup]
> > mkdir("foo", 511) = 0;
> > open("foo", 65536, 511) = 3;
> > create("bar", 511) = 4;
> > symlink("foo", "sym_foo") = 0;
> > open("sym_foo", 65536, 511) = 5;
> >
> > [Thread 1]
> > create("bar", 438);
> >
> > __do_sys_creat
> >   ksys_open
> >     do_filp_open
> >       path_openat
> >         do_last
> >           handle_truncate
> >             do_truncate
> >               notify_change
> >                 ext4_setattr
> >                   ext4_truncate
> >                     ext4_ext_truncate
> >                       ext4_ext _remove_space
> >                         [WRITE, 2 bytes] ext_inode_hdr(inode)->eh_depth = 0;
> >                         [WRITE, 2 bytes] ext_inode_hdr(inode)->eh_max
> > = cpu_to_le16(ext4_ext_space_root(inode, 0));
> >
> > [Thread 2]
> > unlink("sym_foo");
> >
> > __do_sys_unlink
> >   do_unlinkat
> >     iput
> >       iput_final
> >         evict
> >           ext4_evict_inode
> >             ext4_orphan_del
> >               ext4_mark_iloc_dirty
> >                 ext4_do_update_inode
> >                   [READ, 4 bytes] raw_inode->i_block[block] = ei->i_data[block];
> >
> >
> > I could observe that the order between the READ and WRITE is not
> > deterministic and I was curious what will happen if the READ takes
> > place in the middle of the two WRITES? Does it cause any damages or
> > violations?
>
> This makes no sense.  The inodes corresponding to "sym_foo" and "bar"
> are completely differenth.  So why would there be a data race?
>
> How are you concluding that that there is, in fact, a data race?
>
>                                        - Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-28 17:03 potential data race on ext_inode_hdr(inode)->eh_depth, ext_inode_hdr(inode)->eh_max between a creat and unlink syscall Meng Xu
2019-11-28 23:19 ` Theodore Y. Ts'o
2019-11-29  4:43   ` Meng Xu

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git