All of lore.kernel.org
 help / color / mirror / Atom feed
* Question regarding CIFS cache=loose behavior.
@ 2014-03-20 12:03 Tetsuo Handa
       [not found] ` <201403202103.EIE87056.QVtLHSFMFOOFOJ-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2014-03-20 12:03 UTC (permalink / raw)
  To: linux-cifs, linux-fsdevel

Is the behavior described below ( "read() returns 0 unless stat() detects the
growth of file size" and "read() returns non-0 when stat() detects the growth
of file size but the data actually read is '\0'" ) a feature of cache=loose
option?

Steps to reproduce:

---------- Start of writer1.c ----------
#include <unistd.h>
int main(int argc, char *argv[])
{
    while (write(1, ".", 1) == 1)
        sleep(1);
    return 0;
}
---------- End of writer1.c ----------

---------- Start of reader1.c ----------
#include <unistd.h>
int main(int argc, char *argv[])
{
    while (1) {
        char c;
        if (read(0, &c, 1) == 1) {
            if (c)
                write(1, &c, 1);
            else
                write(1, "!", 1);
        } else {
            sleep(1);
        }
    }
    return 0;
}
---------- End of reader1.c ----------

---------- Start of reader2.c ----------
#include <sys/stat.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
    while (1) {
        struct stat buf;
        char c;
        fstat(0, &buf);
        if (read(0, &c, 1) == 1) {
            if (c)
                write(1, &c, 1);
            else
                write(1, "!", 1);
        } else {
            sleep(1);
        }
    }
    return 0;
}
---------- End of reader2.c ----------

Run the following commands on the Samba server side.

  $ cc -Wall -O3 -o writer1 writer1.c
  $ ./writer1 > shared_file

Run the following commands on the Samba client side.

  $ cc -Wall -O3 -o reader1 reader1.c
  $ strace -tt -o /tmp/log1 ./reader1 < shared_file
  ..........     (<= Makes no progress by default)

The shared_file shown above is a regular file shared via CIFS mounted with
cache=loose option. (e.g. "mount -t cifs -o cache=loose //server/path /mnt" )

I expect that the reader1 process reads one byte whenever the writer1
process writes one byte. But we can observe via /tmp/log1 that the read()
continues returning 0 until the growth of file size of shared_file is
recognized by stat() using programs like /bin/ls and /usr/bin/stat .

Then, terminate the reader1 process using Ctrl-C and run the following
commands on the Samba client side.

  $ cc -Wall -O3 -o reader2 reader2.c
  $ strace -tt -o /tmp/log2 ./reader2 < shared_file
  ......................!!!!!!     (<= Makes progress with wrong data)

We can observe via /tmp/log2 that the read() returns 1 because fstat()
detects the growth of file size of shared_file . However, the data
returned by read() is not '.' but '\0'.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found] ` <201403202103.EIE87056.QVtLHSFMFOOFOJ-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
@ 2014-03-20 20:19   ` Jeff Layton
       [not found]     ` <20140320161943.60890dd4-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
  2014-03-31 17:48   ` Jeff Layton
  1 sibling, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2014-03-20 20:19 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

On Thu, 20 Mar 2014 21:03:03 +0900
Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> wrote:

> Is the behavior described below ( "read() returns 0 unless stat() detects the
> growth of file size" and "read() returns non-0 when stat() detects the growth
> of file size but the data actually read is '\0'" ) a feature of cache=loose
> option?
> 
> Steps to reproduce:
> 
> ---------- Start of writer1.c ----------
> #include <unistd.h>
> int main(int argc, char *argv[])
> {
>     while (write(1, ".", 1) == 1)
>         sleep(1);
>     return 0;
> }
> ---------- End of writer1.c ----------
> 
> ---------- Start of reader1.c ----------
> #include <unistd.h>
> int main(int argc, char *argv[])
> {
>     while (1) {
>         char c;
>         if (read(0, &c, 1) == 1) {
>             if (c)
>                 write(1, &c, 1);
>             else
>                 write(1, "!", 1);
>         } else {
>             sleep(1);
>         }
>     }
>     return 0;
> }
> ---------- End of reader1.c ----------
> 
> ---------- Start of reader2.c ----------
> #include <sys/stat.h>
> #include <unistd.h>
> int main(int argc, char *argv[])
> {
>     while (1) {
>         struct stat buf;
>         char c;
>         fstat(0, &buf);
>         if (read(0, &c, 1) == 1) {
>             if (c)
>                 write(1, &c, 1);
>             else
>                 write(1, "!", 1);
>         } else {
>             sleep(1);
>         }
>     }
>     return 0;
> }
> ---------- End of reader2.c ----------
> 
> Run the following commands on the Samba server side.
> 
>   $ cc -Wall -O3 -o writer1 writer1.c
>   $ ./writer1 > shared_file
> 
> Run the following commands on the Samba client side.
> 
>   $ cc -Wall -O3 -o reader1 reader1.c
>   $ strace -tt -o /tmp/log1 ./reader1 < shared_file
>   ..........     (<= Makes no progress by default)
> 
> The shared_file shown above is a regular file shared via CIFS mounted with
> cache=loose option. (e.g. "mount -t cifs -o cache=loose //server/path /mnt" )
> 
> I expect that the reader1 process reads one byte whenever the writer1
> process writes one byte. But we can observe via /tmp/log1 that the read()
> continues returning 0 until the growth of file size of shared_file is
> recognized by stat() using programs like /bin/ls and /usr/bin/stat .
> 
> Then, terminate the reader1 process using Ctrl-C and run the following
> commands on the Samba client side.
> 
>   $ cc -Wall -O3 -o reader2 reader2.c
>   $ strace -tt -o /tmp/log2 ./reader2 < shared_file
>   ......................!!!!!!     (<= Makes progress with wrong data)
> 
> We can observe via /tmp/log2 that the read() returns 1 because fstat()
> detects the growth of file size of shared_file . However, the data
> returned by read() is not '.' but '\0'.

That's expected behavior. The kernel believes that the file is frozen in
length so it returns short read() calls until the size is updated.

cache=loose is very much not recommended for use when you have multiple
hosts accessing files on the server (or access by processes on the
server itself). It only gives you "loose" cache coherency. The whole
point of it is to allow the client to cache data even when the protocol
says that it shouldn't.

cache=strict is what is recommended and that's the default these days.
-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]     ` <20140320161943.60890dd4-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
@ 2014-03-20 20:56       ` Tetsuo Handa
       [not found]         ` <201403210556.FEI21354.FLJOQMHtSVOFFO-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2014-03-20 20:56 UTC (permalink / raw)
  To: jlayton-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

Jeff Layton wrote:
> That's expected behavior. The kernel believes that the file is frozen in
> length so it returns short read() calls until the size is updated.

The "size is updated" means "stat() detects the growth of file size",
doesn't it? Then, the former is expected behavior.

> 
> cache=loose is very much not recommended for use when you have multiple
> hosts accessing files on the server (or access by processes on the
> server itself). It only gives you "loose" cache coherency. The whole
> point of it is to allow the client to cache data even when the protocol
> says that it shouldn't.

But why is the latter ( "read() returns non-0 when stat() detects the growth
of file size but the data actually read is '\0'" ) is expected behavior?
It sounds like a bug that the client caches '\0' (data nobody has ever wrote)
instead of '.' (data somebody wrote when the file size grew).

> 
> cache=strict is what is recommended and that's the default these days.

Yes, I suggested a customer using CIFS on RHEL6 to use cache=none or
cache=strict , for that customer's system does not work as expected due to
the latter behavior when using "tail -f" for checking for appended log.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]         ` <201403210556.FEI21354.FLJOQMHtSVOFFO-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
@ 2014-03-20 21:04           ` Steve French
  2014-03-20 23:10           ` Jeff Layton
  1 sibling, 0 replies; 14+ messages in thread
From: Steve French @ 2014-03-20 21:04 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Jeff Layton, linux-cifs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel

On Thu, Mar 20, 2014 at 3:56 PM, Tetsuo Handa
<penguin-kernel-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org> wrote:
> Jeff Layton wrote:
>> That's expected behavior. The kernel believes that the file is frozen in
>> length so it returns short read() calls until the size is updated.
>
> The "size is updated" means "stat() detects the growth of file size",
> doesn't it? Then, the former is expected behavior.

Note that setting cifs mount option "actimeo=0" will cause size not to
be cached and may help.

>> cache=loose is very much not recommended for use when you have multiple
>> hosts accessing files on the server (or access by processes on the
>> server itself). It only gives you "loose" cache coherency. The whole
>> point of it is to allow the client to cache data even when the protocol
>> says that it shouldn't.
>
> But why is the latter ( "read() returns non-0 when stat() detects the growth
> of file size but the data actually read is '\0'" ) is expected behavior?
> It sounds like a bug that the client caches '\0' (data nobody has ever wrote)
> instead of '.' (data somebody wrote when the file size grew).
>
>>
>> cache=strict is what is recommended and that's the default these days.
>
> Yes, I suggested a customer using CIFS on RHEL6 to use cache=none or
> cache=strict , for that customer's system does not work as expected due to
> the latter behavior when using "tail -f" for checking for appended log.



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]         ` <201403210556.FEI21354.FLJOQMHtSVOFFO-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
  2014-03-20 21:04           ` Steve French
@ 2014-03-20 23:10           ` Jeff Layton
       [not found]             ` <20140320191013.110276c0-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
  1 sibling, 1 reply; 14+ messages in thread
From: Jeff Layton @ 2014-03-20 23:10 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

On Fri, 21 Mar 2014 05:56:04 +0900
Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> wrote:

> Jeff Layton wrote:
> > That's expected behavior. The kernel believes that the file is frozen in
> > length so it returns short read() calls until the size is updated.
> 
> The "size is updated" means "stat() detects the growth of file size",
> doesn't it? Then, the former is expected behavior.
> 
> > 
> > cache=loose is very much not recommended for use when you have multiple
> > hosts accessing files on the server (or access by processes on the
> > server itself). It only gives you "loose" cache coherency. The whole
> > point of it is to allow the client to cache data even when the protocol
> > says that it shouldn't.
> 
> But why is the latter ( "read() returns non-0 when stat() detects the growth
> of file size but the data actually read is '\0'" ) is expected behavior?
> It sounds like a bug that the client caches '\0' (data nobody has ever wrote)
> instead of '.' (data somebody wrote when the file size grew).
> 

Yeah, that sounds wrong. What should happen is that the cache is
invalidated when the size changes. It's possible there is a race in
that code however. The locking around it is pretty sloppy...

> > 
> > cache=strict is what is recommended and that's the default these days.
> 
> Yes, I suggested a customer using CIFS on RHEL6 to use cache=none or
> cache=strict , for that customer's system does not work as expected due to
> the latter behavior when using "tail -f" for checking for appended log.

Yeah, cache=strict should work much better for that sort of use-case.

-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]             ` <20140320191013.110276c0-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
@ 2014-03-21  8:56               ` Pavel Shilovsky
       [not found]                 ` <CAKywueTNDHSGia58AcKwuDtZA2zc_yOQR-iFTPq3XTrYm6bNLQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2014-03-26 12:04               ` Tetsuo Handa
  1 sibling, 1 reply; 14+ messages in thread
From: Pavel Shilovsky @ 2014-03-21  8:56 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Tetsuo Handa, linux-cifs, linux-fsdevel

2014-03-21 3:10 GMT+04:00 Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>:
> On Fri, 21 Mar 2014 05:56:04 +0900
> Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> wrote:
>
>> Jeff Layton wrote:
>> > That's expected behavior. The kernel believes that the file is frozen in
>> > length so it returns short read() calls until the size is updated.
>>
>> The "size is updated" means "stat() detects the growth of file size",
>> doesn't it? Then, the former is expected behavior.
>>
>> >
>> > cache=loose is very much not recommended for use when you have multiple
>> > hosts accessing files on the server (or access by processes on the
>> > server itself). It only gives you "loose" cache coherency. The whole
>> > point of it is to allow the client to cache data even when the protocol
>> > says that it shouldn't.
>>
>> But why is the latter ( "read() returns non-0 when stat() detects the growth
>> of file size but the data actually read is '\0'" ) is expected behavior?
>> It sounds like a bug that the client caches '\0' (data nobody has ever wrote)
>> instead of '.' (data somebody wrote when the file size grew).
>>
>
> Yeah, that sounds wrong. What should happen is that the cache is
> invalidated when the size changes. It's possible there is a race in
> that code however. The locking around it is pretty sloppy...

When fstat() get a new file size it sets
CIFS_I(inode)->invalid_mapping to true but do not revalidate the
cache. Then generic_file_aio_read() reads the wrong data. I think we
need to check if CIFS_I(inode)->invalid_mapping is true and revalidate
the cache before calling generic_file_aio_read() in
file->f_ops->aio_read(). Now cache revalidation happens in lookup/open
and mmap codepaths only for cache=loose.

Of course, cache=loose is not recommended for this sort of work flow
and cache=strict should be used to provide a data coherency between
several machines.

-- 
Best regards,
Pavel Shilovsky.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]                 ` <CAKywueTNDHSGia58AcKwuDtZA2zc_yOQR-iFTPq3XTrYm6bNLQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-03-21 11:40                   ` Jeff Layton
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2014-03-21 11:40 UTC (permalink / raw)
  To: Pavel Shilovsky; +Cc: Tetsuo Handa, linux-cifs, linux-fsdevel

On Fri, 21 Mar 2014 12:56:34 +0400
Pavel Shilovsky <piastryyy-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> 2014-03-21 3:10 GMT+04:00 Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>:
> > On Fri, 21 Mar 2014 05:56:04 +0900
> > Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> wrote:
> >
> >> Jeff Layton wrote:
> >> > That's expected behavior. The kernel believes that the file is frozen in
> >> > length so it returns short read() calls until the size is updated.
> >>
> >> The "size is updated" means "stat() detects the growth of file size",
> >> doesn't it? Then, the former is expected behavior.
> >>
> >> >
> >> > cache=loose is very much not recommended for use when you have multiple
> >> > hosts accessing files on the server (or access by processes on the
> >> > server itself). It only gives you "loose" cache coherency. The whole
> >> > point of it is to allow the client to cache data even when the protocol
> >> > says that it shouldn't.
> >>
> >> But why is the latter ( "read() returns non-0 when stat() detects the growth
> >> of file size but the data actually read is '\0'" ) is expected behavior?
> >> It sounds like a bug that the client caches '\0' (data nobody has ever wrote)
> >> instead of '.' (data somebody wrote when the file size grew).
> >>
> >
> > Yeah, that sounds wrong. What should happen is that the cache is
> > invalidated when the size changes. It's possible there is a race in
> > that code however. The locking around it is pretty sloppy...
> 
> When fstat() get a new file size it sets
> CIFS_I(inode)->invalid_mapping to true but do not revalidate the
> cache. Then generic_file_aio_read() reads the wrong data. I think we
> need to check if CIFS_I(inode)->invalid_mapping is true and revalidate
> the cache before calling generic_file_aio_read() in
> file->f_ops->aio_read(). Now cache revalidation happens in lookup/open
> and mmap codepaths only for cache=loose.
> 
> Of course, cache=loose is not recommended for this sort of work flow
> and cache=strict should be used to provide a data coherency between
> several machines.
> 

Ahh right, you're quite correct that we need to revalidate the cache
before doing an aio_read. We should have cifs do something like
nfs_file_read() does...

I also suspect that we have a problem with the invalid_mapping flag
similar to the one NFS had until recently. That was fixed by commit
d529ef83c355. We probably ought to do something similar for cifs.

-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]             ` <20140320191013.110276c0-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
  2014-03-21  8:56               ` Pavel Shilovsky
@ 2014-03-26 12:04               ` Tetsuo Handa
       [not found]                 ` <201403262104.AGG86982.StVMOJOHFOQFLF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
  1 sibling, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2014-03-26 12:04 UTC (permalink / raw)
  To: jlayton-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

Jeff Layton wrote:
> > Yes, I suggested a customer using CIFS on RHEL6 to use cache=none or
> > cache=strict , for that customer's system does not work as expected due to
> > the latter behavior when using "tail -f" for checking for appended log.
> 
> Yeah, cache=strict should work much better for that sort of use-case.
> 
I got another problem where none of cache=none cache=strict actimeo=0 helps.

Since "tail -f" does "repeat calling sleep() and fstat() until the file size
increases" -> "do read() until EOF" -> "call fstat() again", "tail -f" prints
"file truncated" message and prints some portion of already printed lines
even though the file size on the Samba server never decreased.
I expect that the latest fstat() returns the up-to-date file size.
This behavior is observed on 2.6.32-431.11.2.el6.x86_64 and 3.14-rc8.

Steps to reproduce:

---------- Start of testprog.c ----------
/**
 * stdin and stdout must refer a regular file which is shared via CIFS.
 * An example is shown below.
 *
 * # mount -t cifs -o cache=strict,actimeo=0 //127.0.0.1/path/to/export /path/to/mount
 * $ cc -Wall -O3 -o a.out this_file.c
 * $ ./a.out > /path/to/export/shared_file < /path/to/mount/shared_file
*/
#include <stdio.h>
#include <stdlib.h>
#include <poll.h>
#include <sys/stat.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
	unsigned int delay = argc > 1 ? atoi(argv[1]) : 0;
	size_t write_pos = 0;
	size_t read_pos = 0;
	if (ftruncate(1, 0))
		return 1;
	while (1) {
		struct stat buf;
		char c;
		if (write(1, ".", 1) != 1)
			return 2;
		write_pos++;
		if (fstat(1, &buf))
			return 3;
		if (buf.st_size != write_pos)
			return 4;
		if (read(0, &c, 1) != 1)
			return 5;
		if (c != '.')
			return 6;
		read_pos++;
		if (delay)
			poll(NULL, 0, delay);
		if (fstat(0, &buf))
			return 7;
		if (buf.st_size < read_pos)
			fprintf(stderr, "read(%lu) - stat(%lu) = %lu\n",
				read_pos, buf.st_size, read_pos - buf.st_size);
	}
}
---------- End of testprog.c ----------

Make a CIFS mount on localhost where the Samba server process is running.
Compile testprog.c and run testprog, with stdin redirected to a regular file
which is accessed via CIFS and stdout redirected to the same regular file
which is accessed as a local filesystem. An example is shown below.

  # mount -t cifs -o cache=none,actimeo=0 //127.0.0.1/home /mnt
  $ cc -Wall -O3 -o testprog testprog.c
  $ ./testprog > ~/shared_file < /mnt/shared_file

According to what testprog.c does, it is expected that nothing is printed by
testprog process because the bytes read via read() equals the bytes obtained
via fstat(). However, testprog process prints that the bytes read via read()
is larger than the bytes obtained via fstat().

The strace on the smbd process reports that CIFS is omitting lstat() request
after pread() request if lstat() request was done very recently. Specifying
delay as the command line argument for testprog process reduces the possibility
of omitting lstat() request.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]                 ` <201403262104.AGG86982.StVMOJOHFOQFLF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
@ 2014-03-26 12:21                   ` Jeffrey Layton
       [not found]                     ` <20140326052119.32405279-uvzPfv+vNdB0Ogp0/tUwVOTW4wlIGRCZ@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Jeffrey Layton @ 2014-03-26 12:21 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	samba-technical-w/Ol4Ecudpl8XjKLYN78aQ

On Wed, 26 Mar 2014 21:04:15 +0900
Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> wrote:

> Jeff Layton wrote:
> > > Yes, I suggested a customer using CIFS on RHEL6 to use cache=none
> > > or cache=strict , for that customer's system does not work as
> > > expected due to the latter behavior when using "tail -f" for
> > > checking for appended log.
> > 
> > Yeah, cache=strict should work much better for that sort of
> > use-case.
> > 
> I got another problem where none of cache=none cache=strict actimeo=0
> helps.
> 
> Since "tail -f" does "repeat calling sleep() and fstat() until the
> file size increases" -> "do read() until EOF" -> "call fstat()
> again", "tail -f" prints "file truncated" message and prints some
> portion of already printed lines even though the file size on the
> Samba server never decreased. I expect that the latest fstat()
> returns the up-to-date file size. This behavior is observed on
> 2.6.32-431.11.2.el6.x86_64 and 3.14-rc8.
> 
> Steps to reproduce:
> 
> ---------- Start of testprog.c ----------
> /**
>  * stdin and stdout must refer a regular file which is shared via
> CIFS.
>  * An example is shown below.
>  *
>  * # mount -t cifs -o
> cache=strict,actimeo=0 //127.0.0.1/path/to/export /path/to/mount
>  * $ cc -Wall -O3 -o a.out this_file.c
>  * $ ./a.out > /path/to/export/shared_file
> < /path/to/mount/shared_file */
> #include <stdio.h>
> #include <stdlib.h>
> #include <poll.h>
> #include <sys/stat.h>
> #include <unistd.h>
> 
> int main(int argc, char *argv[])
> {
> 	unsigned int delay = argc > 1 ? atoi(argv[1]) : 0;
> 	size_t write_pos = 0;
> 	size_t read_pos = 0;
> 	if (ftruncate(1, 0))
> 		return 1;
> 	while (1) {
> 		struct stat buf;
> 		char c;
> 		if (write(1, ".", 1) != 1)
> 			return 2;
> 		write_pos++;
> 		if (fstat(1, &buf))
> 			return 3;
> 		if (buf.st_size != write_pos)
> 			return 4;
> 		if (read(0, &c, 1) != 1)
> 			return 5;
> 		if (c != '.')
> 			return 6;
> 		read_pos++;
> 		if (delay)
> 			poll(NULL, 0, delay);
> 		if (fstat(0, &buf))
> 			return 7;
> 		if (buf.st_size < read_pos)
> 			fprintf(stderr, "read(%lu) - stat(%lu) =
> %lu\n", read_pos, buf.st_size, read_pos - buf.st_size);
> 	}
> }
> ---------- End of testprog.c ----------
> 
> Make a CIFS mount on localhost where the Samba server process is
> running. Compile testprog.c and run testprog, with stdin redirected
> to a regular file which is accessed via CIFS and stdout redirected to
> the same regular file which is accessed as a local filesystem. An
> example is shown below.
> 
>   # mount -t cifs -o cache=none,actimeo=0 //127.0.0.1/home /mnt
>   $ cc -Wall -O3 -o testprog testprog.c
>   $ ./testprog > ~/shared_file < /mnt/shared_file
> 
> According to what testprog.c does, it is expected that nothing is
> printed by testprog process because the bytes read via read() equals
> the bytes obtained via fstat(). However, testprog process prints that
> the bytes read via read() is larger than the bytes obtained via
> fstat().
> 
> The strace on the smbd process reports that CIFS is omitting lstat()
> request after pread() request if lstat() request was done very
> recently. Specifying delay as the command line argument for testprog
> process reduces the possibility of omitting lstat() request.

(cc'ing samba-technical)

...or that samba is omitting it and is sending cached info instead of
doing the lstat() call? I'm not sure if it does that, but I don't
think you should draw too many conclusions about the behavior of cifs.ko
from stracing smbd.

What may make more sense is to get network captures and analyze the
behavior from that perspective.

-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]                     ` <20140326052119.32405279-uvzPfv+vNdB0Ogp0/tUwVOTW4wlIGRCZ@public.gmane.org>
@ 2014-03-26 13:03                       ` Jeffrey Layton
  2014-03-26 13:06                         ` Tetsuo Handa
       [not found]                         ` <20140326060305.3226a08c-uvzPfv+vNdB0Ogp0/tUwVOTW4wlIGRCZ@public.gmane.org>
  0 siblings, 2 replies; 14+ messages in thread
From: Jeffrey Layton @ 2014-03-26 13:03 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	samba-technical-w/Ol4Ecudpl8XjKLYN78aQ

On Wed, 26 Mar 2014 05:21:19 -0700
Jeffrey Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> On Wed, 26 Mar 2014 21:04:15 +0900
> Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> wrote:
> 
> > Jeff Layton wrote:
> > > > Yes, I suggested a customer using CIFS on RHEL6 to use
> > > > cache=none or cache=strict , for that customer's system does
> > > > not work as expected due to the latter behavior when using
> > > > "tail -f" for checking for appended log.
> > > 
> > > Yeah, cache=strict should work much better for that sort of
> > > use-case.
> > > 
> > I got another problem where none of cache=none cache=strict
> > actimeo=0 helps.
> > 
> > Since "tail -f" does "repeat calling sleep() and fstat() until the
> > file size increases" -> "do read() until EOF" -> "call fstat()
> > again", "tail -f" prints "file truncated" message and prints some
> > portion of already printed lines even though the file size on the
> > Samba server never decreased. I expect that the latest fstat()
> > returns the up-to-date file size. This behavior is observed on
> > 2.6.32-431.11.2.el6.x86_64 and 3.14-rc8.
> > 
> > Steps to reproduce:
> > 
> > ---------- Start of testprog.c ----------
> > /**
> >  * stdin and stdout must refer a regular file which is shared via
> > CIFS.
> >  * An example is shown below.
> >  *
> >  * # mount -t cifs -o
> > cache=strict,actimeo=0 //127.0.0.1/path/to/export /path/to/mount
> >  * $ cc -Wall -O3 -o a.out this_file.c
> >  * $ ./a.out > /path/to/export/shared_file
> > < /path/to/mount/shared_file */
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <poll.h>
> > #include <sys/stat.h>
> > #include <unistd.h>
> > 
> > int main(int argc, char *argv[])
> > {
> > 	unsigned int delay = argc > 1 ? atoi(argv[1]) : 0;
> > 	size_t write_pos = 0;
> > 	size_t read_pos = 0;
> > 	if (ftruncate(1, 0))
> > 		return 1;
> > 	while (1) {
> > 		struct stat buf;
> > 		char c;
> > 		if (write(1, ".", 1) != 1)
> > 			return 2;
> > 		write_pos++;
> > 		if (fstat(1, &buf))
> > 			return 3;
> > 		if (buf.st_size != write_pos)
> > 			return 4;
> > 		if (read(0, &c, 1) != 1)
> > 			return 5;
> > 		if (c != '.')
> > 			return 6;
> > 		read_pos++;
> > 		if (delay)
> > 			poll(NULL, 0, delay);
> > 		if (fstat(0, &buf))
> > 			return 7;
> > 		if (buf.st_size < read_pos)
> > 			fprintf(stderr, "read(%lu) - stat(%lu) =
> > %lu\n", read_pos, buf.st_size, read_pos - buf.st_size);
> > 	}
> > }
> > ---------- End of testprog.c ----------
> > 
> > Make a CIFS mount on localhost where the Samba server process is
> > running. Compile testprog.c and run testprog, with stdin redirected
> > to a regular file which is accessed via CIFS and stdout redirected
> > to the same regular file which is accessed as a local filesystem. An
> > example is shown below.
> > 
> >   # mount -t cifs -o cache=none,actimeo=0 //127.0.0.1/home /mnt
> >   $ cc -Wall -O3 -o testprog testprog.c
> >   $ ./testprog > ~/shared_file < /mnt/shared_file
> > 
> > According to what testprog.c does, it is expected that nothing is
> > printed by testprog process because the bytes read via read() equals
> > the bytes obtained via fstat(). However, testprog process prints
> > that the bytes read via read() is larger than the bytes obtained via
> > fstat().
> > 
> > The strace on the smbd process reports that CIFS is omitting lstat()
> > request after pread() request if lstat() request was done very
> > recently. Specifying delay as the command line argument for testprog
> > process reduces the possibility of omitting lstat() request.
> 
> (cc'ing samba-technical)
> 
> ...or that samba is omitting it and is sending cached info instead of
> doing the lstat() call? I'm not sure if it does that, but I don't
> think you should draw too many conclusions about the behavior of
> cifs.ko from stracing smbd.
> 
> What may make more sense is to get network captures and analyze the
> behavior from that perspective.
> 

...or maybe it *is* cifs.ko. From cifs_inode_needs_reval:


        if (!time_in_range(jiffies, cifs_i->time,
                                cifs_i->time + cifs_sb->actimeo))
                return true;


...I think though that if cifs_i->time == jiffies and actimeo=0, then
that condition will be false. As a quick check, it might be good to add
something like this before that if statement and then rerun your test:

	if (!cifs_sb->actimeo)
		return true;

That should get rid of that particular corner case.

-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
  2014-03-26 13:03                       ` Jeffrey Layton
@ 2014-03-26 13:06                         ` Tetsuo Handa
       [not found]                         ` <20140326060305.3226a08c-uvzPfv+vNdB0Ogp0/tUwVOTW4wlIGRCZ@public.gmane.org>
  1 sibling, 0 replies; 14+ messages in thread
From: Tetsuo Handa @ 2014-03-26 13:06 UTC (permalink / raw)
  To: jlayton; +Cc: linux-cifs, linux-fsdevel, samba-technical

Jeffrey Layton wrote:
> ...or maybe it *is* cifs.ko.

The strace log is showing that smbd process receives "\377SMB." for read()
request and "\377SMB2" for fstat() request.
We can see that there is only one "\377SMB." between two "\377SMB2" for
delay = 1000 ms case while there are multiple "\377SMB." between two
"\377SMB2" for delay = 0 ms case.

Therefore, I think it *is* cifs.ko.

---------- strace of normal case ----------
21:38:47.725757 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 53516) = 1 ([{fd=25, revents=POLLIN}])
21:38:47.725847 read(25, "\0\0\0;", 4)  = 4
21:38:47.725901 read(25, "\377SMB.\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5d\0c\32"..., 59) = 59
21:38:47.725970 fstat(29, {st_mode=S_IFREG|0644, st_size=1, ...}) = 0
21:38:47.726039 pread(29, ".", 1, 0)    = 1
21:38:47.726095 writev(25, [{"\0\0\0<\377SMB.\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5"..., 64}], 1) = 64
21:38:47.726187 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 53514) = 1 ([{fd=25, revents=POLLIN}])
21:38:48.739627 read(25, "\0\0\0b", 4)  = 4
21:38:48.739725 read(25, "\377SMB2\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5d\0d\32"..., 98) = 98
21:38:48.739811 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=1, ...}) = 0
21:38:48.739902 getcwd("/root", 4096)   = 6
21:38:48.739956 lstat("/root/shared_file", {st_mode=S_IFREG|0644, st_size=1, ...}) = 0
21:38:48.740039 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=1, ...}) = 0
21:38:48.740111 fcntl(13, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:38:48.740171 fcntl(13, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:38:48.740241 writev(25, [{"\0\0\0\240\377SMB2\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5"..., 164}], 1) = 164
21:38:48.740341 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 52500) = 1 ([{fd=25, revents=POLLIN}])
21:38:48.740459 read(25, "\0\0\0;", 4)  = 4
21:38:48.740515 read(25, "\377SMB.\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5d\0e\32"..., 59) = 59
21:38:48.740578 fstat(29, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
21:38:48.740643 pread(29, ".", 1, 1)    = 1
21:38:48.740698 writev(25, [{"\0\0\0<\377SMB.\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5"..., 64}], 1) = 64
21:38:48.740789 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 52499) = 1 ([{fd=25, revents=POLLIN}])
21:38:49.755256 read(25, "\0\0\0b", 4)  = 4
21:38:49.755330 read(25, "\377SMB2\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5d\0f\32"..., 98) = 98
21:38:49.755412 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
21:38:49.755484 getcwd("/root", 4096)   = 6
21:38:49.755538 lstat("/root/shared_file", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
21:38:49.755603 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
21:38:49.755674 fcntl(13, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:38:49.755733 fcntl(13, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:38:49.755802 writev(25, [{"\0\0\0\240\377SMB2\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5"..., 164}], 1) = 164
21:38:49.755945 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 51484) = 1 ([{fd=25, revents=POLLIN}])
21:38:49.756007 read(25, "\0\0\0;", 4)  = 4
21:38:49.756058 read(25, "\377SMB.\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5d\0g\32"..., 59) = 59
21:38:49.756120 fstat(29, {st_mode=S_IFREG|0644, st_size=3, ...}) = 0
21:38:49.756185 pread(29, ".", 1, 2)    = 1
21:38:49.756240 writev(25, [{"\0\0\0<\377SMB.\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5"..., 64}], 1) = 64
21:38:49.756331 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 51484) = 1 ([{fd=25, revents=POLLIN}])
21:38:50.770753 read(25, "\0\0\0b", 4)  = 4
21:38:50.770838 read(25, "\377SMB2\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5d\0h\32"..., 98) = 98
21:38:50.770922 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=3, ...}) = 0
21:38:50.770994 getcwd("/root", 4096)   = 6
21:38:50.771050 lstat("/root/shared_file", {st_mode=S_IFREG|0644, st_size=3, ...}) = 0
21:38:50.771115 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=3, ...}) = 0
21:38:50.771187 fcntl(13, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:38:50.771247 fcntl(13, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:38:50.771316 writev(25, [{"\0\0\0\240\377SMB2\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0\32\5"..., 164}], 1) = 164
21:38:50.771461 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 50469) = 1 ([{fd=25, revents=POLLIN}])
---------- strace of normal case ----------

---------- strace of problem case ----------
21:40:36.663289 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 4638) = 1 ([{fd=25, revents=POLLIN}])
21:40:36.663349 read(25, "\0\0\0b", 4)  = 4
21:40:36.663399 read(25, "\377SMB2\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5d\0\227\34"..., 98) = 98
21:40:36.663463 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=447, ...}) = 0
21:40:36.663528 getcwd("/root", 4096)   = 6
21:40:36.663578 lstat("/root/shared_file", {st_mode=S_IFREG|0644, st_size=447, ...}) = 0
21:40:36.663642 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=447, ...}) = 0
21:40:36.663708 fcntl(13, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:40:36.663761 fcntl(13, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:40:36.663867 writev(25, [{"\0\0\0\240\377SMB2\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5"..., 164}], 1) = 164
21:40:36.664038 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 4637) = 1 ([{fd=25, revents=POLLIN}])
21:40:36.664102 read(25, "\0\0\0;", 4)  = 4
21:40:36.664154 read(25, "\377SMB.\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5d\0\230\34"..., 59) = 59
21:40:36.664215 fstat(29, {st_mode=S_IFREG|0644, st_size=448, ...}) = 0
21:40:36.664277 pread(29, ".", 1, 447)  = 1
21:40:36.664330 writev(25, [{"\0\0\0<\377SMB.\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5"..., 64}], 1) = 64
21:40:36.664438 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 4637) = 1 ([{fd=25, revents=POLLIN}])
21:40:36.664498 read(25, "\0\0\0;", 4)  = 4
21:40:36.664548 read(25, "\377SMB.\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5d\0\231\34"..., 59) = 59
21:40:36.664608 fstat(29, {st_mode=S_IFREG|0644, st_size=449, ...}) = 0
21:40:36.664669 pread(29, ".", 1, 448)  = 1
21:40:36.664719 writev(25, [{"\0\0\0<\377SMB.\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5"..., 64}], 1) = 64
21:40:36.664824 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 4636) = 1 ([{fd=25, revents=POLLIN}])
21:40:36.664883 read(25, "\0\0\0;", 4)  = 4
21:40:36.664933 read(25, "\377SMB.\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5d\0\232\34"..., 59) = 59
21:40:36.664992 fstat(29, {st_mode=S_IFREG|0644, st_size=450, ...}) = 0
21:40:36.665053 pread(29, ".", 1, 449)  = 1
21:40:36.665102 writev(25, [{"\0\0\0<\377SMB.\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5"..., 64}], 1) = 64
21:40:36.665206 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 4636) = 1 ([{fd=25, revents=POLLIN}])
21:40:36.665265 read(25, "\0\0\0;", 4)  = 4
21:40:36.665315 read(25, "\377SMB.\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5d\0\233\34"..., 59) = 59
21:40:36.665374 fstat(29, {st_mode=S_IFREG|0644, st_size=451, ...}) = 0
21:40:36.665435 pread(29, ".", 1, 450)  = 1
21:40:36.665484 writev(25, [{"\0\0\0<\377SMB.\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5"..., 64}], 1) = 64
21:40:36.665587 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 4636) = 1 ([{fd=25, revents=POLLIN}])
21:40:36.665646 read(25, "\0\0\0;", 4)  = 4
21:40:36.665696 read(25, "\377SMB.\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5d\0\234\34"..., 59) = 59
21:40:36.665755 fstat(29, {st_mode=S_IFREG|0644, st_size=452, ...}) = 0
21:40:36.666556 pread(29, ".", 1, 451)  = 1
21:40:36.666613 writev(25, [{"\0\0\0<\377SMB.\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5"..., 64}], 1) = 64
21:40:36.666712 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 4634) = 1 ([{fd=25, revents=POLLIN}])
21:40:36.666772 read(25, "\0\0\0b", 4)  = 4
21:40:36.666823 read(25, "\377SMB2\0\0\0\0\0\1\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5d\0\235\34"..., 98) = 98
21:40:36.666887 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=452, ...}) = 0
21:40:36.666951 getcwd("/root", 4096)   = 6
21:40:36.667000 lstat("/root/shared_file", {st_mode=S_IFREG|0644, st_size=452, ...}) = 0
21:40:36.667064 lstat("shared_file", {st_mode=S_IFREG|0644, st_size=452, ...}) = 0
21:40:36.667129 fcntl(13, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:40:36.667182 fcntl(13, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=13024, len=1}) = 0
21:40:36.667242 writev(25, [{"\0\0\0\240\377SMB2\0\0\0\0\200\3\300\0\0\0\0\0\0\0\0\0\0\0\0\1\0,\5"..., 164}], 1) = 164
21:40:36.667344 poll([{fd=6, events=POLLIN|POLLHUP}, {fd=25, events=POLLIN|POLLHUP}, {fd=8, events=POLLIN|POLLHUP}], 3, 4634) = 1 ([{fd=25, revents=POLLIN}])
---------- strace of problem case ----------

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]                         ` <20140326060305.3226a08c-uvzPfv+vNdB0Ogp0/tUwVOTW4wlIGRCZ@public.gmane.org>
@ 2014-03-26 14:05                           ` Tetsuo Handa
       [not found]                             ` <201403262305.GBG17639.MOJLVOOSFHFtQF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Tetsuo Handa @ 2014-03-26 14:05 UTC (permalink / raw)
  To: jlayton-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	samba-technical-w/Ol4Ecudpl8XjKLYN78aQ

Jeffrey Layton wrote:
> > ...or that samba is omitting it and is sending cached info instead of
> > doing the lstat() call? I'm not sure if it does that, but I don't
> > think you should draw too many conclusions about the behavior of
> > cifs.ko from stracing smbd.
> > 
> > What may make more sense is to get network captures and analyze the
> > behavior from that perspective.
> > 
> 
> ...or maybe it *is* cifs.ko. From cifs_inode_needs_reval:
> 
> 
>         if (!time_in_range(jiffies, cifs_i->time,
>                                 cifs_i->time + cifs_sb->actimeo))
>                 return true;
> 
> 
> ...I think though that if cifs_i->time == jiffies and actimeo=0, then
> that condition will be false. As a quick check, it might be good to add
> something like this before that if statement and then rerun your test:
> 
> 	if (!cifs_sb->actimeo)
> 		return true;
> 
> That should get rid of that particular corner case.

Yes, I confirmed that below change fixes this problem. Thank you.

--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -1737,6 +1737,8 @@ cifs_inode_needs_reval(struct inode *inode)
 	if (cifs_i->time == 0)
 		return true;
 
+	if (!cifs_sb->actimeo)
+		return true;
 	if (!time_in_range(jiffies, cifs_i->time,
 				cifs_i->time + cifs_sb->actimeo))
 		return true;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found]                             ` <201403262305.GBG17639.MOJLVOOSFHFtQF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
@ 2014-03-26 14:12                               ` Jeffrey Layton
  0 siblings, 0 replies; 14+ messages in thread
From: Jeffrey Layton @ 2014-03-26 14:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	samba-technical-w/Ol4Ecudpl8XjKLYN78aQ

On Wed, 26 Mar 2014 23:05:20 +0900
Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> wrote:

> Jeffrey Layton wrote:
> > > ...or that samba is omitting it and is sending cached info
> > > instead of doing the lstat() call? I'm not sure if it does that,
> > > but I don't think you should draw too many conclusions about the
> > > behavior of cifs.ko from stracing smbd.
> > > 
> > > What may make more sense is to get network captures and analyze
> > > the behavior from that perspective.
> > > 
> > 
> > ...or maybe it *is* cifs.ko. From cifs_inode_needs_reval:
> > 
> > 
> >         if (!time_in_range(jiffies, cifs_i->time,
> >                                 cifs_i->time + cifs_sb->actimeo))
> >                 return true;
> > 
> > 
> > ...I think though that if cifs_i->time == jiffies and actimeo=0,
> > then that condition will be false. As a quick check, it might be
> > good to add something like this before that if statement and then
> > rerun your test:
> > 
> > 	if (!cifs_sb->actimeo)
> > 		return true;
> > 
> > That should get rid of that particular corner case.
> 
> Yes, I confirmed that below change fixes this problem. Thank you.
> 
> --- a/fs/cifs/inode.c
> +++ b/fs/cifs/inode.c
> @@ -1737,6 +1737,8 @@ cifs_inode_needs_reval(struct inode *inode)
>  	if (cifs_i->time == 0)
>  		return true;
>  
> +	if (!cifs_sb->actimeo)
> +		return true;
>  	if (!time_in_range(jiffies, cifs_i->time,
>  				cifs_i->time + cifs_sb->actimeo))
>  		return true;

Great, thanks for testing it. I'll send Steve a patch for 3.15 and cc
you.

-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Question regarding CIFS cache=loose behavior.
       [not found] ` <201403202103.EIE87056.QVtLHSFMFOOFOJ-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
  2014-03-20 20:19   ` Jeff Layton
@ 2014-03-31 17:48   ` Jeff Layton
  1 sibling, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2014-03-31 17:48 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA

On Thu, 20 Mar 2014 21:03:03 +0900
Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org> wrote:

> Is the behavior described below ( "read() returns 0 unless stat() detects the
> growth of file size" and "read() returns non-0 when stat() detects the growth
> of file size but the data actually read is '\0'" ) a feature of cache=loose
> option?
> 
> Steps to reproduce:
> 
> ---------- Start of writer1.c ----------
> #include <unistd.h>
> int main(int argc, char *argv[])
> {
>     while (write(1, ".", 1) == 1)
>         sleep(1);
>     return 0;
> }
> ---------- End of writer1.c ----------
> 
> ---------- Start of reader1.c ----------
> #include <unistd.h>
> int main(int argc, char *argv[])
> {
>     while (1) {
>         char c;
>         if (read(0, &c, 1) == 1) {
>             if (c)
>                 write(1, &c, 1);
>             else
>                 write(1, "!", 1);
>         } else {
>             sleep(1);
>         }
>     }
>     return 0;
> }
> ---------- End of reader1.c ----------
> 
> ---------- Start of reader2.c ----------
> #include <sys/stat.h>
> #include <unistd.h>
> int main(int argc, char *argv[])
> {
>     while (1) {
>         struct stat buf;
>         char c;
>         fstat(0, &buf);
>         if (read(0, &c, 1) == 1) {
>             if (c)
>                 write(1, &c, 1);
>             else
>                 write(1, "!", 1);
>         } else {
>             sleep(1);
>         }
>     }
>     return 0;
> }
> ---------- End of reader2.c ----------
> 
> Run the following commands on the Samba server side.
> 
>   $ cc -Wall -O3 -o writer1 writer1.c
>   $ ./writer1 > shared_file
> 
> Run the following commands on the Samba client side.
> 
>   $ cc -Wall -O3 -o reader1 reader1.c
>   $ strace -tt -o /tmp/log1 ./reader1 < shared_file
>   ..........     (<= Makes no progress by default)
> 
> The shared_file shown above is a regular file shared via CIFS mounted with
> cache=loose option. (e.g. "mount -t cifs -o cache=loose //server/path /mnt" )
> 
> I expect that the reader1 process reads one byte whenever the writer1
> process writes one byte. But we can observe via /tmp/log1 that the read()
> continues returning 0 until the growth of file size of shared_file is
> recognized by stat() using programs like /bin/ls and /usr/bin/stat .
> 
> Then, terminate the reader1 process using Ctrl-C and run the following
> commands on the Samba client side.
> 
>   $ cc -Wall -O3 -o reader2 reader2.c
>   $ strace -tt -o /tmp/log2 ./reader2 < shared_file
>   ......................!!!!!!     (<= Makes progress with wrong data)
> 
> We can observe via /tmp/log2 that the read() returns 1 because fstat()
> detects the growth of file size of shared_file . However, the data
> returned by read() is not '.' but '\0'.


I had a chance to poke at this today. I think the main issue is likely
due to a server-side problem.

smbd.conf defaults to "kernel oplocks = no" now. That means that samba
is handing out oplocks even when other processes on the box have the
file open for read or write. Setting that to "yes" should help things
work better.

It seems like that used to be different, but has apparently changed in
recent versions of samba. Without that, we get some "interesting"
effects when the size changes, but we still hold an oplock.

I also have a patchset forthcoming to fix some races in this area too.
After setting kernel oplocks = yes and applying the patchset, this
works more like you'd expect.

-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-03-31 17:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-20 12:03 Question regarding CIFS cache=loose behavior Tetsuo Handa
     [not found] ` <201403202103.EIE87056.QVtLHSFMFOOFOJ-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-03-20 20:19   ` Jeff Layton
     [not found]     ` <20140320161943.60890dd4-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2014-03-20 20:56       ` Tetsuo Handa
     [not found]         ` <201403210556.FEI21354.FLJOQMHtSVOFFO-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-03-20 21:04           ` Steve French
2014-03-20 23:10           ` Jeff Layton
     [not found]             ` <20140320191013.110276c0-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2014-03-21  8:56               ` Pavel Shilovsky
     [not found]                 ` <CAKywueTNDHSGia58AcKwuDtZA2zc_yOQR-iFTPq3XTrYm6bNLQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-03-21 11:40                   ` Jeff Layton
2014-03-26 12:04               ` Tetsuo Handa
     [not found]                 ` <201403262104.AGG86982.StVMOJOHFOQFLF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-03-26 12:21                   ` Jeffrey Layton
     [not found]                     ` <20140326052119.32405279-uvzPfv+vNdB0Ogp0/tUwVOTW4wlIGRCZ@public.gmane.org>
2014-03-26 13:03                       ` Jeffrey Layton
2014-03-26 13:06                         ` Tetsuo Handa
     [not found]                         ` <20140326060305.3226a08c-uvzPfv+vNdB0Ogp0/tUwVOTW4wlIGRCZ@public.gmane.org>
2014-03-26 14:05                           ` Tetsuo Handa
     [not found]                             ` <201403262305.GBG17639.MOJLVOOSFHFtQF-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2014-03-26 14:12                               ` Jeffrey Layton
2014-03-31 17:48   ` Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.