* getxattr() on cifs sometimes hangs since kernel 5.14
@ 2022-05-17 20:48 Forest
2022-05-18 3:18 ` ronnie sahlberg
[not found] ` <CAH2r5muJYFQ7FutNP_WWCHPE+dDSi6=_x27P81+FN7QGQKyzFA@mail.gmail.com>
0 siblings, 2 replies; 4+ messages in thread
From: Forest @ 2022-05-17 20:48 UTC (permalink / raw)
To: linux-cifs
When running on recent kernel versions, this system call on a cifs-mounted
file sometimes takes an unusually long time:
getxattr("/cifsmount/dir/image.jpg", "user.baloo.rating", NULL, 0)
The call normally returns in under 10 milliseconds, but on kernel 5.14+, it
sometimes takes over 30 seconds with no significant client or server load.
Discovered while using gwenview to browse 100+ 1.5 MiB images on a samba share
mounted via /etc/fstab. While quickly flipping through the images, the problem
often occurs within 20 seconds. Gwenview freezes until the call completes.
Client:
kernel versions 5.14 and later
mount.cifs 6.11
Gwenview 20.12.3
Debian Bullseye
4-core amd64
Server:
Samba 4.13.13-Debian
Debian Bullseye
6-core arm64
A git bisect identified kernel commit 9e992755be8f as the problematic change.
The problem does not occur when any of the following are true:
- Client is running a kernel from before that commit.
- The nouser_xattr mount option is used on the cifs share.
- Gwenview accesses the files via smb:// URL instead of a cifs mount.
I don't know Gwenview's internals, but using its strace output as a guide, I
have written a potential reproducer. It succeeds at triggering slow getxattr()
calls, though not nearly as slow as those triggered by Gwenview. I can post it
if that would be helpful.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: getxattr() on cifs sometimes hangs since kernel 5.14
2022-05-17 20:48 getxattr() on cifs sometimes hangs since kernel 5.14 Forest
@ 2022-05-18 3:18 ` ronnie sahlberg
2022-07-15 21:29 ` Forest
[not found] ` <CAH2r5muJYFQ7FutNP_WWCHPE+dDSi6=_x27P81+FN7QGQKyzFA@mail.gmail.com>
1 sibling, 1 reply; 4+ messages in thread
From: ronnie sahlberg @ 2022-05-18 3:18 UTC (permalink / raw)
To: Forest; +Cc: linux-cifs
On Wed, 18 May 2022 at 13:15, Forest <forestix@sonic.net> wrote:
>
> When running on recent kernel versions, this system call on a cifs-mounted
> file sometimes takes an unusually long time:
>
> getxattr("/cifsmount/dir/image.jpg", "user.baloo.rating", NULL, 0)
>
> The call normally returns in under 10 milliseconds, but on kernel 5.14+, it
> sometimes takes over 30 seconds with no significant client or server load.
>
> Discovered while using gwenview to browse 100+ 1.5 MiB images on a samba share
> mounted via /etc/fstab. While quickly flipping through the images, the problem
> often occurs within 20 seconds. Gwenview freezes until the call completes.
>
> Client:
> kernel versions 5.14 and later
> mount.cifs 6.11
> Gwenview 20.12.3
> Debian Bullseye
> 4-core amd64
> Server:
> Samba 4.13.13-Debian
> Debian Bullseye
> 6-core arm64
>
> A git bisect identified kernel commit 9e992755be8f as the problematic change.
> The problem does not occur when any of the following are true:
> - Client is running a kernel from before that commit.
> - The nouser_xattr mount option is used on the cifs share.
> - Gwenview accesses the files via smb:// URL instead of a cifs mount.
>
> I don't know Gwenview's internals, but using its strace output as a guide, I
> have written a potential reproducer. It succeeds at triggering slow getxattr()
> calls, though not nearly as slow as those triggered by Gwenview. I can post it
> if that would be helpful.
Please post the reproducer. It will be useful for testing as well as
verifying if a potential fix.
If the reproducer is simple enough we might add it to our buildbot.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: getxattr() on cifs sometimes hangs since kernel 5.14
[not found] ` <CAH2r5muJYFQ7FutNP_WWCHPE+dDSi6=_x27P81+FN7QGQKyzFA@mail.gmail.com>
@ 2022-05-18 3:56 ` Forest
0 siblings, 0 replies; 4+ messages in thread
From: Forest @ 2022-05-18 3:56 UTC (permalink / raw)
To: Steve French; +Cc: Paulo Alcantara, ronnie sahlberg, linux-cifs
/*
Attempt to reproduce a cifs xattr problem from kernel commit 9e992755be8f.
When running on recent kernel versions, this system call on a cifs-mounted
file sometimes takes an unusually long time:
getxattr("/cifsmount/dir/image.jpg", "user.baloo.rating", NULL, 0)
The call normally returns in under 10 milliseconds, but on kernel 5.14+, it
sometimes takes over 30 seconds with no significant client or server load.
Discovered while using gwenview to browse 100+ 1.5 MiB images on a samba share
mounted via /etc/fstab. While quickly flipping through the images, the problem
often occurs within 20 seconds. Gwenview freezes until the call completes.
Client:
kernel versions 5.14 and later
mount.cifs 6.11
Gwenview 20.12.3
Debian Bullseye
4-core amd64
Server:
Samba 4.13.13-Debian
Debian Bullseye
6-core arm64
A git bisect identified kernel commit 9e992755be8f as the problematic change.
The problem does not occur when any of the following are true:
- Client is running a kernel from before that commit.
- The nouser_xattr mount option is used on the cifs share.
- Gwenview accesses the files via smb:// URL instead of a cifs mount.
This program tries to reproduce the problem by making system calls seen in
strace output from a stuck gwenview instance. It expects its arguments to be
file paths on a cifs mount. It will loop over the named files, applying the
system calls to each one in sequence. The -i option is available to run
several iterations of the loop. For example, with -i 2 and 10 files, the system
calls will be made 20 times. This normally completes quickly.
The -t option runs the same loop in multiple threads, which seems to trigger
the problem: getxattr() takes over 100 times as long when more than one thread
is running.
Curiously, the call never seems to be as slow in this reproducer (~1 second) as
it sometimes is in gwenview (30+ seconds), so perhaps this code does not model
gwenview's triggering behavior well. Nevertheless, it reproduces a significant
delay under the same conditions, so it might still help track down the problem.
Build with:
gcc -pthread
*/
#include <alloca.h>
#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/xattr.h>
#include <unistd.h>
int test_file(char *path)
{
int fd;
fd = openat(AT_FDCWD, path, O_RDONLY);
if (fd == -1)
{
perror("openat");
return -1;
}
close(fd);
getxattr(path, "user.baloo.rating", NULL, 0); /* sometimes slow */
return 0;
}
int test_files(char **paths)
{
for (; *paths; paths++)
if (test_file(*paths))
return -1;
return 0;
}
int test_files_repeatedly(char **paths, int itercount)
{
while (itercount--)
if (test_files(paths))
return -1;
return 0;
}
struct thread_params
{
char **paths;
int itercount;
};
void *thread_main(void *thread_arg)
{
struct thread_params params = *(struct thread_params *)thread_arg;
while (params.itercount--)
if (test_files(params.paths))
return "failure in test thread";
return 0;
}
int test_files_threaded(char **paths, int itercount, int threadcount)
{
struct thread_params params = {paths, itercount};
pthread_t *threadids;
int i;
threadcount--; /* the main thread will do one thread's work */
threadids = alloca(sizeof(*threadids) * threadcount);
for (i = 0; i < threadcount; i++)
if (pthread_create(&threadids[i], NULL, thread_main, ¶ms))
{
printf("pthread_create failed\n");
return -1;
}
/* do one thread's work in the main thread */
if (test_files_repeatedly(paths, itercount))
{
printf("failure in main thread");
return -1;
}
for (i = 0; i < threadcount; i++)
{
void *thread_result;
if (pthread_join(threadids[i], &thread_result))
{
printf("pthread_join failed\n");
return -1;
}
if (thread_result)
{
printf("%s\n", (char *)thread_result);
return -1;
}
}
return 0;
}
void usage(const char *cmd)
{
printf("usage: %s [-i iterations] [-t threads] <files>\n", cmd);
}
int main(int argc, char *argv[])
{
int itercount = 1, threadcount=1, opt;
char **paths;
while ((opt = getopt(argc, argv, "i:t:h")) != -1)
{
switch (opt)
{
case 'i':
itercount = atoi(optarg);
break;
case 't':
threadcount = atoi(optarg);
break;
default:
usage(argv[0]);
return 2;
}
}
if (optind == argc)
{
usage(argv[0]);
return 2;
}
paths = &argv[optind];
return test_files_threaded(paths, itercount, threadcount);
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: getxattr() on cifs sometimes hangs since kernel 5.14
2022-05-18 3:18 ` ronnie sahlberg
@ 2022-07-15 21:29 ` Forest
0 siblings, 0 replies; 4+ messages in thread
From: Forest @ 2022-07-15 21:29 UTC (permalink / raw)
To: linux-cifs; +Cc: ronnie sahlberg, Steve French, Paulo Alcantara
On Wed, 18 May 2022 13:18:02 +1000, ronnie sahlberg wrote:
>Please post the reproducer. It will be useful for testing as well as
>verifying if a potential fix.
I sent the reproducer to you guys back in May, but forgot to cc: the list.
There is now a report in bugzilla, with the reproducer attached:
https://bugzilla.samba.org/show_bug.cgi?id=15123
I'm dropping off the mailing list, but updates to the bug report should
still reach me.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-07-15 21:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-17 20:48 getxattr() on cifs sometimes hangs since kernel 5.14 Forest
2022-05-18 3:18 ` ronnie sahlberg
2022-07-15 21:29 ` Forest
[not found] ` <CAH2r5muJYFQ7FutNP_WWCHPE+dDSi6=_x27P81+FN7QGQKyzFA@mail.gmail.com>
2022-05-18 3:56 ` Forest
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.