All of lore.kernel.org
 help / color / mirror / Atom feed
* Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
       [not found]     ` <4F95897B.2040103@msgid.tls.msk.ru>
@ 2012-04-23 18:07       ` Michael Tokarev
  2012-04-24  1:25         ` Ian Kent
  2012-04-24 15:08         ` Linus Torvalds
  0 siblings, 2 replies; 12+ messages in thread
From: Michael Tokarev @ 2012-04-23 18:07 UTC (permalink / raw)
  To: Ian Kent; +Cc: stable, autofs, Linux-kernel

On 23.04.2012 20:55, Michael Tokarev wrote:
> On 23.04.2012 20:29, Michael Tokarev wrote:
>> On 23.04.2012 13:19, Ian Kent wrote:
>>> On Sun, 2012-04-22 at 19:22 +0400, Michael Tokarev wrote:
>>>> This is JFYI for now, I only diagnozed the issue but had
>>>> no time to see what's going on.
>>>>
>>>> The short story is that with 3.0.24 and later kernels,
>>>> 32bit autofs stopped working on 64bit kernels, while
>>>> it worked before just fine.
>>>>
>>>> The userspace is of version 5.0.4.  Platform is x86.
>>>>
>>>> The sympthoms of the issue is that automount daemon,
>>>> on every access to a (not yet mounted) autofs mount
>>>> point, stalls forever, with the process trying to
>>>> access it stalling forever too -- this is why the
>>>> system didn't reboot, -- it stayed on an attempt to
>>>> access one of automount subdirs, somewhere in the
>>>> middle.
>>
>> This is what happens with 3.0.24+:
>>
>> [pid 15207] read(4, "\5\0\0\0\3\0\0\0\2\0\0\0\35\0\0\0@\17\246\3\0\0\0\0\350\3\0\0\350\3\0\0"..., 304) = 300
>> [pid 15207] read(4,  <unfinished ...> -- forever.
>>
>> It wants to read 304 bytes, but kernel only supplies 300
>> bytes.
>>
>> $ file /usr/sbin/automount
>> /usr/sbin/automount: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
>> $ uname -a
>> Linux gandalf 3.0.0-amd64 #3.0.27 SMP Tue Apr 3 16:23:45 MSK 2012 x86_64 GNU/Linux
>>
>> This is a32744d4abae24572eff7269bc17895c41bd0085
>> "autofs: work around unhappy compat problem on x86-64",
>> included in 3.0.24:
>>
>> +static noinline size_t autofs_v5_packet_size(struct autofs_sb_info *sbi)
>> +{
>> +       size_t pktsz = sizeof(struct autofs_v5_packet);
>> +#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
>> +       if (sbi->compat_daemon > 0)
>> +               pktsz -= 4;
>> +#endif
>> +       return pktsz;
>> +}
>> +
>>
>> "End result: the "packed" size of the structure is 300 bytes: 4-byte, but
>> not 8-byte aligned.
>>
>> As a result, despite all the fields being in the same place on all
>> architectures, sizeof() will round up that size to 304 bytes on
>> architectures that have 8-byte alignment for u64."
>>
>> So I'm not sure how it worked before this change (from the
>> description of the patch it looks like it should not work).
>> But it definitely worked for me, and this patch broke it
>> for me.
> 
> It worked because the userspace apparently does the "rigtht"
> think already compensating for the kernel bitness mismatch.
> This is a bit funky too, but it should work:
> 
> static size_t get_kpkt_len(void)
> {
>         size_t pkt_len = sizeof(struct autofs_v5_packet);
>         struct utsname un;
> 
>         uname(&un);
> 
>         if (pkt_len % 8) {
>                 if (strcmp(un.machine, "alpha") == 0 ||
>                     strcmp(un.machine, "ia64") == 0 ||
>                     strcmp(un.machine, "x86_64") == 0 ||
>                     strcmp(un.machine, "ppc64") == 0)
>                         pkt_len += 4;
> 
>         }
> 
>         return pkt_len;
> }
> 
> So it looks like the kernel/user interface had an error,
> userspace adopted to the kernel by doing something, so
> it started working.  And next kernel adopted to the un-
> patched userspace, thus breaking patched userspace.
> 
> That's quite messy... :(  And I'm not sure what to do
> about this, -- the only solution - in my opinion anyway -
> is to revert this kernel patch and maybe switch to a
> new protocol version.

Okay, even if messy, there are quite easy pure userspace
solutions to all this.

But first of all, I want to question this change in the
first place, hence CC'ing to LKML too.

Kernel has been shipping with this brokeness for quite
some time, namely, since introduction of autofs4, dated
Mon Mar 27 01:14:55 2006 -0800 (commit 5c0a32fc2cd0).

This is 6 whole years by now.

Main userspace user of this interface adopted long ago
too, and has been in wide use for years as well.

This kernel change, even if right in theory, actually
breaks all existing userspace which was working fine
for years.

It looks like this is very much against main linux
principle of not breaking stuff even if the change is
"obviously right".

So it pretty much looks like this very fix has to be
reverted in mainline and in all stable kernels.  At least
that's what I'd do if you ask me.


Now, back to "pure userspace" solution.

The kernel sends data to userspace, and the unit of
exchange is pretty small - this structure of either
300 or 304 bytes.  Can the read from userspace EVER
be partial, so that a fullread() function is necessary?
I'm not sure, but to me it looks like this fullread()
loop isn't needed in the first place, and just single
read() (maybe repeating it in case of EINTR) should
do the job already.  This way, we either handle the
data we've read, if we think the amount we read
is sufficient, and the packet looks sane, or just
error out.  The filename is at the end of the struct,
so actually there's no need to send whole struct from
kernel, it is sufficient for the kernel to write up
to the trailing zero, and userspace _may_ be able to
understand (and no, this is not what userspace expects
currently, so changing kernel to do so will break things --
I'm just saying what can be done in _userspace_).

Now, we know that this sizeof(v5_packet) is designed
wrongly.  But it is actually trivial for the userspace
to expect EITHER 300 bytes OR 304 bytes from the kernel --
again, all in one go.  For it to work, extend the struct
with a padding at the end so it will be real 304 bytes,
and request reading of these 304 bytes, but handle the
read of 300 bytes too as successful.

This way, it will work for BOTH "bad" and "good" kernel.

Ofcourse it all breaks again if, in case of several
requests going from the kernel, kernel will batch them
into single larger read, so we'll lose packet boundary.
If that's the case, my "solution" does not work obviously.

Still, the userspace may adapt again, by doing a probe,
forcing the kernel to send us a request before making
the mountpoint visible in the final location, checking
the size of the data kernel sends, and setting that
pkt_len variable to this value.  Ugly, but it will let
the application to work in either case.

But again, to me the current situation with outright
BREAKING of existing userspace is much uglier since
it affects many users instead of a single userspace
component...

Thanks,

/mjt

>> Should 3.3 work?  As far as I can see, it includes the
>> same change (which has been backported!), so it may have
>> the same _issue_...


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-23 18:07       ` Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment Michael Tokarev
@ 2012-04-24  1:25         ` Ian Kent
  2012-04-24  3:40           ` Ian Kent
  2012-04-24 15:08         ` Linus Torvalds
  1 sibling, 1 reply; 12+ messages in thread
From: Ian Kent @ 2012-04-24  1:25 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: stable, autofs, Linux-kernel

On Mon, 2012-04-23 at 22:07 +0400, Michael Tokarev wrote:
> On 23.04.2012 20:55, Michael Tokarev wrote:
> > On 23.04.2012 20:29, Michael Tokarev wrote:
> >> On 23.04.2012 13:19, Ian Kent wrote:
> >>> On Sun, 2012-04-22 at 19:22 +0400, Michael Tokarev wrote:
> >>>> This is JFYI for now, I only diagnozed the issue but had
> >>>> no time to see what's going on.
> >>>>
> >>>> The short story is that with 3.0.24 and later kernels,
> >>>> 32bit autofs stopped working on 64bit kernels, while
> >>>> it worked before just fine.
> >>>>
> >>>> The userspace is of version 5.0.4.  Platform is x86.
> >>>>
> >>>> The sympthoms of the issue is that automount daemon,
> >>>> on every access to a (not yet mounted) autofs mount
> >>>> point, stalls forever, with the process trying to
> >>>> access it stalling forever too -- this is why the
> >>>> system didn't reboot, -- it stayed on an attempt to
> >>>> access one of automount subdirs, somewhere in the
> >>>> middle.
> >>
> >> This is what happens with 3.0.24+:
> >>
> >> [pid 15207] read(4, "\5\0\0\0\3\0\0\0\2\0\0\0\35\0\0\0@\17\246\3\0\0\0\0\350\3\0\0\350\3\0\0"..., 304) = 300
> >> [pid 15207] read(4,  <unfinished ...> -- forever.
> >>
> >> It wants to read 304 bytes, but kernel only supplies 300
> >> bytes.
> >>
> >> $ file /usr/sbin/automount
> >> /usr/sbin/automount: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
> >> $ uname -a
> >> Linux gandalf 3.0.0-amd64 #3.0.27 SMP Tue Apr 3 16:23:45 MSK 2012 x86_64 GNU/Linux
> >>
> >> This is a32744d4abae24572eff7269bc17895c41bd0085
> >> "autofs: work around unhappy compat problem on x86-64",
> >> included in 3.0.24:
> >>
> >> +static noinline size_t autofs_v5_packet_size(struct autofs_sb_info *sbi)
> >> +{
> >> +       size_t pktsz = sizeof(struct autofs_v5_packet);
> >> +#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
> >> +       if (sbi->compat_daemon > 0)
> >> +               pktsz -= 4;
> >> +#endif
> >> +       return pktsz;
> >> +}
> >> +
> >>
> >> "End result: the "packed" size of the structure is 300 bytes: 4-byte, but
> >> not 8-byte aligned.
> >>
> >> As a result, despite all the fields being in the same place on all
> >> architectures, sizeof() will round up that size to 304 bytes on
> >> architectures that have 8-byte alignment for u64."
> >>
> >> So I'm not sure how it worked before this change (from the
> >> description of the patch it looks like it should not work).
> >> But it definitely worked for me, and this patch broke it
> >> for me.
> > 
> > It worked because the userspace apparently does the "rigtht"
> > think already compensating for the kernel bitness mismatch.
> > This is a bit funky too, but it should work:
> > 
> > static size_t get_kpkt_len(void)
> > {
> >         size_t pkt_len = sizeof(struct autofs_v5_packet);
> >         struct utsname un;
> > 
> >         uname(&un);
> > 
> >         if (pkt_len % 8) {
> >                 if (strcmp(un.machine, "alpha") == 0 ||
> >                     strcmp(un.machine, "ia64") == 0 ||
> >                     strcmp(un.machine, "x86_64") == 0 ||
> >                     strcmp(un.machine, "ppc64") == 0)
> >                         pkt_len += 4;
> > 
> >         }
> > 
> >         return pkt_len;
> > }
> > 
> > So it looks like the kernel/user interface had an error,
> > userspace adopted to the kernel by doing something, so
> > it started working.  And next kernel adopted to the un-
> > patched userspace, thus breaking patched userspace.
> > 
> > That's quite messy... :(  And I'm not sure what to do
> > about this, -- the only solution - in my opinion anyway -
> > is to revert this kernel patch and maybe switch to a
> > new protocol version.
> 
> Okay, even if messy, there are quite easy pure userspace
> solutions to all this.
> 
> But first of all, I want to question this change in the
> first place, hence CC'ing to LKML too.

Back porting the 3.3 change to current stable kernels was not a good
idea IMO and I probably should have NAKed the one that I actually saw,
which was devoid of version btw.

The change came about because there was a complaint about me not wanting
to fix this in the kernel, but wanting to continue using the user space
workaround. But it was insisted that it be fixed in the kernel, and so
it was done.

Consequently people will encounter this gotcha sooner or later and will
need to apply a patch to resolve it. The back port to stable kernels
just means it's sooner rather than later.

I have already told you that I will need to provide a (updated, since
there is already a commit in git that covers 3.3) patch for autofs. I
haven't done that yet because I've been a little ill.

A quick web search finds the post that I made to the autofs list
explaining this.
http://comments.gmane.org/gmane.linux.file-systems/61832

> 
> Kernel has been shipping with this brokeness for quite
> some time, namely, since introduction of autofs4, dated
> Mon Mar 27 01:14:55 2006 -0800 (commit 5c0a32fc2cd0).
> 
> This is 6 whole years by now.
> 
> Main userspace user of this interface adopted long ago
> too, and has been in wide use for years as well.
> 
> This kernel change, even if right in theory, actually
> breaks all existing userspace which was working fine
> for years.
> 
> It looks like this is very much against main linux
> principle of not breaking stuff even if the change is
> "obviously right".
> 
> So it pretty much looks like this very fix has to be
> reverted in mainline and in all stable kernels.  At least
> that's what I'd do if you ask me.
> 
> 
> Now, back to "pure userspace" solution.
> 
> The kernel sends data to userspace, and the unit of
> exchange is pretty small - this structure of either
> 300 or 304 bytes.  Can the read from userspace EVER
> be partial, so that a fullread() function is necessary?
> I'm not sure, but to me it looks like this fullread()
> loop isn't needed in the first place, and just single
> read() (maybe repeating it in case of EINTR) should
> do the job already.  This way, we either handle the
> data we've read, if we think the amount we read
> is sufficient, and the packet looks sane, or just
> error out.  The filename is at the end of the struct,
> so actually there's no need to send whole struct from
> kernel, it is sufficient for the kernel to write up
> to the trailing zero, and userspace _may_ be able to
> understand (and no, this is not what userspace expects
> currently, so changing kernel to do so will break things --
> I'm just saying what can be done in _userspace_).
> 
> Now, we know that this sizeof(v5_packet) is designed
> wrongly.  But it is actually trivial for the userspace
> to expect EITHER 300 bytes OR 304 bytes from the kernel --
> again, all in one go.  For it to work, extend the struct
> with a padding at the end so it will be real 304 bytes,
> and request reading of these 304 bytes, but handle the
> read of 300 bytes too as successful.
> 
> This way, it will work for BOTH "bad" and "good" kernel.
> 
> Ofcourse it all breaks again if, in case of several
> requests going from the kernel, kernel will batch them
> into single larger read, so we'll lose packet boundary.
> If that's the case, my "solution" does not work obviously.
> 
> Still, the userspace may adapt again, by doing a probe,
> forcing the kernel to send us a request before making
> the mountpoint visible in the final location, checking
> the size of the data kernel sends, and setting that
> pkt_len variable to this value.  Ugly, but it will let
> the application to work in either case.
> 
> But again, to me the current situation with outright
> BREAKING of existing userspace is much uglier since
> it affects many users instead of a single userspace
> component...
> 
> Thanks,
> 
> /mjt
> 
> >> Should 3.3 work?  As far as I can see, it includes the
> >> same change (which has been backported!), so it may have
> >> the same _issue_...
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-24  1:25         ` Ian Kent
@ 2012-04-24  3:40           ` Ian Kent
  2012-04-24  4:06             ` Michael Tokarev
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Kent @ 2012-04-24  3:40 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: stable, autofs, Linux-kernel

On Tue, 2012-04-24 at 09:25 +0800, Ian Kent wrote:
> On Mon, 2012-04-23 at 22:07 +0400, Michael Tokarev wrote:
> > On 23.04.2012 20:55, Michael Tokarev wrote:
> > > On 23.04.2012 20:29, Michael Tokarev wrote:
> > >> On 23.04.2012 13:19, Ian Kent wrote:
> > >>> On Sun, 2012-04-22 at 19:22 +0400, Michael Tokarev wrote:
> > >>>> This is JFYI for now, I only diagnozed the issue but had
> > >>>> no time to see what's going on.
> > >>>>
> > >>>> The short story is that with 3.0.24 and later kernels,
> > >>>> 32bit autofs stopped working on 64bit kernels, while
> > >>>> it worked before just fine.
> > >>>>
> > >>>> The userspace is of version 5.0.4.  Platform is x86.
> > >>>>
> > >>>> The sympthoms of the issue is that automount daemon,
> > >>>> on every access to a (not yet mounted) autofs mount
> > >>>> point, stalls forever, with the process trying to
> > >>>> access it stalling forever too -- this is why the
> > >>>> system didn't reboot, -- it stayed on an attempt to
> > >>>> access one of automount subdirs, somewhere in the
> > >>>> middle.
> > >>
> > >> This is what happens with 3.0.24+:
> > >>
> > >> [pid 15207] read(4, "\5\0\0\0\3\0\0\0\2\0\0\0\35\0\0\0@\17\246\3\0\0\0\0\350\3\0\0\350\3\0\0"..., 304) = 300
> > >> [pid 15207] read(4,  <unfinished ...> -- forever.
> > >>
> > >> It wants to read 304 bytes, but kernel only supplies 300
> > >> bytes.
> > >>
> > >> $ file /usr/sbin/automount
> > >> /usr/sbin/automount: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
> > >> $ uname -a
> > >> Linux gandalf 3.0.0-amd64 #3.0.27 SMP Tue Apr 3 16:23:45 MSK 2012 x86_64 GNU/Linux
> > >>
> > >> This is a32744d4abae24572eff7269bc17895c41bd0085
> > >> "autofs: work around unhappy compat problem on x86-64",
> > >> included in 3.0.24:
> > >>
> > >> +static noinline size_t autofs_v5_packet_size(struct autofs_sb_info *sbi)
> > >> +{
> > >> +       size_t pktsz = sizeof(struct autofs_v5_packet);
> > >> +#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
> > >> +       if (sbi->compat_daemon > 0)
> > >> +               pktsz -= 4;
> > >> +#endif
> > >> +       return pktsz;
> > >> +}
> > >> +
> > >>
> > >> "End result: the "packed" size of the structure is 300 bytes: 4-byte, but
> > >> not 8-byte aligned.
> > >>
> > >> As a result, despite all the fields being in the same place on all
> > >> architectures, sizeof() will round up that size to 304 bytes on
> > >> architectures that have 8-byte alignment for u64."
> > >>
> > >> So I'm not sure how it worked before this change (from the
> > >> description of the patch it looks like it should not work).
> > >> But it definitely worked for me, and this patch broke it
> > >> for me.
> > > 
> > > It worked because the userspace apparently does the "rigtht"
> > > think already compensating for the kernel bitness mismatch.
> > > This is a bit funky too, but it should work:
> > > 
> > > static size_t get_kpkt_len(void)
> > > {
> > >         size_t pkt_len = sizeof(struct autofs_v5_packet);
> > >         struct utsname un;
> > > 
> > >         uname(&un);
> > > 
> > >         if (pkt_len % 8) {
> > >                 if (strcmp(un.machine, "alpha") == 0 ||
> > >                     strcmp(un.machine, "ia64") == 0 ||
> > >                     strcmp(un.machine, "x86_64") == 0 ||
> > >                     strcmp(un.machine, "ppc64") == 0)
> > >                         pkt_len += 4;
> > > 
> > >         }
> > > 
> > >         return pkt_len;
> > > }
> > > 
> > > So it looks like the kernel/user interface had an error,
> > > userspace adopted to the kernel by doing something, so
> > > it started working.  And next kernel adopted to the un-
> > > patched userspace, thus breaking patched userspace.
> > > 
> > > That's quite messy... :(  And I'm not sure what to do
> > > about this, -- the only solution - in my opinion anyway -
> > > is to revert this kernel patch and maybe switch to a
> > > new protocol version.
> > 
> > Okay, even if messy, there are quite easy pure userspace
> > solutions to all this.
> > 
> > But first of all, I want to question this change in the
> > first place, hence CC'ing to LKML too.
> 
> Back porting the 3.3 change to current stable kernels was not a good
> idea IMO and I probably should have NAKed the one that I actually saw,
> which was devoid of version btw.
> 
> The change came about because there was a complaint about me not wanting
> to fix this in the kernel, but wanting to continue using the user space
> workaround. But it was insisted that it be fixed in the kernel, and so
> it was done.
> 
> Consequently people will encounter this gotcha sooner or later and will
> need to apply a patch to resolve it. The back port to stable kernels
> just means it's sooner rather than later.
> 
> I have already told you that I will need to provide a (updated, since
> there is already a commit in git that covers 3.3) patch for autofs. I
> haven't done that yet because I've been a little ill.

AFAICT this patch is what's needed.
Can you check please.

autofs-5.0.4 - allow for kernel packet size change

From: Ian Kent <raven@themaw.net>

Kernel 3.3.0 has a patch to allow for the original missdesign of
the autofs v5 kernel packet. That change is being back ported to
earlier stable kernel releases which is causing autofs to hang in
mixed 32-bit user space/64-bit kernel

The problem is that while all the structure fields are alligned
correctly structure allignment on x86-64 causes the packet size
to be 4 bytes larger than on x86. So when running an x86 binary
on an x86-64 install the packet size did not match causing user
space pipe reads to hang.
---

 daemon/automount.c |    8 ++++++++
 include/mounts.h   |   17 +++++++++++++++++
 2 files changed, 25 insertions(+), 0 deletions(-)


diff --git a/daemon/automount.c b/daemon/automount.c
index 863aac5..92236b1 100644
--- a/daemon/automount.c
+++ b/daemon/automount.c
@@ -38,6 +38,7 @@
 #include <dirent.h>
 #include <sys/vfs.h>
 #include <sys/utsname.h>
+#include "mounts.h"
 
 #include "automount.h"
 #ifdef LIBXML2_WORKAROUND
@@ -599,6 +600,13 @@ static size_t get_kpkt_len(void)
 {
 	size_t pkt_len = sizeof(struct autofs_v5_packet);
 	struct utsname un;
+	int kern_vers;
+
+	kern_vers = linux_version_code();
+	if (kern_vers > KERNEL_VERSION(3, 2, 9) ||
+	   (kern_vers > KERNEL_VERSION(3, 0, 23) &&
+            kern_vers < KERNEL_VERSION(3, 1, 0)))
+		return pkt_len;
 
 	uname(&un);
 
diff --git a/include/mounts.h b/include/mounts.h
index 4d932ca..3947d83 100644
--- a/include/mounts.h
+++ b/include/mounts.h
@@ -16,6 +16,9 @@
 #ifndef MOUNTS_H
 #define MOUNTS_H
 
+#include <linux/version.h>
+#include <sys/utsname.h>
+
 #ifndef AUTOFS_TYPE_ANY
 #define AUTOFS_TYPE_ANY		0x0000
 #endif
@@ -72,6 +75,20 @@ struct mnt_list {
 	struct list_head ordered;
 };
 
+static inline unsigned int linux_version_code(void)
+{
+        struct utsname my_utsname;
+        unsigned int p, q, r;
+
+        if (uname(&my_utsname))
+                return 0;
+
+        p = (unsigned int)atoi(strtok(my_utsname.release, "."));
+        q = (unsigned int)atoi(strtok(NULL, "."));
+        r = (unsigned int)atoi(strtok(NULL, "."));
+        return KERNEL_VERSION(p, q, r);
+}
+
 unsigned int query_kproto_ver(void);
 unsigned int get_kver_major(void);
 unsigned int get_kver_minor(void);



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-24  3:40           ` Ian Kent
@ 2012-04-24  4:06             ` Michael Tokarev
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Tokarev @ 2012-04-24  4:06 UTC (permalink / raw)
  To: Ian Kent; +Cc: stable, autofs, Linux-kernel, Linus Torvalds

On 24.04.2012 07:40, Ian Kent wrote:
> On Tue, 2012-04-24 at 09:25 +0800, Ian Kent wrote:
>> On Mon, 2012-04-23 at 22:07 +0400, Michael Tokarev wrote:
[]
>>>> So it looks like the kernel/user interface had an error,

for 6 years,

>>>> userspace adopted to the kernel by doing something, so
>>>> it started working.  And next kernel adopted to the un-
>>>> patched userspace, thus breaking patched userspace.

(The commit in question is 499f286dc02cde6b668e2d757dfe100cb0c43445
from Feb 2012, released in 3.3, original bug went into kernel in
2006, as 5c0a32fc2cd0).

>>>>
>>>> That's quite messy... :(  And I'm not sure what to do
>>>> about this, -- the only solution - in my opinion anyway -
>>>> is to revert this kernel patch and maybe switch to a
>>>> new protocol version.
>>>
>>> Okay, even if messy, there are quite easy pure userspace
>>> solutions to all this.
>>>
>>> But first of all, I want to question this change in the
>>> first place, hence CC'ing to LKML too.
>>
>> Back porting the 3.3 change to current stable kernels was not a good
>> idea IMO and I probably should have NAKed the one that I actually saw,
>> which was devoid of version btw.
>>
>> The change came about because there was a complaint about me not wanting
>> to fix this in the kernel, but wanting to continue using the user space
>> workaround. But it was insisted that it be fixed in the kernel, and so
>> it was done.

Who insisted on this?

Note my question is not about stable series anymore, it is about the
original "fix" which went into 3.3 kernel, which, having in mind that
interface has been this way for 6 years and userspace adopted, should
not have been there in the first place.

>> Consequently people will encounter this gotcha sooner or later and will
>> need to apply a patch to resolve it. The back port to stable kernels
>> just means it's sooner rather than later.
>>
>> I have already told you that I will need to provide a (updated, since
>> there is already a commit in git that covers 3.3) patch for autofs. I
>> haven't done that yet because I've been a little ill.
> 
> AFAICT this patch is what's needed.
> Can you check please.
> 
> autofs-5.0.4 - allow for kernel packet size change

Please stop creating even more mess.  Just revert this patch from
kernel -- from BOTH stable and current mainline series.  Right
NOW, as an urgent fix, before 3.4 will be released (Cc'ing Linus
for this).

[]
> @@ -599,6 +600,13 @@ static size_t get_kpkt_len(void)
>  {
>  	size_t pkt_len = sizeof(struct autofs_v5_packet);
>  	struct utsname un;
> +	int kern_vers;
> +
> +	kern_vers = linux_version_code();
> +	if (kern_vers > KERNEL_VERSION(3, 2, 9) ||
> +	   (kern_vers > KERNEL_VERSION(3, 0, 23) &&
> +            kern_vers < KERNEL_VERSION(3, 1, 0)))
> +		return pkt_len;

Yeah, mess like this is absolutely disgusting... :(

[]
 +static inline unsigned int linux_version_code(void)
> +{
> +        struct utsname my_utsname;
> +        unsigned int p, q, r;
> +
> +        if (uname(&my_utsname))
> +                return 0;
> +
> +        p = (unsigned int)atoi(strtok(my_utsname.release, "."));
> +        q = (unsigned int)atoi(strtok(NULL, "."));
> +        r = (unsigned int)atoi(strtok(NULL, "."));
> +        return KERNEL_VERSION(p, q, r);

And here there's the same bug repeated again: there's no
guarantee that 3rd strtok() will return non-NULL.  But
this is irrelevant, just do not do that.  Pretty please,
it is not too late still.


Thank you!

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-23 18:07       ` Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment Michael Tokarev
  2012-04-24  1:25         ` Ian Kent
@ 2012-04-24 15:08         ` Linus Torvalds
  2012-04-24 16:12           ` Michael Tokarev
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2012-04-24 15:08 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: Ian Kent, stable, autofs, Linux-kernel

On Mon, Apr 23, 2012 at 11:07 AM, Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> Okay, even if messy, there are quite easy pure userspace
> solutions to all this.

Yes.

> But first of all, I want to question this change in the
> first place, hence CC'ing to LKML too.

So the reason for the change was to make it much easier for people to
upgrade their 32-bit environment to a 64-bit kernel, and then you can
- maybe even just one package at a time: you may only care about a few
things - upgrade your binaries.

And the 64-bit compat layer really *should* make that "just work". A
32-bit binary really should work exactly the same on a 32-bit kernel
as it does on a 64-bit kernel. The optimal situation is that the
binary couldn't really even tell the difference, with the possible
exception of just doing an uname (and quite frankly, even *that* I
think is debatable: if you run a 32-bit binary, I seriously think we
should have just returned "i386", but people argued against that).

> Kernel has been shipping with this brokeness for quite
> some time, namely, since introduction of autofs4, dated
> Mon Mar 27 01:14:55 2006 -0800 (commit 5c0a32fc2cd0).

Absolutely. That said, we I did have several reports of it making it
possible to use a 64-bit kernel with user land from a couple of
regular distributions. So it's a real fix, and I'd like to keep it
around some way too.

> Main userspace user of this interface adopted long ago
> too, and has been in wide use for years as well.

Actually, that's just not true. The *main* users of the interface seem
to have never fixed anything. As far as I know, neither the upstream
autofs tools nor several of the big distributions ever had patches to
make 32-bit autofs work with the old broken 64-bit compat layer.

So this is a regression, but it does seem to be the case that the
workarounds for the old broken kernel behavior were fairly limited in
distribution too. Which makes me wonder if

 (a) we could make it a kernel boot command line option (which would
be better than a hardcoded compile-time config option)

 (b) which distros did the work-around for the old broken 32-bit user
space compat behavior, and how widely spread is that (which would
likely imply which way the *default* behavior should be)

But yes, we'll need to fix it somehow in the kernel, even if it might
be just a horrible hack.

Sadly, the autofs interface is *disgusting*. It just uses a pipe, so
the kernel side of autofs doesn't even *see* how big the read will be.
It just sees a random pipe, never seeing the read itself. So we can't
just say "oh, he's doing a 300-byte read, so he must be the old broken
interface". But if somebody figures out how to detect that
automatically in the kernel, that would be really good too.

Anybody?

                  Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-24 15:08         ` Linus Torvalds
@ 2012-04-24 16:12           ` Michael Tokarev
  2012-04-24 16:16             ` Michael Tokarev
  2012-04-24 19:53             ` Linus Torvalds
  0 siblings, 2 replies; 12+ messages in thread
From: Michael Tokarev @ 2012-04-24 16:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ian Kent, stable, autofs, Linux-kernel

On 24.04.2012 19:08, Linus Torvalds wrote:
> On Mon, Apr 23, 2012 at 11:07 AM, Michael Tokarev <mjt@tls.msk.ru> wrote:
[]
>> Kernel has been shipping with this brokeness for quite
>> some time, namely, since introduction of autofs4, dated
>> Mon Mar 27 01:14:55 2006 -0800 (commit 5c0a32fc2cd0).
> 
> Absolutely. That said, we I did have several reports of it making it
> possible to use a 64-bit kernel with user land from a couple of
> regular distributions. So it's a real fix, and I'd like to keep it
> around some way too.

I run 64bit kernel and 32bit userland for several years.  Initially
we had a bunch of machines (servers) not supporting 64bits at all,
and I wanted to keep userspace across all servers the same (so I
can roll eg updates to all them at once).  Nowadays it more historic,
but still works, -- I just updated the kernel (to support more
memory etc), and a few applications which actually _use_ that memory,
the rest of the userspace is 32bits still.

The same 32/64 environment is running on my workstations too.

>> Main userspace user of this interface adopted long ago
>> too, and has been in wide use for years as well.
> 
> Actually, that's just not true. The *main* users of the interface seem
> to have never fixed anything. As far as I know, neither the upstream
> autofs tools nor several of the big distributions ever had patches to
> make 32-bit autofs work with the old broken 64-bit compat layer.

And this is actually not the case -- that's what started this thread,
when I tried to upgrade a 64bit kernel on one of our production servers
to 3.0.28, 32bit autofs, which worked just fine before, stopped working.
Hence this bugreport/discussion.

The userspace is running debian stable (squeeze).  Debian has autofs
package based on upstream 5.0.4 version.  That (upstream) version has
the "bug-compat" code in it, in daemon/automount.c:

static size_t get_kpkt_len(void)
{
        size_t pkt_len = sizeof(struct autofs_v5_packet);
        struct utsname un;

        uname(&un);

        if (pkt_len % 8) {
                if (strcmp(un.machine, "alpha") == 0 ||
                    strcmp(un.machine, "ia64") == 0 ||
                    strcmp(un.machine, "x86_64") == 0 ||
                    strcmp(un.machine, "ppc64") == 0)
                        pkt_len += 4;

        }

        return pkt_len;
}

This is exactly the workaround for this very bug in kernel.

I don't know how old this code is, but it definitely is in the
5.0.1 upstream tarball, and the file there is dated
Feb-20, 2007 - at least 3 years after the initial bug in kernel.

> So this is a regression, but it does seem to be the case that the
> workarounds for the old broken kernel behavior were fairly limited in
> distribution too. Which makes me wonder if

No, this is unfortunately not the case: the whole issue here is that
the _only_ userspace of this interface has the fix for broken kernel
for at least 5 years already, or maybe more (with kernel bug being
6 years old).

>  (a) we could make it a kernel boot command line option (which would
> be better than a hardcoded compile-time config option)
> 
>  (b) which distros did the work-around for the old broken 32-bit user
> space compat behavior, and how widely spread is that (which would
> likely imply which way the *default* behavior should be)

That'd be all distros shipping 5.0.1 or later version, which, I think,
is all current and previous distros.  5.0.1-pre1 did not have this
workaround.

And previos, v4 version, did not know this interface.

> But yes, we'll need to fix it somehow in the kernel, even if it might
> be just a horrible hack.

I'm not sure it really needs fixing, since the only user of this interface
has a workarond for this alomst since the time the bug has been introduced.

The only real solution to this - in my humble opinion anyway - is to make
the _next_ version of the interface right, but keep current one broken but
compatible.

> Sadly, the autofs interface is *disgusting*. It just uses a pipe, so
> the kernel side of autofs doesn't even *see* how big the read will be.
> It just sees a random pipe, never seeing the read itself. So we can't
> just say "oh, he's doing a 300-byte read, so he must be the old broken
> interface". But if somebody figures out how to detect that
> automatically in the kernel, that would be really good too.

No, just keep it the way it is now.  And design new iface right :)

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-24 16:12           ` Michael Tokarev
@ 2012-04-24 16:16             ` Michael Tokarev
  2012-04-24 19:53             ` Linus Torvalds
  1 sibling, 0 replies; 12+ messages in thread
From: Michael Tokarev @ 2012-04-24 16:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ian Kent, stable, autofs, Linux-kernel

On 24.04.2012 20:12, Michael Tokarev wrote:
> On 24.04.2012 19:08, Linus Torvalds wrote:

>>> Kernel has been shipping with this brokeness for quite
>>> some time, namely, since introduction of autofs4, dated
>>> Mon Mar 27 01:14:55 2006 -0800 (commit 5c0a32fc2cd0).
>>

>> Actually, that's just not true. The *main* users of the interface seem
>> to have never fixed anything. As far as I know, neither the upstream
>> autofs tools nor several of the big distributions ever had patches to
>> make 32-bit autofs work with the old broken 64-bit compat layer.
[]
> I don't know how old this code is, but it definitely is in the
> 5.0.1 upstream tarball, and the file there is dated
> Feb-20, 2007 - at least 3 years after the initial bug in kernel.

I meant to say "about a year after the initial bug in kernel".

And I think it was me who found the original issue, -- i mean,
discovered that it does not work, not found the real bug - but
I don't remember anymore.  This stuff looks very familiar for
some reason however... ;)

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-24 16:12           ` Michael Tokarev
  2012-04-24 16:16             ` Michael Tokarev
@ 2012-04-24 19:53             ` Linus Torvalds
  2012-04-24 20:42               ` Thomas Meyer
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2012-04-24 19:53 UTC (permalink / raw)
  To: Michael Tokarev, Thomas Meyer; +Cc: Ian Kent, stable, autofs, Linux-kernel

On Tue, Apr 24, 2012 at 9:12 AM, Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> The userspace is running debian stable (squeeze).  Debian has autofs
> package based on upstream 5.0.4 version.  That (upstream) version has
> the "bug-compat" code in it, in daemon/automount.c:
>
> static size_t get_kpkt_len(void)
> {
>        size_t pkt_len = sizeof(struct autofs_v5_packet);
>        struct utsname un;
>
>        uname(&un);
>
>        if (pkt_len % 8) {
>                if (strcmp(un.machine, "alpha") == 0 ||
>                    strcmp(un.machine, "ia64") == 0 ||
>                    strcmp(un.machine, "x86_64") == 0 ||
>                    strcmp(un.machine, "ppc64") == 0)
>                        pkt_len += 4;

Ugh.

So how did this break for Thomas Mayer (there was another person
involved too, I forget the details)? Is this code somehow
config-dependent so that it doesn't even get compiled for some cases?

                    Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-24 19:53             ` Linus Torvalds
@ 2012-04-24 20:42               ` Thomas Meyer
  2012-04-24 21:01                 ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Meyer @ 2012-04-24 20:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Michael Tokarev, Ian Kent, stable, autofs, Linux-kernel

Am Dienstag, den 24.04.2012, 12:53 -0700 schrieb Linus Torvalds:
> On Tue, Apr 24, 2012 at 9:12 AM, Michael Tokarev <mjt@tls.msk.ru> wrote:
> >
> > The userspace is running debian stable (squeeze).  Debian has autofs
> > package based on upstream 5.0.4 version.  That (upstream) version has
> > the "bug-compat" code in it, in daemon/automount.c:
> >
> > static size_t get_kpkt_len(void)
> > {
> >        size_t pkt_len = sizeof(struct autofs_v5_packet);
> >        struct utsname un;
> >
> >        uname(&un);
> >
> >        if (pkt_len % 8) {
> >                if (strcmp(un.machine, "alpha") == 0 ||
> >                    strcmp(un.machine, "ia64") == 0 ||
> >                    strcmp(un.machine, "x86_64") == 0 ||
> >                    strcmp(un.machine, "ppc64") == 0)
> >                        pkt_len += 4;
> 
> Ugh.
> 
> So how did this break for Thomas Mayer (there was another person
> involved too, I forget the details)? Is this code somehow
> config-dependent so that it doesn't even get compiled for some cases?

It broke the autofs feature of systemd (32bit) on a 64bit kernel, which
resulted in an endless wait in the boot process. I don't use the autofs
tools. I guess the fix in the autofs package to compensate the kernel
bug, which has now been fixed, breaks the autofs package.
The fix in the kernel makes systemd work correctly.

thomas



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-24 20:42               ` Thomas Meyer
@ 2012-04-24 21:01                 ` Linus Torvalds
  2012-04-25  4:05                   ` Michael Tokarev
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2012-04-24 21:01 UTC (permalink / raw)
  To: Thomas Meyer; +Cc: Michael Tokarev, Ian Kent, stable, autofs, Linux-kernel

On Tue, Apr 24, 2012 at 1:42 PM, Thomas Meyer <thomas@m3y3r.de> wrote:
>
> It broke the autofs feature of systemd (32bit) on a 64bit kernel, which
> resulted in an endless wait in the boot process. I don't use the autofs
> tools. I guess the fix in the autofs package to compensate the kernel
> bug, which has now been fixed, breaks the autofs package.
> The fix in the kernel makes systemd work correctly.

Oh damn, I'd forgotten the details. Ok.

That does mean that we probably just have to do a kernel command line
option for this. And then the autofs4 code that currently does

    sbi->compat_daemon = is_compat_task();

needs to change that to take the command line option into account too.

We could make it a mount option, but since the whole point is that we
want to be compatible with existing binaries that don't *use* special
mount options, I suspect the only reasonable point is at the kernel
command line.

That is, unless somebody can figure out a way to auto-detect how big
the user space read is. The problem with that really is that the
autofs read() system call isn't done directly to some "real autofs"
file descriptor, autofs literally is using the standard pipe code.

So the code in fs/pipe.c does actually see the size of the read. But
the code in fs/autofs4/ does not. And we can't change the interface,
because the whole pipe is opened in user space iirc.

Ugly.

Anybody willing to write the patch?

                 Linus

Damn.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-24 21:01                 ` Linus Torvalds
@ 2012-04-25  4:05                   ` Michael Tokarev
  2012-04-25  4:08                     ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2012-04-25  4:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Meyer, Ian Kent, stable, autofs, Linux-kernel

On 25.04.2012 01:01, Linus Torvalds wrote:
> On Tue, Apr 24, 2012 at 1:42 PM, Thomas Meyer <thomas@m3y3r.de> wrote:
>>
>> It broke the autofs feature of systemd (32bit) on a 64bit kernel, which
>> resulted in an endless wait in the boot process. I don't use the autofs
>> tools. I guess the fix in the autofs package to compensate the kernel
>> bug, which has now been fixed, breaks the autofs package.
>> The fix in the kernel makes systemd work correctly.
> 
> Oh damn, I'd forgotten the details. Ok.
> 
> That does mean that we probably just have to do a kernel command line
> option for this. And then the autofs4 code that currently does
> 
>     sbi->compat_daemon = is_compat_task();
> 
> needs to change that to take the command line option into account too.

How about something like

     sbi->compat_daemon = is_compat_task() && strcmp(task->comm, "automount") != 0

instead?  (This task->comm is a pseudocode, just to show the idea, I don't
know where this info can be found in kernel)

Or maybe similar workaround (as in automount) can be placed into systemd
too.  I'd really prefer this one -- historically things has been that way
for quite some time (at least 5 years).

Or yet another solution: let the kernel to compensate, but add some code
in the _negotiation_ phase, when user<=>kernel space initializes things
and exchanges their capabilities - IF there is one.

> We could make it a mount option, but since the whole point is that we
> want to be compatible with existing binaries that don't *use* special
> mount options, I suspect the only reasonable point is at the kernel
> command line.

Well, the last my suggestion -- it will have to use some special option
or a flag or whatever.  But let just make it a part of the interface,
like we just introduced an autofs v6, so existing and currently in WIDE
use (I mean automount from autofs5 which broke) will be fine, and all
NEW binaries will work too.

Kernel command line is - in my opinion - too much for this very issue.
The same way a user can just replace his automount with automount64
(for example) - if something has to be done by the user anyway.

But a mount option which will be specified by all new binaries as a
part of standard interface should work - provided this is the
"negotiation" phase.

This way, we're breaking systemd _again_, but only till the next
version.  The difference between systemd and automount here is that
the latter is much older and is more likely to be used in a mixed
environment and has more "chances" to be seen in the future this
way too, when someone tries to run some old distribution code in
a chroot or a container.  Systemd here is much less used in mixed
environment now and is less widely used in general, so this way
we break less - IMHO anyway.

And/or systemd can adopt in the similar way as automount did --
like, treat this bug as an interface "feature", provide a "large
enough" buffer and just expect a read of up to the trailing zero
in filename as okay too -- this way it will work regardless of
anything else (this is what I proposed initially).

> That is, unless somebody can figure out a way to auto-detect how big
> the user space read is. The problem with that really is that the
> autofs read() system call isn't done directly to some "real autofs"
> file descriptor, autofs literally is using the standard pipe code.
> 
> So the code in fs/pipe.c does actually see the size of the read. But
> the code in fs/autofs4/ does not. And we can't change the interface,
> because the whole pipe is opened in user space iirc.

Yes, that'd be more sane solution - an ability to detect read size.

So, to sum it all up, we've the following options:

1) somehow detect the read size and do a magic in kernel so everything
   Just Work.

2) sbi->compat_daemon = is_compat_task() && strcmp(task->comm, "automount") != 0
   Everything _current_ works.

3) mount option or other change of the interface, reverting the
   patch.  Old automount will continue to work as it did last 5+
   years, systemd will break again.  Can be combined with 2), ie,
   do 2) as an "urgent fix" and implement 3) later after some
   thinking.

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment
  2012-04-25  4:05                   ` Michael Tokarev
@ 2012-04-25  4:08                     ` Linus Torvalds
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2012-04-25  4:08 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: Thomas Meyer, Ian Kent, stable, autofs, Linux-kernel

On Tue, Apr 24, 2012 at 9:05 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> How about something like
>
>     sbi->compat_daemon = is_compat_task() && strcmp(task->comm, "automount") != 0
>
> instead?  (This task->comm is a pseudocode, just to show the idea, I don't
> know where this info can be found in kernel)

Thomas, does that work for you?

                Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-04-25  4:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4F94222C.6080608@msgid.tls.msk.ru>
     [not found] ` <1335172741.2226.10.camel@perseus.themaw.net>
     [not found]   ` <4F958352.7050106@msgid.tls.msk.ru>
     [not found]     ` <4F95897B.2040103@msgid.tls.msk.ru>
2012-04-23 18:07       ` Breaking userspace? Re: 3.0.24 broke aufofs on mixed 32/64bit environment Michael Tokarev
2012-04-24  1:25         ` Ian Kent
2012-04-24  3:40           ` Ian Kent
2012-04-24  4:06             ` Michael Tokarev
2012-04-24 15:08         ` Linus Torvalds
2012-04-24 16:12           ` Michael Tokarev
2012-04-24 16:16             ` Michael Tokarev
2012-04-24 19:53             ` Linus Torvalds
2012-04-24 20:42               ` Thomas Meyer
2012-04-24 21:01                 ` Linus Torvalds
2012-04-25  4:05                   ` Michael Tokarev
2012-04-25  4:08                     ` Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.