linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Rich Felker <dalias@libc.org>
Cc: Arnd Bergmann <arnd@arndb.de>, linux-man <linux-man@vger.kernel.org>
Subject: Re: Access to CMSG_DATA
Date: Wed, 5 Feb 2020 01:30:12 +0100	[thread overview]
Message-ID: <CAKgNAkhJ3j=v255UPJBeYs4erDOWpinxo0T1dqx_Fdh1MC=7-A@mail.gmail.com> (raw)
In-Reply-To: <20191217204751.GI1666@brightrain.aerifal.cx>

Hello Rich,

My apologies for the delayed reply.

On Tue, 17 Dec 2019 at 21:47, Rich Felker <dalias@libc.org> wrote:
>
> On Tue, Dec 17, 2019 at 09:00:08PM +0100, Arnd Bergmann wrote:
> > On Tue, Dec 17, 2019 at 3:36 PM Rich Felker <dalias@libc.org> wrote:
> > >
> > > It came to my attention while reviewing possible breakage with move to
> > > 64-bit time_t that some applications are dereferencing data in socket
> > > control messages (particularly SCM_TIMESTAMP*) in-place as the message
> > > type, rather than memcpy'ing it to appropriate storage. This
> > > necessarily does not work and is not supportable if the message
> > > contains data with greater alignment requirement than the header. In
> > > particular, on 32-bit archs, cmsghdr has size 12 and alignment 4, but
> > > struct timeval and timespec may have alignment requirement 8.
> > >
> > > I found at least ptpd, socat, and ssmping doing this via Debian Code
> > > Search:
> > >
> > > https://sources.debian.org/src/ptpd/2.3.1-debian1-4/src/dep/net.c/?hl=1578#L1578
> > > https://sources.debian.org/src/socat/1.7.3.3-2/xio-socket.c/?hl=1839#L1839
> > > https://sources.debian.org/src/ssmping/0.9.1-3/ssmpngcl.c/?hl=307#L307
> > >
> > > and I suspect there are a good deal more out there. On most archs they
> > > won't break, or will visibly break with SIGBUS, but in theory it's
> > > possible that they silently read wrong data and this might happen on
> > > some older and more tiny-embedded-oriented archs.
> >
> > Good find. I suppose this is going to be particularly annoying for
> > architectures that are affected because all systems that are in
> > widespread use are not affected:
> >
> > - x86, riscv, ppc and s390 always allow unaligned loads
> > - ARMv6+ mostly allows unaligned loads. Some instructions such as ldrd
> >   require alignment of four bytes, which is ok, and ARMv5 requires natural
> >   alignment up to 32 bits, so this is also ok
>
> Seems correct.
> x
> > - On MIPS I think that o32 is fine since there are no 64-bit loads, but
> >   n64  would likely be affected, if there are still users remaining (musl
> >   supports it, so I assume there are some users).
>
> I think you mean n32. n64 is the full LP64 ABI. Indeed it seems like
> n32 is likely affected unless the kernel traps and fixes up misaligned
> accesses.
>
> > - m68k only requires 16-bit alignment
> > - For the other 32-bit architectures that musl supports (microblaze, sh,
>                                        ^^^^^^^^^^^^^^^^^^
>
> FWIW this isn't specific to musl; glibc is also affected, and uclibc
> would be too if they ever implement time64.
>
> >   openrisc), none advertise unaligned-access capability  to the kernel,
> >   but I also don't think any of them have a native 64-bit load instruction.
> >   armv5, microblaze, sh and nds32 fix up unaligned accesses in an
> >   exception handler; openrisc and csky require aligned accesses in user
> >   space.
>
> This sounds correct. Presently J2 (open source SH2 ISA implementation)
> has no unaligned trap; it just loads/stores the wrong value. But there
> are no 64-bit load/store insns anyway and 32-bit alignment is met.
>
> > > I think it's clear to someone who understands alignment and who's
> > > thought about it that applications just can't do this, but it doesn't
> > > seem to be documented, and an example in cmsg(3) even shows access to
> > > int payload via *(int *)CMSG_DATA(cmsg) (of course int is safe because
> > > its alignment is <= header alignment, but this is not mentioned).
> > >
> > > Could we add text, and perhaps change the example, to indicate that in
> > > general memcpy needs to be used to copy the payload to/from a suitable
> > > object?
> >
> > Yes, I think that would be a good idea.
>
> How about adding to:
>
>        *  CMSG_DATA() returns a pointer to the data portion of a cmsghdr.
>
> "The pointer returned cannot be assumed to be suitably aligned for
> accessing arbitrary payload data types. Applications should not cast
> it to a pointer type matching the payload, but should use memcpy to
> copy data to or from a suitably declared object."
>
> and doing this in the examples? Are there other places it should be
> mentioned to to make sure readers see it?

Thanks for this report! And the nicely worded text that you propose to
add to the page.

I've applied the patch below, which is almost exactly your text, plus
a suitable change in the code example. Seem okay?

I can't spot any other place in the manual page where this point
should be mentioned.

Thanks,

Michael

diff --git a/man3/cmsg.3 b/man3/cmsg.3
index 83bb633cc..9dd4c9c10 100644
--- a/man3/cmsg.3
+++ b/man3/cmsg.3
@@ -106,6 +106,12 @@ This is a constant expression.
 .BR CMSG_DATA ()
 returns a pointer to the data portion of a
 .IR cmsghdr .
+The pointer returned cannot be assumed to be suitably aligned for
+accessing arbitrary payload data types.
+Applications should not cast it to a pointer type matching the payload,
+but should instead use
+.BR memcpy (3)
+to copy data to or from a suitably declared object.
 .IP *
 .BR CMSG_LEN ()
 returns the value to store in the
@@ -178,7 +184,6 @@ option in a received ancillary buffer:
 .EX
 struct msghdr msgh;
 struct cmsghdr *cmsg;
-int *ttlptr;
 int received_ttl;

 /* Receive auxiliary data in msgh */
@@ -187,8 +192,7 @@ for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg != NULL;
         cmsg = CMSG_NXTHDR(&msgh, cmsg)) {
     if (cmsg\->cmsg_level == IPPROTO_IP
             && cmsg\->cmsg_type == IP_TTL) {
-        ttlptr = (int *) CMSG_DATA(cmsg);
-        received_ttl = *ttlptr;
+       memcpy(&receive_ttl, CMSG_DATA(cmsg), sizeof(int));
         break;
     }
 }

  reply	other threads:[~2020-02-05  0:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-17 14:36 Access to CMSG_DATA Rich Felker
2019-12-17 20:00 ` Arnd Bergmann
2019-12-17 20:47   ` Rich Felker
2020-02-05  0:30     ` Michael Kerrisk (man-pages) [this message]
2020-02-05  0:40       ` Rich Felker
2020-02-05  8:08       ` [PATCH] cmsg.3: ffix Dmitry V. Levin
2020-02-07 15:17         ` Michael Kerrisk (man-pages)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKgNAkhJ3j=v255UPJBeYs4erDOWpinxo0T1dqx_Fdh1MC=7-A@mail.gmail.com' \
    --to=mtk.manpages@gmail.com \
    --cc=arnd@arndb.de \
    --cc=dalias@libc.org \
    --cc=linux-man@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).