Netdev Archive on lore.kernel.org
 help / color / Atom feed
From: Thorsten Glaser <t.glaser@tarent.de>
To: Ben Hutchings <ben@decadent.org.uk>
Cc: 966459@bugs.debian.org, netdev <netdev@vger.kernel.org>
Subject: Re: Bug#966459: linux: traffic class socket options (both IPv4/IPv6) inconsistent with docs/standards
Date: Sun, 2 Aug 2020 19:29:50 +0000 (UTC)
Message-ID: <Pine.BSM.4.64L.2008021919500.2148@herc.mirbsd.org> (raw)
In-Reply-To: <e67190b7de22fff20fb4c5c084307e0b76001248.camel@decadent.org.uk>

Ben Hutchings dixit:

>ip(7) also doesn't document IP_PKTOPIONS.

Hmm, I don’t use IP_PKTOPIONS though. I’m not exactly sure I found
the correct place in the kernel for what I do.

On the sending side, I use setsockopt with either
IPPROTO_IP,IP_TOS or IPPROTO_IPV6,IPV6_TCLASS to
set the default traffic class on outgoing packets.

On the receiving side I use setsockopt with either
IPPROTO_IP,IP_RECVTOS or IPPROTO_IPV6,IPV6_RECVTCLASS
to set up the socket then recvmsg to get a cmsg(3) of
IPPROTO_IP,IP_TOS/IPPROTO_IPV6,IPV6_TCLASS from which
I read the traffic class octet.

These are where I believe I found inconsistencies
between code and documentation.

>Those are two different APIs though: recvmsg() for datagram sockets, vs
>getsockopt(... IP_PKTOPTIONS ...) for stream sockets.  They obviously
>ought to be consistent, but mistakes happen.

OK, I’m currently looking at the datagram case only.
This may change later if there’s enough time.

>I see no point in changing the IPv6 behaviour: it seems to be
>consistent with itself and with the standard

Not really: if the kernel writes an int and userspace reads
its first byte, it only works by accident on little endian,
but not elsewhere.

>so only risks breaking user-space that works today.

Hrm. It risks breaking userspace that reads an int. But the
RFC clearly says it should read the first byte, not an int.

>But you should know that the highest priority for Linux API
>compatibility is to avoid breaking currently working user-space.  That
>means that ugly and inconsistent APIs won't get fixed if it causes a
>regression for the programs people actually use.  If the API never
>worked like it was supposed to on some architectures, that's not a
>regression, and is lower priority.

This is why I just put this up for discussion instead of
requesting a specific change.

That being said, given that the IPv6 API is *only* documented
in the RFC and *not* documented in the Linux manpages…

(Perhaps codesearching for IPV6_TCLASS might also help.
It’s unclear how many users this has…)



In the end, what I really want, is clear documentation for
how I should implement the following file that it works on
Linux, and ideally also other systems implementing the RFC
API (FreeBSD supposedly does but needs testing):

https://github.com/tarent/ECN-Bits/blob/master/linux-c/lib/ecn.c

Given that there’s no documentation, trying to read the
coffee grounds from the kernel source, finding it doesn’t
even match the RFC (which, again, doesn’t match what itojun
proposed, for some reason), does not instigate trust in the
things I *think* I’ve found.

bye,
//mirabilos
-- 
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 228 54881-393 • Fax: +49 228 54881-235
HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg

  reply index

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <159596111771.2639.6929056987566441726.reportbug@tglase-nb.lan.tarent.de>
2020-08-02 17:49 ` Ben Hutchings
2020-08-02 19:29   ` Thorsten Glaser [this message]
2020-08-02 20:29     ` Ben Hutchings
2020-08-02 20:44       ` Thorsten Glaser
2020-08-03  3:32         ` Ben Hutchings
2020-08-03 16:58           ` Thorsten Glaser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.BSM.4.64L.2008021919500.2148@herc.mirbsd.org \
    --to=t.glaser@tarent.de \
    --cc=966459@bugs.debian.org \
    --cc=ben@decadent.org.uk \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git