All of lore.kernel.org
 help / color / mirror / Atom feed
* /proc/net/dev regression
@ 2015-01-10 23:25 Carlos R. Mafra
  2015-01-11  0:27 ` Al Viro
  0 siblings, 1 reply; 8+ messages in thread
From: Carlos R. Mafra @ 2015-01-10 23:25 UTC (permalink / raw)
  To: LKML; +Cc: Hauke Mehrtens, John W. Linville, netdev

I use a dockapp called 'wmnet' [1] to monitor the speed of
my internet connection and after the kernel v3.18 it does no
longer work properly (it still doesn't work in v3.19-rc3)

I bisected the problem and the culprit is this commit:

commit 6e094bd805a9b6ad2f5421125db8f604a166616c
Author: Rafał Miłecki <zajec5@gmail.com>
Date:   Fri Sep 5 00:18:48 2014 +0200

    bcma: move code for core registration into separate function
    
    This cleans code a bit and will us to register cores in other places as
    well. The only difference with this patch is using "core_index" for
    setting device name.
    
    
The problem is caused by the different name that my wireless connection
receives after this patch:

wlp3s0b1 (after the patch)
wlp3s0   (before the patch)

because it affects the display of information in the file /proc/net/dev.

Before the patch the fields are all aligned:

[mafra@linux-g29b:wmnet]$ cat net_dev_good.txt
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:   35916     332    0    0    0     0          0         0    35916     332    0    0    0     0       0          0
wlp3s0: 6406428    5794    0    0    0     0          0         0   426813    3778    0    0    0     0       0          0

but after the patch the fields are misaligned:

[mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0

And for some reason this change confuses 'wmnet'. Reading the source code
of 'wmnet' I found that it reads the packets as follows,

	totalpackets_in = strtoul(&buffer[15], NULL, 10);

I am not sure if 'wmnet' could do this better (any suggestions?),
but the fact is that it was working before and now it is not.

Any help is greatly appreciated.

[1] wmnet can be found in http://repo.or.cz/w/dockapps.git

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: /proc/net/dev regression
  2015-01-10 23:25 /proc/net/dev regression Carlos R. Mafra
@ 2015-01-11  0:27 ` Al Viro
  2015-01-11  0:58   ` Carlos R. Mafra
  2015-01-11  1:00   ` Al Viro
  0 siblings, 2 replies; 8+ messages in thread
From: Al Viro @ 2015-01-11  0:27 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev

On Sat, Jan 10, 2015 at 11:25:18PM +0000, Carlos R. Mafra wrote:
> [mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
> Inter-|   Receive                                                |  Transmit
>  face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
>     lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
> wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0
> 
> And for some reason this change confuses 'wmnet'. Reading the source code
> of 'wmnet' I found that it reads the packets as follows,
> 
> 	totalpackets_in = strtoul(&buffer[15], NULL, 10);
> 
> I am not sure if 'wmnet' could do this better (any suggestions?),

*snort*

well, yes - it's called scanf().  And if one is really, really nervous
about the overhead of <gasp> parsing a bunch of integers (as if fopen/
fgets/fclose alone won't cost enough to make constantly calling that
sucker a bad idea), just use ptr + <something> - 6 instead of
&buffer[<something>] in there.  That thing has just found where the
colon was (and replaced it with NUL), so dealing with "the first field
turned out to be too long and shifted everything past it" isn't hard.

> but the fact is that it was working before and now it is not.

True.  Mind you, the real issue is that this code expects the interface
names to be never longer than 6 characters, but then /proc/net/dev layout
strongly suggests that.  Hell knows; it is a regression and it does
break real-world userland code.  The only way to avoid that, AFAICS, is
to prohibit interface names longer than 6 chars ;-/

Lovely combination of crappy ABI (procfs file layout), crappy userland
code relying on details of said ABI out of sheer laziness and triggering
kernel change producing bloody long interface names...

Incidentally, sufficiently long interface name will produce other fun issues
for a docked app - it simply won't fit into 64x64 square on screen ;-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: /proc/net/dev regression
  2015-01-11  0:27 ` Al Viro
@ 2015-01-11  0:58   ` Carlos R. Mafra
  2015-01-11  1:00   ` Al Viro
  1 sibling, 0 replies; 8+ messages in thread
From: Carlos R. Mafra @ 2015-01-11  0:58 UTC (permalink / raw)
  To: Al Viro; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev

On Sun, 11 Jan 2015 at  0:27:06 +0000, Al Viro wrote:
> On Sat, Jan 10, 2015 at 11:25:18PM +0000, Carlos R. Mafra wrote:
> > [mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
> > Inter-|   Receive                                                |  Transmit
> >  face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
> >     lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
> > wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0
> > 
> > And for some reason this change confuses 'wmnet'. Reading the source code
> > of 'wmnet' I found that it reads the packets as follows,
> > 
> > 	totalpackets_in = strtoul(&buffer[15], NULL, 10);
> > 
> > I am not sure if 'wmnet' could do this better (any suggestions?),
> 
> *snort*
> 
> well, yes - it's called scanf().  And if one is really, really nervous
> about the overhead of <gasp> parsing a bunch of integers (as if fopen/
> fgets/fclose alone won't cost enough to make constantly calling that
> sucker a bad idea), just use ptr + <something> - 6 instead of
> &buffer[<something>] in there.  That thing has just found where the
> colon was (and replaced it with NUL), so dealing with "the first field
> turned out to be too long and shifted everything past it" isn't hard.

Alright! Thank you.

> > but the fact is that it was working before and now it is not.
> 
> True.  Mind you, the real issue is that this code expects the interface
> names to be never longer than 6 characters, but then /proc/net/dev layout
> strongly suggests that.  Hell knows; it is a regression and it does
> break real-world userland code.  The only way to avoid that, AFAICS, is
> to prohibit interface names longer than 6 chars ;-/
> 
> Lovely combination of crappy ABI (procfs file layout), crappy userland
> code relying on details of said ABI out of sheer laziness and triggering
> kernel change producing bloody long interface names...

I won't mind just changing the crappy and fragile wmnet code and moving on.
I have already lost the 2 hours bisecting this anyway.

Thank you very much for your advice.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: /proc/net/dev regression
  2015-01-11  0:27 ` Al Viro
  2015-01-11  0:58   ` Carlos R. Mafra
@ 2015-01-11  1:00   ` Al Viro
  2015-01-11  1:33     ` Carlos R. Mafra
  1 sibling, 1 reply; 8+ messages in thread
From: Al Viro @ 2015-01-11  1:00 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev

On Sun, Jan 11, 2015 at 12:27:06AM +0000, Al Viro wrote:
> On Sat, Jan 10, 2015 at 11:25:18PM +0000, Carlos R. Mafra wrote:
> > [mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
> > Inter-|   Receive                                                |  Transmit
> >  face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
> >     lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
> > wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0
> > 
> > And for some reason this change confuses 'wmnet'. Reading the source code
> > of 'wmnet' I found that it reads the packets as follows,
> > 
> > 	totalpackets_in = strtoul(&buffer[15], NULL, 10);
> > 
> > I am not sure if 'wmnet' could do this better (any suggestions?),
> 
> *snort*
> 
> well, yes - it's called scanf().  And if one is really, really nervous
> about the overhead of <gasp> parsing a bunch of integers (as if fopen/
> fgets/fclose alone won't cost enough to make constantly calling that
> sucker a bad idea), just use ptr + <something> - 6 instead of
> &buffer[<something>] in there.  That thing has just found where the
> colon was (and replaced it with NUL), so dealing with "the first field
> turned out to be too long and shifted everything past it" isn't hard.
> 
> > but the fact is that it was working before and now it is not.
> 
> True.  Mind you, the real issue is that this code expects the interface
> names to be never longer than 6 characters, but then /proc/net/dev layout
> strongly suggests that.  Hell knows; it is a regression and it does
> break real-world userland code.  The only way to avoid that, AFAICS, is
> to prohibit interface names longer than 6 chars ;-/
> 
> Lovely combination of crappy ABI (procfs file layout), crappy userland
> code relying on details of said ABI out of sheer laziness and triggering
> kernel change producing bloody long interface names...
> 
> Incidentally, sufficiently long interface name will produce other fun issues
> for a docked app - it simply won't fit into 64x64 square on screen ;-)

Mind you, assuming that columns will align is obviously broken - the producing
side of that thing is
        seq_printf(seq, "%6s: %7llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu "
                   "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n",
                   dev->name, stats->rx_bytes, stats->rx_packets,
                   stats->rx_errors,
                   stats->rx_dropped + stats->rx_missed_errors,
                   stats->rx_fifo_errors,
                   stats->rx_length_errors + stats->rx_over_errors +
                    stats->rx_crc_errors + stats->rx_frame_errors,
                   stats->rx_compressed, stats->multicast,
                   stats->tx_bytes, stats->tx_packets,
                   stats->tx_errors, stats->tx_dropped,
                   stats->tx_fifo_errors, stats->collisions,
                   stats->tx_carrier_errors +
                    stats->tx_aborted_errors +
                    stats->tx_window_errors +
                    stats->tx_heartbeat_errors,
                   stats->tx_compressed);
To start with, expecting the ->rx_bytes to remain a 7-digit number is somewhat,
er, odd.  Long interace names be damned, the columns will not stay aligned,
no matter what.  Unless your interface vanishes as soon as it has sent
or received 10 megabytes, that is...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: /proc/net/dev regression
  2015-01-11  1:00   ` Al Viro
@ 2015-01-11  1:33     ` Carlos R. Mafra
  2015-01-11  1:39       ` Al Viro
  0 siblings, 1 reply; 8+ messages in thread
From: Carlos R. Mafra @ 2015-01-11  1:33 UTC (permalink / raw)
  To: Al Viro; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev

On Sun, 11 Jan 2015 at  1:00:36 +0000, Al Viro wrote:
> On Sun, Jan 11, 2015 at 12:27:06AM +0000, Al Viro wrote:
> > On Sat, Jan 10, 2015 at 11:25:18PM +0000, Carlos R. Mafra wrote:
> > > [mafra@linux-g29b:wmnet]$ cat net_dev_bad.txt
> > > Inter-|   Receive                                                |  Transmit
> > >  face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
> > >     lo:     600       8    0    0    0     0          0         0      600       8    0    0    0     0       0          0
> > > wlp3s0b1: 9266848    7298    0    0    0     0          0         0   372229    4030    0    0    0     0       0          0
> > > 
> > > And for some reason this change confuses 'wmnet'. Reading the source code
> > > of 'wmnet' I found that it reads the packets as follows,
> > > 
> > > 	totalpackets_in = strtoul(&buffer[15], NULL, 10);
> > > 
> > > I am not sure if 'wmnet' could do this better (any suggestions?),
> > 
> > *snort*
> > 
> > well, yes - it's called scanf().  And if one is really, really nervous
> > about the overhead of <gasp> parsing a bunch of integers (as if fopen/
> > fgets/fclose alone won't cost enough to make constantly calling that
> > sucker a bad idea), just use ptr + <something> - 6 instead of
> > &buffer[<something>] in there.  That thing has just found where the
> > colon was (and replaced it with NUL), so dealing with "the first field
> > turned out to be too long and shifted everything past it" isn't hard.
> > 
> > > but the fact is that it was working before and now it is not.
> > 
> > True.  Mind you, the real issue is that this code expects the interface
> > names to be never longer than 6 characters, but then /proc/net/dev layout
> > strongly suggests that.  Hell knows; it is a regression and it does
> > break real-world userland code.  The only way to avoid that, AFAICS, is
> > to prohibit interface names longer than 6 chars ;-/
> > 
> > Lovely combination of crappy ABI (procfs file layout), crappy userland
> > code relying on details of said ABI out of sheer laziness and triggering
> > kernel change producing bloody long interface names...
> > 
> > Incidentally, sufficiently long interface name will produce other fun issues
> > for a docked app - it simply won't fit into 64x64 square on screen ;-)
> 
> Mind you, assuming that columns will align is obviously broken - the producing
> side of that thing is
>         seq_printf(seq, "%6s: %7llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu "
>                    "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n",
>                    dev->name, stats->rx_bytes, stats->rx_packets,
>                    stats->rx_errors,
>                    stats->rx_dropped + stats->rx_missed_errors,
>                    stats->rx_fifo_errors,
>                    stats->rx_length_errors + stats->rx_over_errors +
>                     stats->rx_crc_errors + stats->rx_frame_errors,
>                    stats->rx_compressed, stats->multicast,
>                    stats->tx_bytes, stats->tx_packets,
>                    stats->tx_errors, stats->tx_dropped,
>                    stats->tx_fifo_errors, stats->collisions,
>                    stats->tx_carrier_errors +
>                     stats->tx_aborted_errors +
>                     stats->tx_window_errors +
>                     stats->tx_heartbeat_errors,
>                    stats->tx_compressed);
> To start with, expecting the ->rx_bytes to remain a 7-digit number is somewhat,
> er, odd.  Long interace names be damned, the columns will not stay aligned,
> no matter what.  Unless your interface vanishes as soon as it has sent
> or received 10 megabytes, that is...

I think the problem with wmnet is not that it was expecting the fields
to be aligned because it never had problems before (when definitely more
than 10 megabytes were received, wmnet is crappy but not _that_ crappy).

I think the problem really was here,

	totalbytes_in = strtoul(&buffer[7], NULL, 10);

After the patch the device name is 8 characters long and &buffer[7]
overlaps with the name instead of reading the bytes. Before the
patch is was fine because the call to strtoul() seems correct in the
sense that it would read everything until the NULL. So more than 10
megabytes was still ok.

So I guess I was wrong when suggesting that the problem was the
alignment.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: /proc/net/dev regression
  2015-01-11  1:33     ` Carlos R. Mafra
@ 2015-01-11  1:39       ` Al Viro
  2015-01-11 13:40         ` Carlos R. Mafra
  2015-01-12 11:47         ` David Laight
  0 siblings, 2 replies; 8+ messages in thread
From: Al Viro @ 2015-01-11  1:39 UTC (permalink / raw)
  To: Carlos R. Mafra; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev

On Sun, Jan 11, 2015 at 01:33:35AM +0000, Carlos R. Mafra wrote:

> I think the problem with wmnet is not that it was expecting the fields
> to be aligned because it never had problems before (when definitely more
> than 10 megabytes were received, wmnet is crappy but not _that_ crappy).
> 
> I think the problem really was here,
> 
> 	totalbytes_in = strtoul(&buffer[7], NULL, 10);
> 
> After the patch the device name is 8 characters long and &buffer[7]
> overlaps with the name instead of reading the bytes. Before the
> patch is was fine because the call to strtoul() seems correct in the
> sense that it would read everything until the NULL. So more than 10
> megabytes was still ok.
> 
> So I guess I was wrong when suggesting that the problem was the
> alignment.

Several lines below there's this:
                        totalpackets_out = strtoul(&buffer[74], NULL, 10);
                        if (totalpackets_out != lastpackets_out) {
                                totalbytes_out = strtoul(&buffer[66], NULL, 10);
                                diffpackets_out += totalpackets_out - lastpackets_out;
                                diffbytes_out += totalbytes_out - lastbytes_out;
                                lastpackets_out = totalpackets_out;
                                lastbytes_out = totalbytes_out;
                                tx = True;
                        }

So I'm afraid it *is* that crappy.  This function really should use scanf();
note that updateStats_ipchains() in the same file does just that (well,
fgets()+sscanf() for fsck knows what reason).  And I'd be careful with all
those %d, actually - it's not _that_ hard to get more than 4Gb sent.
scanf formats really ought to match the kernel-side (seq_)printf one...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: /proc/net/dev regression
  2015-01-11  1:39       ` Al Viro
@ 2015-01-11 13:40         ` Carlos R. Mafra
  2015-01-12 11:47         ` David Laight
  1 sibling, 0 replies; 8+ messages in thread
From: Carlos R. Mafra @ 2015-01-11 13:40 UTC (permalink / raw)
  To: Al Viro; +Cc: LKML, Hauke Mehrtens, John W. Linville, netdev

On Sun, 11 Jan 2015 at  1:39:13 +0000, Al Viro wrote:
> On Sun, Jan 11, 2015 at 01:33:35AM +0000, Carlos R. Mafra wrote:
> 
> > I think the problem with wmnet is not that it was expecting the fields
> > to be aligned because it never had problems before (when definitely more
> > than 10 megabytes were received, wmnet is crappy but not _that_ crappy).
> > 
> > I think the problem really was here,
> > 
> > 	totalbytes_in = strtoul(&buffer[7], NULL, 10);
> > 
> > After the patch the device name is 8 characters long and &buffer[7]
> > overlaps with the name instead of reading the bytes. Before the
> > patch is was fine because the call to strtoul() seems correct in the
> > sense that it would read everything until the NULL. So more than 10
> > megabytes was still ok.
> > 
> > So I guess I was wrong when suggesting that the problem was the
> > alignment.
> 
> Several lines below there's this:
>                         totalpackets_out = strtoul(&buffer[74], NULL, 10);
>                         if (totalpackets_out != lastpackets_out) {
>                                 totalbytes_out = strtoul(&buffer[66], NULL, 10);
>                                 diffpackets_out += totalpackets_out - lastpackets_out;
>                                 diffbytes_out += totalbytes_out - lastbytes_out;
>                                 lastpackets_out = totalpackets_out;
>                                 lastbytes_out = totalbytes_out;
>                                 tx = True;
>                         }
> 
> So I'm afraid it *is* that crappy.  This function really should use scanf();
> note that updateStats_ipchains() in the same file does just that (well,
> fgets()+sscanf() for fsck knows what reason).  And I'd be careful with all
> those %d, actually - it's not _that_ hard to get more than 4Gb sent.
> scanf formats really ought to match the kernel-side (seq_)printf one...

Ok, I fixed wmnet using Al's suggestion.

As far as I'm concerned, my regression complaint can be dismissed. It's
all working fine again.

Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: /proc/net/dev regression
  2015-01-11  1:39       ` Al Viro
  2015-01-11 13:40         ` Carlos R. Mafra
@ 2015-01-12 11:47         ` David Laight
  1 sibling, 0 replies; 8+ messages in thread
From: David Laight @ 2015-01-12 11:47 UTC (permalink / raw)
  To: 'Al Viro', Carlos R. Mafra
  Cc: LKML, Hauke Mehrtens, John W. Linville, netdev

From: Al Viro
> > I think the problem with wmnet is not that it was expecting the fields
> > to be aligned because it never had problems before (when definitely more
> > than 10 megabytes were received, wmnet is crappy but not _that_ crappy).
> >
> > I think the problem really was here,
> >
> > 	totalbytes_in = strtoul(&buffer[7], NULL, 10);
> >
> > After the patch the device name is 8 characters long and &buffer[7]
> > overlaps with the name instead of reading the bytes. Before the
> > patch is was fine because the call to strtoul() seems correct in the
> > sense that it would read everything until the NULL. So more than 10
> > megabytes was still ok.
> >
> > So I guess I was wrong when suggesting that the problem was the
> > alignment.
> 
> Several lines below there's this:
>                         totalpackets_out = strtoul(&buffer[74], NULL, 10);
>                         if (totalpackets_out != lastpackets_out) {
>                                 totalbytes_out = strtoul(&buffer[66], NULL, 10);
>                                 diffpackets_out += totalpackets_out - lastpackets_out;
>                                 diffbytes_out += totalbytes_out - lastbytes_out;
>                                 lastpackets_out = totalpackets_out;
>                                 lastbytes_out = totalbytes_out;
>                                 tx = True;
>                         }
> 
> So I'm afraid it *is* that crappy.  This function really should use scanf();
> note that updateStats_ipchains() in the same file does just that (well,
> fgets()+sscanf() for fsck knows what reason).  And I'd be careful with all
> those %d, actually - it's not _that_ hard to get more than 4Gb sent.
> scanf formats really ought to match the kernel-side (seq_)printf one...

IMHO it is safer to use strchr(p, ' '); to skip the interface name and then
use repeated calls to strtoull() to read the numbers.
Correctly/safely using scanf() is really too hard.

	David


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-01-12 11:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-10 23:25 /proc/net/dev regression Carlos R. Mafra
2015-01-11  0:27 ` Al Viro
2015-01-11  0:58   ` Carlos R. Mafra
2015-01-11  1:00   ` Al Viro
2015-01-11  1:33     ` Carlos R. Mafra
2015-01-11  1:39       ` Al Viro
2015-01-11 13:40         ` Carlos R. Mafra
2015-01-12 11:47         ` David Laight

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.