linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Annoying /proc/net/dev rollovers.
@ 2003-02-16 22:16 Mark J Roberts
  2003-02-17  1:41 ` Chris Wedgwood
  0 siblings, 1 reply; 8+ messages in thread
From: Mark J Roberts @ 2003-02-16 22:16 UTC (permalink / raw)
  To: linux-kernel

The rolling-over of /proc/net/dev fields annoys me.

I read a couple threads about the issue and saw a lot of whimpering
about how locking would be such a pain to implement in lieu of
32-bit atomicity.

Alan Cox pointed out in one of them that accurate info could be
collected through "the firewalling facilities", which I take to mean
the ipt_counters structure. The caveat is that it only provides
packet and byte counts.

One alternative to throwing locks around everything accessing those
fields is to update a 64-bit counter asynchronously. Has this been
considered? It would entail atomically executing

	total_rx_bytes += rx_bytes;
	rx_bytes = 0;

and merely ensuring that rx_bytes does not roll over between calls.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Annoying /proc/net/dev rollovers.
  2003-02-16 22:16 Annoying /proc/net/dev rollovers Mark J Roberts
@ 2003-02-17  1:41 ` Chris Wedgwood
  2003-02-17  2:46   ` Mark J Roberts
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Wedgwood @ 2003-02-17  1:41 UTC (permalink / raw)
  To: Mark J Roberts; +Cc: linux-kernel

On Sun, Feb 16, 2003 at 04:16:16PM -0600, Mark J Roberts wrote:

> The rolling-over of /proc/net/dev fields annoys me.

Why?

How often does it happen?

> total_rx_bytes += rx_bytes;

if lval is 64-bit, then this cannot be done reliably on all
architectures



  --cw

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Annoying /proc/net/dev rollovers.
  2003-02-17  1:41 ` Chris Wedgwood
@ 2003-02-17  2:46   ` Mark J Roberts
  2003-02-17  3:24     ` Jeff Garzik
  2003-02-17  4:21     ` Chris Wedgwood
  0 siblings, 2 replies; 8+ messages in thread
From: Mark J Roberts @ 2003-02-17  2:46 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: linux-kernel

Chris Wedgwood:
> How often does it happen?

When the windows box behind my NAT is using all of my 640kbit/sec
downstream to download movies, it takes a little over 14 hours to
download four gigabytes and roll over the byte counter. This means
ifconfig is mostly useless for getting an idea of how much I've
downloaded, which is something very useful to me.

> > total_rx_bytes += rx_bytes;
> 
> if lval is 64-bit, then this cannot be done reliably on all
> architectures

I'm not sure why. I realize that x86 can't do atomic 64-bit
operations, but what I propose is to leave the 32-bit rx_bytes code
the way it is, and just have some heuristic for updating the 64-bit
value every so often, which can be done under a lock, so there would
be no opportunity for races to corrupt the counter. (This is also an
optimization since there needn't be any locks in the actual packet
handling code.)

But I admit I'm no expert programmer, and I might be suggesting
nonsense. In any case, the bug is real, the ifconfig output is
misleading, and I think it should be fixed one way or another.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Annoying /proc/net/dev rollovers.
  2003-02-17  2:46   ` Mark J Roberts
@ 2003-02-17  3:24     ` Jeff Garzik
  2003-02-17  4:21     ` Chris Wedgwood
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff Garzik @ 2003-02-17  3:24 UTC (permalink / raw)
  To: Mark J Roberts; +Cc: Chris Wedgwood, linux-kernel

Mark J Roberts wrote:
> Chris Wedgwood:
>>>total_rx_bytes += rx_bytes;
>>
>>if lval is 64-bit, then this cannot be done reliably on all
>>architectures
> 
> 
> I'm not sure why. I realize that x86 can't do atomic 64-bit
> operations, but what I propose is to leave the 32-bit rx_bytes code
> the way it is, and just have some heuristic for updating the 64-bit
> value every so often, which can be done under a lock, so there would
> be no opportunity for races to corrupt the counter. (This is also an
> optimization since there needn't be any locks in the actual packet
> handling code.)


I was one of the ones who was interested in making the statistics 
64-bit, and adding locking to do it right.  The solution finally 
appeared, many months ago:

The counters don't need to be 64-bit, because it is trivially possible 
for userspace to track the statistics, and to simply use the difference 
between two samples as the increment used in calculating whatever 
numbers you wish -- 64-bit SNMP MIB statistics were what I was 
interested in.  Wrapping is trivially handled by standard unsigned int 
arithmetic, among other methods.

If you really want the raw data, then use ethtool's NIC-specific stats 
facility, to retrieve raw statistics directly from the NIC.  [this of 
course requires driver modifications, but they are easy on modern NICs]

	Jeff



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Annoying /proc/net/dev rollovers.
  2003-02-17  2:46   ` Mark J Roberts
  2003-02-17  3:24     ` Jeff Garzik
@ 2003-02-17  4:21     ` Chris Wedgwood
  2003-02-17 10:35       ` Matti Aarnio
  1 sibling, 1 reply; 8+ messages in thread
From: Chris Wedgwood @ 2003-02-17  4:21 UTC (permalink / raw)
  To: Mark J Roberts; +Cc: linux-kernel

On Sun, Feb 16, 2003 at 08:46:05PM -0600, Mark J Roberts wrote:

> When the windows box behind my NAT is using all of my 640kbit/sec
> downstream to download movies, it takes a little over 14 hours to
> download four gigabytes and roll over the byte counter.

Therefore userspace needs to check the counters more often... say ever
30s or so and detect rollover.  Most of this could be simply
encapsulated in a library and made transparent to the upper layers.



  --cw


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Annoying /proc/net/dev rollovers.
  2003-02-17  4:21     ` Chris Wedgwood
@ 2003-02-17 10:35       ` Matti Aarnio
  2003-02-18  4:58         ` David Lang
  0 siblings, 1 reply; 8+ messages in thread
From: Matti Aarnio @ 2003-02-17 10:35 UTC (permalink / raw)
  To: Mark J Roberts, linux-kernel

On Sun, Feb 16, 2003 at 08:21:56PM -0800, Chris Wedgwood wrote:
> On Sun, Feb 16, 2003 at 08:46:05PM -0600, Mark J Roberts wrote:
> > When the windows box behind my NAT is using all of my 640kbit/sec
> > downstream to download movies, it takes a little over 14 hours to
> > download four gigabytes and roll over the byte counter.
> 
> Therefore userspace needs to check the counters more often... say ever
> 30s or so and detect rollover.  Most of this could be simply
> encapsulated in a library and made transparent to the upper layers.

  Some of my colleques complained once, that at full tilt
  the fiber-channel fabric overflowed its SNMP bitcounters
  every 2 seconds.

  "we need to do polling more rapidly, than the poller can do"

  The SNMP pollers do handle gracefully 32-bit unsigned overlow,
  they just need to get snapshots in increments a bit under 2G...
  (Hmm.. perhaps I remember that wrong, a bit under 4G should be ok.)

>   --cw

/Matti Aarnio

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Annoying /proc/net/dev rollovers.
  2003-02-17 10:35       ` Matti Aarnio
@ 2003-02-18  4:58         ` David Lang
  2003-02-18 13:28           ` Matti Aarnio
  0 siblings, 1 reply; 8+ messages in thread
From: David Lang @ 2003-02-18  4:58 UTC (permalink / raw)
  To: Matti Aarnio; +Cc: Mark J Roberts, linux-kernel

don't forget that 10G ethernet is starting to leak out of the labs into
the real world. I don't know of any linux support yet, but it will come
and then you will be able to overflow 32bit bitcounters multiple times per
second.

David Lang


 On Mon, 17 Feb 2003, Matti Aarnio wrote:

> Date: Mon, 17 Feb 2003 12:35:53 +0200
> From: Matti Aarnio <matti.aarnio@zmailer.org>
> To: Mark J Roberts <mjr@znex.org>, linux-kernel@vger.kernel.org
> Subject: Re: Annoying /proc/net/dev rollovers.
>
> On Sun, Feb 16, 2003 at 08:21:56PM -0800, Chris Wedgwood wrote:
> > On Sun, Feb 16, 2003 at 08:46:05PM -0600, Mark J Roberts wrote:
> > > When the windows box behind my NAT is using all of my 640kbit/sec
> > > downstream to download movies, it takes a little over 14 hours to
> > > download four gigabytes and roll over the byte counter.
> >
> > Therefore userspace needs to check the counters more often... say ever
> > 30s or so and detect rollover.  Most of this could be simply
> > encapsulated in a library and made transparent to the upper layers.
>
>   Some of my colleques complained once, that at full tilt
>   the fiber-channel fabric overflowed its SNMP bitcounters
>   every 2 seconds.
>
>   "we need to do polling more rapidly, than the poller can do"
>
>   The SNMP pollers do handle gracefully 32-bit unsigned overlow,
>   they just need to get snapshots in increments a bit under 2G...
>   (Hmm.. perhaps I remember that wrong, a bit under 4G should be ok.)
>
> >   --cw
>
> /Matti Aarnio
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Annoying /proc/net/dev rollovers.
  2003-02-18  4:58         ` David Lang
@ 2003-02-18 13:28           ` Matti Aarnio
  0 siblings, 0 replies; 8+ messages in thread
From: Matti Aarnio @ 2003-02-18 13:28 UTC (permalink / raw)
  To: David Lang; +Cc: Matti Aarnio, Mark J Roberts, linux-kernel

On Mon, Feb 17, 2003 at 08:58:40PM -0800, David Lang wrote:
> don't forget that 10G ethernet is starting to leak out of the labs into
> the real world. I don't know of any linux support yet, but it will come
> and then you will be able to overflow 32bit bitcounters multiple times per
> second.

A machine capable to support full data speed of 10G ether needs ...
around 1.3 GB/sec I/O speed both ways for the card, which at
64-bit PCI-X 533 -- is at most 4.3 GB/sec.  In reality one can't
quite get the theorethical maximum out of the hardware.
One full-speed full-duplex 10G ether is barely doable with that new
version of PCI-X.

A giga-ether interface (or two) can be done in current generation 
hardware, and even some usefull things can be done to fill the pipe.

I do suppose that at the time we are also using 64-bit processors,
in which incrementing 64-bit counter variables uninterruptably is
trivial. 

I leave it as a thought excercise, as to why non-irq-blocking spinlock
is not a good idea to ensure data update monotonicity.


There are algorithmic ways to handle interruptible two-fetch
consistency problem in current 32-bit hardware.  None of those
are being used, as far as I know:

irq-context:
   add to less-significant-long
   add carry to more-significant-long

reader context:
   read less-significant-long into ax
   read more-significant-long into bx
   compare less-significant-long with ax
    if differ, start from begin
   compare more-significant-long with bx
    if differ, start from begin
   return ax,bx

That way the reader need not worry interrupting,
but implementation is -- likely -- assembly.


No spinlocks, no irq-blocking...


> David Lang

  /Matti Aarnio


>  On Mon, 17 Feb 2003, Matti Aarnio wrote:
> 
> > Date: Mon, 17 Feb 2003 12:35:53 +0200
> > From: Matti Aarnio <matti.aarnio@zmailer.org>
> > To: Mark J Roberts <mjr@znex.org>, linux-kernel@vger.kernel.org
> > Subject: Re: Annoying /proc/net/dev rollovers.
> >
> > On Sun, Feb 16, 2003 at 08:21:56PM -0800, Chris Wedgwood wrote:
> > > On Sun, Feb 16, 2003 at 08:46:05PM -0600, Mark J Roberts wrote:
> > > > When the windows box behind my NAT is using all of my 640kbit/sec
> > > > downstream to download movies, it takes a little over 14 hours to
> > > > download four gigabytes and roll over the byte counter.
> > >
> > > Therefore userspace needs to check the counters more often... say ever
> > > 30s or so and detect rollover.  Most of this could be simply
> > > encapsulated in a library and made transparent to the upper layers.
> >
> >   Some of my colleques complained once, that at full tilt
> >   the fiber-channel fabric overflowed its SNMP bitcounters
> >   every 2 seconds.
> >
> >   "we need to do polling more rapidly, than the poller can do"
> >
> >   The SNMP pollers do handle gracefully 32-bit unsigned overlow,
> >   they just need to get snapshots in increments a bit under 2G...
> >   (Hmm.. perhaps I remember that wrong, a bit under 4G should be ok.)
> >
> > >   --cw
> >
> > /Matti Aarnio

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-02-18 13:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-16 22:16 Annoying /proc/net/dev rollovers Mark J Roberts
2003-02-17  1:41 ` Chris Wedgwood
2003-02-17  2:46   ` Mark J Roberts
2003-02-17  3:24     ` Jeff Garzik
2003-02-17  4:21     ` Chris Wedgwood
2003-02-17 10:35       ` Matti Aarnio
2003-02-18  4:58         ` David Lang
2003-02-18 13:28           ` Matti Aarnio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).