linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ATM] second pass at fixing atm spinlock
@ 2003-03-27 22:28 chas williams
  2003-03-28 10:07 ` Duncan Sands
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: chas williams @ 2003-03-27 22:28 UTC (permalink / raw)
  To: linux-atm-general, linux-kernel

ok, here is a further attempt at fixing the smp problems in the linux-atm
stack.  things have been moved around in net/atm:

. atm sk's are reference counted.  this makes the vcc_release (renamed
  from atm_release_vcc for consistency) and svc_release easier to write
  since you dont need to explicity free the sk
. vcc's are stored as a list of sockets, like other protocols
. atm drivers should use vcc_hold/vcc_put to modify reference counts.  the
  driver should read_lock(vcc_sklist_lock) before doing a vcc_hold() if
  they are in their 'bottom half'.  i converted eni, idt77252 (untested --
  no hardware), fore200e (untest -- lazy) and nicstar.  i also included the
  he driver.  the eni driver is special, since the bottom half is a tasklet
  you can simply stop the bottom half and be sure the bottom half
  isnt holding a reference.  the he just grabs the vcc_sklist_lock and 
  processes all the vcc's needing serviced.  a vcc cant disappear while 
  the lock is held (since the vcc_release needs to do a write_lock to unhash
  the socket/vcc before dropping the refcnt to 0).
. the nodev list has been removed and bind_vcc/unlink have been changed to
  just insert/remove vcc list operations.  this makes things a bit clearer.
. driver's check_ci() routine have been removed in favor of the one in
  atm_misc.c.  very few were actually any faster, and this functionality is
  rarely used.
. there is a new function, vcc_lookup(), used mostly by the fore200e and he
  driver.  vcc's returned by this function will need a vcc_put() or the
  ref count will be wrong.  in the future, most drivers should probably 
  use this function to find the vcc for their newly arrived data.  there
  possibly could be a routine atmif_rx(atmdev, skb, vpi, vci) which 
  just does the right thing.  ideally, atm would use netif_rx() but one
  would need to dummy up a ethernet header on each of the pdu's.  comments
  on that idea?
. all the stuff from the previous version: atmdev ref counting, atm and 
  clip can now build as modules.

ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_64_atm_dev_lock.patch

[i will try to get a 2.5.65 version available this weekend]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ATM] second pass at fixing atm spinlock
  2003-03-27 22:28 [ATM] second pass at fixing atm spinlock chas williams
@ 2003-03-28 10:07 ` Duncan Sands
  2003-03-28 13:08   ` chas williams
  2003-04-01 16:28 ` chas williams
  2003-04-03 18:46 ` [ATM] a few more changes for people to test chas williams
  2 siblings, 1 reply; 10+ messages in thread
From: Duncan Sands @ 2003-03-28 10:07 UTC (permalink / raw)
  To: chas williams, linux-atm-general, linux-kernel

Hi Chas, good work.  A few comments below

On Thursday 27 March 2003 23:28, chas williams wrote:
> ok, here is a further attempt at fixing the smp problems in the linux-atm
> stack.  things have been moved around in net/atm:
>
> . atm sk's are reference counted.  this makes the vcc_release (renamed
>   from atm_release_vcc for consistency) and svc_release easier to write
>   since you dont need to explicity free the sk
> . vcc's are stored as a list of sockets, like other protocols
> . atm drivers should use vcc_hold/vcc_put to modify reference counts.  the
>   driver should read_lock(vcc_sklist_lock) before doing a vcc_hold() if
>   they are in their 'bottom half'.  i converted eni, idt77252 (untested --
>   no hardware), fore200e (untest -- lazy) and nicstar.  i also included the
>   he driver.  the eni driver is special, since the bottom half is a tasklet
>   you can simply stop the bottom half and be sure the bottom half
>   isnt holding a reference.  the he just grabs the vcc_sklist_lock and
>   processes all the vcc's needing serviced.  a vcc cant disappear while
>   the lock is held (since the vcc_release needs to do a write_lock to
> unhash the socket/vcc before dropping the refcnt to 0).

I'm kind of confused about this.  It seems to me that you should only
need to read_lock(vcc_sklist_lock) if you are going to traverse (or
otherwise examine the structure of) the list.  There should be no need
to do it if you are just taking a reference.  After all, you should only take a
reference to a vcc (ok, to the socket) if the vcc you are going to use/copy
already has a reference taken [Example: if the ATM layer calls you and
passes you a vcc, it must have taken a reference to the vcc before calling
you, otherwise the vcc might vanish under you; then you take a reference
to this vcc -> ref count = 2].  So no-one is going to delete the vcc under
you, so why take a lock?  As I said, I'm kind of confused here.

> . the nodev list has been removed and bind_vcc/unlink have been changed to
>   just insert/remove vcc list operations.  this makes things a bit clearer.
> . driver's check_ci() routine have been removed in favor of the one in
>   atm_misc.c.  very few were actually any faster, and this functionality is
>   rarely used.

Why does this exist at all?  I mean, if someone has already opened a vcc for
a given vpi/vci pair, the ATM layer could detect this itself and return an error,
without ever calling the driver's open method.  Is it sometimes useful to open
more than one vcc for the same vpi/vci pair?  Also, calling check_ci then
claiming the vpi/vci pair is racy.  If you want to have something like this,
shouldn't there be a routine "claim_ci", which either returns an error if that
vpi/vci pair was already in use, or marks it as in use if not (done atomically
of course).

> . there is a new function, vcc_lookup(), used mostly by the fore200e and he
>   driver.  vcc's returned by this function will need a vcc_put() or the
>   ref count will be wrong.  in the future, most drivers should probably
>   use this function to find the vcc for their newly arrived data.  there
>   possibly could be a routine atmif_rx(atmdev, skb, vpi, vci) which
>   just does the right thing.  ideally, atm would use netif_rx() but one
>   would need to dummy up a ethernet header on each of the pdu's.  comments
>   on that idea?
> . all the stuff from the previous version: atmdev ref counting, atm and
>   clip can now build as modules.
>
> ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_64_atm_dev_lock.patch
>
> [i will try to get a 2.5.65 version available this weekend]

Good stuff!

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ATM] second pass at fixing atm spinlock
  2003-03-28 10:07 ` Duncan Sands
@ 2003-03-28 13:08   ` chas williams
  0 siblings, 0 replies; 10+ messages in thread
From: chas williams @ 2003-03-28 13:08 UTC (permalink / raw)
  To: Duncan Sands; +Cc: linux-atm-general, linux-kernel

In message <200303281107.07586.baldrick@wanadoo.fr>,Duncan Sands writes:
>I'm kind of confused about this.  It seems to me that you should only
>need to read_lock(vcc_sklist_lock) if you are going to traverse (or
>otherwise examine the structure of) the list.  There should be no need

to prevent trouble with walking a list that might be changing while
you are walking it and to synchronize the release and "bottom half"
operation of the atm drivers.  i came up with this as the worst
possible case (of course its one of those lookup via big index table
drivers):

driver->open()		BH()		vcc_release()		sk.refcnt

ENTER
...
vcc_hold(vcc)								2
rx_vcc->vcc = vcc
...
EXIT
		ENTER
		read_lock(sklist)
		vcc = rx_vcc->vcc
		...			
					ENTER
					driver->close()
						...
						rx_vcc->vcc = NULL
						barrier()
						vcc_put(vcc)		1
					...
					write_lock(sklist) [MUST WAIT]
						
		vcc_hold(vcc)						2
		read_unlock(sklist)
		...
					...
					vcc_remove_socket()
					write_unlock(sklist)
					sock_put(vcc);			1
					EXIT

		...
		vcc->push(skb)
		vcc_put(vcc)						0
		EXIT

if you didnt read_lock(sklist) then vcc_release() could (it is somewhat
unlikely) unhash the socket and vcc_put() (its really just sock_put) before
the BH has a chance to vcc_hold(vcc).  the { vcc = rx_vcc->vcc; vcc_hold(vcc); }
isnt an atomic operation and that is were you run into trouble.  you
could remove the hold/put for for the table references and this still
works but putting the vcc in the table is a reference.  you should count it.
however, the nicstar/idt77252 probably still have a race since 
the vc && vc->vcc in the BH isnt protected.

>Why does this exist at all?  I mean, if someone has already opened a vcc for
>a given vpi/vci pair, the ATM layer could detect this itself and return an error,
>without ever calling the driver's open method.  Is it sometimes useful to open

mitch and i have had this discussion.  it seems like its probably a good
idea to move this to the upper layer and possibly remove the ATM_ANY_VPI/VCI
functionality.  it probably should be claim_ci, and insert the entry into
the list during operation.  as it stands there is still a race beween
find_ci() and vcc_insert_socket().

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ATM] second pass at fixing atm spinlock
  2003-03-27 22:28 [ATM] second pass at fixing atm spinlock chas williams
  2003-03-28 10:07 ` Duncan Sands
@ 2003-04-01 16:28 ` chas williams
  2003-04-02  0:45   ` [Linux-ATM-General] " Till Immanuel Patzschke
  2003-04-02  7:18   ` Christoph Hellwig
  2003-04-03 18:46 ` [ATM] a few more changes for people to test chas williams
  2 siblings, 2 replies; 10+ messages in thread
From: chas williams @ 2003-04-01 16:28 UTC (permalink / raw)
  To: linux-atm-general, linux-kernel

>ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_64_atm_dev_lock.patch

i have made an equivalent version of these patches for 2.4.20 in hopes
of getting more feedback.

ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_4_20_atm_dev_lock.patch

(only the nicstar, fore200e, eni and he (included) driver support the
new smp 'safe' locking)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Linux-ATM-General] Re: [ATM] second pass at fixing atm spinlock
  2003-04-01 16:28 ` chas williams
@ 2003-04-02  0:45   ` Till Immanuel Patzschke
  2003-04-02  4:00     ` chas williams
  2003-04-02  7:18   ` Christoph Hellwig
  1 sibling, 1 reply; 10+ messages in thread
From: Till Immanuel Patzschke @ 2003-04-02  0:45 UTC (permalink / raw)
  To: chas williams; +Cc: linux-atm-general, linux-kernel

Hi Chas,

thanks for having taken the job of cleaning this up!!!
I've merged your 2.4 patch w/ my changes and check w/ a couple of thousand PPPoA
sessions created and destroyed over night (which always triggered the
locking/unlocking vcc problems on my SMP box).
I'll let you know how it goes by tomorrow.

One comment on the patch - the change in the net/atm/Makefile (i.e. changing
O_TARGET into atmdev.o) didn't work in my environment (main Makefile still links
atm.o), but I haven't had time to check it out any further - I am using the old
Makefile for now.

Thanks again - and yes, I am using the he155 card :-)

Immanuel
chas williams wrote:

> >ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_64_atm_dev_lock.patch
>
> i have made an equivalent version of these patches for 2.4.20 in hopes
> of getting more feedback.
>
> ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_4_20_atm_dev_lock.patch
>
> (only the nicstar, fore200e, eni and he (included) driver support the
> new smp 'safe' locking)
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Linux-ATM-General] Re: [ATM] second pass at fixing atm spinlock
  2003-04-02  0:45   ` [Linux-ATM-General] " Till Immanuel Patzschke
@ 2003-04-02  4:00     ` chas williams
  0 siblings, 0 replies; 10+ messages in thread
From: chas williams @ 2003-04-02  4:00 UTC (permalink / raw)
  To: Till Immanuel Patzschke; +Cc: linux-atm-general, linux-kernel

In message <3E8A3291.F8397F43@inw.de>,Till Immanuel Patzschke writes:
>I've merged your 2.4 patch w/ my changes and check w/ a couple of thousand PPPoA
>sessions created and destroyed over night (which always triggered the
>locking/unlocking vcc problems on my SMP box).

you might still see problem.  i didnt fix the race when opening to
prevent vpi/vci collisions.  however, i should have a patch tomorrow
to address this.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ATM] second pass at fixing atm spinlock
  2003-04-01 16:28 ` chas williams
  2003-04-02  0:45   ` [Linux-ATM-General] " Till Immanuel Patzschke
@ 2003-04-02  7:18   ` Christoph Hellwig
  2003-04-02 12:11     ` chas williams
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2003-04-02  7:18 UTC (permalink / raw)
  To: chas williams; +Cc: linux-atm-general, linux-kernel

On Tue, Apr 01, 2003 at 11:28:33AM -0500, chas williams wrote:
> >ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_64_atm_dev_lock.patch
> 
> i have made an equivalent version of these patches for 2.4.20 in hopes
> of getting more feedback.
> 
> ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_4_20_atm_dev_lock.patch
> 
> (only the nicstar, fore200e, eni and he (included) driver support the
> new smp 'safe' locking)

You always include the he driver in your patches, what about cleaning it up
and submitting it first?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [ATM] second pass at fixing atm spinlock
  2003-04-02  7:18   ` Christoph Hellwig
@ 2003-04-02 12:11     ` chas williams
  0 siblings, 0 replies; 10+ messages in thread
From: chas williams @ 2003-04-02 12:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-atm-general, linux-kernel

In message <20030402081847.A22335@infradead.org>,Christoph Hellwig writes:
>You always include the he driver in your patches, what about cleaning it up
>and submitting it first?

it still has one serious (atleast in my opnion) bug.  the very first pdu
sent on any particular vpi/vci pair has a corrupted crc.  its during 
initialization somewhere that i dont get the crc 'counters' cleared.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ATM] a few more changes for people to test
  2003-03-27 22:28 [ATM] second pass at fixing atm spinlock chas williams
  2003-03-28 10:07 ` Duncan Sands
  2003-04-01 16:28 ` chas williams
@ 2003-04-03 18:46 ` chas williams
  2003-04-15 18:58   ` [ATM] more changes available for testing chas williams
  2 siblings, 1 reply; 10+ messages in thread
From: chas williams @ 2003-04-03 18:46 UTC (permalink / raw)
  To: linux-atm-general, linux-kernel

besides the previous changes, this release:

. should no longer race when opening vcc's -- atm_find_ci is obselete
  the upper layer always allocates the vpi/vci with the proper locking
  held
. fixes a race for device numbering in atm_dev_register()
. make the list of vcc's global.  this simplifies the code quite a bit,
  and for a vast majority (single atm card) its essentially the same.
. converts vcc->itf to vcc->sk->bound_dev_if
. lets proc/fore200e/eni walk the vcc list safely now

i think locking in the bottom halves of the nicstar/idt77252 is now correct.
since i dont have any idt77252 cards, i only tested the nicstar but the 
code is essentially the same.  the lanai and iphase drivers need similar
work done.  anyone willing to test some changes?  (again, i dont have these
cards).

ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_66_diffs
ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_4_20_diffs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [ATM] more changes available for testing...
  2003-04-03 18:46 ` [ATM] a few more changes for people to test chas williams
@ 2003-04-15 18:58   ` chas williams
  0 siblings, 0 replies; 10+ messages in thread
From: chas williams @ 2003-04-15 18:58 UTC (permalink / raw)
  To: linux-atm-general, linux-kernel

ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_66_diffs
ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_4_20_diffs

new changes with this release --

. the vcc socket list is now global instead of per interface
. more drivers have been tested with the smp locking (iphase, fore200e)
  other have the changes, but are untested due to no hardware (lanai, idt77252)
. alloc_tx, free_rx_skb are no longer part of struct atmdev
. iovcnt has been removed -- linux-atm should use skb.nr_frags in the
  future
. ATM_PDUOVHD has been removed (it was 0 anyway)
. some (very unlikely) races in net/atm/svc.c were addressed
. listenq was dropped in favor just using the receive_queue of the 
  listening 'socket'
. 2.4 makefile issues corrected (or so i think)
. SO_ATMQOS should be more 'stack' friendly
. skb->cb is memzero'ed before calling netif_rx()

previous changes included --

. should no longer race when opening vcc's -- atm_find_ci is obselete
  the upper layer always allocates the vpi/vci with the proper locking
  held
. fixes a race for device numbering in atm_dev_register()
. make the list of vcc's global.  this simplifies the code quite a bit,
  and for a vast majority (single atm card) its essentially the same.
. converts vcc->itf to vcc->sk->bound_dev_if
. lets proc/fore200e/eni walk the vcc list safely now

>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-04-15 18:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-27 22:28 [ATM] second pass at fixing atm spinlock chas williams
2003-03-28 10:07 ` Duncan Sands
2003-03-28 13:08   ` chas williams
2003-04-01 16:28 ` chas williams
2003-04-02  0:45   ` [Linux-ATM-General] " Till Immanuel Patzschke
2003-04-02  4:00     ` chas williams
2003-04-02  7:18   ` Christoph Hellwig
2003-04-02 12:11     ` chas williams
2003-04-03 18:46 ` [ATM] a few more changes for people to test chas williams
2003-04-15 18:58   ` [ATM] more changes available for testing chas williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).