* [ATM] first pass at fixing atm spinlock [not found] <OFAFD4106C.6053931B-ON85256CE0.004F1DD6@gdeb.com> @ 2003-03-17 22:24 ` chas williams 2003-03-17 23:01 ` Mitchell Blank Jr 0 siblings, 1 reply; 8+ messages in thread From: chas williams @ 2003-03-17 22:24 UTC (permalink / raw) To: linux-atm-general; +Cc: linux-kernel ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_64_atm_dev_lock.patch this should apply cleanly to the stock 2-5-64 sources. its an attempt to make the atm spinlock a little less obnoxious. atm_dev_lock is now a read/write lock that now protects the list of atm devices. an individual spinlock protect the various members of atm_dev, like the vcc list and addr lists. atm_dev's are refcounted, so there is no longer any need to hold a lock across the ioctl. atm_find_dev is gone, in favor of atm_dev_lookup, atm_dev_release, and atm_dev_hold. atm_dev_lookup() and atm_dev_hold() increment the ref count. atm_dev_release() drops the ref count. this patch also makes atm and clip modular. the clip arp 'interface' in ipv4/arp.c isn't very pretty. it seems to me that there should be a better way. anyway, instead of clip_tbl, its now clip_tbl_hook which is exported for the clip module to set as it desires (the atm proc interface also uses this instead of exporting clip_tbl via atm_clip_ops) things are not set in stone yet, so i would like some advice -- the per device vcc list will probably be traversed (read-only) in the bottom half of the kernel during receive for some adapters. is it simply enough to use list_for_each_safe() when traversing the list? otherwise the per device spinlock will need to shared between the top and bottom half of the kernel. the vcc's shouldnt need to be ref counted since they are rather tightly coupled to a struct sock which is. i dont like the 'shutdown on last reference' in atm_dev_release(). you can only safely call release when you know you can sleep. is shutdown_atm_dev() really that useful? i would like a few brave souls test and comment on this code. i could possibly provide a 2.4.20 patch in a couple days. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ATM] first pass at fixing atm spinlock 2003-03-17 22:24 ` [ATM] first pass at fixing atm spinlock chas williams @ 2003-03-17 23:01 ` Mitchell Blank Jr 2003-03-17 23:25 ` chas williams 0 siblings, 1 reply; 8+ messages in thread From: Mitchell Blank Jr @ 2003-03-17 23:01 UTC (permalink / raw) To: chas williams; +Cc: linux-atm-general, linux-kernel chas williams wrote: > ftp://ftp.cmf.nrl.navy.mil/pub/chas/linux-atm/2_5_64_atm_dev_lock.patch Note that this patch also has lots of other stuff (he driver, ipv6 stuff) I'll try to look at the patch in depth later, but I'll make a few initial comments. First of all, thanks for taking this up - this is work I've wanted to do for years and never found the time for. > atm_dev_lock is now a > read/write lock that now protects the list of atm devices. Are you sure you're never sleeping while holding this lock while iterating device lists? I haven't checked but we do a fair number of things there (like the proc stuff) so we really need to track those down. > things are not set in stone yet, so i would like some advice -- the per > device vcc list will probably be traversed (read-only) in the bottom > half of the kernel during receive for some adapters. Yes, absolutely. > is it simply > enough to use list_for_each_safe() when traversing the list? otherwise > the per device spinlock will need to shared between the top and bottom half > of the kernel. the vcc's shouldnt need to be ref counted since they > are rather tightly coupled to a struct sock which is. Then you should implement functions to grab/release the "struct sock"'s refernce count based on the atm_vcc. Then you'd have atm_vcc_find(dev, vpi, vci) atm_vcc_hold(vcc) atm_vcc_release(vcc) Just like that atm_dev_*() functions. Otherwise there's no reason that the sock/vcc combo can't go away while a processor's bh still is using a reference to the vcc. I think this has been the result of many of the reported SMP crashes (it's probably not that hard to trigger; just close an ATM socket that's receiving a flood of traffic) > i dont like > the 'shutdown on last reference' in atm_dev_release(). you can only > safely call release when you know you can sleep. is shutdown_atm_dev() > really that useful? Yes, it's needed (especially if you want to support more complicated interfaces like ADSL cards via the ATM stack in the future) Really the dev release stuff should ONLY be called from the close() path. You really need something like atm_dev_release_last() that waits for the reference count to hit "1" (via a completion or something) and then does the release stuff. -Mitch ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ATM] first pass at fixing atm spinlock 2003-03-17 23:01 ` Mitchell Blank Jr @ 2003-03-17 23:25 ` chas williams 2003-03-18 1:13 ` Till Immanuel Patzschke ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: chas williams @ 2003-03-17 23:25 UTC (permalink / raw) To: Mitchell Blank Jr; +Cc: linux-atm-general, linux-kernel In message <20030317150144.F92155@sfgoth.com>,Mitchell Blank Jr writes: >Note that this patch also has lots of other stuff (he driver, ipv6 stuff) that ipv6 stuff crept in there (due to a bad merge on my part). i believe i have that right now. >> atm_dev_lock is now a >> read/write lock that now protects the list of atm devices. > >Are you sure you're never sleeping while holding this lock while iterating >device lists? I haven't checked but we do a fair number of things there >(like the proc stuff) so we really need to track those down. fairly certain. the only dangerous thing proc.c is snprintf(). i didnt want to spend a lot of time on proc.c doing it the 'right' way (using atm_dev_hold/release) since it will be change to a seq interface. >sock/vcc combo can't go away while a processor's bh still is using a reference >to the vcc. I think this has been the result of many of the reported SMP >crashes (it's probably not that hard to trigger; just close an ATM socket >that's receiving a flood of traffic) i dont know. i believe all of the adapters do a synchronous close. after the close finishes, no more traffic should arrive for a particular vcc. the bottom halfs (halves?) should not have references to vcc after a close() for it finishes. >You really need something like atm_dev_release_last() that waits for the >reference count to hit "1" (via a completion or something) and then does >the release stuff. atm_dev_deregister() does that. if a driver need to remove the atm device (i suppose its being unplugged) it could close the vcc's and call atm_dev_register() which will wait the ref count to drop to 0. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ATM] first pass at fixing atm spinlock 2003-03-17 23:25 ` chas williams @ 2003-03-18 1:13 ` Till Immanuel Patzschke 2003-03-18 16:11 ` chas williams 2003-03-19 10:57 ` Mitchell Blank Jr 2003-03-19 17:11 ` [Linux-ATM-General] " Duncan Sands 2 siblings, 1 reply; 8+ messages in thread From: Till Immanuel Patzschke @ 2003-03-18 1:13 UTC (permalink / raw) To: chas williams; +Cc: Mitchell Blank Jr, linux-atm-general, linux-kernel Hi Chas, good job, cleaning the ATM stuff up -- on the note below: I've been having lots of problems w/ the close and "dangeling" vccs, since in my scenarios traffic was coming in AFTER the close. I tried to defer the destruction and it got rid of the bad crashes. The ones left are more subtle - getting sporadic messages "device already in use" since close worked, but open fails, since the VCC list is still linked... (Are you sure - and sorry for asking) there is no place this structure sleeps and therefor "jumps" the sequence? Anyway, how about re-doing the VCC stuff mor in the way you did it for the HE device driver - having an array, indexed by a pvc hash -- and one could easily move that code into the drivers since it is unlikely to have a receiving PVC on one card and the sending part (of the same PVC) on a different one... There is only one ATM card - as far as I know the PCE/A200-E - which allows arbitrary PVCs. All others start at 0.0 counting up, or splitting bits like the HE... Besides, I am happy to give your changes a try on our SMP system (2.4.20) once you have a version for 2.4. ready. I am using PPPoA trying to utilise as many PVCs as possible (up to 4096). BTW: Have you addressed the difficulty that, depending on the ATM board, some packet receives are handled while still in the IRQ and others (the newer/better) ones always go through a tasklet? Another note on the PPPoA driver -- it creates/destroys the tasklet for each PVC (on open/close) potentially resulting in lots of allocated tasklet structures if you do heavy logon/logoff w/ PPPoA. I have a version having a single tasklet for doing all the work, but since you've changed some of the locks, I'd like to try it first before I'll submit. Keep continuing the you great work! Thanks Immanuel chas williams wrote: > >sock/vcc combo can't go away while a processor's bh still is using a reference > >to the vcc. I think this has been the result of many of the reported SMP > >crashes (it's probably not that hard to trigger; just close an ATM socket > >that's receiving a flood of traffic) > > i dont know. i believe all of the adapters do a synchronous close. after > the close finishes, no more traffic should arrive for a particular vcc. > the bottom halfs (halves?) should not have references to vcc after a close() > for it finishes. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ATM] first pass at fixing atm spinlock 2003-03-18 1:13 ` Till Immanuel Patzschke @ 2003-03-18 16:11 ` chas williams 0 siblings, 0 replies; 8+ messages in thread From: chas williams @ 2003-03-18 16:11 UTC (permalink / raw) To: Till Immanuel Patzschke Cc: Mitchell Blank Jr, linux-atm-general, linux-kernel >good job, cleaning the ATM stuff up -- on the note below: I've been having lot >problems w/ the close and "dangeling" vccs, since in my scenarios traffic was >coming in AFTER the close. I tried to defer the destruction and it got rid of what driver are you using? this seem like wrong behavior imho. the first part of close should stop the card from delivering anymore traffic on the particular vpi/vci. close should return after it is certain that no more traffic will arrive on said vpi/vci pair. >Anyway, how about re-doing the VCC stuff mor in the way you did it for the >HE device driver - having an array, indexed by a pvc hash -- and one could eas almost all the drivers walk the vcc list trying to find a vpi/vci pair. enough that there should be a real lookup that is hashed. however, some drivers (like the nicstar) keep an indexed array of vpi/vcc pointing to the approriate vcc. >BTW: Have you addressed the difficulty that, depending on the ATM board, some >packet receives are handled while still in the IRQ and others (the newer/bette >ones always go through a tasklet? that's really up the driver authors. convince the author to change to a tasklet. change it yourself. it shouldnt be too difficult in most cases. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ATM] first pass at fixing atm spinlock 2003-03-17 23:25 ` chas williams 2003-03-18 1:13 ` Till Immanuel Patzschke @ 2003-03-19 10:57 ` Mitchell Blank Jr 2003-03-19 15:13 ` chas williams 2003-03-19 17:11 ` [Linux-ATM-General] " Duncan Sands 2 siblings, 1 reply; 8+ messages in thread From: Mitchell Blank Jr @ 2003-03-19 10:57 UTC (permalink / raw) To: chas williams; +Cc: linux-atm-general, linux-kernel chas williams wrote: > fairly certain. the only dangerous thing proc.c is snprintf(). i didnt > want to spend a lot of time on proc.c doing it the 'right' way (using > atm_dev_hold/release) since it will be change to a seq interface. Great! > >sock/vcc combo can't go away while a processor's bh still is using a reference > >to the vcc. I think this has been the result of many of the reported SMP > >crashes (it's probably not that hard to trigger; just close an ATM socket > >that's receiving a flood of traffic) > > i dont know. i believe all of the adapters do a synchronous close. I'm really not sure it's that safe. At the very least the drivers all need to make sure that their ->close() excludes their interrupt/bh work from happening. That would probably be possible but it seems that it would be better to just build a robust refcount system at the lower layer. This should make it quite a bit easier to ensure that the drivers are safe. Also remember that some drivers have a common area that incoming PDUs get put in (especially for AAL0 traffic) so there might be stuff in there even after a close. So I think you're right that it's possible to get by without vcc refcounting I think the best method would be to implement an atm_vcc_{find,hold,release} that uses the sock's refcount so we can cleanly and 100% safely take a reference to a vcc from a bh. > >You really need something like atm_dev_release_last() that waits for the > >reference count to hit "1" (via a completion or something) and then does > >the release stuff. > > atm_dev_deregister() does that. if a driver need to remove the atm > device (i suppose its being unplugged) it could close the vcc's and > call atm_dev_register() which will wait the ref count to drop to 0. Great - now we just have to do the same thing for vcc's :-) -Mitch ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [ATM] first pass at fixing atm spinlock 2003-03-19 10:57 ` Mitchell Blank Jr @ 2003-03-19 15:13 ` chas williams 0 siblings, 0 replies; 8+ messages in thread From: chas williams @ 2003-03-19 15:13 UTC (permalink / raw) To: Mitchell Blank Jr; +Cc: linux-atm-general, linux-kernel In message <20030319025734.C35024@sfgoth.com>,Mitchell Blank Jr writes: >> i dont know. i believe all of the adapters do a synchronous close. >I'm really not sure it's that safe. At the very least the drivers all >need to make sure that their ->close() excludes their interrupt/bh work i suppose i could be he-centric. when the rx and tx close complete you know you wont be getting anymore traffic in the queues regarding that vpi/vcc. i suppose other cards might now have this feature. some drivers just keep a pointer to vcc in a table. so it they dont use atm_vcc_lookup() each time they need a receieve then i guess the vcc's will need some sort of ref counting. i am generally against this scheme, but it could be made to work. >Great - now we just have to do the same thing for vcc's :-) actually i think the vcc related code is 'broken'. since the vcc is really attached to a struct sock, the vcc code should be more sock-centric. sock provides ref counting, locks, linking, hash support. at the time the atm code was initially written, struct sock might not have been generic enough. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Linux-ATM-General] Re: [ATM] first pass at fixing atm spinlock 2003-03-17 23:25 ` chas williams 2003-03-18 1:13 ` Till Immanuel Patzschke 2003-03-19 10:57 ` Mitchell Blank Jr @ 2003-03-19 17:11 ` Duncan Sands 2 siblings, 0 replies; 8+ messages in thread From: Duncan Sands @ 2003-03-19 17:11 UTC (permalink / raw) To: chas williams, Mitchell Blank Jr; +Cc: linux-atm-general, linux-kernel >> atm_dev_lock is now a >> read/write lock that now protects the list of atm devices. > >Are you sure you're never sleeping while holding this lock while iterating >device lists? I haven't checked but we do a fair number of things there >(like the proc stuff) so we really need to track those down. [I hope the following comments are not misplaced: I picked up this thread half way through, and I didn't read the code...] Presumably the only reason you want a spin lock is because drivers may be walking the list in interrupt context when the time comes to add/remove a member. My first comment is: what are drivers doing walking this list anyway? Shouldn't it be private to the ATM layer? Drivers can maintain their own list if they need one. My second comment is that the ATM layer should never need to take the spinlock itself when walking the list (I'm guessing it always walks the list in process context). You could have the following setup: list protected by a semaphore and a spinlock. Deleting a member does the following: acquire the semaphore ... take the spinlock delete the member release the spinlock ... release the semaphore Adding a member does the following: acquire the semaphore ... take the spinlock add the member release the spinlock ... release the semaphore Walking the list in process context does: acquire the semaphore walk the list doing stuff release the semaphore Walking the list in interrupt context does: acquire the spinlock walk the list doing stuff that can't sleep release the spinlock Of course the semaphore could be used as a "big device lock", not just a list lock. Duncan. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-03-19 17:00 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <OFAFD4106C.6053931B-ON85256CE0.004F1DD6@gdeb.com> 2003-03-17 22:24 ` [ATM] first pass at fixing atm spinlock chas williams 2003-03-17 23:01 ` Mitchell Blank Jr 2003-03-17 23:25 ` chas williams 2003-03-18 1:13 ` Till Immanuel Patzschke 2003-03-18 16:11 ` chas williams 2003-03-19 10:57 ` Mitchell Blank Jr 2003-03-19 15:13 ` chas williams 2003-03-19 17:11 ` [Linux-ATM-General] " Duncan Sands
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).