On Thu, Oct 15, 2020 at 6:12 PM Apollon Oikonomopoulos wrote: > > Yuchung Cheng writes: > > > On Thu, Oct 15, 2020 at 1:22 PM Neal Cardwell wrote: > >> > >> On Thu, Oct 15, 2020 at 2:31 PM Apollon Oikonomopoulos wrote: > >> > > >> > Hi, > >> > > >> > I'm trying to debug a (possible) TCP issue we have been encountering > >> > sporadically during the past couple of years. Currently we're running > >> > 4.9.144, but we've been observing this since at least 3.16. > >> > > >> > Tl;DR: I believe we are seeing a case where snd_wl1 fails to be properly > >> > updated, leading to inability to recover from a TCP persist state and > >> > would appreciate some help debugging this. > >> > >> Thanks for the detailed report and diagnosis. I think we may need a > >> fix something like the following patch below. > > That was fast, thank you! > > >> > >> Eric/Yuchung/Soheil, what do you think? > > wow hard to believe how old this bug can be. The patch looks good but > > can Apollon verify this patch fix the issue? > > Sure, I can give it a try and let the systems do their thing for a couple of > days, which should be enough to see if it's fixed. Great, thanks! > Neal, would it be possible to re-send the patch as an attachment? The > inlined version does not apply cleanly due to linewrapping and > whitespace changes and, although I can re-type it, I would prefer to test > the exact same thing that would be merged. Sure, I have attached the "git format-patch" format of the commit. It does seem to apply cleanly to the v4.9.144 kernel you mentioned you are using. Thanks for testing this! best, neal