All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: netback commit history
@ 2011-09-20 12:10 Jan Beulich
  2011-09-20 12:38 ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2011-09-20 12:10 UTC (permalink / raw)
  To: Ian Campbell; +Cc: DongxiaoXu, xen-devel

>>> On 20.09.11 at 13:26, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> On Tue, 2011-09-20 at 12:04 +0100, Jan Beulich wrote:
>> One thing I wonder about in this context is whether the
>> netif_stop_queue() call from xenvif_close() shouldn't happen before
>> xenvif_down() (not the least for reasons of symmetry with
>> xenvif_open()).
> 
> I seem to recall looking at that too, it was the same in the old kernels
> too and I didn't know why so I avoided touching it (I was doing too much
> other cleanup at the time to risk it).

After looking further I don't think that would help (though I still think
it would be more correct), as netif_stop_queue() basically evaluates
to a set_bit() without any other locks taken. So it's completely
asynchronous wrt dev_hard_start_xmit() (and its callers, which are
the ones looking at the bit with HARD_TX_LOCK() carried out).

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: netback commit history
  2011-09-20 12:10 netback commit history Jan Beulich
@ 2011-09-20 12:38 ` Jan Beulich
  2011-09-22  9:07   ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2011-09-20 12:38 UTC (permalink / raw)
  To: Ian Campbell; +Cc: DongxiaoXu, xen-devel

>>> On 20.09.11 at 14:10, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> On 20.09.11 at 13:26, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
>> On Tue, 2011-09-20 at 12:04 +0100, Jan Beulich wrote:
>>> One thing I wonder about in this context is whether the
>>> netif_stop_queue() call from xenvif_close() shouldn't happen before
>>> xenvif_down() (not the least for reasons of symmetry with
>>> xenvif_open()).
>> 
>> I seem to recall looking at that too, it was the same in the old kernels
>> too and I didn't know why so I avoided touching it (I was doing too much
>> other cleanup at the time to risk it).
> 
> After looking further I don't think that would help (though I still think
> it would be more correct), as netif_stop_queue() basically evaluates
> to a set_bit() without any other locks taken. So it's completely
> asynchronous wrt dev_hard_start_xmit() (and its callers, which are
> the ones looking at the bit with HARD_TX_LOCK() carried out).

Which in turn suggests that the upstream driver isn't safe either:
There's nothing preventing vif->netbk from becoming NULL between
the early check in xenvif_start_xmit() and its actual use in
xen_netbk_queue_tx_skb(). I think vif->netbk needs to be latched
into a local variable, checked against NULL, and passed instead of
vif to xen_netbk_queue_tx_skb().

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: netback commit history
  2011-09-20 12:38 ` Jan Beulich
@ 2011-09-22  9:07   ` Ian Campbell
  0 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2011-09-22  9:07 UTC (permalink / raw)
  To: Jan Beulich; +Cc: DongxiaoXu, xen-devel

On Tue, 2011-09-20 at 13:38 +0100, Jan Beulich wrote:
> >>> On 20.09.11 at 14:10, "Jan Beulich" <JBeulich@suse.com> wrote:
> >>>> On 20.09.11 at 13:26, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> >> On Tue, 2011-09-20 at 12:04 +0100, Jan Beulich wrote:
> >>> One thing I wonder about in this context is whether the
> >>> netif_stop_queue() call from xenvif_close() shouldn't happen before
> >>> xenvif_down() (not the least for reasons of symmetry with
> >>> xenvif_open()).
> >> 
> >> I seem to recall looking at that too, it was the same in the old kernels
> >> too and I didn't know why so I avoided touching it (I was doing too much
> >> other cleanup at the time to risk it).
> > 
> > After looking further I don't think that would help (though I still think
> > it would be more correct), as netif_stop_queue() basically evaluates
> > to a set_bit() without any other locks taken. So it's completely
> > asynchronous wrt dev_hard_start_xmit() (and its callers, which are
> > the ones looking at the bit with HARD_TX_LOCK() carried out).
> 
> Which in turn suggests that the upstream driver isn't safe either:
> There's nothing preventing vif->netbk from becoming NULL between
> the early check in xenvif_start_xmit() and its actual use in
> xen_netbk_queue_tx_skb(). I think vif->netbk needs to be latched
> into a local variable, checked against NULL, and passed instead of
> vif to xen_netbk_queue_tx_skb().

The xmit function is called with txq->_xmit_lock held.

I think we should be calling netif_tx_disable at some point before
making vif->netbk == NULL. Need to figure out where (or if we are
already doing it indirectly)

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: netback commit history
  2011-09-20 12:40     ` Ian Campbell
@ 2011-09-20 13:03       ` Jan Beulich
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2011-09-20 13:03 UTC (permalink / raw)
  To: Ian Campbell; +Cc: DongxiaoXu, xen-devel

>>> On 20.09.11 at 14:40, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> On Tue, 2011-09-20 at 12:46 +0100, Jan Beulich wrote:
>> >>> On 20.09.11 at 13:26, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
>> > On Tue, 2011-09-20 at 12:04 +0100, Jan Beulich wrote:
>> >> with the upstream netback introduction consisting of a single big commit
>> >> I wonder whether you could point me to where the full history of it is.
>> > 
>> > Yeah, that was a bit annoying, luckily I had the foresight to post where
>> > the history was. http://www.gossamer-threads.com/lists/xen/devel/202139 
>> > says it is:
>> >         git://xenbits.xen.org/people/ianc/linux-2.6.git 
>> > upstream/dom0/backend/netback-history 
>> 
>> Does this have a http:// representation somewhere (it doesn't show up
>> under http://xenbits.xen.org/people/ianc/)? 
> 
> http://xenbits.xen.org/gitweb/?p=people/ianc/linux-2.6.git;a=summary 
> 
>> 
>> >> I'm asking particularly in the context of us being asked to add a safety
>> >> check similar to the vif->netbk != NULL one at the beginning of
>> >> xenvif_start_xmit() (also in xenvif_interrupt(), but there we have a
>> >> similar check in place), which hadn't been in the legacy tree (obviously)
>> >> nor in the original multiple-tasklets patch that I retained a copy of.
>> > 
>> > Seems to have come from bc05ada1283eb583c9789c27429af36b034c4a74 in that
>> > history tree and was a conversion from a check for group == -1. That
>> > commit changes from storing a group index to a group pointer so I think
>> > it's roughly equivalent from a validity point of view.
>> > 
>> > The original group == -1 check appears to be in the 2.6.32.x pvops
>> > kernels at least, I expect it is also in the multiple tasklet patch
>> > which you have as well?
>> 
>> That's the point - we don't.
> 
> Wierd, it is in 020ba9067e121b720a3335521698ea9cf31f6166 in Jeremy's
> xen/2.6.32-stable branch which is the original "xen/netback: Multiple
> tasklets support." commit. Did you pick up an earlier posting?
> 
> I found the original postings
> v2: http://lists.xensource.com/archives/html/xen-devel/2010-04/msg01578.html 
> v3: http://www.gossamer-threads.com/lists/xen/devel/170582 
> v4: http://lists.xensource.com/archives/html/xen-devel/2010-05/msg00140.html 
> 
> From the looks things the check arrived in v3 but there is no comment on
> the patch saying why, nor does any review I found of v2 give a hint. I
> had a bit of a trawl through various list archives of the other postings
> of the series but didn't spot anything.

Seems like my original is actually v1, and I may not have picked up
that later change precisely because it wasn't mentioned in the
description and I didn't look closely enough at the changes plus I
had done quite a bit of other cleanup on v1 already and hence
didn't want to start over. v3 is also where the similar check in
netif_be_int() appears, and I know for sure we had to add this
due to another bug report, not due to the change there.

In any case, thanks a lot for helping with this!

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: netback commit history
  2011-09-20 11:46   ` Jan Beulich
@ 2011-09-20 12:40     ` Ian Campbell
  2011-09-20 13:03       ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2011-09-20 12:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: DongxiaoXu, xen-devel

On Tue, 2011-09-20 at 12:46 +0100, Jan Beulich wrote:
> >>> On 20.09.11 at 13:26, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> > On Tue, 2011-09-20 at 12:04 +0100, Jan Beulich wrote:
> >> with the upstream netback introduction consisting of a single big commit
> >> I wonder whether you could point me to where the full history of it is.
> > 
> > Yeah, that was a bit annoying, luckily I had the foresight to post where
> > the history was. http://www.gossamer-threads.com/lists/xen/devel/202139 
> > says it is:
> >         git://xenbits.xen.org/people/ianc/linux-2.6.git 
> > upstream/dom0/backend/netback-history 
> 
> Does this have a http:// representation somewhere (it doesn't show up
> under http://xenbits.xen.org/people/ianc/)?

http://xenbits.xen.org/gitweb/?p=people/ianc/linux-2.6.git;a=summary

> 
> >> I'm asking particularly in the context of us being asked to add a safety
> >> check similar to the vif->netbk != NULL one at the beginning of
> >> xenvif_start_xmit() (also in xenvif_interrupt(), but there we have a
> >> similar check in place), which hadn't been in the legacy tree (obviously)
> >> nor in the original multiple-tasklets patch that I retained a copy of.
> > 
> > Seems to have come from bc05ada1283eb583c9789c27429af36b034c4a74 in that
> > history tree and was a conversion from a check for group == -1. That
> > commit changes from storing a group index to a group pointer so I think
> > it's roughly equivalent from a validity point of view.
> > 
> > The original group == -1 check appears to be in the 2.6.32.x pvops
> > kernels at least, I expect it is also in the multiple tasklet patch
> > which you have as well?
> 
> That's the point - we don't.

Wierd, it is in 020ba9067e121b720a3335521698ea9cf31f6166 in Jeremy's
xen/2.6.32-stable branch which is the original "xen/netback: Multiple
tasklets support." commit. Did you pick up an earlier posting?

I found the original postings
v2: http://lists.xensource.com/archives/html/xen-devel/2010-04/msg01578.html
v3: http://www.gossamer-threads.com/lists/xen/devel/170582
v4: http://lists.xensource.com/archives/html/xen-devel/2010-05/msg00140.html

>From the looks things the check arrived in v3 but there is no comment on
the patch saying why, nor does any review I found of v2 give a hint. I
had a bit of a trawl through various list archives of the other postings
of the series but didn't spot anything.

> >> One thing I wonder about in this context is whether the
> >> netif_stop_queue() call from xenvif_close() shouldn't happen before
> >> xenvif_down() (not the least for reasons of symmetry with
> >> xenvif_open()).
> > 
> > I seem to recall looking at that too, it was the same in the old kernels
> > too and I didn't know why so I avoided touching it (I was doing too much
> > other cleanup at the time to risk it).
> 
> Understandable.
> 
> Thanks for the really quick response,
> Jan
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: netback commit history
  2011-09-20 11:26 ` Ian Campbell
@ 2011-09-20 11:46   ` Jan Beulich
  2011-09-20 12:40     ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2011-09-20 11:46 UTC (permalink / raw)
  To: Ian Campbell; +Cc: DongxiaoXu, xen-devel

>>> On 20.09.11 at 13:26, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> On Tue, 2011-09-20 at 12:04 +0100, Jan Beulich wrote:
>> with the upstream netback introduction consisting of a single big commit
>> I wonder whether you could point me to where the full history of it is.
> 
> Yeah, that was a bit annoying, luckily I had the foresight to post where
> the history was. http://www.gossamer-threads.com/lists/xen/devel/202139 
> says it is:
>         git://xenbits.xen.org/people/ianc/linux-2.6.git 
> upstream/dom0/backend/netback-history 

Does this have a http:// representation somewhere (it doesn't show up
under http://xenbits.xen.org/people/ianc/)?

>> I'm asking particularly in the context of us being asked to add a safety
>> check similar to the vif->netbk != NULL one at the beginning of
>> xenvif_start_xmit() (also in xenvif_interrupt(), but there we have a
>> similar check in place), which hadn't been in the legacy tree (obviously)
>> nor in the original multiple-tasklets patch that I retained a copy of.
> 
> Seems to have come from bc05ada1283eb583c9789c27429af36b034c4a74 in that
> history tree and was a conversion from a check for group == -1. That
> commit changes from storing a group index to a group pointer so I think
> it's roughly equivalent from a validity point of view.
> 
> The original group == -1 check appears to be in the 2.6.32.x pvops
> kernels at least, I expect it is also in the multiple tasklet patch
> which you have as well?

That's the point - we don't.

>> One thing I wonder about in this context is whether the
>> netif_stop_queue() call from xenvif_close() shouldn't happen before
>> xenvif_down() (not the least for reasons of symmetry with
>> xenvif_open()).
> 
> I seem to recall looking at that too, it was the same in the old kernels
> too and I didn't know why so I avoided touching it (I was doing too much
> other cleanup at the time to risk it).

Understandable.

Thanks for the really quick response,
Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: netback commit history
  2011-09-20 11:04 Jan Beulich
@ 2011-09-20 11:26 ` Ian Campbell
  2011-09-20 11:46   ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2011-09-20 11:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xu, xen-devel, Dongxiao

On Tue, 2011-09-20 at 12:04 +0100, Jan Beulich wrote:
> Hi Ian,
> 
> with the upstream netback introduction consisting of a single big commit
> I wonder whether you could point me to where the full history of it is.

Yeah, that was a bit annoying, luckily I had the foresight to post where
the history was. http://www.gossamer-threads.com/lists/xen/devel/202139
says it is:
        git://xenbits.xen.org/people/ianc/linux-2.6.git upstream/dom0/backend/netback-history 

> I'm asking particularly in the context of us being asked to add a safety
> check similar to the vif->netbk != NULL one at the beginning of
> xenvif_start_xmit() (also in xenvif_interrupt(), but there we have a
> similar check in place), which hadn't been in the legacy tree (obviously)
> nor in the original multiple-tasklets patch that I retained a copy of.

Seems to have come from bc05ada1283eb583c9789c27429af36b034c4a74 in that
history tree and was a conversion from a check for group == -1. That
commit changes from storing a group index to a group pointer so I think
it's roughly equivalent from a validity point of view.

The original group == -1 check appears to be in the 2.6.32.x pvops
kernels at least, I expect it is also in the multiple tasklet patch
which you have as well?

>  That
> is, I'd like to understand the reasons for this check as it seems wrong
> to me to have to do it there - I'd rather think that if an interface got
> disabled, execution shouldn't even reach that function anymore.

Yes, I'd think that too but I don't really know. Maybe Dongxiao
remembers why he added it?

> One thing I wonder about in this context is whether the
> netif_stop_queue() call from xenvif_close() shouldn't happen before
> xenvif_down() (not the least for reasons of symmetry with
> xenvif_open()).

I seem to recall looking at that too, it was the same in the old kernels
too and I didn't know why so I avoided touching it (I was doing too much
other cleanup at the time to risk it).

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* netback commit history
@ 2011-09-20 11:04 Jan Beulich
  2011-09-20 11:26 ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2011-09-20 11:04 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel

Hi Ian,

with the upstream netback introduction consisting of a single big commit
I wonder whether you could point me to where the full history of it is.

I'm asking particularly in the context of us being asked to add a safety
check similar to the vif->netbk != NULL one at the beginning of
xenvif_start_xmit() (also in xenvif_interrupt(), but there we have a
similar check in place), which hadn't been in the legacy tree (obviously)
nor in the original multiple-tasklets patch that I retained a copy of. That
is, I'd like to understand the reasons for this check as it seems wrong
to me to have to do it there - I'd rather think that if an interface got
disabled, execution shouldn't even reach that function anymore.

One thing I wonder about in this context is whether the
netif_stop_queue() call from xenvif_close() shouldn't happen before
xenvif_down() (not the least for reasons of symmetry with
xenvif_open()).

Thanks, Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-09-22  9:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-20 12:10 netback commit history Jan Beulich
2011-09-20 12:38 ` Jan Beulich
2011-09-22  9:07   ` Ian Campbell
  -- strict thread matches above, loose matches on Subject: below --
2011-09-20 11:04 Jan Beulich
2011-09-20 11:26 ` Ian Campbell
2011-09-20 11:46   ` Jan Beulich
2011-09-20 12:40     ` Ian Campbell
2011-09-20 13:03       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.