All of lore.kernel.org
 help / color / mirror / Atom feed
From: Farhan Ali <alifm@linux.ibm.com>
To: Eric Farman <farman@linux.ibm.com>, cohuck@redhat.com
Cc: pasic@linux.ibm.com, linux-s390@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [RFC v1 1/1] vfio-ccw: Don't call cp_free if we are processing a channel program
Date: Fri, 21 Jun 2019 10:17:09 -0400	[thread overview]
Message-ID: <581d756d-7418-cd67-e0e8-f9e4fe10b22d@linux.ibm.com> (raw)
In-Reply-To: <638804dc-53c0-ff2f-d123-13c257ad593f@linux.ibm.com>



On 06/20/2019 04:27 PM, Eric Farman wrote:
> 
> 
> On 6/20/19 3:40 PM, Farhan Ali wrote:
>> There is a small window where it's possible that an interrupt can
>> arrive and can call cp_free, while we are still processing a channel
>> program (i.e allocating memory, pinnging pages, translating
> 
> s/pinnging/pinning/
> 
>> addresses etc). This can lead to allocating and freeing at the same
>> time and can cause memory corruption.
>>
>> Let's not call cp_free if we are currently processing a channel program.
> 
> The check around this cp_free() call is for a solicited interrupt, so
> it's presumably in response to a SSCH we issued.  But if we're still
> processing a CP, then we hadn't issued the SSCH to the hardware yet.  So
> what is this interrupt for?  Do the contents of irb.cpa provide any
> clues, perhaps if it's in the current cp or for someone else?
> 

I don't think the interrupt is in response to an ssch but rather due to 
an csch/hsch.

>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> ---
>>
>> I have been running my test overnight with this patch and I haven't
>> seen the stack traces that I mentioned about earlier. I would like
>> to get some reviews on this and also if this is the right thing to
>> do?
>>
>> Thanks
>> Farhan
>>
>>   drivers/s390/cio/vfio_ccw_drv.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
>> index 66a66ac..61ece3f 100644
>> --- a/drivers/s390/cio/vfio_ccw_drv.c
>> +++ b/drivers/s390/cio/vfio_ccw_drv.c
>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>>   		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
>>   	if (scsw_is_solicited(&irb->scsw)) {
>>   		cp_update_scsw(&private->cp, &irb->scsw);
> 
> As I alluded earlier, do we know this irb is for this cp?  If no, what
> does this function end up putting in the scsw?
> 
>> -		if (is_final)
>> +		if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING)
> 
> In looking at how we set this state, and how we exit it, I see we do:
> 
> if SSCH got CC0, CP_PROCESSING -> CP_PENDING
> if SSCH got !CC0, CP_PROCESSING -> IDLE
> 
> While the first scenario happens immediately after the SSCH instruction,
> I guess it could be just tiny enough, like the io_trigger FSM patch I
> sent a few weeks ago.
> 
> Meanwhile, the latter happens way after we return from the jump table.
> So that scenario leaves considerable time for such an interrupt to
> occur, though I don't understand why it would if we got a CC(1-3) on the
> SSCH.
> 
> And anyway, the return from fsm_io_helper() in that case will also call
> cp_free().  So why does the cp->initialized check provide protection
> from a double-free in that direction, but not here?  I'm confused.

I have a theory where I think it's possible to have 2 different threads 
executing cp_free

If we start with private->state == IDLE and the guest issues a 
clear/halt and then an ssch

- clear/halt will be issued to hardware, and if succeeds we will return 
cc=0 to guest

- the guest can then issue ssch

- we get an interrupt for csch/hsch and we queue the interrupt in the 
workqueue

- we start processing the ssch and then at the same time another cpu 
could be working on the
interrupt


Thread 1                                        Thread 2
--------                                        --------

fsm_io_request                                  vfio_ccw_sch_io_todo 

     cp_init                                         cp_free
     cp_prefetch
     fsm_io_helper
         cp_free



The test that I am trying is with a guest running an fio workload, while 
at the same time stressing the error recovery path in the guest. So 
there is a lot of ssch and lot of csch.

Of course I don't think my patch completely solves the problem, I think 
it just makes the window narrower. I just wanted to get a discussion 
started :)


Now that I am thinking more about it, I think we might have to protect 
cp with it's own mutex.

Thanks
Farhan


> 
>>   			cp_free(&private->cp);
>>   	}
>>   	mutex_lock(&private->io_mutex);
>>
> 

  reply	other threads:[~2019-06-21 14:17 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1561055076.git.alifm@linux.ibm.com>
2019-06-20 19:40 ` [RFC v1 1/1] vfio-ccw: Don't call cp_free if we are processing a channel program Farhan Ali
2019-06-20 21:07   ` Farhan Ali
2019-06-20 20:27   ` Eric Farman
2019-06-21 14:17     ` Farhan Ali [this message]
2019-06-21 17:40       ` Eric Farman
2019-06-21 18:34         ` Farhan Ali
2019-06-24  9:42           ` Cornelia Huck
2019-06-24 10:05             ` Cornelia Huck
2019-06-24 11:46               ` Cornelia Huck
2019-06-24 12:07                 ` Cornelia Huck
2019-06-24 14:44                   ` Farhan Ali
2019-06-24 15:09                     ` Cornelia Huck
2019-06-24 15:24                       ` Farhan Ali
2019-06-27  9:14                         ` Cornelia Huck
2019-06-28 13:05                           ` Farhan Ali
2019-06-24 11:31             ` Halil Pasic
2019-06-21 14:00   ` Halil Pasic
2019-06-21 14:26     ` Farhan Ali

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=581d756d-7418-cd67-e0e8-f9e4fe10b22d@linux.ibm.com \
    --to=alifm@linux.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.