kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cornelia Huck <cohuck@redhat.com>
To: Eric Farman <farman@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>,
	"Jason J . Herne" <jjherne@linux.ibm.com>,
	Jared Rossi <jrossi@linux.ibm.com>,
	linux-s390@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH v1 1/1] vfio-ccw: Don't free channel programs for unrelated interrupts
Date: Fri, 24 Jan 2020 16:33:05 +0100	[thread overview]
Message-ID: <20200124163305.3d6f0d47.cohuck@redhat.com> (raw)
In-Reply-To: <20200124145455.51181-2-farman@linux.ibm.com>

On Fri, 24 Jan 2020 15:54:55 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> With the addition of asynchronous channel programs (HALT or CLEAR
> SUBCHANNEL instructions), the hardware can receive interrupts that
> are not related to any channel program currently active.  An attempt
> is made to ensure that only associated resources are freed, but the
> host can fail in unpleasant ways:
> 
> [ 1051.330289] Unable to handle kernel pointer dereference in virtual kernel address space
> [ 1051.330360] Failing address: c880003d16572000 TEID: c880003d16572803
> [ 1051.330365] Fault in home space mode while using kernel ASCE.
> [ 1051.330372] AS:00000000fde9c007 R3:0000000000000024
> ...snip...
> [ 1051.330539]  [<00000000fccbd33e>] __kmalloc+0xd6/0x3d8
> [ 1051.330543] ([<00000000fccbd514>] __kmalloc+0x2ac/0x3d8)
> [ 1051.330550]  [<000003ff801452b4>] cp_prefetch+0xc4/0x3b8 [vfio_ccw]
> [ 1051.330554]  [<000003ff801471e4>] fsm_io_request+0x2d4/0x7b8 [vfio_ccw]
> [ 1051.330559]  [<000003ff80145d9c>] vfio_ccw_mdev_write+0x17c/0x300 [vfio_ccw]
> [ 1051.330564]  [<00000000fccf0d20>] vfs_write+0xb0/0x1b8
> [ 1051.330568]  [<00000000fccf1236>] ksys_pwrite64+0x7e/0xb8
> [ 1051.330574]  [<00000000fd4524c0>] system_call+0xdc/0x2d8
> 
> The problem is corruption of the dma-kmalloc-8 slab [1], if an interrupt
> occurs for a CLEAR or HALT that is not obviously associated with the
> current channel program.  If the channel_program struct is freed AND
> another interrupt for that I/O occurs, then this may occur:
> 
> 583.612967 00          cp_prefetch  NEW SSCH
> 583.613180 03 vfio_ccw_sch_io_todo  orb.cpa=03012690 irb.cpa=03012698
>                                     ccw=2704004203015600 *cda=1955d8fb8
>                                     irb: fctl=4 actl=0 stctl=7
> 587.039292 04          cp_prefetch  NEW SSCH
> 587.039296 01 vfio_ccw_sch_io_todo  orb.cpa=7fe209f0 irb.cpa=03012698
>                                     ccw=3424000c030a4068 *cda=1999e9cf0
>                                     irb: fctl=2 actl=0 stctl=1
> 587.039505 01 vfio_ccw_sch_io_todo  orb.cpa=7fe209f0 irb.cpa=7fe209f8
>                                     ccw=3424000c030a4068 *cda=0030a4070
>                                     irb: fctl=4 actl=0 stctl=7
> 
> Note how the last vfio_ccw_sch_io_todo() call has a ccw.cda that is
> right next to its supposed IDAW, compared to the previous one?  That
> is the result of the previous one freeing the cp (and its IDAL), and
> kfree writing the next available address at the beginning of the
> newly released memory.  When the channel goes to store data, it
> believes the IDAW is valid and overwrites that pointer and causes
> kmalloc to fail some time later.
> 
> Since the vfio-ccw interrupt handler walks the list of ccwchain structs
> to determine if the guest SCSW needs to be updated, it can be changed
> to indicate whether the interrupt points within the channel_program.
> If yes, then the channel_program is valid and its resources can be freed.
> It not, then another interrupt is expected to do that later.
> 
> [1] It could be other dma-kmalloc-xxx slabs; this just happens to be the
> one driven most frequently in my testing.
> 
> Fixes: f4c9939433bd ("vfio-ccw: Don't call cp_free if we are processing a channel program")
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/s390/cio/vfio_ccw_cp.c  | 11 +++++++++--
>  drivers/s390/cio/vfio_ccw_cp.h  |  2 +-
>  drivers/s390/cio/vfio_ccw_drv.c |  4 ++--
>  3 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> index 3645d1720c4b..2d942433baf9 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.c
> +++ b/drivers/s390/cio/vfio_ccw_cp.c
> @@ -803,15 +803,19 @@ union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm)
>   *
>   * This function updates @scsw->cpa to its coressponding guest physical
>   * address.
> + *
> + * Returns true if the channel program address in the irb was found
> + * within the chain of CCWs for this channel program.
>   */
> -void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
> +bool cp_update_scsw(struct channel_program *cp, union scsw *scsw)
>  {
>  	struct ccwchain *chain;
>  	u32 cpa = scsw->cmd.cpa;
>  	u32 ccw_head;
> +	bool within_chain = false;
>  
>  	if (!cp->initialized)
> -		return;
> +		return false;
>  
>  	/*
>  	 * LATER:
> @@ -833,11 +837,14 @@ void cp_update_scsw(struct channel_program *cp, union scsw *scsw)
>  			 * head gets us the guest cpa.
>  			 */
>  			cpa = chain->ch_iova + (cpa - ccw_head);
> +			within_chain = true;
>  			break;
>  		}
>  	}
>  
>  	scsw->cmd.cpa = cpa;

Looking at this, I'm wondering why we would want to update the cpa if
!within_chain. But I'm probably too tired to review this properly
today...

> +
> +	return within_chain;
>  }
>  
>  /**
> diff --git a/drivers/s390/cio/vfio_ccw_cp.h b/drivers/s390/cio/vfio_ccw_cp.h
> index ba31240ce965..a4cb6527bd4e 100644
> --- a/drivers/s390/cio/vfio_ccw_cp.h
> +++ b/drivers/s390/cio/vfio_ccw_cp.h
> @@ -47,7 +47,7 @@ extern int cp_init(struct channel_program *cp, struct device *mdev,
>  extern void cp_free(struct channel_program *cp);
>  extern int cp_prefetch(struct channel_program *cp);
>  extern union orb *cp_get_orb(struct channel_program *cp, u32 intparm, u8 lpm);
> -extern void cp_update_scsw(struct channel_program *cp, union scsw *scsw);
> +extern bool cp_update_scsw(struct channel_program *cp, union scsw *scsw);
>  extern bool cp_iova_pinned(struct channel_program *cp, u64 iova);
>  
>  #endif
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index e401a3d0aa57..a8ab256a217b 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -90,8 +90,8 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>  	is_final = !(scsw_actl(&irb->scsw) &
>  		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
>  	if (scsw_is_solicited(&irb->scsw)) {
> -		cp_update_scsw(&private->cp, &irb->scsw);
> -		if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING)
> +		if (cp_update_scsw(&private->cp, &irb->scsw) &&
> +		    is_final && private->state == VFIO_CCW_STATE_CP_PENDING)

...but I still wonder why is_final is not catching non-ssch related
interrupts, as I thought it would. We might want to adapt that check,
instead. (Or was the scsw_is_solicited() check supposed to catch that?
As said, too tired right now...)

>  			cp_free(&private->cp);
>  	}
>  	mutex_lock(&private->io_mutex);


  reply	other threads:[~2020-01-24 15:33 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-24 14:54 [PATCH v1 0/1] vfio-ccw: Fix interrupt handling for HALT/CLEAR Eric Farman
2020-01-24 14:54 ` [PATCH v1 1/1] vfio-ccw: Don't free channel programs for unrelated interrupts Eric Farman
2020-01-24 15:33   ` Cornelia Huck [this message]
2020-01-24 16:08     ` Eric Farman
2020-01-27 12:52       ` Cornelia Huck
2020-01-27 21:28         ` Eric Farman
2020-01-28  9:58           ` Cornelia Huck
2020-01-28 14:42             ` Eric Farman
2020-01-29  4:13               ` Eric Farman
2020-01-29 12:00                 ` Cornelia Huck
2020-01-29 16:48                   ` Eric Farman
2020-01-24 15:28 ` [PATCH v1 0/1] vfio-ccw: Fix interrupt handling for HALT/CLEAR Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200124163305.3d6f0d47.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=jjherne@linux.ibm.com \
    --cc=jrossi@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).