[3/6] macvtap: zerocopy: validate vector length before pinning user pages
diff mbox series

Message ID 20120416060807.14140.96229.stgit@intel-e5620-16-2.englab.nay.redhat.com
State New, archived
Headers show
Series
  • [1/6] macvtap: zerocopy: fix offset calculation when building skb
Related show

Commit Message

Jason Wang April 16, 2012, 6:08 a.m. UTC
Currently we do not validate the vector length before calling
get_user_pages_fast(), host stack would be easily overflowed by
malicious guest driver who give us a descriptor with length greater
than MAX_SKB_FRAGS. Solve this problem by checking the free entries
before trying to pin user pages.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/macvtap.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Eric Dumazet April 16, 2012, 6:53 a.m. UTC | #1
On Mon, 2012-04-16 at 14:08 +0800, Jason Wang wrote:
> Currently we do not validate the vector length before calling
> get_user_pages_fast(), host stack would be easily overflowed by
> malicious guest driver who give us a descriptor with length greater
> than MAX_SKB_FRAGS. Solve this problem by checking the free entries
> before trying to pin user pages.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/net/macvtap.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index 7cb2684..d197a78 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -529,6 +529,8 @@ static int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
>  		}
>  		base = (unsigned long)from->iov_base + offset;
>  		size = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
> +		if (i + size >= MAX_SKB_FRAGS)
> +			return -EFAULT;
>  		num_pages = get_user_pages_fast(base, size, 0, &page[i]);
>  		if ((num_pages != size) ||
>  		    (num_pages > MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags))
> 

Hi Jason

Why is -EFAULT the right error code ?

Also, why not removing the "(num_pages > MAX_SKB_FRAGS -
skb_shinfo(skb)->nr_frags)" test done few lines after ?

Also, if get_user_pages_fast() returns truncated result (size - 1 for
example), we apparently dont free the references on pages.

The comment applies only on pages that were added in skb frags.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Michael S. Tsirkin April 16, 2012, 7:58 a.m. UTC | #2
On Mon, Apr 16, 2012 at 02:08:07PM +0800, Jason Wang wrote:
> Currently we do not validate the vector length before calling
> get_user_pages_fast(), host stack would be easily overflowed by
> malicious guest driver who give us a descriptor with length greater
> than MAX_SKB_FRAGS. Solve this problem by checking the free entries
> before trying to pin user pages.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

We could handle this by copying and linearising some fragments.
That would be better.

> ---
>  drivers/net/macvtap.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index 7cb2684..d197a78 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -529,6 +529,8 @@ static int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
>  		}
>  		base = (unsigned long)from->iov_base + offset;
>  		size = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
> +		if (i + size >= MAX_SKB_FRAGS)
> +			return -EFAULT;
>  		num_pages = get_user_pages_fast(base, size, 0, &page[i]);
>  		if ((num_pages != size) ||
>  		    (num_pages > MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags))
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Jason Wang April 16, 2012, 8:21 a.m. UTC | #3
Hi:
On 04/16/2012 02:53 PM, Eric Dumazet wrote:
> On Mon, 2012-04-16 at 14:08 +0800, Jason Wang wrote:
>> Currently we do not validate the vector length before calling
>> get_user_pages_fast(), host stack would be easily overflowed by
>> malicious guest driver who give us a descriptor with length greater
>> than MAX_SKB_FRAGS. Solve this problem by checking the free entries
>> before trying to pin user pages.
>>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>> ---
>>   drivers/net/macvtap.c |    2 ++
>>   1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
>> index 7cb2684..d197a78 100644
>> --- a/drivers/net/macvtap.c
>> +++ b/drivers/net/macvtap.c
>> @@ -529,6 +529,8 @@ static int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
>>   		}
>>   		base = (unsigned long)from->iov_base + offset;
>>   		size = ((base&  ~PAGE_MASK) + len + ~PAGE_MASK)>>  PAGE_SHIFT;
>> +		if (i + size>= MAX_SKB_FRAGS)
>> +			return -EFAULT;
>>   		num_pages = get_user_pages_fast(base, size, 0,&page[i]);
>>   		if ((num_pages != size) ||
>>   		    (num_pages>  MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags))
>>
> Hi Jason
>
> Why is -EFAULT the right error code ?

E2BIG or is there any error code you prefer?
>
> Also, why not removing the "(num_pages>  MAX_SKB_FRAGS -
> skb_shinfo(skb)->nr_frags)" test done few lines after ?

Yes, it can be removed.
>
> Also, if get_user_pages_fast() returns truncated result (size - 1 for
> example), we apparently dont free the references on pages.

It's a bug of this patch, thanks.
> The comment applies only on pages that were added in skb frags.
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Eric Dumazet April 17, 2012, 5:33 a.m. UTC | #4
On Mon, 2012-04-16 at 16:21 +0800, Jason Wang wrote:
> Hi:
> On 04/16/2012 02:53 PM, Eric Dumazet wrote:
> 	if ((num_pages != size) ||
> >>   		    (num_pages>  MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags))
> >>
> > Hi Jason
> >
> > Why is -EFAULT the right error code ?
> 
> E2BIG or is there any error code you prefer?

Might be good yes.

However it sounds strange user cant write any size he wants (and kernel
needs to build several skbs to fulfill user request)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Michael S. Tsirkin April 17, 2012, 5:43 a.m. UTC | #5
On Tue, Apr 17, 2012 at 07:33:28AM +0200, Eric Dumazet wrote:
> On Mon, 2012-04-16 at 16:21 +0800, Jason Wang wrote:
> > Hi:
> > On 04/16/2012 02:53 PM, Eric Dumazet wrote:
> > 	if ((num_pages != size) ||
> > >>   		    (num_pages>  MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags))
> > >>
> > > Hi Jason
> > >
> > > Why is -EFAULT the right error code ?
> > 
> > E2BIG or is there any error code you prefer?
> 
> Might be good yes.
> 
> However it sounds strange user cant write any size he wants (and kernel
> needs to build several skbs to fulfill user request)

We never supported arbitrary length writes:
macvtap is exactly like packet sockets in this regard.
Michael S. Tsirkin April 17, 2012, 6:19 a.m. UTC | #6
On Mon, Apr 16, 2012 at 08:53:03AM +0200, Eric Dumazet wrote:
> On Mon, 2012-04-16 at 14:08 +0800, Jason Wang wrote:
> > Currently we do not validate the vector length before calling
> > get_user_pages_fast(), host stack would be easily overflowed by
> > malicious guest driver who give us a descriptor with length greater
> > than MAX_SKB_FRAGS. Solve this problem by checking the free entries
> > before trying to pin user pages.
> > 
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/net/macvtap.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> > index 7cb2684..d197a78 100644
> > --- a/drivers/net/macvtap.c
> > +++ b/drivers/net/macvtap.c
> > @@ -529,6 +529,8 @@ static int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
> >  		}
> >  		base = (unsigned long)from->iov_base + offset;
> >  		size = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
> > +		if (i + size >= MAX_SKB_FRAGS)
> > +			return -EFAULT;
> >  		num_pages = get_user_pages_fast(base, size, 0, &page[i]);
> >  		if ((num_pages != size) ||
> >  		    (num_pages > MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags))
> > 
> 
> Hi Jason
> 
> Why is -EFAULT the right error code ?


skb_copy_datagram_from_iovec return -EFAULT on iovs
that are too large, too, so this is at least consistent.

Patch
diff mbox series

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 7cb2684..d197a78 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -529,6 +529,8 @@  static int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
 		}
 		base = (unsigned long)from->iov_base + offset;
 		size = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
+		if (i + size >= MAX_SKB_FRAGS)
+			return -EFAULT;
 		num_pages = get_user_pages_fast(base, size, 0, &page[i]);
 		if ((num_pages != size) ||
 		    (num_pages > MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags))