From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757315AbbEEJwG (ORCPT ); Tue, 5 May 2015 05:52:06 -0400 Received: from mail-la0-f53.google.com ([209.85.215.53]:34651 "EHLO mail-la0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757298AbbEEJv6 (ORCPT ); Tue, 5 May 2015 05:51:58 -0400 From: Rasmus Villemoes To: Alexey Dobriyan Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 04/10] sscanf: fix overflow Organization: D03 References: <20150502004714.GA21655@p183.telecom.by> <20150502005116.GD21655@p183.telecom.by> X-Hashcash: 1:20:150505:akpm@linux-foundation.org::OfNzYc3ktFrYLC20:0000000000000000000000000000000000000O5L X-Hashcash: 1:20:150505:linux-kernel@vger.kernel.org::7RoBaTfVyoSPDZdQ:00000000000000000000000000000000039IQ X-Hashcash: 1:20:150505:adobriyan@gmail.com::UNgxC+6i8m+6sCc5:0000000000000000000000000000000000000000003bAp Date: Tue, 05 May 2015 11:51:55 +0200 In-Reply-To: <20150502005116.GD21655@p183.telecom.by> (Alexey Dobriyan's message of "Sat, 2 May 2015 03:51:16 +0300") Message-ID: <87egmvfj2c.fsf@rasmusvillemoes.dk> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, May 02 2015, Alexey Dobriyan wrote: > Fun fact: > > uint8_t val; > sscanf("256", "%hhu", &val); > > will return 1 (as it should), and make val=0 (as it should not). > What do you base these "should" and "should not" on? Both C99 and POSIX say that the behaviour is undefined - the kernel can obviously define its own semantics for scanf, but what do you think they should be? If we want to correctly handle overflow, the only sane way is to make sscanf return 0 in the above case (no conversions done). This also seems to be what your patch does, but then I'm confused by your first "as it should". > Apart from correctness, patch allows to remove checks and switch > to proper types in several (most?) cases: > > grep -e 'scanf.*%[0-9]\+[dioux]' -n -r . > > Such checks can be incorrect too, checking for 3 digits with %3u > for parsing uint8_t is not enough. Yeah, and it may be too much; sscanf("0042", "%hhu", &val") should give 42, not 4. I agree that one should be able to rely on scanf doing range checking as part of matching. Actually, I think one should go through all the callers of sscanf which use a field width with an integer conversion and see if we can get rid of it, and then rip it away from the sscanf implementation. Otherwise there's another bug which would need fixing, namely int x; char rest[50]; sscanf("12345678901234567890", "%3d%s", &x, rest) should successfully return 2 (storing 123 in x), but it can't when the strategy is to convert as much as possible (which may then give an early failure due to overflow), then divide by 10 until we haven't consumed more than we're allowed. Rasmus