From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paul Eggert Subject: Re: [PATCH 1/3] tmpfs: revert SEEK_DATA and SEEK_HOLE Date: Tue, 14 Aug 2012 10:03:23 -0700 Message-ID: <502A84DB.5090607@cs.ucla.edu> References: <877gtkxatx.fsf@rho.meyering.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Jim Meyering , Zheng Liu , linux-fsdevel@vger.kernel.org To: Hugh Dickins Return-path: Received: from smtp.cs.ucla.edu ([131.179.128.62]:37900 "EHLO smtp.cs.ucla.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752809Ab2HNRN3 (ORCPT ); Tue, 14 Aug 2012 13:13:29 -0400 In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 08/07/2012 07:08 PM, Hugh Dickins wrote: > wouldn't the developer's common case (object files amidst source > in the tree) usually be handled by that check on the first 32k? Yes, but grep should also handle the less-common case where the first 32K is text and there's a large hole later. The particular case I'm worried about is a denial-of-service attack, so it's irrelevant that this case is uncommon in typical files. > shouldn't grep instead just be > checking for over-long lines instead of running out of memory? GNU programs should not have arbitrary limits. An arbitrary limit, such as 100,000 bytes, that we put on line length, would cause grep to not work on some valid inputs. This is not to say that grep couldn't function better on files with lots of nulls -- it can, and that's on our list of things to do -- but SEEK_HOLE is a big and obvious win in this area. We also need SEEK_HOLE and SEEK_DATA for GNU 'tar', for the same reason (denial-of-service attacks, mostly).