From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965493AbdCVSRw (ORCPT ); Wed, 22 Mar 2017 14:17:52 -0400 Received: from smtprelay0161.hostedemail.com ([216.40.44.161]:53558 "EHLO smtprelay.hostedemail.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S934813AbdCVSRp (ORCPT ); Wed, 22 Mar 2017 14:17:45 -0400 X-Session-Marker: 6A6F6540706572636865732E636F6D X-Spam-Summary: 50,0,0,,d41d8cd98f00b204,joe@perches.com,:::::::,RULES_HIT:41:69:355:379:541:599:960:967:973:982:988:989:1260:1277:1311:1313:1314:1345:1359:1373:1437:1515:1516:1518:1534:1543:1593:1594:1605:1711:1730:1747:1777:1792:1801:2393:2525:2553:2560:2563:2682:2685:2828:2859:2892:2895:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3622:3865:3866:3867:3868:3870:3871:3872:3873:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4605:5007:6117:6119:7653:7903:8531:8784:8957:8985:9025:9121:10004:10400:10848:11026:11232:11233:11658:11783:11914:12043:12219:12296:12530:12683:12740:12895:13141:13161:13229:13230:13255:13439:13894:14181:14659:14721:14819:21063:21080:21324:21433:21451:30022:30026:30029:30054:30070:30074:30075:30090:30091,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:5,LUA_SUMMARY:none X-HE-Tag: jam26_5faa1ef5a3536 X-Filterd-Recvd-Size: 4117 Message-ID: <1490206653.2041.17.camel@perches.com> Subject: Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings From: Joe Perches To: Darren Hart Cc: "John 'Warthog9' Hawley (VMware)" , linux-kernel@vger.kernel.org, Andy Whitcroft Date: Wed, 22 Mar 2017 11:17:33 -0700 In-Reply-To: <20170322152539.GA11892@localhost.localdomain> References: <1490113805-9295-1-git-send-email-warthog9@eaglescrag.net> <1490121068.2041.13.camel@perches.com> <20170322152539.GA11892@localhost.localdomain> Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: Evolution 3.22.3-0ubuntu0.1 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2017-03-22 at 08:25 -0700, Darren Hart wrote: > On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote: > > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote: > > > Spamassassin sticks a long (~79 character) long string after a > > > line that has a single space in it. The line with space causes > > > checkpatch to erroniously think that it's in the content body, as > > > opposed to headers and thus flag a mail header as an unwrapped long > > > comment line. > > > > If the spammassassin header is like > > > > email-header-n: foo > > email-header-m: bar > > > > X-Spam-Report: bar > > The specific content of the X-Spam-Report that triggers this for me, > from this patch for example, is: > > === 8< === > X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary: > Content analysis details: (-1.9 points, 5.0 required) > > pts rule name description > ---- ---------------------- -------------------------------------------------- > -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% > [score: 0.0000] > X-TUID: alGBIuPZmqOj > > === >8 === > > The long ---- ----... line is over 75 characters and triggers the test > for long commit_log lines. > > > > > Does that form follow rfc 5322? > > By my reading, this is governed by the long header fields defined by > 2.2.3, with whitespace folding defined as "a CRLF may be inserted before > any WSP." > > > > > If it does then any email header could have that > > form and the header wrapping test should be > > Yes, agreed. > > So the logic we want is: > > If we are in headers and we detect a CRLF and the next line starts with a WSP, > then we are still in headers (and therefor not in the commit log). The CRLF > information does not appear to be available as it is replaced with just \n. > > > updated from > > > > if ($in_header_lines && $realfile =~ /^$/ && > >     !($rawline =~ /^\s+\S/ || > >       $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) { > > $in_header_lines = 0; > > $in_commit_log = 1; > > $has_commit_log = 1; > > } > > > > to something like > > > > if ($in_header_lines && $realfile =~ /^$/ && > >     !($rawline =~ /^ (?:\s*\S|$)/ || > > Hrm... lines that start with maybe a space followed by a : ... Why did you > introduce that part of the check? The regex doesn't care about colons. It's a perl non-capturing group. https://perldoc.perl.org/perlretut.html#Non-capturing-groupings > Looking at this more closely, I was also not clear why the original test looked > for several spaces followed by non-space. What case is this for? Not several spaces, one or more spaces then a non-space. The only change here is allowing an initial space followed by either: 1: optional spaces, then non-space. 2: EOL I supposed you could argue that case 2 should also allow optional spaces before EOL and the test should be if ($in_header_lines && $realfile =~ /^$/ &&     !($rawline =~ /^\s+(?:\S|$)/ || ...