linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] checkpatch: Check for quoted strings broken across lines
@ 2012-03-20 21:06 Josh Triplett
  2012-03-21  1:24 ` Joe Perches
  0 siblings, 1 reply; 22+ messages in thread
From: Josh Triplett @ 2012-03-20 21:06 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: Joe Perches, Paul E. McKenney, torvalds, linux-kernel

checkpatch already makes an exception to the 80-column rule for quoted
strings, and Documentation/CodingStyle recommends not splitting quoted
strings across lines, because it breaks the ability to grep for the
string.  Rather than just permitting this, actively warn about quoted
strings split across lines.

Test case:

void context(void)
{
	struct { unsigned magic; const char *strdata; } foo[] = {
		{ 42, "these strings"
		      "do not produce warnings" },
		{ 256, "though perhaps"
		       "they should" },
	};
	pr_err("this string"
	       " should produce a warning\n");
	pr_err("this multi-line string\n"
	       "should not produce a warning\n");
	asm ("this asm\n\t"
	     "should not produce a warning");
}

Results of checkpatch on that test case:

WARNING: quoted string split across lines
+	       " should produce a warning\n");

total: 0 errors, 1 warnings, 15 lines checked

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---

Resending now that the merge window has opened.

 scripts/checkpatch.pl |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e3bfcbe..01a5e4b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1737,6 +1737,21 @@ sub process {
 			     "line over 80 characters\n" . $herecurr);
 		}
 
+# Check for user-visible strings broken across lines, which breaks the ability
+# to grep for the string.  Limited to strings used as parameters (those
+# following an open parenthesis), which almost completely eliminates false
+# positives, as well as warning only once per parameter rather than once per
+# line of the string.  Make an exception when the previous string ends in a
+# newline (multiple lines in one string constant) or \n\t (common in inline
+# assembly to indent the instruction on the following line).
+		if ($line =~ /^\+\s*"/ &&
+		    $prevline =~ /"\s*$/ &&
+		    $prevline =~ /\(/ &&
+		    $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
+			WARN("SPLIT_STRING",
+			     "quoted string split across lines\n" . $hereprev);
+		}
+
 # check for spaces before a quoted newline
 		if ($rawline =~ /^.*\".*\s\\n/) {
 			WARN("QUOTED_WHITESPACE_BEFORE_NEWLINE",
-- 
1.7.9


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-03-20 21:06 [PATCH] checkpatch: Check for quoted strings broken across lines Josh Triplett
@ 2012-03-21  1:24 ` Joe Perches
  2012-03-21  4:28   ` Josh Triplett
  0 siblings, 1 reply; 22+ messages in thread
From: Joe Perches @ 2012-03-21  1:24 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Andy Whitcroft, Paul E. McKenney, torvalds, linux-kernel

On Tue, 2012-03-20 at 14:06 -0700, Josh Triplett wrote:
> checkpatch already makes an exception to the 80-column rule for quoted
> strings, and Documentation/CodingStyle recommends not splitting quoted
> strings across lines, because it breaks the ability to grep for the
> string.  Rather than just permitting this, actively warn about quoted
> strings split across lines.

Hi Josh.

I don't recall seeing your patch before, but I submitted
a similar one that I believe Andrew has in his tree.

https://lkml.org/lkml/2012/3/2/24


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-03-21  1:24 ` Joe Perches
@ 2012-03-21  4:28   ` Josh Triplett
  2012-03-21  4:35     ` Joe Perches
  0 siblings, 1 reply; 22+ messages in thread
From: Josh Triplett @ 2012-03-21  4:28 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andy Whitcroft, Andrew Morton, Paul E. McKenney, torvalds, linux-kernel

On Tue, Mar 20, 2012 at 06:24:16PM -0700, Joe Perches wrote:
> On Tue, 2012-03-20 at 14:06 -0700, Josh Triplett wrote:
> > checkpatch already makes an exception to the 80-column rule for quoted
> > strings, and Documentation/CodingStyle recommends not splitting quoted
> > strings across lines, because it breaks the ability to grep for the
> > string.  Rather than just permitting this, actively warn about quoted
> > strings split across lines.
> 
> Hi Josh.
> 
> I don't recall seeing your patch before, but I submitted
> a similar one that I believe Andrew has in his tree.

You reviewed my patch at the time and provided feedback, and I'd
produced a revised version based on that feedback.

> https://lkml.org/lkml/2012/3/2/24

The heuristics in the patch I submitted almost completely eliminate
false positives, which makes my patch suitable for use without --strict.
Having a --strict version which flags *all* wrapped strings seems
potentially reasonable as well (though you'll still want the heuristic
that ignores strings ending in \n or \n\t), but I'd still like to get my
version included with the heuristics that make it suitable as a default.

Among other things, the version you submitted to Andrew will flag
multi-instruction __asm__ directives, multi-line strings printed with a
single printk, and arrays of arbitrary data expressed as strings.

- Josh Triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-03-21  4:28   ` Josh Triplett
@ 2012-03-21  4:35     ` Joe Perches
  2012-03-21  6:05       ` Josh Triplett
  0 siblings, 1 reply; 22+ messages in thread
From: Joe Perches @ 2012-03-21  4:35 UTC (permalink / raw)
  To: Josh Triplett
  Cc: Andy Whitcroft, Andrew Morton, Paul E. McKenney, torvalds, linux-kernel

On Tue, 2012-03-20 at 21:28 -0700, Josh Triplett wrote:
> On Tue, Mar 20, 2012 at 06:24:16PM -0700, Joe Perches wrote:
> > I don't recall seeing your patch before, but I submitted
> > a similar one that I believe Andrew has in his tree.
> You reviewed my patch at the time and provided feedback, and I'd
> produced a revised version based on that feedback.

duh.

I heard short term memory fades as you get age,
but I might not remember that in awhile.

> The heuristics in the patch I submitted almost completely eliminate
> false positives, which makes my patch suitable for use without --strict.

Yup, this patch is better than the one I submitted.

One improvement might be to check for a line continuation \
on $prevline and still produce a warning in that case.

cheers, Joe


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-03-21  4:35     ` Joe Perches
@ 2012-03-21  6:05       ` Josh Triplett
  2012-03-21 12:11         ` Joe Perches
  0 siblings, 1 reply; 22+ messages in thread
From: Josh Triplett @ 2012-03-21  6:05 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andy Whitcroft, Andrew Morton, Paul E. McKenney, torvalds, linux-kernel

On Tue, Mar 20, 2012 at 09:35:58PM -0700, Joe Perches wrote:
> On Tue, 2012-03-20 at 21:28 -0700, Josh Triplett wrote:
> > The heuristics in the patch I submitted almost completely eliminate
> > false positives, which makes my patch suitable for use without --strict.
> 
> Yup, this patch is better than the one I submitted.

OK.  Andrew, could you substitute my patch for Joe's in your tree?

> One improvement might be to check for a line continuation \
> on $prevline and still produce a warning in that case.

I'd suggest doing that one as a separate check, "unnecessary line
continuation", which should flag any use of a line continuation other
than with a preprocessor directive.  I don't think combining that with
the wrapped-string check makes sense.

- Josh Triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-03-21  6:05       ` Josh Triplett
@ 2012-03-21 12:11         ` Joe Perches
  0 siblings, 0 replies; 22+ messages in thread
From: Joe Perches @ 2012-03-21 12:11 UTC (permalink / raw)
  To: Josh Triplett
  Cc: Andy Whitcroft, Andrew Morton, Paul E. McKenney, torvalds, linux-kernel

On Tue, 2012-03-20 at 23:05 -0700, Josh Triplett wrote:
> On Tue, Mar 20, 2012 at 09:35:58PM -0700, Joe Perches wrote:
> > One improvement might be to check for a line continuation \
> > on $prevline and still produce a warning in that case.
> 
> I'd suggest doing that one as a separate check, "unnecessary line
> continuation", which should flag any use of a line continuation other
> than with a preprocessor directive.  I don't think combining that with
> the wrapped-string check makes sense.

What I suggested is for a case in a #define:

#define subsystem_printk(subsystem, level, format, ...)			\
	netdev_printk(level, (subsystem)->dev, "Some partial format "	\
		      "continued on another line" format,		\
		      ##__VA_ARGS__)

where it's necessary to use line continuations but rewrapping like:

#define subsystem_printk(subsystem, level, format, ...)			\
	netdev_printk(level, (subsystem)->dev, 				\
		      "A coalesced format on a single line" format,	\
 		      ##__VA_ARGS__)

might improve it.

Anyway:

Acked-by: Joe Perches <joe@perches.com>



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-03  5:38 ` Joe Perches
@ 2012-02-03  8:55   ` Josh Triplett
  0 siblings, 0 replies; 22+ messages in thread
From: Josh Triplett @ 2012-02-03  8:55 UTC (permalink / raw)
  To: Joe Perches; +Cc: linux-kernel, Andy Whitcroft, Paul E. McKenney, Nick Bowler

On Thu, Feb 02, 2012 at 09:38:35PM -0800, Joe Perches wrote:
> On Thu, 2012-02-02 at 21:27 -0800, Josh Triplett wrote:
> > checkpatch already makes an exception to the 80-column rule for quoted
> > strings, and Documentation/CodingStyle recommends not splitting quoted
> > strings across lines, because it breaks the ability to grep for the
> > string.  Rather than just permitting this, actively warn about quoted
> > strings split across lines.
> []
> > +			WARN("SPLIT_STRING",
> > +			     "quoted string split across lines\n" . $herecurr);
> 
> I think the output would be better as:
> 
> 			WARN("SPLIT_STRING",
> 			     "quoted string split across lines\n" . $hereprev);

Awesome; I'd thought that it would look better to show the previous
line, but I didn't know about $hereprev.  PATCHv3 momentarily.

- Josh Triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-03  5:27 Josh Triplett
  2012-02-03  5:38 ` Joe Perches
@ 2012-02-03  6:34 ` Josh Triplett
  1 sibling, 0 replies; 22+ messages in thread
From: Josh Triplett @ 2012-02-03  6:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andy Whitcroft, Joe Perches, Paul E. McKenney, Nick Bowler

On Thu, Feb 02, 2012 at 09:27:27PM -0800, Josh Triplett wrote:
> checkpatch already makes an exception to the 80-column rule for quoted
> strings, and Documentation/CodingStyle recommends not splitting quoted
> strings across lines, because it breaks the ability to grep for the
> string.  Rather than just permitting this, actively warn about quoted
> strings split across lines.
[...]
> v2: Add the "after open parenthesis" heuristic which almost completely
> eliminates non-user-visible strings, leaving only 12 in the entire
> kernel, versus thousands of correct matches.  Change "breaks
> greppability" to "breaks the ability to grep for the string".

Sorry, forgot to change "PATCH" to "PATCHv2" in the subject.

- Josh Triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-03  5:27 Josh Triplett
@ 2012-02-03  5:38 ` Joe Perches
  2012-02-03  8:55   ` Josh Triplett
  2012-02-03  6:34 ` Josh Triplett
  1 sibling, 1 reply; 22+ messages in thread
From: Joe Perches @ 2012-02-03  5:38 UTC (permalink / raw)
  To: Josh Triplett; +Cc: linux-kernel, Andy Whitcroft, Paul E. McKenney, Nick Bowler

On Thu, 2012-02-02 at 21:27 -0800, Josh Triplett wrote:
> checkpatch already makes an exception to the 80-column rule for quoted
> strings, and Documentation/CodingStyle recommends not splitting quoted
> strings across lines, because it breaks the ability to grep for the
> string.  Rather than just permitting this, actively warn about quoted
> strings split across lines.
[]
> +			WARN("SPLIT_STRING",
> +			     "quoted string split across lines\n" . $herecurr);

I think the output would be better as:

			WARN("SPLIT_STRING",
			     "quoted string split across lines\n" . $hereprev);



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] checkpatch: Check for quoted strings broken across lines
@ 2012-02-03  5:27 Josh Triplett
  2012-02-03  5:38 ` Joe Perches
  2012-02-03  6:34 ` Josh Triplett
  0 siblings, 2 replies; 22+ messages in thread
From: Josh Triplett @ 2012-02-03  5:27 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andy Whitcroft, Joe Perches, Paul E. McKenney, Nick Bowler

checkpatch already makes an exception to the 80-column rule for quoted
strings, and Documentation/CodingStyle recommends not splitting quoted
strings across lines, because it breaks the ability to grep for the
string.  Rather than just permitting this, actively warn about quoted
strings split across lines.

Test case:

void context(void)
{
	struct { unsigned magic; const char *strdata; } foo[] = {
		{ 42, "these strings"
		      "do not produce warnings" },
		{ 256, "though perhaps"
		       "they should" },
	};
	pr_err("this string"
	       " should produce a warning\n");
	pr_err("this multi-line string\n"
	       "should not produce a warning\n");
	asm ("this asm\n\t"
	     "should not produce a warning");
}

Results of checkpatch on that test case:

WARNING: quoted string split across lines
#10: FILE: test.c:10:
+	       " should produce a warning\n");

total: 0 errors, 1 warnings, 15 lines checked

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---

v2: Add the "after open parenthesis" heuristic which almost completely
eliminates non-user-visible strings, leaving only 12 in the entire
kernel, versus thousands of correct matches.  Change "breaks
greppability" to "breaks the ability to grep for the string".

 scripts/checkpatch.pl |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e3bfcbe..b898df4 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1737,6 +1737,21 @@ sub process {
 			     "line over 80 characters\n" . $herecurr);
 		}
 
+# Check for user-visible strings broken across lines, which breaks the ability
+# to grep for the string.  Limited to strings used as parameters (those
+# following an open parenthesis), which almost completely eliminates false
+# positives, as well as warning only once per parameter rather than once per
+# line of the string.  Make an exception when the previous string ends in a
+# newline (multiple lines in one string constant) or \n\t (common in inline
+# assembly to indent the instruction on the following line).
+		if ($line =~ /^\+\s*"/ &&
+		    $prevline =~ /"\s*$/ &&
+		    $prevline =~ /\(/ &&
+		    $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
+			WARN("SPLIT_STRING",
+			     "quoted string split across lines\n" . $herecurr);
+		}
+
 # check for spaces before a quoted newline
 		if ($rawline =~ /^.*\".*\s\\n/) {
 			WARN("QUOTED_WHITESPACE_BEFORE_NEWLINE",
-- 
1.7.9


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 22:08     ` Nick Bowler
@ 2012-02-03  1:31       ` Josh Triplett
  0 siblings, 0 replies; 22+ messages in thread
From: Josh Triplett @ 2012-02-03  1:31 UTC (permalink / raw)
  To: Nick Bowler; +Cc: linux-kernel, Andy Whitcroft, Joe Perches, Paul E. McKenney

On Thu, Feb 02, 2012 at 05:08:53PM -0500, Nick Bowler wrote:
> On 2012-02-02 13:16 -0800, Josh Triplett wrote:
> > On Thu, Feb 02, 2012 at 03:22:07PM -0500, Nick Bowler wrote:
> > > On 2012-02-02 12:06 -0800, Josh Triplett wrote:
> > > > Documentation/CodingStyle recommends not splitting quoted strings across
> > > > lines, because it breaks the ability to grep for the string.  checkpatch
> > > > already makes an exception to the 80-column rule for quoted strings to
> > > > allow this.  Rather than just allowing it, actively warn about quoted
> > > > strings split across lines.
> > > [...]
> > > > +# Check for strings broken across lines (breaks greppability).  Make an
> > > > +# exception when the previous string ends in a newline (multiple lines in one
> > > > +# string constant) or \n\t (common in inline assembly to indent the instruction
> > > > +# on the following line).
> > > 
> > > There are tons of strings in the kernel that this makes checkpatch warn
> > > about where it probably shouldn't.  For example, this one (from
> > > kernel/auditsc.c:1476):
> > > 
> > >   		audit_log_format(ab,
> > >   			"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
> > >   			"mq_msgsize=%ld mq_curmsgs=%ld",
> > > 
> > >   WARNING: quoted string split across lines
> > >   #1478: FILE: auditsc.c:1478:
> > >   +			"mq_msgsize=%ld mq_curmsgs=%ld",
> > > 
> > > Breaking "greppability" of this string is a non-issue, because this sort
> > > of string is not really greppable to begin with (and would certainly not
> > > be any easier to grep for if it were all on one line).
> > 
> > While I agree that in that particular case (heavy on the %formats and
> > light on the text) you can't easily grep to begin with, the guideline
> > from CodingStyle still applies,
> 
> The guideline from CodingStyle talks about "user-visible strings".  I
> don't think this string counts, because it contains code that is not
> displayed to the user (hence, the string is not user-visible).  That is
> the reason why it's not greppable in the first place.
> 
> There are hundreds, if not thousands of similar strings in the kernel.

OK, with a combination of arcane grep/sed commands and hand checking, I
just scanned the *entire* kernel for quoted strings split across lines,
and filtered out all user-visible strings, leaving only the split
non-user-visible strings.

An excerpt from that pile of ugly, suggesting just how many ways kernel
code has to spell "user-visible":

\(print[a-z_]*\|pr_[a-z]*\|pr_[a-z]*_ratelimited\|pr_debug[0-9]\|pr_info_once\|dev_[a-z]*\|msg\|warn\|warn_once\|dbg\|debug\|debugp\|err\|error\|errdev\|errx\|warning\|DBG[A-Z0-9_]*\|DEBUG[A-Z_]*\|trace[a-z0-9_]*\|LOG_KMS\|exception\|FAIL\|fatal\|assert\|panic_vm\|trace\|panic\|puts\|MODULE_[A-Z_]*\|asm\|__asm__\|volatile\|__volatile__\|message\|log\|info\|DAC960_[A-Za-z_]*\|notify\|ufsd\|crit\|debugf[0-9]\|DP\|die\|fyi\|notice\|dout\|snoop\|throtl_log_tg\|ERROR_INODE\|RFALSE\|DMESGE\|SAY\|JOM\|SAM\|JOT\|DBF_EVENT\|DE_ACT\|s390_handle_damage\|CIO_MSG_EVENT\|CIO_CRW_EVENT\|DBF_EVENT_DEVID\|fat_fs_error_ratelimit\|mark_tsc_unstable\|ata_ehi_push_desc\|DEB[0-9]\|IUCV_DBF_TEXT_\|LOG_PARSE\|esp_log_[a-z]*\|check_warn_return\|D_ISR\|D_CALIB\|D_RATE\|D_TX_REPLY\|D_TXPOWER\|D_EEPROM\|D_POWER\|D_SCAN\|D_ASSOC\|D_HC\|D_HC_DUMP\|IP_VS_ERR_BUF\|DMWARN_LIMIT\|ide_dump_status\|ppfinit\|mca_recovered\|die_if_kernel\|die_if_no_fixup\|gdbstub_proto\|ite_pr\|nvt_pr\|deb_i2c\|deb_irq\|deb_ts\|deb_rc\|mxl_i2c\|MLX4_EN_PARM_INT\|DI\|d_fnstart\|d_fnend\|lbs_deb_[a-z_]*\|LPFC_[A-Z_]*\|stop\|mconsole_reply_v0\|CH_ALERT\|D1\|ADD\|lbtf_deb_usbd\|IWL_DEBUG_MAC80211\|WL_CONN\|CX18_INFO_DEV\|CX18_ERR_DEV\|OPT_BOOLEAN\|asc_prt_line\|gdbstub_msg_write\|DCCP_BUG\|fappend\)

I found that a single change to my checkpatch test (only checking
strings passed as function arguments) limits the false positives in the
entire kernel to a grand total of 12:

arch/powerpc/platforms/iseries/mf.c:    memcpy(src, "\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00"
arch/um/os-Linux/process.c:     if (sscanf(buf, "%*d " COMM_SCANF " %*c %*d %*d %*d %*d %*d %*d %*d "
arch/x86/platform/olpc/olpc_dt.c:       olpc_dt_interpret("\" /battery@0\" find-device"
arch/x86/platform/olpc/olpc_dt.c:               olpc_dt_interpret("\" /pci/display@1\" find-device"
arch/x86/platform/olpc/olpc_dt.c:               olpc_dt_interpret("\" /pci/display@1,1\" find-device"
drivers/block/rbd.c:               "%" __stringify(RBD_MAX_OBJ_NAME_LEN) "s"
drivers/pcmcia/ds.c:    if (add_uevent_var(env, "MODALIAS=pcmcia:m%04Xc%04Xf%02Xfn%02Xpfn%02X"
drivers/tty/tty_audit.c:                audit_log_format(ab, "%s pid=%u uid=%u auid=%u ses=%u "
fs/ecryptfs/crypto.c:static unsigned char *portable_filename_chars = ("-.0123456789ABCD"
kernel/auditsc.c:               audit_log_format(ab, " inode=%lu"
kernel/auditsc.c:                       audit_log_format(ab, "login pid=%d uid=%u "
security/selinux/ss/services.c: audit_log_format(ab, "op=security_compute_av reason=%s "

Not "hundreds, if not thousands", but 12, including 4 calls to
audit_log_format. :)

Meanwhile, this checkpatch test correctly flags several thousand
instances of user-visible strings that shouldn't get split.  That seems
like a pretty reasonable ratio to me; what do you think?

I'll send PATCHv2, which limits the check to function parameters,
momentarily.

- Josh Triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 21:32     ` Joe Perches
@ 2012-02-02 22:36       ` Josh Triplett
  0 siblings, 0 replies; 22+ messages in thread
From: Josh Triplett @ 2012-02-02 22:36 UTC (permalink / raw)
  To: Joe Perches; +Cc: linux-kernel, Andy Whitcroft, Paul E. McKenney

On Thu, Feb 02, 2012 at 01:32:07PM -0800, Joe Perches wrote:
> On Thu, 2012-02-02 at 13:28 -0800, Josh Triplett wrote:
> > On Thu, Feb 02, 2012 at 12:36:29PM -0800, Joe Perches wrote:
> > > On Thu, 2012-02-02 at 12:06 -0800, Josh Triplett wrote:
> > > > Documentation/CodingStyle recommends not splitting quoted strings across
> > > > lines, because it breaks the ability to grep for the string.  checkpatch
> > > > already makes an exception to the 80-column rule for quoted strings to
> > > > allow this.  Rather than just allowing it, actively warn about quoted
> > > > strings split across lines.
> []
> > > Seems sensible but there are also asm uses like:
> > > arch/x86/include/asm/pvclock.h:         "xor  %5,%5    ; "
> > >                 "add  %4,%%eax ; "
> > I did see those, yes.  However, a quick grep through the kernel shows
> > that those occur quite rarely compared to \n or \n\t; the latter looks
> > like the standard style.  How about I provide a patch for
> > Documentation/CodingStyle adding a chapter on inline assembly, with that
> > and other style guidelines? :)
> 
> Sounds fine to me.

Done, with message-id <20120202223304.GA13069@leaf> and subject
"[PATCH] Documentation/CodingStyle: Add guidelines for inline assembly".

- Josh triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 21:16   ` Josh Triplett
@ 2012-02-02 22:08     ` Nick Bowler
  2012-02-03  1:31       ` Josh Triplett
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2012-02-02 22:08 UTC (permalink / raw)
  To: Josh Triplett; +Cc: linux-kernel, Andy Whitcroft, Joe Perches, Paul E. McKenney

On 2012-02-02 13:16 -0800, Josh Triplett wrote:
> On Thu, Feb 02, 2012 at 03:22:07PM -0500, Nick Bowler wrote:
> > On 2012-02-02 12:06 -0800, Josh Triplett wrote:
> > > Documentation/CodingStyle recommends not splitting quoted strings across
> > > lines, because it breaks the ability to grep for the string.  checkpatch
> > > already makes an exception to the 80-column rule for quoted strings to
> > > allow this.  Rather than just allowing it, actively warn about quoted
> > > strings split across lines.
> > [...]
> > > +# Check for strings broken across lines (breaks greppability).  Make an
> > > +# exception when the previous string ends in a newline (multiple lines in one
> > > +# string constant) or \n\t (common in inline assembly to indent the instruction
> > > +# on the following line).
> > 
> > There are tons of strings in the kernel that this makes checkpatch warn
> > about where it probably shouldn't.  For example, this one (from
> > kernel/auditsc.c:1476):
> > 
> >   		audit_log_format(ab,
> >   			"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
> >   			"mq_msgsize=%ld mq_curmsgs=%ld",
> > 
> >   WARNING: quoted string split across lines
> >   #1478: FILE: auditsc.c:1478:
> >   +			"mq_msgsize=%ld mq_curmsgs=%ld",
> > 
> > Breaking "greppability" of this string is a non-issue, because this sort
> > of string is not really greppable to begin with (and would certainly not
> > be any easier to grep for if it were all on one line).
> 
> While I agree that in that particular case (heavy on the %formats and
> light on the text) you can't easily grep to begin with, the guideline
> from CodingStyle still applies,

The guideline from CodingStyle talks about "user-visible strings".  I
don't think this string counts, because it contains code that is not
displayed to the user (hence, the string is not user-visible).  That is
the reason why it's not greppable in the first place.

There are hundreds, if not thousands of similar strings in the kernel.

> as does the standard guideline about checkpatch (namely "don't go
> globally fixing everything it says, but fix it in new or changed
> code").
> 
> I certainly don't think joining those lines would *hurt*.

I disagree.  Joining those lines will push the conversion specifiers way
off to the right (possibly off the screen), and these specifiers are
vital to understanding the rest of the function parameters.  In other
words, the specifiers should be treated the same as actual *code*.

> Making that change blindly across the entire kernel doesn't seem worth
> it, but changing it on new code seems like a good idea.

For reasons that I've already outlined, I don't think it's a good idea
to change it in new code, either.

> In theory checkpatch could try to heuristically ignore cases where the
> split in the string occurs immediately before or after a %format, but I
> don't fancy writing a regex to match valid printf format specifiers. :)

It doesn't hurt to be conservative, erring on the side of not warning
(the status quo) rather than warning.  Regardless, writing a regex for
printf conversion specifiers should be straightforward, because the
format is so rigid.

Here's a perl regex which should match all the valid possibilities
according to ISO C99 (untested!).  Catching the Linux extensions is
left as an exercise to the reader.

 %[-+ #0]*(hh|ll|[ljztL])?[*[:digit:]]*(\.[*[:digit:]]*)?[diouxXfFeEgGaAcspn%]

> I still think this patch provides a net win, and I don't think the
> example you mentioned represents a false positive.

Cheers,
-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 21:34     ` Joe Perches
@ 2012-02-02 22:02       ` Josh Triplett
  0 siblings, 0 replies; 22+ messages in thread
From: Josh Triplett @ 2012-02-02 22:02 UTC (permalink / raw)
  To: Joe Perches
  Cc: Jesper Juhl, Nick Bowler, linux-kernel, Andy Whitcroft, Paul E. McKenney

On Thu, Feb 02, 2012 at 01:34:16PM -0800, Joe Perches wrote:
> On Thu, 2012-02-02 at 22:29 +0100, Jesper Juhl wrote:
> > But generally a good idea to warn about "easily grepable strings broken 
> > into multiple lines" if possible, IMHO...
> 
> Maybe count the %'s?  Ignore if more than 2?

Numerous greppable strings include several formats, but you can
generally look at a message and know which non-parameterized bits to
grep for.  Even in the case of strings with numerous parameters, it
often works better to grep for the whole string and replace the
parameters with '.*', and make the string incrementally less specific
until I find it.  For example, from dmesg on my system:

[    0.000000] last_pfn = 0xdb000 max_arch_pfn = 0x400000000

Grepping for "last_pfn" or even "last_pfn = " produces a pile of
results, but grepping for 'last_pfn = .* max_arch_pfn' produces exactly
one result:

arch/x86/kernel/e820.c: printk(KERN_INFO "last_pfn = %#lx max_arch_pfn = %#lx\n",

That wouldn't work if the string broke across lines.  So, I still think
it seems reasonable for checkpatch to encourage keeping such strings
un-split.

- Josh Triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 21:29   ` Jesper Juhl
@ 2012-02-02 21:34     ` Joe Perches
  2012-02-02 22:02       ` Josh Triplett
  0 siblings, 1 reply; 22+ messages in thread
From: Joe Perches @ 2012-02-02 21:34 UTC (permalink / raw)
  To: Jesper Juhl
  Cc: Nick Bowler, Josh Triplett, linux-kernel, Andy Whitcroft,
	Paul E. McKenney

On Thu, 2012-02-02 at 22:29 +0100, Jesper Juhl wrote:
> But generally a good idea to warn about "easily grepable strings broken 
> into multiple lines" if possible, IMHO...

Maybe count the %'s?  Ignore if more than 2?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 21:28   ` Josh Triplett
@ 2012-02-02 21:32     ` Joe Perches
  2012-02-02 22:36       ` Josh Triplett
  0 siblings, 1 reply; 22+ messages in thread
From: Joe Perches @ 2012-02-02 21:32 UTC (permalink / raw)
  To: Josh Triplett; +Cc: linux-kernel, Andy Whitcroft, Paul E. McKenney

On Thu, 2012-02-02 at 13:28 -0800, Josh Triplett wrote:
> On Thu, Feb 02, 2012 at 12:36:29PM -0800, Joe Perches wrote:
> > On Thu, 2012-02-02 at 12:06 -0800, Josh Triplett wrote:
> > > Documentation/CodingStyle recommends not splitting quoted strings across
> > > lines, because it breaks the ability to grep for the string.  checkpatch
> > > already makes an exception to the 80-column rule for quoted strings to
> > > allow this.  Rather than just allowing it, actively warn about quoted
> > > strings split across lines.
[]
> > Seems sensible but there are also asm uses like:
> > arch/x86/include/asm/pvclock.h:         "xor  %5,%5    ; "
> >                 "add  %4,%%eax ; "
> I did see those, yes.  However, a quick grep through the kernel shows
> that those occur quite rarely compared to \n or \n\t; the latter looks
> like the standard style.  How about I provide a patch for
> Documentation/CodingStyle adding a chapter on inline assembly, with that
> and other style guidelines? :)

Sounds fine to me.

Another option might be to ignore all
strings in any asm block.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 20:22 ` Nick Bowler
  2012-02-02 21:16   ` Josh Triplett
@ 2012-02-02 21:29   ` Jesper Juhl
  2012-02-02 21:34     ` Joe Perches
  1 sibling, 1 reply; 22+ messages in thread
From: Jesper Juhl @ 2012-02-02 21:29 UTC (permalink / raw)
  To: Nick Bowler
  Cc: Josh Triplett, linux-kernel, Andy Whitcroft, Joe Perches,
	Paul E. McKenney

On Thu, 2 Feb 2012, Nick Bowler wrote:

> On 2012-02-02 12:06 -0800, Josh Triplett wrote:
> > Documentation/CodingStyle recommends not splitting quoted strings across
> > lines, because it breaks the ability to grep for the string.  checkpatch
> > already makes an exception to the 80-column rule for quoted strings to
> > allow this.  Rather than just allowing it, actively warn about quoted
> > strings split across lines.
> [...]
> > +# Check for strings broken across lines (breaks greppability).  Make an

s/greppability/grepability/ ?


> > +# exception when the previous string ends in a newline (multiple lines in one
> > +# string constant) or \n\t (common in inline assembly to indent the instruction
> > +# on the following line).
> 
> There are tons of strings in the kernel that this makes checkpatch warn
> about where it probably shouldn't.  For example, this one (from
> kernel/auditsc.c:1476):
> 
>   		audit_log_format(ab,
>   			"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
>   			"mq_msgsize=%ld mq_curmsgs=%ld",
> 
>   WARNING: quoted string split across lines
>   #1478: FILE: auditsc.c:1478:
>   +			"mq_msgsize=%ld mq_curmsgs=%ld",
> 
> Breaking "greppability" of this string is a non-issue, because this sort
> of string is not really greppable to begin with (and would certainly not
> be any easier to grep for if it were all on one line).
> 

Agreed. checkpatch needs to not warn about strings like this.
But generally a good idea to warn about "easily grepable strings broken 
into multiple lines" if possible, IMHO...

-- 
Jesper Juhl <jj@chaosbits.net>       http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 20:36 ` Joe Perches
@ 2012-02-02 21:28   ` Josh Triplett
  2012-02-02 21:32     ` Joe Perches
  0 siblings, 1 reply; 22+ messages in thread
From: Josh Triplett @ 2012-02-02 21:28 UTC (permalink / raw)
  To: Joe Perches; +Cc: linux-kernel, Andy Whitcroft, Paul E. McKenney

On Thu, Feb 02, 2012 at 12:36:29PM -0800, Joe Perches wrote:
> On Thu, 2012-02-02 at 12:06 -0800, Josh Triplett wrote:
> > Documentation/CodingStyle recommends not splitting quoted strings across
> > lines, because it breaks the ability to grep for the string.  checkpatch
> > already makes an exception to the 80-column rule for quoted strings to
> > allow this.  Rather than just allowing it, actively warn about quoted
> > strings split across lines.
> []
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> []
> > @@ -1737,6 +1737,17 @@ sub process {
> []
> > +# Check for strings broken across lines (breaks greppability).  Make an
> > +# exception when the previous string ends in a newline (multiple lines in one
> > +# string constant) or \n\t (common in inline assembly to indent the instruction
> > +# on the following line).
> > +		if ($line =~ /^\+\s*"/ &&
> > +		    $prevline =~ /"\s*$/ &&
> > +		    $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
> > +			WARN("SPLIT_STRING",
> > +			     "quoted string split across lines\n" . $herecurr);
> 
> Seems sensible but there are also asm uses like:
> 
> arch/x86/include/asm/pvclock.h:         "xor  %5,%5    ; "
>                 "add  %4,%%eax ; "

I did see those, yes.  However, a quick grep through the kernel shows
that those occur quite rarely compared to \n or \n\t; the latter looks
like the standard style.  How about I provide a patch for
Documentation/CodingStyle adding a chapter on inline assembly, with that
and other style guidelines? :)

- Josh Triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 20:22 ` Nick Bowler
@ 2012-02-02 21:16   ` Josh Triplett
  2012-02-02 22:08     ` Nick Bowler
  2012-02-02 21:29   ` Jesper Juhl
  1 sibling, 1 reply; 22+ messages in thread
From: Josh Triplett @ 2012-02-02 21:16 UTC (permalink / raw)
  To: Nick Bowler; +Cc: linux-kernel, Andy Whitcroft, Joe Perches, Paul E. McKenney

On Thu, Feb 02, 2012 at 03:22:07PM -0500, Nick Bowler wrote:
> On 2012-02-02 12:06 -0800, Josh Triplett wrote:
> > Documentation/CodingStyle recommends not splitting quoted strings across
> > lines, because it breaks the ability to grep for the string.  checkpatch
> > already makes an exception to the 80-column rule for quoted strings to
> > allow this.  Rather than just allowing it, actively warn about quoted
> > strings split across lines.
> [...]
> > +# Check for strings broken across lines (breaks greppability).  Make an
> > +# exception when the previous string ends in a newline (multiple lines in one
> > +# string constant) or \n\t (common in inline assembly to indent the instruction
> > +# on the following line).
> 
> There are tons of strings in the kernel that this makes checkpatch warn
> about where it probably shouldn't.  For example, this one (from
> kernel/auditsc.c:1476):
> 
>   		audit_log_format(ab,
>   			"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
>   			"mq_msgsize=%ld mq_curmsgs=%ld",
> 
>   WARNING: quoted string split across lines
>   #1478: FILE: auditsc.c:1478:
>   +			"mq_msgsize=%ld mq_curmsgs=%ld",
> 
> Breaking "greppability" of this string is a non-issue, because this sort
> of string is not really greppable to begin with (and would certainly not
> be any easier to grep for if it were all on one line).

While I agree that in that particular case (heavy on the %formats and
light on the text) you can't easily grep to begin with, the guideline
from CodingStyle still applies, as does the standard guideline about
checkpatch (namely "don't go globally fixing everything it says, but fix
it in new or changed code").

I certainly don't think joining those lines would *hurt*.  Making that
change blindly across the entire kernel doesn't seem worth it, but
changing it on new code seems like a good idea.

In theory checkpatch could try to heuristically ignore cases where the
split in the string occurs immediately before or after a %format, but I
don't fancy writing a regex to match valid printf format specifiers. :)

I still think this patch provides a net win, and I don't think the
example you mentioned represents a false positive.

- Josh Triplett

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 20:06 Josh Triplett
  2012-02-02 20:22 ` Nick Bowler
@ 2012-02-02 20:36 ` Joe Perches
  2012-02-02 21:28   ` Josh Triplett
  1 sibling, 1 reply; 22+ messages in thread
From: Joe Perches @ 2012-02-02 20:36 UTC (permalink / raw)
  To: Josh Triplett; +Cc: linux-kernel, Andy Whitcroft, Paul E. McKenney

On Thu, 2012-02-02 at 12:06 -0800, Josh Triplett wrote:
> Documentation/CodingStyle recommends not splitting quoted strings across
> lines, because it breaks the ability to grep for the string.  checkpatch
> already makes an exception to the 80-column rule for quoted strings to
> allow this.  Rather than just allowing it, actively warn about quoted
> strings split across lines.
[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -1737,6 +1737,17 @@ sub process {
[]
> +# Check for strings broken across lines (breaks greppability).  Make an
> +# exception when the previous string ends in a newline (multiple lines in one
> +# string constant) or \n\t (common in inline assembly to indent the instruction
> +# on the following line).
> +		if ($line =~ /^\+\s*"/ &&
> +		    $prevline =~ /"\s*$/ &&
> +		    $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
> +			WARN("SPLIT_STRING",
> +			     "quoted string split across lines\n" . $herecurr);

Seems sensible but there are also asm uses like:

arch/x86/include/asm/pvclock.h:         "xor  %5,%5    ; "
                "add  %4,%%eax ; "




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] checkpatch: Check for quoted strings broken across lines
  2012-02-02 20:06 Josh Triplett
@ 2012-02-02 20:22 ` Nick Bowler
  2012-02-02 21:16   ` Josh Triplett
  2012-02-02 21:29   ` Jesper Juhl
  2012-02-02 20:36 ` Joe Perches
  1 sibling, 2 replies; 22+ messages in thread
From: Nick Bowler @ 2012-02-02 20:22 UTC (permalink / raw)
  To: Josh Triplett; +Cc: linux-kernel, Andy Whitcroft, Joe Perches, Paul E. McKenney

On 2012-02-02 12:06 -0800, Josh Triplett wrote:
> Documentation/CodingStyle recommends not splitting quoted strings across
> lines, because it breaks the ability to grep for the string.  checkpatch
> already makes an exception to the 80-column rule for quoted strings to
> allow this.  Rather than just allowing it, actively warn about quoted
> strings split across lines.
[...]
> +# Check for strings broken across lines (breaks greppability).  Make an
> +# exception when the previous string ends in a newline (multiple lines in one
> +# string constant) or \n\t (common in inline assembly to indent the instruction
> +# on the following line).

There are tons of strings in the kernel that this makes checkpatch warn
about where it probably shouldn't.  For example, this one (from
kernel/auditsc.c:1476):

  		audit_log_format(ab,
  			"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
  			"mq_msgsize=%ld mq_curmsgs=%ld",

  WARNING: quoted string split across lines
  #1478: FILE: auditsc.c:1478:
  +			"mq_msgsize=%ld mq_curmsgs=%ld",

Breaking "greppability" of this string is a non-issue, because this sort
of string is not really greppable to begin with (and would certainly not
be any easier to grep for if it were all on one line).

Cheers,
-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] checkpatch: Check for quoted strings broken across lines
@ 2012-02-02 20:06 Josh Triplett
  2012-02-02 20:22 ` Nick Bowler
  2012-02-02 20:36 ` Joe Perches
  0 siblings, 2 replies; 22+ messages in thread
From: Josh Triplett @ 2012-02-02 20:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andy Whitcroft, Joe Perches, Paul E. McKenney

Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string.  checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this.  Rather than just allowing it, actively warn about quoted
strings split across lines.

Test case:

void context(void)
{
        pr_err("this string"
               " should produce a warning\n");
        pr_err("this multi-line string\n"
               "should not produce a warning\n");
        asm ("this asm\n\t"
             "should not produce a warning");
}

Results of checkpatch on that test case:

WARNING: quoted string split across lines
+	       " should produce a warning\n");

total: 0 errors, 1 warnings, 9 lines checked

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---
 scripts/checkpatch.pl |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e3bfcbe..ce4d740 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1737,6 +1737,17 @@ sub process {
 			     "line over 80 characters\n" . $herecurr);
 		}
 
+# Check for strings broken across lines (breaks greppability).  Make an
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
+		if ($line =~ /^\+\s*"/ &&
+		    $prevline =~ /"\s*$/ &&
+		    $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
+			WARN("SPLIT_STRING",
+			     "quoted string split across lines\n" . $herecurr);
+		}
+
 # check for spaces before a quoted newline
 		if ($rawline =~ /^.*\".*\s\\n/) {
 			WARN("QUOTED_WHITESPACE_BEFORE_NEWLINE",
-- 
1.7.9


^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2012-03-21 12:11 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-20 21:06 [PATCH] checkpatch: Check for quoted strings broken across lines Josh Triplett
2012-03-21  1:24 ` Joe Perches
2012-03-21  4:28   ` Josh Triplett
2012-03-21  4:35     ` Joe Perches
2012-03-21  6:05       ` Josh Triplett
2012-03-21 12:11         ` Joe Perches
  -- strict thread matches above, loose matches on Subject: below --
2012-02-03  5:27 Josh Triplett
2012-02-03  5:38 ` Joe Perches
2012-02-03  8:55   ` Josh Triplett
2012-02-03  6:34 ` Josh Triplett
2012-02-02 20:06 Josh Triplett
2012-02-02 20:22 ` Nick Bowler
2012-02-02 21:16   ` Josh Triplett
2012-02-02 22:08     ` Nick Bowler
2012-02-03  1:31       ` Josh Triplett
2012-02-02 21:29   ` Jesper Juhl
2012-02-02 21:34     ` Joe Perches
2012-02-02 22:02       ` Josh Triplett
2012-02-02 20:36 ` Joe Perches
2012-02-02 21:28   ` Josh Triplett
2012-02-02 21:32     ` Joe Perches
2012-02-02 22:36       ` Josh Triplett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).