From mboxrd@z Thu Jan 1 00:00:00 1970 From: Junio C Hamano Subject: Re: [PATCH v3 00/15] ref-filter: use parsing functions Date: Thu, 07 Jan 2016 10:43:41 -0800 Message-ID: References: <1451980994-26865-1-git-send-email-Karthik.188@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Cc: Eric Sunshine , Git List To: Karthik Nayak X-From: git-owner@vger.kernel.org Thu Jan 07 19:43:54 2016 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aHFXI-0005T3-K9 for gcvg-git-2@plane.gmane.org; Thu, 07 Jan 2016 19:43:48 +0100 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753028AbcAGSno (ORCPT ); Thu, 7 Jan 2016 13:43:44 -0500 Received: from pb-smtp0.int.icgroup.com ([208.72.237.35]:50199 "EHLO sasl.smtp.pobox.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752707AbcAGSnn (ORCPT ); Thu, 7 Jan 2016 13:43:43 -0500 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-smtp0.pobox.com (Postfix) with ESMTP id DAF8139644; Thu, 7 Jan 2016 13:43:42 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=rYUbbOIYUR53+vKFbzj/3VhpXc8=; b=smbG+t KoZw1eC4fgv6fmPXBBxRps9gRBOkz0NapGaWHhdiEy0NqfoELb4zXyeG+XyXMvVI r5bUbO6hL6f4AgQZnq9riSIoBd5yBVQfeT9/lobF6xWRuz3HuIMCYDOEM8RW7paG UvttAJ+KL5V0yDYfYJyz6IxVkR1/I2QpPwgos= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=rEwUG2KA/FJW0wX0gi5THYrW3U6rzOoL ysM+38LHQQ2PXTN6kiH43Vi5XPOBntCrf6/2Ix8vQtxzG1lKD1xyVrnlrLsIDtTN jwGGLOC0tSM0sDaMBnoUEl2xBElKDf8Lu9vHzKJVHD27RXSVSFPiqPEW5d+ISi1/ kuqDmhzgpio= Received: from pb-smtp0.int.icgroup.com (unknown [127.0.0.1]) by pb-smtp0.pobox.com (Postfix) with ESMTP id D276539643; Thu, 7 Jan 2016 13:43:42 -0500 (EST) Received: from pobox.com (unknown [216.239.45.64]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by pb-smtp0.pobox.com (Postfix) with ESMTPSA id 558C839642; Thu, 7 Jan 2016 13:43:42 -0500 (EST) In-Reply-To: (Karthik Nayak's message of "Thu, 7 Jan 2016 19:55:41 +0530") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) X-Pobox-Relay-ID: 9319EE0E-B56E-11E5-9862-6BD26AB36C07-77302942!pb-smtp0.pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Karthik Nayak writes: >> I don't understand the difficulty. It should be easy to manually skip >> the 'deref' for this one particular case: >> >> const char *name = atom->name; >> if (*name == '*') >> name++; >> >> Which would allow this unnecessarily complicated code from patch 14/15: >> >> if (match_atom_name(atom->name, "subject", &buf) && !buf) { >> ... >> return; >> } else if (match_atom_name(atom->name, "body", &buf) && !buf) { >> ... >> return; >> } if (!match_atom_name(atom->name, "contents", &buf)) >> die("BUG: parsing non-'contents'"); >> >> to be simplified to the more easily understood form suggested during >> review[1] of v2: >> >> if (!strcmp(name, "subject")) { >> ... >> return; >> } else if (!strcmp(name, "body")) { >> ... >> return; >> } else if (!match_atom_name(name,"contents", &buf)) >> die("BUG: expected 'contents' or 'contents:'"); >> >> You could also just use (!strcmp("body") || !strcmp("*body")) rather >> than skipping "*" manually, but the repetition makes that a bit >> noisier and uglier. >> >> [1]: http://article.gmane.org/gmane.comp.version-control.git/282645 > > Definitely not a difficulty per se. Just that it seems like something > match_atom_name() > seems to be fit for. As the function name suggests that we're matching > the atom name > and the check for '!buf' indicates that no options are to be included > for that particular atom. > > Also after Junio's suggestion[1], I think It looks better now[2]. But > either ways, I'm not > strongly against what you're saying, so my opinion on this matter is > quite flexible. > > [1]: http://article.gmane.org/gmane.comp.version-control.git/283404 > [2]: http://article.gmane.org/gmane.comp.version-control.git/283449 Sorry, but I suspect I was looking at a leaf function without thinking about the larger picture. I suspect that the interface to customized parsing function called by parse_ref_filter_atom() is misdesigned. I understand that the overall parsing that starts at verify_ref_format() goes like this: * Iterate over the string and find a matching "%(",")" pair. - For each such pair found, use parse_ref_filter_atom() on what is inside that matching pair. - parse_ref_filter_atom() iterates over the table of known atoms, and finds the entry in that table. Note that at this point, it knows that "%(" is followed by 'contents' or 'contents:' when it picked the "contents" atom from the table, for example. - if the entry we found in that table for the atom being parsed has a custom parse function, that function is called, but the calling convention does not pass the fact that we already know what we are seeing inside "%(",")" pair is 'contents', for example, and we know what argument it is given if any. So it appears to me that match_atom_name() is a misguided helper function that you shouldn't have to use too often. If the signature of parse() functions is changed to take not just the atom but the pointer to its argument (could be NULL, if we are seeing "%(contents)", for example) that is already available as "formatp" in the function, then contents_atom_parser() could become more like: contents_atom_parser(struct used_atom *atom, const char *arg) { if (args) atom->u.contents.option = C_BARE; else if (!strcmp(arg, "body")) atom->u.contents.option = C_BODY; ... } and there is no reason for this caller to even look at atom->name or worry about that it might have the dereferencing asterisk in front. If we really want to avoid having separate subject_atom_parser() and body_atom_parser(), they can be folded into the same function and it becomes necessary to switch on atom->name like you did in the code being discussed in the quoted part above. For that, as Eric said, skipping '*' manually would not be a big deal, as that should not happen so often in the code _anyway_. It is not a good idea to switch on atom->name inside contents_atom_parser() like you did. You are better off having separate {subject,body}_atom_parser() functions. For one thing, you are not reusing or sharing any code by squishing these three functions into one. A conceptually larger problem is that you are adding two extra !strcmp() calls to figure out the caller _already_ knows (notice I said this is "conceptual", this is not about performance). parse_ref_filter_atom() knows that it is a "%(subject)" or "%(subject:...)" atom, but because you throw away that information and call contents_atom_parser() by saying that it is one of the contents, subject or body, the called function has to redo strcmp in order to figure it out itself.