Review Board 2.0.15


ISA parser: Match /* */ and // style comments.

Review Request #841 - Created Sept. 5, 2011 and submitted

Information
Gabe Black
gem5
Reviewers
Default
ali, gblack, nate, stever
ISA parser: Match /* */ and // style comments.

Comments should not be scanned for operands, and we should look for both /* */
style and // style. The regular expression is still not quite right because it
doesn't handle comments in strings, but it's closer.

   
Posted (Sept. 6, 2011, 3:07 a.m.)
I did some searching and these look promising, although I haven't tested them:
comment_re = re.compile(
    r'(^)?[^\S\n]*/(?:\*(.*?)\*/[^\S\n]*|/[^\n]*)($)?',
    re.DOTALL | re.MULTILINE
)

or 

def comment_remover(text):
    def replacer(match):
        s = match.group(0)
        if s.startswith('/'):
            return ""
        else:
            return s
    pattern = re.compile(
        r'//.*?$|/\*.*?\*/|\'(?:\\.|[^\\\'])*\'|"(?:\\.|[^\\"])*"',
        re.DOTALL | re.MULTILINE
    )
    return re.sub(pattern, replacer, text)
Posted (Sept. 6, 2011, 4:40 a.m.)
Out of curiosity, are comments not being handled by PLY?  If that's the case, why?  Check out slicc/parser.py to see how to deal with comments.
  1. We're talking about comments in the C snippets that the ISA parser handles via regexes, not the ISA DSL itself.
Posted (Sept. 7, 2011, 5:09 a.m.)



  
src/arch/isa_parser.py (Diff revision 1)
 
 
re.MULTILINE only affects how '^ 'and '$' are interpreted, not '.'.  You need DOTALL to make '.' match newlines (which then makes the '//.*\n' not work right).  Based on this, the regex that Ali posted looks more reasonable.
  1. I should say "more likely to be correct", not "more reasonable"... by some definitions that regex is rather unreasonable... :-)