Another COBOL Source Code Comparison Issue

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • kengrubb
    Enthusiast
    • Jan 2006
    • 46

    Another COBOL Source Code Comparison Issue

    I'm using BC3 and the factory default for COBOL Source. I've discovered an Unimportant Difference, and I don't know why it's Unimportant. I inserted some spaces into a paragraph name, which is very definitely NOT Unimportant.

    ;^>

    Before
    Code:
    000000             CLOSE-ACCTL-EXIT.
    After
    Code:
    000000             CLOSE-     ACCTL-EXIT.
  • Aaron
    Team Scooter
    • Oct 2007
    • 16000

    #2
    Hello Ken,

    Our default COBOL file format would detect that whitespace as whitespace text. By default, whitespace is not important.

    Should that whitespace be swallowed by one of our other grammar element definitions? Or should whitespace be important? You can easily enable the later in the Text Compare's Session menu -> Session Settings, Importance tab, and then checking (checked items are Important) the default text: embedded whitespace, and optionally leading and trailing whitespace.
    Aaron P Scooter Software

    Comment

    • kengrubb
      Enthusiast
      • Jan 2006
      • 46

      #3
      Hmmm. This is a subtle nuance.

      A difference in whitespace (1 space versus 2 or 20 or whatever) would be Unimportant.

      However, a change in whitespace (meaning the addition of whitespace where there was none, or the removal of whitespace where it existed) would be Important.

      I'm not sure how to represent that in the grammar.

      Comment

      • Michael Bulgrien
        Carpal Tunnel
        • Oct 2007
        • 1772

        #4
        Personally I agree that a difference in embedded whitespace and the insertion of whitespace where none exists on the other side are two different scenarios and should be separately configurable. I agree that the whitespace in your example is significant. I also agree that you should not have to make embedded whitespace important to get it to show up.

        In my opinion, the importance tab should have an additional category called "Inserted whitespace" that is important by default.
        BC v4.0.7 build 19761
        ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

        Comment

        • Aaron
          Team Scooter
          • Oct 2007
          • 16000

          #5
          Hello Ken,

          So the above change would be unimportant if the whitespace was increased or decreased, but is important going from none to some, or some to none?
          Aaron P Scooter Software

          Comment

          • kengrubb
            Enthusiast
            • Jan 2006
            • 46

            #6
            Hi Aaron,

            Perfectly stated.

            Comment

            • Aaron
              Team Scooter
              • Oct 2007
              • 16000

              #7
              Hello Ken,

              Would it be possible to get a specific example file emailed to [email protected] ? Please also include a link back to this forum thread for our reference?

              Would "CLOSE-ACCTL-EXIT" be an identifier? One solution would be to add it as an Identifier grammar.
              Aaron P Scooter Software

              Comment

              • kengrubb
                Enthusiast
                • Jan 2006
                • 46

                #8
                Well, that probably would not work. In this example, CLOSE-ACCTL-EXIT is a paragraph name, and in the non-COBOL world that's approximately equivalent to a function name or subroutine name. It could be any combination of A-Z, a-z, 0-9, dash (-), underscore (_) in some COBOL variants, and perhaps other characters as well.

                Comment

                • Aaron
                  Team Scooter
                  • Oct 2007
                  • 16000

                  #9
                  Hello Ken,

                  What if you define a new grammar element, Identifier, and match the regular expression:
                  [_A-Za-z][_A-Za-z0-9-]*

                  If marked as important, the line should then detect is has an important difference (though no particular text will be red), and blue if both lines have whitespace (variable amount) in the same location.

                  This would assume a digit or dash cannot be the first character. Would that match with COBOL standards? If you have any reference material, we would appreciate it.
                  Aaron P Scooter Software

                  Comment

                  • kengrubb
                    Enthusiast
                    • Jan 2006
                    • 46

                    #10
                    That does appear to work. I used it without the underscore for my variant of COBOL (Unisys 2200).

                    [A-Za-z][A-Za-z0-9-]*

                    As for COBOL reference material, it will vary from one platform to the next. There are a couple of major differences and several minor differences between the various dialects of COBOL.

                    One of the major differences, for HP NonStop COBOL and OpenCOBOL, would be a Comment pattern of ^\*.* and no Line Numbers. I suspect COBOL.NET would also fall under this standard.

                    Also, COBOL 74 and COBOL 85 standards have different keyword lists. The BC3 COBOL keyword list appears to be COBOL 85, but with some of the hardware specific nuances included.

                    Comment

                    • Aaron
                      Team Scooter
                      • Oct 2007
                      • 16000

                      #11
                      Hello,

                      Several aspects of our COBOL rule have been tweaked by customer suggestions and examples. Do you think some of your comments could be applied to our general COBOL rule without causing issues for users using the other variants of COBOL? If not, that's ok; I just like to try and improve our rules when and where we can.
                      Aaron P Scooter Software

                      Comment

                      • kengrubb
                        Enthusiast
                        • Jan 2006
                        • 46

                        #12
                        I've been tweaking at the rule again.

                        First Line Number is now Text matching ^.{6}
                        Second Line Number is now Column from 73 to end of line
                        Identifier is Text matching [A-Za-z][A-Za-z0-9-]*

                        Perhaps you'd want to add an Open COBOL Rule (for OpenCOBOL, HP NonStop, .NET, and other more modern applications of COBOL)

                        For that rule:
                        Remove Line Numbers
                        Identifier is Text matching [_A-Za-z][_A-Za-z0-9-]*
                        Comment is Text matching ^\*.*

                        Comment

                        • Aaron
                          Team Scooter
                          • Oct 2007
                          • 16000

                          #13
                          And leave String, Keyword, Number, and Operator as is, correct?

                          Is there any other form of comment, or are they only a * at the beginning of the line.

                          Would Open COBOL have the same extensions as traditional COBOL? *.cbl;*.cob;*.cpy*

                          UPDATE: Here's an update for the identifier
                          [a-z0-9](-?[a-z0-9])*
                          Last edited by Aaron; 30-Nov-2011, 04:33 PM. Reason: UPDATE
                          Aaron P Scooter Software

                          Comment

                          • kengrubb
                            Enthusiast
                            • Jan 2006
                            • 46

                            #14
                            All looks good.

                            On the Unisys 2200, and perhaps on some of the other mainframes, a forward slash (/) is a comment that I believe produces a page advance (in greenbar printers).

                            The letter D can also serve as a Debug comment. If the program is compiled with the Debug option turned on, then those Debug commented lines of code are uncommented for that compilation.

                            Open COBOL also uses *> to denote a comment from that point to the end of the line. Similarly, Open COBOL uses >>D to denote a Debug comment from that point to the end of the line.

                            Open COBOL would very likely use the same extensions, but COBOL programmers can be very tricky cats to herd. I have seen .cob, .cbl, .c85 (for COBOL 85), .c74 (for COBOL 74), .cpy (approximately equivalent to #include)

                            Comment

                            • kengrubb
                              Enthusiast
                              • Jan 2006
                              • 46

                              #15
                              One more tweak.

                              Line Number=Text matching ^.{1,6}

                              Comment

                              Working...