Help with alignment

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Sharlikran
    Enthusiast
    • Jan 2014
    • 20

    Help with alignment

    I am comparing two text files. One file just uses a custom file format. It's a text file basically. The other uses YAML 1.2, with custom condition string syntax modeled after Python's condition expressions.

    On one side I will have a file name like

    Essential Whiterun horses.esp

    The other side will have

    - name: 'Essential Whiterun horses.esp'

    They don't always line up.

    On one side I will have a string of

    BEGINGROUP: Official ESMs

    The other side will have

    # BEGINGROUP: Official ESMs

    They don't always line up.

    I added 'BEGINGROUP: and 'ENDGROUP:' as strings in the Rules under Importance. I also tried setting one side as xml and the other as rtf and then adding '# BEGINGROUP:' to the xml grammer, and 'BEGINGROUP:' to the rtf grammer. However, no matter what I try those lines don't match up. One text file has over 80,000 lines in it and the other has over 50,000 lines. The difference in line location might be over 6000 lines different.

    Is there a way to help them line up better>
  • Aaron
    Team Scooter
    • Oct 2007
    • 15997

    #2
    Hello,

    The key for file name alignment would be to determine how the file names are different. File names only align if they have the exact same name. Sometimes, whitespace or an odd character can cause them to not align, so I recommend selecting a file, right-click, and use Copy Filename, then paste into a Text Compare to compare the text of the two file names you expect to align.

    For the text within a file, this depends on several factors including the defined grammar elements, line weights, and the Alignment (Session Setting) options. Here you can alter the skew to increase it, or use the Alternate method. You can also use the right-click Align With option to manually align selected text on one side with another.

    How do these tips help improve your current Folder Compare alignment, or the Text Compare text alignment?
    Aaron P Scooter Software

    Comment

    • Sharlikran
      Enthusiast
      • Jan 2014
      • 20

      #3
      Originally posted by Aaron
      Hello,
      For the text within a file, this depends on several factors including the defined grammar elements, line weights, and the Alignment (Session Setting) options. Here you can alter the skew to increase it, or use the Alternate method. You can also use the right-click Align With option to manually align selected text on one side with another.

      ...Snip, I am comparing within a text file..., or the Text Compare text alignment?
      I clicked on the icon that looks like a Basball Ref, Rules, and then I added 'BEGINGROUP: and 'ENDGROUP:' as strings in the Rules under Importance -> Grammar. On the alignment tab I increased it from 2000 to 4000 and I tried the alternate method but it didn't force the alignment of the 'BEGINGROUP: and 'ENDGROUP:' tags.

      I can search the document and use the Align With option but since the 'BEGINGROUP: and 'ENDGROUP:' are distinctive I was hoping that at least they would line up is I added that to the grammar section.

      Should those be Strings, or should I choose Enviorment Variable, comment, lable, would another choice other then string help?

      Comment

      • Aaron
        Team Scooter
        • Oct 2007
        • 15997

        #4
        Hello,

        Defining the grammar only helps the algorithm. How far apart are the lines you are trying to match on? You may need to increase the Skew further (you can type a value in). Have you defined a Line Weight (in the File Format) or tried the Alternate Alignment Method in the Alignment tab of the Session Settings (also as the Rules/referee button on the toolbar)?
        Aaron P Scooter Software

        Comment

        • Sharlikran
          Enthusiast
          • Jan 2014
          • 20

          #5
          Originally posted by Aaron
          Hello,

          Defining the grammar only helps the algorithm. How far apart are the lines you are trying to match on?
          6000-7000

          Originally posted by Aaron
          You may need to increase the Skew further (you can type a value in).
          I'll try that.

          Originally posted by Aaron
          or tried the Alternate Alignment Method in the Alignment tab of the Session Settings (also as the Rules/referee button on the toolbar)?
          Yes.

          Originally posted by Aaron
          Have you defined a Line Weight (in the File Format)
          No. Not sure where to go to do that.

          Comment

          • Aaron
            Team Scooter
            • Oct 2007
            • 15997

            #6
            Line Weights are defined in the Tools menu -> File Formats, select your format, and in the Grammar section, below the list of the grammar elements is the list of the Line Weights.

            The Skew Tolerance should probably also then be 7000+ to increase the search range to be sufficiently large.
            Aaron P Scooter Software

            Comment

            • Sharlikran
              Enthusiast
              • Jan 2014
              • 20

              #7
              Originally posted by Aaron
              Line Weights are defined in the Tools menu -> File Formats, select your format, and in the Grammar section, below the list of the grammar elements is the list of the Line Weights.

              The Skew Tolerance should probably also then be 7000+ to increase the search range to be sufficiently large.
              I'll try that and see if it helps.

              Thanks for the assistance so far.

              Comment

              • Sharlikran
                Enthusiast
                • Jan 2014
                • 20

                #8
                Originally posted by Sharlikran
                I'll try that and see if it helps.

                Thanks for the assistance so far.
                Nope it's not really doing the trick either. If I change the file and switch between files once the view changes it has to recalculate the whole file, again. So each time it does that is looses the manual alignment I assigned. I'm asking a lot of the program to line up over 6000 lines but the BEGINGROUP: and ENDGROUP: only appear less then 200 times out of 80,000 lines. They have things like 'BEGINGROUP: Base Q' so there is text after it.

                Each file has the exact same amount of tags. I meticulously put them in each files to try this weighted line alignment but I don't see a way to set weights that will make BC align this 'BEGINGROUP: Basw Q' exactly no exceptions, and then do a fuzzy logic for the rest of the file.

                Oh is there a way to say, match text at the beginning of the line? Because the labels are at the beginning of a line always.

                For now with all the weights, and adding the file as a file format, and the labels as priority five, the program matches this line "ENDGROUP: Items K" with this line " - name: Ket_WEAPONIZER.esp" which isn't even remotely the same. I am expecting it to match "ENDGROUP: Items K" with "ENDGROUP: Items K".

                I feel like adding the weights is messing up the logic of the program, or I have to do something like scripting or something complex to make the program say match up lines with this text, and this text 100%, and then for the rest just do your best to find a close match.
                Last edited by Sharlikran; 15-Jan-2014, 12:25 AM.

                Comment

                • Zoë
                  Team Scooter
                  • Oct 2007
                  • 2666

                  #9
                  Can you email your files to [email protected]? There may be something obvious that we'll be able to catch by seeing them in person.

                  Line weights only apply to lines that match exactly (possibly with unimportant differences), and only using the standard alignment. To force an alignment of BEGINGROUP/ENDGROUP, set up your file formats so that everything else on those lines is unimportant, then set the line weights for those strings to the maximum value you can (5000?).
                  Zoë P Scooter Software

                  Comment

                  • Sharlikran
                    Enthusiast
                    • Jan 2014
                    • 20

                    #10
                    Finally, I figured it out. I'm too excited at the moment to post. I want to actually work on the documents. I'll post tomorrow exactly what I did. I did use your suggestions granted but maybe I wasn't being clear enough with my expectations.

                    Comment

                    • Sharlikran
                      Enthusiast
                      • Jan 2014
                      • 20

                      #11
                      Nevermind it didn't work after all.

                      I still can't get this line " - name: CNDragonbornSort.esp" to match up with this line "CNDragonbornSort.esp"

                      Nor will it match "# ENDGROUP: Overrides B" with "ENDGROUP: Overrides B"

                      To me when I added things to the grammar box it seemed to to only change colors in the document to highlight keywords. It was implied that it would effect the alignment. Will it?

                      The Line weights box does not have the settings that the grammer box has. I wish it did. I would love to set up a line weight to have [beginofline]" - name: " to ".esp" and add a weight to it. Then in the other file I am comparing it to add, [beginofline] to ".esp" thus making any line with the same text line up wihtout manually aligning things as I go.

                      The problem is the files can't worked on by anyone else. If changes take place then BC tells you the file changed do you want to reload, and once you do the alignment is all forgotten.
                      Last edited by Sharlikran; 15-Jan-2014, 09:54 AM.

                      Comment

                      • Sharlikran
                        Enthusiast
                        • Jan 2014
                        • 20

                        #12
                        Here is what I set up so far.

                        For Text Compare - Session Settings, Keywords is checked. As I am writing this I can see that I might be able to add things to that section and then check them. I will have to try that also.

                        Also in Text Compare - Session Settings Alignment is set to 3000, use closeness matching.

                        Under Tools File Formats I have a format named TXT.

                        Line Weights:

                        Match Case, RegEx, Priorty 5, and the text is [^BEGINGROUP.*] without the braces.
                        Match Case, RegEx, Priorty 5, and the text is [^ENDGROUP.*] without the braces.
                        Match Case, RegEx, Priorty 2, and the text is [^.*\.esp] without the braces.
                        Match Case, RegEx, Priorty 2, and the text is [^.*\.esm] without the braces.

                        In the Grammar section I have

                        Keyword in list: BEGINGROUP, ENDGROUP
                        Comment Text from // to end of line
                        Comment Text from /* to */

                        Under Tools File Formats I have a format named YAML.

                        Line Weights:

                        Match Case, RegEx, Priorty 5, and the text is [^BEGINGROUP.*] without the braces.
                        Match Case, RegEx, Priorty 5, and the text is [^ENDGROUP.*] without the braces.
                        Match Case, RegEx, Priorty 2, and the text is [^.*\.esp] without the braces.
                        Match Case, RegEx, Priorty 2, and the text is [^.*\.esm] without the braces.

                        In the Grammar section I have

                        Keyword in list: BEGINGROUP, ENDGROUP
                        Comment Text from # to end of line

                        ---------------------------------------------------

                        I want it to match BEGINGROUP and ENDGROUP up no matter what, like, 100% of the time. Then the file names 2nd, then anything else.
                        Last edited by Sharlikran; 15-Jan-2014, 10:33 AM.

                        Comment

                        • Aaron
                          Team Scooter
                          • Oct 2007
                          • 15997

                          #13
                          Hello,

                          If the lines are more than 3000 lines apart, you should increase your Skew (max is up to 30,000). Would it be possible to get a pair of your files emailed to [email protected]? Please also include a link to this forum thread, and include your Help menu -> Support; Export to generate a BCSupport.zip.
                          Aaron P Scooter Software

                          Comment

                          • Sharlikran
                            Enthusiast
                            • Jan 2014
                            • 20

                            #14
                            I am trying to discuss this, but not complain that it doesn't match up. If that makes sense. Like when you ask a question I am just respectfully giving you feedback and trying to answer the questions best I can. However, I have tried it at 3000, 7000, and 8000 and it doesn't matter.

                            The issue I seem to have is that it does not actually care what the priority is. I don't think you built this kind workflow into the program because it probably doesn't have any useful application except for what I am doing. At least I want to be humble and think that.

                            However, to me this kind of comparison would be useful for comparing two different styles of documents. For now I am comparing the text I have posted above. What if I were comparing C++, Delphi, Python and Java source code?

                            Similar to what I am doing now I need to specify that one side of document has a different beginning of the line, and possibly a different ending but the middle of the string is a perfect match. Example

                            Delphi Code:

                            wbRecord(ANIO, 'Animated Object', [
                            wbEDID,
                            wbMODL,
                            wbString(BNAM, 'Type')
                            ]);

                            Python Code:

                            class MreAnio(MelRecord):
                            """Anio record (Animated Object)"""
                            classType = 'ANIO'
                            melSet = MelSet(
                            MelString('EDID','eid'),
                            MelModel(),
                            MelString('BNAM','unloadEvent'),
                            )
                            __slots__ = MelRecord.__slots__ + melSet.getSlotsUsed()

                            I would need a way to define what has to match somewhere.

                            In both my hypothetical situation above with the Delphi and Python code, and the situation I am discussing with the Text file and the yaml file each side has the key elements in the same exact order.


                            In otherwords

                            Left side

                            # BEGINGROUP: Base A
                            # ENDGROUP: Base A
                            # BEGINGROUP: Base B
                            # ENDGROUP: Base B
                            # BEGINGROUP: Base C
                            # ENDGROUP: Base C

                            Right side

                            BEGINGROUP: Base A
                            ENDGROUP: Base A
                            BEGINGROUP: Base B
                            ENDGROUP: Base B
                            BEGINGROUP: Base C
                            ENDGROUP: Base C

                            the text is just to distinctive to miss in my opinion.
                            Last edited by Sharlikran; 15-Jan-2014, 01:14 PM.

                            Comment

                            • Sharlikran
                              Enthusiast
                              • Jan 2014
                              • 20

                              #15
                              Originally posted by Aaron
                              Hello,

                              If the lines are more than 3000 lines apart, you should increase your Skew (max is up to 30,000). Would it be possible to get a pair of your files emailed to [email protected]? Please also include a link to this forum thread, and include your Help menu -> Support; Export to generate a BCSupport.zip.
                              Done

                              Comment

                              Working...