Basic File Compare

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • gregx59
    Visitor
    • Dec 2010
    • 6

    Basic File Compare

    I'm using version 1.9f, and having issues on minor compares. I compare mostly flat files, text format, fixed position, usually 1500 lines or less. Row length can be up to 2500 characters. Row length isn't a problem, it performs beautifully. But it can get lost with a single line mis-match, flagging large sections before it finds it's way again. I have to manually insert lines on one file or the other to help the tool do it's job, then it gets lost again on the next line mismatch. I need it to show a blank line on one side or the other when there's a line mismatch. Sometimes it does if it's only one or two lines, sometimes it doesn't.

    There's not many settings I can adjust in my version, and I've tried different settings for most of them.

    I hope there's answers other than "upgrade to the newest version", I had to jump through some major hoops a year ago just to get my company to install this version.
  • Tim
    Team Scooter
    • Oct 2007
    • 786

    #2
    If you haven't tried this already:

    1) In the File Comparison Rules, enable Never Align Mismatches

    2) Make sure Skew Tolerance is large enough to reach over legitimate blocks of differences.

    3) If there is data on each line that acts as a unique key, use the features on the Define Minor tab to define everything else as a minor difference.

    If you can send us sample files, we'd be happy to give more specific suggestions.
    Tim T Scooter Software

    Comment

    • gregx59
      Visitor
      • Dec 2010
      • 6

      #3
      Tim - I'm not aligning mismatches, my skew tolerance is 1000, and initial match requirement is 50. I'm not supressing recursion, and using text-based comparison (although I've tested with all the above settings. I ignore minor differences, and define those as case, leading whitespace, and trailing whitespace. This is a fixed position file, so I don't want to ignore embedded whitespace. We aren't dealing with code, so code comments, variables, literals, and so forth aren't a factor, except I need the ability to flag on characters like quotes, slashes, and asterisks.

      Unfortunately, I can't send examples due to privacy restrictions. Even defaulting certain fields with "junk" will flag at our mailserver.

      I also just tried to do a compare on a comma-delimited file, and had the same problem. One file had over 700 lines, the other had over 600, line length 30 to 60 characters. On a visual inspection the first 2 lines had minor differences, but it couldn't even match those.

      I don't want to give the impression I don't like the product - I do. If fits my needs (which are very basic) about 75% of the time. But it's the times it doesn't work that are incredibly time-consuming.

      Comment

      • Aaron
        Team Scooter
        • Oct 2007
        • 15995

        #4
        Hello Greg,

        Are you able to "define minor" for any blocks of unimportant text? That may help when aligning your two files.

        Also, our BC3 trial is fully featured, and you can install it without altering or removing your BC1 install. Does the BC3 alignment algorithm work better for you? You can also configure some of the options in the Text Compare's Session menu -> Session Settings -> Alignment tab.

        http://www.scootersoftware.com/download.php
        Aaron P Scooter Software

        Comment

        • gregx59
          Visitor
          • Dec 2010
          • 6

          #5
          Aaron - We don't have any unimportant text or blocks - no character(s) I'd be willing to sacrifice. You wouldn't think it would be a difficult process - name, ssn, dob, and other demographic info is right at the beginning - classic flat file stuff. Files change very little from one to the next. But if the client puts an errant character somewhere, I want to detect it. If a date beginning in position 1605 changes from 2008 to 2009 I want to know it. And usually I do, provided the program hasn't gotten lost matching rows. Usually it's when a line is added to a file, or comes off a file is when the tool loses track. Once again, not always, but maybe 25% of the time.

          My use of the tool can't be unusual - comparing iterations of the same flat file from one week to the next, yet I don't see others on this forum having issues. I was mainly hoping somone could tell me it was a simple configuration thing, but there really isn't that much to configure, and I've tested pretty much everything. In a higher security business like ours, downloading a demo is not an option, nor is bringing it in on a usb drive. I'll try to request a newer version.

          Comment

          • snidely.too
            Expert
            • Jul 2008
            • 80

            #6
            Originally posted by Aaron
            Also, our BC3 trial is fully featured, and you can install it without altering or removing your BC1 install. Does the BC3 alignment algorithm work better for you?
            Greg may not be able to try the BC3 trial, if the systems he is using are locked down by his IT to prevent un-authorized programs from being used.

            I'm curious, though, as to when BC1 was being actively maintained? I am not sure I've met anyone else using BC1.

            /dps

            Comment

            • Zoë
              Team Scooter
              • Oct 2007
              • 2666

              #7
              Beyond Compare 1.9f was the last 1.x version, and was released in August 2001. It's still linked off our download page if you want to try it out.
              Zoë P Scooter Software

              Comment

              • gregx59
                Visitor
                • Dec 2010
                • 6

                #8
                Thanks all, I've submitted a request for an upgrade. Per the request form they're still installing 1.9F. If I'm able to upgrade, I'll post an update after I've used it.

                Comment

                • Tim
                  Team Scooter
                  • Oct 2007
                  • 786

                  #9
                  Originally posted by gregx59
                  Aaron - We don't have any unimportant text or blocks - no character(s) I'd be willing to sacrifice. You wouldn't think it would be a difficult process - name, ssn, dob, and other demographic info is right at the beginning - classic flat file stuff.
                  You don't need to sacrifice any data. If you define everything other than ssn as a minor difference (for example), BC will have an easier time aligning the correct items. Differences in other fields will still show as differences (as long as you don't select Just show major differences).

                  If you can save your data as a csv file with ssn as the first field, use a comma in the Text Beginning With option to define subsequent fields as minor.
                  Last edited by Tim; 03-Dec-2010, 09:43 AM.
                  Tim T Scooter Software

                  Comment

                  • gregx59
                    Visitor
                    • Dec 2010
                    • 6

                    #10
                    Tim - to define everything other than a range of positions as minor, would I have to modify my files to surround everything other than those positions with a defined character such as a quote (or vice versa)? That's probably doable, but I'm not sure our text editors are capable of making that edit. I'll have to experiment with that. Thanks for your response.

                    Comment

                    • Tim
                      Team Scooter
                      • Oct 2007
                      • 786

                      #11
                      Yes, although you don't necessarily need to surround the data. The Text Beginning With option will match on a specific character (or string of characters) and flag everything to the end of the record as "minor". The trick is to somehow get something specific to match on, such as the first comma in a csv file.
                      Tim T Scooter Software

                      Comment

                      • gregx59
                        Visitor
                        • Dec 2010
                        • 6

                        #12
                        I am not experiencing this behavior, it is missing obvious differences now. To be a little more specific, I am inserting an asterisk after file demographic information - position 98. I'm doing this on all files. It is matching lines perfectly, but I just noticed differences are not being picked up after position 98. So I'm needing to do significant re-work for the last week. Here's how I'm set up:

                        Minor differences checked:
                        Case
                        Leading Whitespace
                        Trailing Whitespace
                        Text Beginning With: *
                        Nothing else is checked on that tab.

                        On the Text Sync tab I have checked "Ignore Minor Differences".

                        So, unless I misunderstood your earlier post, it DOESN'T compare anything after the "Text Beginning With" character when you ignore minor differences.

                        Comment

                        • Tim
                          Team Scooter
                          • Oct 2007
                          • 786

                          #13
                          You should uncheck "Ignore Minor Differences".

                          Basically, Beyond Compare supports two kinds of differences, major and minor. In your case you want to define key info as major in order to align things properly. But you still need to see the minor differences as well.
                          Tim T Scooter Software

                          Comment

                          Working...