Windows - compare large XML files - showing differences where there are none

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Amadeus
    New User
    • Feb 2017
    • 2

    Windows - compare large XML files - showing differences where there are none

    Hi,

    I have some large (320,000 lines; ~25meg) XML files that I want to compare on a 64bit Windows machines.
    I load them both up and the bar on the left hand side is almost all red. Scrolling through the files shows that's not accurate - they're mainly white with odd lines being red.
    I also see areas where BeyondCompare shows a difference (i.e. it shows a chunk of XML on one side but not the other). There is no difference in that area.

    If I remove (in BC) the surrounding XML, it then shows that the elements I'm looking at are much more similar and there is no red in the left hand column.

    My document is of the form:

    <root>
    <child1>
    ..various child nodes..
    </child1>

    <product>
    ..various child nodes..
    </product>
    <product>
    ..various child nodes..
    </product>
    <product>
    ..various child nodes..
    </product>
    <product>
    ..various child nodes..
    </product>
    </root>

    So if I'm interested in the second <product> element, I delete the first, third and fourth, and then BC shows just a few characters different.

    Is this a known issue? I'm using the latest version.

    thanks
  • Chris
    Team Scooter
    • Oct 2007
    • 5538

    #2
    The error might be due to an unterminated comment.

    Text is only matched for comparison if it has the same grammar element type. This means default text on the left will not show as a match with text inside a comment on the right. If a line contains a <!-- to begin a comment on one side, but it isn't terminated with a -->, then a huge block of text can be treated as a comment and show different if the other side isn't a comment.

    The grammar element type at the current cursor position is displayed in the status bar at the bottom of the Text Compare window. It might help to cursor into the unexpected differences in your files and see if the grammar element type is the same in each file.

    If the grammar element type is different due to an unterminated comment, you can workaround the issue by editing the XML file format. Open "Tools > File Formats". Select the XML file format. Go to the Grammar tab. Select Comment. Click the Edit button (gear icon). Check "Stop at end of line", then click OK, then Save.

    When stop at end of line is checked, it will prevent unterminated comments from affecting multiple lines.
    Chris K Scooter Software

    Comment

    • Amadeus
      New User
      • Feb 2017
      • 2

      #3
      Hi Chris,

      Thanks for the quick reply.

      I am as sure as I can be that the XML is well-formed with no missing close tags etc. I've parsed it in Java ass well as opening it with IE which (possibly surprisingly) is quite good at identifying issue with badly formed XML.
      I've also searched the doc for comments - there are none.

      The grammar types seem to be good.

      I then got on to your suggestion about the XML file format. My files didn't have a .xml extension so that was I thought a lightbulb moment. I added a .xml extension and reloaded but still the same problem.
      I then made the changes to the file format, reloaded but again the same problem.

      Do you have any further suggestions of what I can try?

      thanks

      Comment

      • Chris
        Team Scooter
        • Oct 2007
        • 5538

        #4
        We might need a pair of example files or a screen shot to diagnose the issue.

        Please send example files or a screen shot and your settings to [email protected] with a link referencing this forum thread and we'll continue investigating. To save your settings to a file, open Help | Support, then click the Export button.
        Chris K Scooter Software

        Comment

        Working...