No announcement yet.

How to ignore line breaks or line numbers?

  • Filter
  • Time
  • Show
Clear All
new posts

  • How to ignore line breaks or line numbers?


    I've been searching the forums and I can't find the answer to a question. I am a novice at BC. I have two text files, here are examples:

    File 1:

    Hey, diddle, diddle,
    The cat and the fiddle,

    File 2:

    Hey, diddle, diddle, The cat and the fiddle,

    The text is identical, the only difference is that when I view "Line Numbers" the File 1 has two line numbers

    1 Hey, diddle, diddle,
    2 The cat and the fiddle,

    while File 2 has one line number:

    1 Hey, diddle, diddle, The cat and the fiddle,

    However, I would like the above to not be considered differences, but instead identical. The reason is that I may have a single "line number" that is hundreds of words long (because word wrap was enabled on the originating word processor, so a person just kept on typing and typing and their paragraph shows up as a "single" line number). The comparing text file has the identical text, except their word processor broke up the paragraph as separate lines when a line reached the right of the screen.

    I only want to compare essential differences like a missing or added characters and so forth, and I do not need to worry about line numbers. (The reason is that now I get large red "differences" in BC, but I don't know if there are any real differences.) Is there anyway to ignore line numbers?


  • #2

    Thanks for the detailed case/description. BC4 always considers line breaks important, but we have external conversions which can normalize the whitespaces of files so that the text then is moved to be in the same structure. These Tidy variants (Html, XML, etc) can be found here:

    Please note that if you edit/save in the re-formated form, it will be saved with this structure of line breaks. You can also edit the Tidy variant format to be Editing Disabled to prevent this (Tools menu -> File Formats, Conversion tab, edit Format).

    We also have a guide for plugging in any command line tool that can perform the normalization, here:
    Aaron P Scooter Software


    • #3
      Looks like "newline-agnostic" text comparison is something people have been commonly requesting for the better part of a decade: