Results 1 to 8 of 8
  1. #1
    Join Date
    Jan 2008
    Posts
    28

    Default How to compare text files with lines broken at different points?

    I have two text files with different line wrapping (newlines, not ending encodings). When I compare BC thinks virtually the entire document is different.

    I tried removing all line endings from the files, then compare, but unfortunately I get one huge line and BC won't wrap the view.

    Any ideas how to compare two text files?

  2. #2
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,830

    Default

    Hello,

    Line Endings are important for the Text Compare, but you can run your files through an external conversion which normalizes the whitespace and line breaks of the two files.

    We have some additional downloads for specific formats such as XML Tidied and HTML Tidied:
    http://www.scootersoftware.com/downl..._moreformatsv4

    And you can also use a custom conversion, using an RESX format as the example:
    http://www.scootersoftware.com/suppo...rnalconversion
    Aaron P Scooter Software

  3. #3
    Join Date
    Jan 2008
    Posts
    28

    Default

    Thanks for your answer.

    The sort of thing I'm comparing is markdown (.md) where line breaks in paragraphs are ignored and re-wrapped when rendered.

    I'd like to be able to compare those paragraphs.

    Following your suggestion, I wrote a utility to normalise two .md files for comparison. Then it was ok for BC.

    Is there any documentation on how to make a package for new files formats (eg for .md files). I could make my utility available.

    thanks.

  4. #4
    Join Date
    Sep 2008
    Posts
    12

    Default

    I also often want to compare documents which have been hard word wrapped. Having this functionality readily available within Beyond compare would be a really appreciated feature.

    As it has been requested in the past and not implemented so I assume the restriction is deeply embedded within Beyond Compare. Maybe having a standard filter to convert text to sentences would be more achievable. See Word Wrap thread
    Last edited by patch; 24-Sep-2016 at 07:02 PM.

  5. #5
    Join Date
    Jan 2008
    Posts
    28

    Default

    thanks.

    I posted my paragraph fold util here, https://gitlab.com/jkj/foldp
    I don't know how to make BC run it automatically.

    It only works on plain text files that are paragraphs separated by blank lines. Sort of things like this I need to compare are software licences and contracts. What i do is re-wrap them to, say 100 letters per line, then use BC to compare.

  6. #6
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,830

    Default

    Hello,

    Thanks!

    For automation, use the Tools menu -> File Formats dialog, click +/New -> Text Format. Add *.md as the extension, and in the Conversion tab add the path to your utility. I would recommend storing it in our %AppData%\Scooter Software\Beyond Compare 4\Helpers\foldp. From your GitHub documentation, the Conversion path could be:
    Helpers\foldp "%s" "%t"

    Where %s is the source file and %t is the temp target that will be generated and displayed. This format, when used, should auto-convert your *.md files. Does this work for your files?
    Aaron P Scooter Software

  7. #7
    Join Date
    Jan 2008
    Posts
    28

    Default

    It worked!

    foldp is not very clever. it doesn't really understand markdown (.md). It works by refolding paragraphs of text, where a paragraph is any plain text separated by a blank line.

    However, it works well enough for some of my simple markdown files, also some of the text software licence agreements i was comparing.

    One thing to note. If you make changes to the compared files then save, you'll be saving the wrapped version not the original.

    If i run into problems, i'll update it. or anyone have a problem, request me at contact@voidware.com

  8. #8
    Join Date
    Dec 2007
    Location
    Taipei
    Posts
    15

    Default

    Quote Originally Posted by hugh View Post
    thanks.

    I posted my paragraph fold util here, https://gitlab.com/jkj/foldp
    I don't know how to make BC run it automatically.

    It only works on plain text files that are paragraphs separated by blank lines. Sort of things like this I need to compare are software licences and contracts. What i do is re-wrap them to, say 100 letters per line, then use BC to compare.
    Hi, I've try your util, but it does not deal with UTF-8 encoding correctly.

    so, I try pandoc from http://pandoc.org/installing.html
    by using the following steps, it was done successfully, and work perfect for me.

    1. make a copy of 'file formats' setting 'HTML', name it as 'MD'.
    2. editing the 'Mask' in 'General' tab to '*.md'.
    3. in 'Conversion' tab
    3a. choose 'External Program (Unicode filenames)'
    3b. set 'Loading' to "C:\Utils\Console\pandoc" "%s" -f markdown_github -t html -s -o "%t"
    (I put pandoc.exe at C:\Utils\Console)
    3c. make a click on 'Disable editing'
    3d. change 'Encoding' to 'UTF-8'

    Note: there's many various derivation of 'markdown' sets, I'm using gitbook, so I choose '-f markdown_github' when convert it.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •