spcl1-CHMdecompile <!--\d+-->
I've seen some other questions that seem similar but can't quite wrap my head around how to do this.
Sorry for long posting below but I was trying to keep notes as I tried various things.
Summary of what I think I found:
(1) MAJOR: Its impossible to have "rules" specified for individual file comparisons be used at the folder (parent)level.
(2) MINOR bug - changing order of Grammar elements as specified in [Edit grammar...] does not change order of these elements in the Text Compare - Session Settings - Importance - Grammar elements list.
(3) CONFUSING: Text Compare - Session Settings - Replacement does nothing ... or at least nothing that I can identify
==================
I want to define a rule used for all text files (regardless of extension) that will cause differences matching the rule to be ignored when deciding whether the two versions of the file are identical.
To make sure we are talking about the same thing.
I have two versions of a CHM help file and want to identify the differences between them. I found a program (with a 30-day test version) that will decompile a CHM file. http://www.zipghost.com/chmdecompiler.html into multiple files in a folder
Above program inserts in each "text" file the two line(s):
(1) font-size: 11px; text-decoration: none;">The CHM file was converted to HTM by Trial version of <b>ChmD<!--62-->ecompiler</b>.</a>
(2) font-size: 11px; text-decoration: none;">Download <b>ChmDec<!--62-->ompiler</b> at: http://www.zipghost.com</a>
where "62" is a one to three digit number which varies for each file the decompiler creates in each run.
(1) MAJOR:
as a result ... when I use BC3 to compare the two folders, select all and ask for [Compare contents] (=?) all files show differences not just the ones that have differences I consider significant :-(
Picking one of the HTML files for testing purposes clicking the [Referee] says its HTML and allows me to define Grammar items for HTML. So I define:
spcl1-CHMdecompile=Text matching <!--\d+-->
ticking:
Match character case=OFF
Regular expression=ON
This element is case insensitive
This adds spcl1-CHMdecompile to the HTML list of Grammar elements. If I uncheck all offered Grammar elements (Keyword, String, Comment, Operator and my special one: spcl1-CHMdecompile then the two versions DO compare as equal ... BUT ... there seems to be no way to get these "rules" honored in the Folder compare. When I click back to the Folder level each file that I have manually checked now shows the "squiqqly equal" icon [Ignore unimportant differences]
ie. I've set [Folder Compare - Session Settings] - [v] Compare contents - (o) Rules-based comparison.
Note: AFAICT the example lines I showed above should match both as HTML contents and under my "special rule" CORRECT? however if I click ANY of the elements Keyword, String, Comment, Operator and spcl1-CHMdecompile as being IMPORTANT then the two lines above get triggered as differences.
(2) MINOR BUG? - In Grammar elements for HTML I moved my special rule above the Comments rule however the in the Text Compare - Importance list the order of the Grammar elements remains unchanged.
**LATER** It appears that the actual rule that is being (IMO) incorrectly triggered is the Strings rule. If it is set to "Important" then the added stuff gets flagged as different.
I think its because the actual text in which this appears is (for example):
===
<a href="http://www.etextwizard.com/download/cd/cdsetup.exe" target="_blank" style="font-family: Tahoma, Verdana;
font-size: 11px; text-decoration: none;">Download <b>ChmDec<!--154-->ompiler</b> at: http://www.zipghost.com</a>
====
Note that the first line above has an initial quote just before >>font-family<< and its closing quote just after >>none;<< on the second line. Am I correct that your parsing engine only looks a single lines so that it thinks that:
>>||">Download <b>ChmDec<!--154-->ompiler</b> at: http://www.zipghost.com</a>||<<
is a "string" albeit with an unmatched leading quote.?
ie. for this particular difference to get recognized as either an HTML comment or my special grammar rule I would have to make strings unimportant thus making it impossible to recognize *real* differences in strings in two versions of an HTML file?
(3) CONFUSING: In a last attempt to get this to work I made a copy of the [HTML] file format asd [HTML-test] moved it above [HTML] in the list so it got recognized then removed my special rule and instead tried to do the same think in the Text Compare - Session Settings - Replacement tab.
Help for this says:
>Replacements identify repetitive changes
>that should be considered unimportant. You
>can specify the text to match on one side
>and the text that replaces it on the other
>side.
I tried LS=<!--\d+--> RS=NULL
and then added LS=NULL RS=<!--\d+-->
I can't see any effect in the File compare when using either one or both the above replacements. What I was hoping to cause is that the occurence of *ANY* string matching regular expression <!--\d+--> in either left-side file or right-side file would be replaced by string NULL before comparison is done, hence making both sides compare as equal.
I've seen some other questions that seem similar but can't quite wrap my head around how to do this.
Sorry for long posting below but I was trying to keep notes as I tried various things.
Summary of what I think I found:
(1) MAJOR: Its impossible to have "rules" specified for individual file comparisons be used at the folder (parent)level.
(2) MINOR bug - changing order of Grammar elements as specified in [Edit grammar...] does not change order of these elements in the Text Compare - Session Settings - Importance - Grammar elements list.
(3) CONFUSING: Text Compare - Session Settings - Replacement does nothing ... or at least nothing that I can identify
==================
I want to define a rule used for all text files (regardless of extension) that will cause differences matching the rule to be ignored when deciding whether the two versions of the file are identical.
To make sure we are talking about the same thing.
I have two versions of a CHM help file and want to identify the differences between them. I found a program (with a 30-day test version) that will decompile a CHM file. http://www.zipghost.com/chmdecompiler.html into multiple files in a folder
Above program inserts in each "text" file the two line(s):
(1) font-size: 11px; text-decoration: none;">The CHM file was converted to HTM by Trial version of <b>ChmD<!--62-->ecompiler</b>.</a>
(2) font-size: 11px; text-decoration: none;">Download <b>ChmDec<!--62-->ompiler</b> at: http://www.zipghost.com</a>
where "62" is a one to three digit number which varies for each file the decompiler creates in each run.
(1) MAJOR:
as a result ... when I use BC3 to compare the two folders, select all and ask for [Compare contents] (=?) all files show differences not just the ones that have differences I consider significant :-(
Picking one of the HTML files for testing purposes clicking the [Referee] says its HTML and allows me to define Grammar items for HTML. So I define:
spcl1-CHMdecompile=Text matching <!--\d+-->
ticking:
Match character case=OFF
Regular expression=ON
This element is case insensitive
This adds spcl1-CHMdecompile to the HTML list of Grammar elements. If I uncheck all offered Grammar elements (Keyword, String, Comment, Operator and my special one: spcl1-CHMdecompile then the two versions DO compare as equal ... BUT ... there seems to be no way to get these "rules" honored in the Folder compare. When I click back to the Folder level each file that I have manually checked now shows the "squiqqly equal" icon [Ignore unimportant differences]
ie. I've set [Folder Compare - Session Settings] - [v] Compare contents - (o) Rules-based comparison.
Note: AFAICT the example lines I showed above should match both as HTML contents and under my "special rule" CORRECT? however if I click ANY of the elements Keyword, String, Comment, Operator and spcl1-CHMdecompile as being IMPORTANT then the two lines above get triggered as differences.
(2) MINOR BUG? - In Grammar elements for HTML I moved my special rule above the Comments rule however the in the Text Compare - Importance list the order of the Grammar elements remains unchanged.
**LATER** It appears that the actual rule that is being (IMO) incorrectly triggered is the Strings rule. If it is set to "Important" then the added stuff gets flagged as different.
I think its because the actual text in which this appears is (for example):
===
<a href="http://www.etextwizard.com/download/cd/cdsetup.exe" target="_blank" style="font-family: Tahoma, Verdana;
font-size: 11px; text-decoration: none;">Download <b>ChmDec<!--154-->ompiler</b> at: http://www.zipghost.com</a>
====
Note that the first line above has an initial quote just before >>font-family<< and its closing quote just after >>none;<< on the second line. Am I correct that your parsing engine only looks a single lines so that it thinks that:
>>||">Download <b>ChmDec<!--154-->ompiler</b> at: http://www.zipghost.com</a>||<<
is a "string" albeit with an unmatched leading quote.?
ie. for this particular difference to get recognized as either an HTML comment or my special grammar rule I would have to make strings unimportant thus making it impossible to recognize *real* differences in strings in two versions of an HTML file?
(3) CONFUSING: In a last attempt to get this to work I made a copy of the [HTML] file format asd [HTML-test] moved it above [HTML] in the list so it got recognized then removed my special rule and instead tried to do the same think in the Text Compare - Session Settings - Replacement tab.
Help for this says:
>Replacements identify repetitive changes
>that should be considered unimportant. You
>can specify the text to match on one side
>and the text that replaces it on the other
>side.
I tried LS=<!--\d+--> RS=NULL
and then added LS=NULL RS=<!--\d+-->
I can't see any effect in the File compare when using either one or both the above replacements. What I was hoping to cause is that the occurence of *ANY* string matching regular expression <!--\d+--> in either left-side file or right-side file would be replaced by string NULL before comparison is done, hence making both sides compare as equal.
Comment