Announcement

Collapse
No announcement yet.

Comparing HTML with all elements unimportant

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing HTML with all elements unimportant

    Is this possible? I've looked under Importance but find no rule that covers the HTML tags. "Keyword" covers just the tag names i.e. not attributes.

    Thanks.

  • #2
    We have an HTML rule that displays just the text of the HTML file, HTML to Text:
    http://www.scootersoftware.com/downl...oreformats_alt
    This would remove the tags from view entirely.

    If you need them present, but unimportant, you may need to define new Grammar items that encapsulate the text you wish to define as unimportant. This could be a delimited grammar from "<" to ">", or something more complex.
    http://www.scootersoftware.com/suppo..._unimportantv3

    Let us know if you have any questions. Please include any sample files and your current settings. You can email us at support@scootersoftware.com, and please include the link back to this forum post.
    Aaron P Scooter Software

    Comment


    • #3
      > If you need them present, but unimportant

      I do.

      > you may need to define new Grammar items that encapsulate the text
      > you wish to define as unimportant. This could be a delimited grammar from
      > "<" to ">", or something more complex.
      > http://www.scootersoftware.com/suppo..._unimportantv3

      Thanks. I've followed that but it doesn't work, even if I uncheckmark the importance of all the preexisting elements:


      Any ideas?

      Comment


      • #4
        The Keyword definition is probably swallowing the Tag definition. You may need to delete your definition for Keywords. I would suggest making a copy of your current HTML rule and make edits there. Then place the default file format lower in the list. This way, you can revert to default behavior if needed.
        Aaron P Scooter Software

        Comment


        • #5
          Originally posted by Aaron View Post
          The Keyword definition is probably swallowing the Tag definition. You may need to delete your definition for Keywords.
          Thanks - deleting the first Keyword definition solved it, though of course leaving me unable to use the Keyword definition in this file format.

          Can you please explain this swallowing problem? Since I tried my Tag element both above and below the Keyword element, I am surprised that interference occurred.

          Comment


          • #6
            Originally posted by chrisjj View Post
            I've followed that but it doesn't work, even if I uncheckmark the importance of all the preexisting elements
            In my experience, unchecking the importance of all preexisting elements is not enough. Move your Tag grammar definition to the top of the list so that it is evaluated before the Keyword grammar definition.

            Edit: I see that a new post appeared while I was posting this one. If you already tried putting the Tag grammar definition first then I, too, am surprised that "interference occurred". There must be some "undocumented" override for some of the built-in grammar types (i.e. Comments processed before keywords, keywords processed before other grammar types, etc.)
            Last edited by Michael Bulgrien; 12-Apr-2011, 07:05 PM.
            BC v4.0.7 build 19761
            ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ

            Comment


            • #7
              > If you already tried putting the Tag grammar definition first

              Here it is:



              > then I, too, am surprised that "interference occurred". There must be some
              > "undocumented" override for some of the built-in grammar types

              I wait to hear. Thanks.

              Comment


              • #8
                Originally posted by chrisjj View Post
                Thanks - deleting the first Keyword definition solved it, though of course leaving me unable to use the Keyword definition in this file format.
                Each character in the file can only be classified as a single element type. Therefore, if you define "Tag" to match all characters between "<" and ">", the "Keyword" definition that matches parts of tags is completely useless.
                Erik Scooter Software

                Comment


                • #9
                  Originally posted by Erik View Post
                  ...if you define "Tag" to match all characters between "<" and ">", the "Keyword" definition that matches parts of tags is completely useless.
                  It remains useful for enabling when required in the Important list. What mystifies me is even when disabled, it somehow overrides the Tag element - including when the tag element has priority in the list.

                  Comment


                  • #10
                    You can't "disable" a grammar item. The only way to prevent it from classifying text is to delete it. You can change whether or not it is important. You cannot meaningfully use the built-in keyword definition and your new tag definition at the same time.
                    Erik Scooter Software

                    Comment


                    • #11
                      Originally posted by Erik View Post
                      You can't "disable" a grammar item. The only way to prevent it from classifying text is to delete it.
                      Shouldn't the Keyword item be disabled by a match on the the higher Tag item? As per the documentation:

                      Text Format Grammar Settings
                      ...
                      Items higher on the list take precedence over lower items.

                      Comment


                      • #12
                        Hello Chris,

                        The order is significant in helping to break ties between multiple possible matches. However, other factors can make a section of text match one grammar over another before the list precedence is used. In this example, the length of the match matters more than the position in the priority list. The 'Keyword' is a longer match than the left side of the delimiter ("<"). In this case, the longest match will used. If they are equal in length, then the list's priority breaks a tie.
                        Aaron P Scooter Software

                        Comment


                        • #13
                          Originally posted by Aaron View Post
                          The order is significant in helping to break ties between multiple possible matches. However, other factors can make a section of text match one grammar over another before the list precedence is used.
                          Thanks. That's news to me, despite me having read the Help. Did I miss it somewhere?

                          Originally posted by Aaron View Post
                          In this example, the length of the match matters more than the position in the priority list. The 'Keyword' is a longer match than the left side of the delimiter ("<"). In this case, the longest match will used.
                          Note that in this case the longest match is Tag, and it is not being used.

                          Comment


                          • #14
                            Originally posted by chrisjj View Post
                            Thanks. That's news to me, despite me having read the Help. Did I miss it somewhere?

                            Note that in this case the longest match is Tag, and it is not being used.
                            The delimited type matches on the left side first, and the left side of tag is only "<". Making this behavior clearer and improving on it in general is on our wishlist for a future version of Beyond Compare.

                            I recommend creating the copy of the File Format, and deleting the Keyword definition from the copy. You can then toggle between the two methods of comparison using the dropdown on the toolbar.
                            Aaron P Scooter Software

                            Comment


                            • #15
                              Originally posted by Aaron View Post
                              The delimited type matches on the left side first, and the left side of tag is only "<". Making this behavior clearer and improving on it in general is on our wishlist for a future version of Beyond Compare.
                              Thanks. I suggest precedence should go by what matches the element, not just part of the element "first".

                              Comment

                              Working...
                              X