Announcement

Collapse
No announcement yet.

25118 Text Compare - PDF Conversion Error

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 25118 Text Compare - PDF Conversion Error

    Some PDFs here give:

    Click image for larger version

Name:	xcpQdDv.png
Views:	113
Size:	4.0 KB
ID:	86134

    and no further info.

    These PDFs are created by Chrome Save As PDF and read fine in Adobe Acrobat. Some other such emails don't show this issue.

    Ideas?


  • #2
    Hello,

    If you use Adobe itself to perform the conversion (launch Adobe Reader, File menu -> Save As Text) does it throw any warnings or errors? Is the output file entirely blank? Adobe exports of entirely blank (but no error prompt) files are pictures, not text, so the conversion fails with the above error.
    Aaron P Scooter Software

    Comment


    • #3
      Originally posted by Aaron View Post
      If you use Adobe itself to perform the conversion (launch Adobe Reader, File menu -> Save As Text) does it throw any warnings or errors?
      No.

      Originally posted by Aaron View Post
      Is the output file entirely blank?
      Yes.

      Originally posted by Aaron View Post
      Adobe exports of entirely blank (but no error prompt) files are pictures, not text, so the conversion fails with the above error.
      OK, so the converter can't handle zero text. Could you pass on my suggestion this be fixed? Thanks.

      Comment


      • #4
        Yes, though the issue is that it is common practice for various conversion utilities to generate empty files when they fail (without returning an error code), so this is a generic Conversion Failed state. It's on our wishlist to try and catch this for other known conversion processes and present better error messaging.
        Aaron P Scooter Software

        Comment


        • #5
          Presuming your PDF converter does not suffer that fault, why not handle empty files correctly in the PDF case?

          Comment


          • #6
            We use helper libraries for the pdf conversion, so we have to make sure it returning empty is always this case and there are no edge cases we might misrepresent.
            Aaron P Scooter Software

            Comment


            • #7
              Thanks but I don't see how that's grounds to justify the reported fail. BC is currently misrepresenting a case. And really it make no sense to class an empty document as a "edge case".

              Comment


              • #8
                Empty files are a failed conversion, and common result of a failed conversion using different command line utilities, which we have experience with. Since either side can be using different File Formats, it's important to catch errors instead of returning Rules-based Equal status for two empty converted files. This allows the user to see and address the issue instead of suppressing it as if it had worked.

                We don't currently detect if the conversion failed because you passed a PDF of only pictures to the Text Compare, but it's better to express that as an error than to treat it as empty, which would allow it to be equal to a truly empty file or another (of any format) failed file.

                Enhancing the error message itself to be more specific than Conversion Failed, when possible, is the item on our wishlist.
                Aaron P Scooter Software

                Comment


                • #9
                  Originally posted by Aaron View Post
                  Empty files are a failed conversion
                  A valid empty file is a valid empty file and should not cause conversion fail.

                  Originally posted by Aaron View Post
                  , and common result of a failed conversion using different command line utilities, which we have experience with. Since either side can be using different File Formats, it's important to catch errors instead of returning Rules-based Equal status for two empty converted files. This allows the user to see and address the issue instead of suppressing it as if it had worked.
                  The only way this user found to address the issue was ask here.

                  Originally posted by Aaron View Post
                  We don't currently detect if the conversion failed because you passed a PDF of only pictures to the Text Compare
                  You don't need to. You just need a converter that handles empty text properly.

                  Originally posted by Aaron View Post
                  but it's better to express that as an error than to treat it as empty
                  Better still is to treat it empty test properly. Just as e.g. in TXT.

                  Comment


                  • #10
                    It really isn't. There's no way to distinguish when a conversion utility hands us an empty file because it failed to convert it and hands us an empty file if the file was entirely blank or hands us an empty file because the conversion created a blank file because the source had no valid data to convert.

                    Your suggestion to treat empty converted files as valid would result in two different scanned PDF files (with different picture data) reporting as equal when Rules-based scanned in BC4's Text Compare. To avoid that, BC4 would have to be able to detect different reasons for the empty return value.

                    Detecting that a file is only picture data, or was prevented from generation due to an Adobe security setting, or any other scenario, would have to be accurately captured and prevent false equal status, which even Adobe itself doesn't handle well (it outputs a blank text file). And that isn't even approaching that either side could be PDF or another conversion program. Other conversion utilities that may not have any distinguishing return value that it failed besides returning an empty file.

                    Since BC4 has to handle this scenario, it is better to draw user attention to it than do something that might result in a false equal status.

                    If you feel this strongly, you create your own conversion layer instead of using the default that calls to any PDF conversion command line, notices blank files, and returns with a conversion message as the temp file returning to BC4, such as a static string "<empty file detected>". This would appear as that text in the Text Compare and result in the file returning equal status. I do *not* recommend this, and can think of the various scenarios I've outlined where this will result in false equal status. But BC4 is fully customizable, so we don't prevent you from implementing this kind of change.

                    Given the current development schedule, the default behavior won't be changing any time soon. I suggest leaving the defaults in place and reviewing any conversion errors for why the file failed.
                    Aaron P Scooter Software

                    Comment


                    • #11
                      Originally posted by Aaron View Post
                      It really isn't. There's no way to distinguish when a conversion utility hands us an empty file because it failed to convert it and hands us an empty file if the file was entirely blank or hands us an empty file because the conversion created a blank file because the source had no valid data to convert.
                      Sounds like you need to get this converter fixed.

                      Originally posted by Aaron View Post
                      Your suggestion to treat empty converted files as valid
                      I didn't suggest that.

                      Originally posted by Aaron View Post
                      If you feel this strongly, you create your own conversion layer
                      I'd hoped that your company would fell strongly enough to get this working properly, but seems not.

                      Thanks anyway.

                      Comment

                      Working...
                      X