Results 1 to 3 of 3
  1. #1
    Join Date
    Sep 2016
    Posts
    2

    Default Generated html report has wrong encoding UCS-2 LE BOM

    Generating an html report from two xml files encoded as UTF-8 generates a report encoded as UCS-2 LE BOM per Notepad++.

    Firefox is able to display the page correctly. However I am first saving the file to the db as a clob via python.

    Something like:
    Code:
    print(open("test3.html").read())
    where test3.html is the report generated by BC.
    Here is some output from the above code:

    *■< ! D O C T Y P E H T M L P U B L I C " - / / W 3 C / / D T D H T M L
    4 . 0 1 T r a n s i t i o n a l / / E N " " h t t p : / / w w w . w 3 . o
    g / T R / h t m l 4 / l o o s e . d t d " >
    < h t m l >

    < / h t m l >
    So there is definitely something in the front mucking things up.

    I read that BC should output the report in the format of the left input file and that is working as expected for any files that are not .XML

    I also double checked and Notepad++ opens the left and right XML files as UTF-8, so the encoding of the input files appears to be ok.

    How can I get my report to be generatated in UTF-8.

    Version 3.3.12 b 18981
    Thanks,
    Dennis

  2. #2
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,618

    Default

    Hello Dennis,

    Which HTML report layout would you be generating, and in the interface which Encoding is detected for the files in the upper status bar?

    Testing with two files in the Text Compare that detect as UTF-8, and the generated HTML report (Side-by-side layout) is generated as UTF-8.
    I tested this with BC3.3.13 and BC4.1.8. All minor updates are free, so you can update to BC3.3.13 here:
    http://www.scootersoftware.com/download.php?zz=dl3_en

    If you can email support@scootersoftware.com in your current BCSupport.zip (Help menu -> Support; Export) and a pair of sample files, let us know which report to generate to test. Please include a link back to this forum thread for our reference.
    Aaron P Scooter Software

  3. #3
    Join Date
    Sep 2016
    Posts
    2

    Default

    Aaron,
    I am not sure if I would be allowed to send in the files per our company policy. I appreciate the assistance but for now I just hacked around the issue. For others in the same boat here is the hack I ended up creating.

    Code:
            try:
                concomitant_pk = qa_query().add_concomitant(beyond_compare=codecs.open(bc_path, 
                                                            'r', encoding = 'utf16').read())
            except UnicodeError: 
                concomitant_pk = qa_query().add_concomitant(beyond_compare=open(bc_path,'r').read())
    So just check for utf16 (apparently that is the same as UCS-2).
    http://stackoverflow.com/questions/1...-ucs-2-be-file

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •