Page 2 of 2 FirstFirst 12
Results 11 to 17 of 17

Thread: ignoring CRLF

  1. #11
    Join Date
    Jan 2018
    Posts
    28

    Default

    I also see hex 25 showing line breaks too. This corresponds to line feed.

    I think these records with hex 15 or 25 will show up on high volume files so it is not an issue to me. I only filter on differences so expect the % of the file having these breaks to be microscopic( .1 percent or less) Thanks.

  2. #12
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    12,005

    Default

    It sounds like you might be using an EBCDIC mainframe over FTP? These usually perform the conversion themselves for the newline character for FTP clients. If you are in the Text Compare, and use the View menu -> Hex Details, what is the hex value at the end of the line, and is it the same hex value for the data you are trying to ignore?
    Aaron P Scooter Software

  3. #13
    Join Date
    Jan 2018
    Posts
    28

    Question values shown in editor on line breaks...

    Attached are 4 clippings of what I see. I captured the hex value on the Mainframe. I put hex 15 on the 1st 2 records shown and hex 25 on the 3rd record in the clipping. This is 8 bytes into the record. I type the command 'hex on' from the command line in the editor to see this hex representation and enter these 2 values. For the BC editor in text mode you can see the 'last char of lines' clip I attached. This pink circle with 4 marks protruding out is the last character I see on all the
    line endings. I was only able to see zero in hex mode for this end of line value.

    For the lines that have the random unexpected line break, I attached a snip showing the value. It is the pink 0A in hex mode. The text version looks like a J with a half moon at the top. In the chart I see the control character representation as Control + J. If I do control + J in notepad, it jumps to next line with no value.
    I decided to test the hex representation of CRLF in the editor. I hit enter somewhere in the line and the 0D 0A shows as the hex (CRLF) value.

    In my opinion the hex 15 and hex 25 on Mainframe are showing as the hex 0A in the editor. Do you know what
    is going on ?

    Attached Images Attached Images

  4. #14
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    12,005

    Default

    Hello,

    When acting as an FTP server, the mainframe will accept FTP commands from a client, including ASCII or binary transfers. These usually translate line ending characters into something an FTP client could read. It looks like your file has both LF (0A) and CRLF (0D0A) line ending characters in the file.
    https://en.wikipedia.org/wiki/Newlin...specifications

    We don't have explicit control over this, but should behave similarly to other plain FTP clients. We do have some options under the Tools menu -> Profiles dialog to edit settings (like Transfer type), although I'd expect to see a CRLF or LF when accessing an FTP Server.

    BC4 will treat both as line endings, and handles mixed line ending files (since, theoretically, a file should always generate with either/or, not both, but this allows you to find and fix mixed character files). Graphically, we show the two different symbols (circle and J) for each type of the line endings.
    Aaron P Scooter Software

  5. #15
    Join Date
    Jan 2018
    Posts
    28

    Lightbulb more info and suggestion

    The character that I see at the end of each line is the CRLF. That was the circle with 4 marks symbol I was referring to.
    I noticed this after hitting enter inside the record. I also see that the Carriage Return 0D by itself can give a line break.
    So any combinations producing a line break are CR only(0D), LF only (0A), and CRLF (0A0D)

    I copied each of these record scenarios into MS Word. All of these records do the line break too. The editor
    interprets the CR, LF, and CRLF and does the line break. BC 4 follows this.
    Notepad does NOT do any breaking when it gets CR, LF, and CRLF. The record length is picked up from the FTP tool
    and builds the records having the file length detected.

    I am armed with knowledge about what is going on now. Do you think BC can have an option or enhancement to have a hands off approach like Notepad ? Notepad does not line break on CR, LF, CRLF. I can see this being a plus in my case
    since I want to compare high volume raw data files. Thanks for your help.



    Quote Originally Posted by Aaron View Post
    Hello,

    When acting as an FTP server, the mainframe will accept FTP commands from a client, including ASCII or binary transfers. These usually translate line ending characters into something an FTP client could read. It looks like your file has both LF (0A) and CRLF (0D0A) line ending characters in the file.
    https://en.wikipedia.org/wiki/Newlin...specifications

    We don't have explicit control over this, but should behave similarly to other plain FTP clients. We do have some options under the Tools menu -> Profiles dialog to edit settings (like Transfer type), although I'd expect to see a CRLF or LF when accessing an FTP Server.

    BC4 will treat both as line endings, and handles mixed line ending files (since, theoretically, a file should always generate with either/or, not both, but this allows you to find and fix mixed character files). Graphically, we show the two different symbols (circle and J) for each type of the line endings.

  6. #16
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    12,005

    Default

    Hello,

    I'm not sure how you are seeing that behavior in Notepad. Is it with a copy paste? Or directly launching from which program? A CRLF should be a line break in Notepad (while the other two would not).
    Aaron P Scooter Software

  7. #17
    Join Date
    Jan 2018
    Posts
    28

    Default notepad behavior

    When I do a 'save as' from the BC editor and save to a file with .txt extension, there are no line breaks
    when it sees a CR, LF, CRLF. I open up the .txt saved file and no lines have been broken.
    I also did the FTP path from the Mainframe. When the ftp'd file lands on the PC, I rename it to a .txt extension.
    This text file in notepad shows no line breaks. Using these 2 methods to get the file to PC as a .txt, I see no
    line breaking. That is why I feel the CR, LF, CRLF are ignored.

    If I do a copy (Ctrl + C) of the line with CR,LF, CRLF and paste(Ctrl + P) into MS Word, the line is broken. This makes me come to the conclusion that Word 'processes' the CR, LF, CRLF and does the line break. Notepad IMO has a hands off approach with no line breaks.



    Quote Originally Posted by Aaron View Post
    Hello,

    I'm not sure how you are seeing that behavior in Notepad. Is it with a copy paste? Or directly launching from which program? A CRLF should be a line break in Notepad (while the other two would not).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •