Results 1 to 9 of 9
  1. #1
    Join Date
    Feb 2014
    Posts
    4

    Default Convert docx files

    I am trying to convert docx files to text prior to loading into BC4 under Ubuntu 13.10. On load, I am using this conversion command (below). I am unable to get the resulting filename.txt to load since it is not part of the conversion statement. How do I load the resultant converted txt file rather than the docx?

    /usr/bin/soffice --headless --convert-to txt:Text %s

    Thank you,
    Rob

  2. #2
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,788

    Default

    Hello,

    You would need to wrap the conversion into another, larger script which then takes the converted text file and places it into the %t variable. Where does your script currently put the target/converted .txt file? Can you then write that into a %t target?
    Aaron P Scooter Software

  3. #3
    Join Date
    Feb 2014
    Posts
    4

    Smile

    Aaron -

    /usr/bin/soffice --headless --convert-to txt:Text %s

    the txt in the command above tells the soffice to convert the named %s and change the extension to .txt. So the util manages the output file from the designation above.

    In general, how do I assign a value to the %t. Do you have an example?

    Rob

  4. #4
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,788

    Default

    Hi Rob,

    I installed OpenOffice to play around with this. The --convert-to accepts an output directory, but not a target file. We would need a target file somehow.

    The surrounding script would look in your --convert-to --outdir /bcconvert/*.txt
    then assuming the directory is emptied otherwise:
    cat *.temp >> $2

    where $2 points back to %t as the script. script.bash %s %t from BC3's external conversion. My own Bash scripting skills are not quite up to par to write the whole conversion myself. If there was a way to make soffice pass to a specific filename instead of directory, we could use that more directly.
    Aaron P Scooter Software

  5. #5
    Join Date
    Feb 2014
    Posts
    4

    Default new extension variable

    Is it possible to add a new variable which allows one to specify the new extension (txt) of the same filename?

    It seems I can echo out the resultant destination file, but I don't know how to assign it back to the %t which is not set until after my script would run. It is a chicken/egg issue. I can run my script but it needs to know the %t prior to execution.

    If you supported the destination extension you can then look for the source name with the new extension.

    Rob

  6. #6
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,788

    Default

    Thanks for the suggestion. I'll add this to our wishlist and alternatively see if one of our developers can come up with a better wrapper script. If you are able to create one, please do let us know and we'll take a look at it.
    Aaron P Scooter Software

  7. #7
    Join Date
    Apr 2008
    Posts
    70

    Default Make the file move to %t

    We can't pass a new value of %t back to bcompare, so we have to make the output come to %t

    Code:
    #!/bin/sh
    #
    # docx-to-txt
    #
    # Converts MS Office Open XML (MOO-XML) DOCX to txt using soffice
    #
    # Moves output from a temporary batch folder to target
    #
    # Adrian Wilkins 2014 : licensed free for any use
    #
    # params 
    #
    # $1 = source file
    # $2 = target file
    
    SOURCE=$1
    TARGET=$2
    
    # Make some folders
    OUTPUTFOLDER=$(mktemp -d)
    TEMPHOME=$(mktemp -d)
    
    # Set a new HOME variable for this process or soffice will refuse to start another instance
    # This means this fails silently if you have a document window open
    HOME=$TEMPHOME
    
    # Convert
    soffice --headless --convert-to txt:Text --outdir "$OUTPUTFOLDER" "$SOURCE" 
    
    # Rename converted file to supplied target location
    OUTFILE=$(ls $OUTPUTFOLDER)
    mv "$OUTPUTFOLDER/$OUTFILE" "$TARGET"
    
    rm -rf $TEMPHOME $OUTPUTFOLDER

  8. #8
    Join Date
    Oct 2007
    Location
    Madison, WI
    Posts
    11,788

    Default

    Thanks for the script! We'll take a look at this and see if we can incorporate it or something similar onto our Formats page.
    Aaron P Scooter Software

  9. #9
    Join Date
    Feb 2014
    Posts
    4

    Default

    Thanks this worked well.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •