How to write a packer plugin – extracting file doesn't work

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • roelandsch
    Journeyman
    • Nov 2016
    • 11

    How to write a packer plugin – extracting file doesn't work

    I've written a packer plug-in for an archive file format we're using internally. Listing the files works, but extracting a file always fails with a file not found error.

    I can follow along in the debugger how ReadHeaderExW and ProcessFileW are called in a loop to list the files, and then again to unpack the file. However the unpack loop doesn't stop at the desired file, instead it runs through the entire archive and then reports failure.

    I debugged another packer plug-in, and the unpack loop extracted the file and stopped as soon as the desired file is returned from ReadHeaderExW. I looked closely at the data returned by ProcessFileW but didn't find anything set up differently than our plug-in.

    Does Beyond Compare rely on some other behaviour to unpack a file? Is there any other way I can debug this problem?

    --
    PS Here's what the reference from Total Commander says about these operations. BCompare appears to do the same.
    Here is a simple pseudocode declaration how Total Commander calls the extraction functions:
    1. Loop to scan for files in the archive:
    Code:
    OpenArchive()          with OpenMode==PK_OM_LIST
    repeat
       ReadHeader()
       ProcessFile(...,PK_SKIP,...)
    until error returned
    CloseArchive()
    2. Loop to extract files from the archive:
    Code:
    OpenArchive()          with OpenMode==PK_OM_EXTRACT
    repeat
       ReadHeader()
       if WantToExtractThisFile()
          ProcessFile(...,PK_EXTRACT,...)
       else
          ProcessFile(...,PK_SKIP,...)
    until error returned
    CloseArchive()
  • Aaron
    Team Scooter
    • Oct 2007
    • 16017

    #2
    Hello,

    Our developer who helps handle the packing implementation is out of the office sick today, but I'll ping them with this as soon as they are back in.
    Aaron P Scooter Software

    Comment

    • Zoë
      Team Scooter
      • Oct 2007
      • 2666

      #3
      BC uses the same loops that Total Commander does. What you're running into is probably a problem with the paths returned in the header data's Filename field.

      When loading an archive BC's path parsing code is fairly flexible with what it accepts for path formats, since that's shared among all of the different archive backends. Unfortunately, the WCX support isn't as flexible when it comes to extracting files. Rather than storing the paths as returned, it relies on being able to recreate them on demand and the check to see if it matches the one returned by the packer plugin isn't forgiving right now.

      To make it work, the paths need to use '\' delimiters and there shouldn't be any leading ., /, or \ characters. It's case insensitive and converts the paths to upper case before comparing them.

      These should work for extract:

      abc.txt
      DIR\abc.txt

      These will show correctly in the folder listing, but won't be extracted:

      .\abc.txt
      DIR/abc.txt

      Approximate pseudo-code:

      Code:
      // Build path in archive
      Filename = SrcFile.Name
      SrcFile = SrcFile.Parent
      while SrcFile != ArchiveFile do
         Filename = SrcFile.Name + '\' + Filename
         SrcFile = SrcFile.Parent
      Filename = AnsiUpperCase(Filename)
      
      // Find file to extract
      OpenArchive(PK_OM_EXTRACT)
      while ReadHeader() do
         if Filename == AnsiUpperCase(string(Item.Filename))
            ProcessFile(..., PK_EXTRACT, ...)
            Break
         else
            ProcessFile(..., PK_SKIP)
      CloseArchive(Archive)
      Zoë P Scooter Software

      Comment

      • roelandsch
        Journeyman
        • Nov 2016
        • 11

        #4
        Hi,

        Thanks for the reply.

        For testing I just return a few hard-coded file names, and it still doesn't work. The file listing works as expected. On extracting ProcessFileW is always called with Operation set to 0. I'm missing something else.

        The actual archive reader indeed returns paths in the form "path/to/file.txt". It does not return entries for the directories.

        Code:
        // OpenArchiveW returns a pointer to a "new int"
        
        DLLEXPORT int __stdcall ReadHeaderExW (HANDLE hArcData, tHeaderDataExW *hd)
        {
            memset(hd, 0, sizeof(*hd));
            int *n = (int*) hArcData;
            if (*n == 3) { return E_END_ARCHIVE; }
        
            wsprintf(hd->FileName, L"FILE%d.TXT", *n + 1);
            hd->PackSize = 12;
            hd->PackSizeHigh = 0;
            hd->UnpSize = 12;
            hd->UnpSizeHigh = 0;
            hd->FileTime = 0;
            hd->FileAttr = 0x20;
        
            return 0;
        }
        
        DLLEXPORT int __stdcall ProcessFileW (HANDLE hArcData, int Operation, WCHAR *DestPath, WCHAR *DestName)
        {
            int *n = (int*) hArcData;
            // if Operation is PK_EXTRACT, write "Hello World!" to the given destination file
            ++*n;
        }
        --
        Roeland

        Comment

        • Zoë
          Team Scooter
          • Oct 2007
          • 2666

          #5
          Are you able to provide us with a copy of your DLL and an archive if necessary? I'm out of the office today, but I could debug it from within BC on Monday. Just send a copy to [email protected] with a link back to this thread.

          One note though: if there's a directory structure, you need to use path\to\file.txt, *not* path/to/file.txt.
          Zoë P Scooter Software

          Comment

          • roelandsch
            Journeyman
            • Nov 2016
            • 11

            #6
            Yes, my bad. The plug-in is returning backslashes.

            I'll send you a simple plug-in.

            --
            Roeland

            Comment

            • Aaron
              Team Scooter
              • Oct 2007
              • 16017

              #7
              Thanks, I see the email. Zoe is out sick today, but I'll pass on the email when she's back in the office.
              Aaron P Scooter Software

              Comment

              • roelandsch
                Journeyman
                • Nov 2016
                • 11

                #8
                Hi,

                I’m still stuck, can you figure out why Beyond Compare decides the opened file is not in the archive?

                --
                Roeland

                Comment

                • Zoë
                  Team Scooter
                  • Oct 2007
                  • 2666

                  #9
                  Sorry, we've been preparing a new release and I only got time to start looking into recently.

                  The problem occurs because Beyond Compare's version of tHeaderDataExW was declared incorrectly and used byte packing instead of natural packing, which didn't cause a problem on 32-bit builds, but means it's 8-bytes short on 64-bit ones. The call to memset you have at the top of ReadHeaderExW clears the stack variable where we're storing the filename, so it's trying to match against an empty string.

                  Beyond Compare already zeros out the structure before every call to ReadHeader, and the packer plug-ins on the Total Commander website that include source don't do it themselves, so I assume Total Commander handles it internally too. We don't use any of CmtBuf, CmtBufSize, CmtSize, CmtState, or Reserved so the misalignment itself doesn't cause any problems.

                  This will be fixed in an upcoming release, but for now you should be able to just remove the unnecessary memset call and your plugin will work with both current and future releases.
                  Zoë P Scooter Software

                  Comment

                  • roelandsch
                    Journeyman
                    • Nov 2016
                    • 11

                    #10
                    OK, thanks . It's working now.

                    --
                    Roeland
                    Last edited by roelandsch; 20-Dec-2016, 02:15 PM.

                    Comment

                    • Zoë
                      Team Scooter
                      • Oct 2007
                      • 2666

                      #11
                      Yes, we zero out the entire structure before every call to ReadHeaderExW.
                      Zoë P Scooter Software

                      Comment

                      • Chris
                        Team Scooter
                        • Oct 2007
                        • 5538

                        #12
                        The Total Commander packer plug-in bug is fixed in Beyond Compare 4.2 beta.

                        Beta page: http://www.scootersoftware.com/beta
                        Chris K Scooter Software

                        Comment

                        • roelandsch
                          Journeyman
                          • Nov 2016
                          • 11

                          #13
                          Thanks.

                          In general I think the hard part of writing a plug-in like this is finding documentation. I found a .HLP file on totalcmd.net, but this file format cannot be read anymore in current versions of Windows (8.1 and 10).

                          Comment

                          Working...