Announcement

Collapse
No announcement yet.

Snapshot format (bcss)

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Snapshot format (bcss)

    Hello,

    is there a format description of the bcss-files (snapshots) available? The bcss looks like a binary format.

    Or same as a feature request: Please store the snapshot as a xml (or a zipped xml). Then I could build my own snapshots.

    It could be a solution for Changeable snapshots.

  • #2
    Hi knut,

    Here it is, updated through the current release. I created an uncompressed example that demonstrates most of the advanced features (UTF-8, symlinks, CRCs, version info). BC will read uncompressed snapshots, but won't produce them. I've also attached compressed samples with mixed languages, from both Windows and Linux, to demonstrate the Unicode handling.

    I apologize in advance for the fact that it wasn't planned as well as it should have been, and that it gained some quirks to workaround old bugs.

    If we ever break backwards compatibility I'm pretty sure we'll switch to zipped XML. I'd say making them editable is more likely, but neither is scheduled yet.
    Attached Files
    Last edited by ZoŽ; 20-Jan-2017, 10:42 AM. Reason: Updated "BCSS Binary Format.txt" and added Unicode samples
    ZoŽ P Scooter Software

    Comment


    • #3
      Thanks a lot Craig,
      let's see if I can use it for my purposes.
      Knut

      Comment


      • #4
        I published a little ruby-gem[1] : http://rubygems.org/gems/bc3
        A documentation may be found at http://rubypla.net/bc3/0.1.0/. I hope it is ok, that I added the snapshot description to the documentation of my gem.

        The tool allows to generate snapshots based on file system, other snapshots or "hand coded".

        If you like to use my tool, you must install ruby 1.9.
        I will check, if I can build an exe from bc3_merge.rb

        Comment


        • #5
          Knut,

          Very cool! I'm glad you were able to put it to use. It's perfectly fine that you included the snapshot docs, though if I'd been thinking I would have included our URL in the header for contact info.

          I wish I wasn't so rusty with Ruby, so I could give it a more thorough go through.

          One thing I did notice is that parse.rb:: parse_file_extended_headers isn't a loop; it should keep type bytes until it finds a value it doesn't recognize, then it should break. For example, you could have a file that has both a version string and a UTF-8 encoded name, and your error handling won't detect that. It won't really hurt anything in the current code/snapshots, but I thought I'd mention it.

          Anyway, again, it looks neat. Keep us posted if you do anything else with it.
          ZoŽ P Scooter Software

          Comment


          • #6
            Oh, I almost forgot. I installed the Ruby 1.92 Windows binary install and got this output when I tried to install the gem:

            Code:
            Successfully installed log4r-1.1.9
            Successfully installed bc3-0.1.0
            2 gems installed
            Installing ri documentation for log4r-1.1.9...
            Installing ri documentation for bc3-0.1.0...
            
            RDoc::Parser::Ruby failure around line 2 of
            lib/bc3/time.rb
            
            
            "\r  end\r\rend #Time\r\n"
            Before reporting this, could you check that the file you're documenting
            compiles cleanly--RDoc is not a full Ruby parser, and gets confused easily if
            fed invalid programs.
            
            The internal error was:
            
                    (TypeError) can't convert nil into String
            
            ERROR:  While executing gem ... (TypeError)
                can't convert nil into String
            It did extract everything.
            ZoŽ P Scooter Software

            Comment


            • #7
              Hello Craig,

              according UTF-8: I ignored parse_file_extended_headers and anything with UTF-8. (Ignoring means, I just read it and don't use it). Perhaps it is included in one of my next versions - it's part of my todo list :-).

              About the error: I will take a look for it. But it should be harmless. The gem is installed, you may use it. The documentation (rdoc = ruby documentation) seems to have a problem (Line end problem with windows/unix?).


              In meantime I created an exe for bc2_merge: http://rubypla.net/bc3/bc3_merge.exe

              My todo list for bc3-gem:
              - UTF-8/parse_file_extended_headers
              - Export/Import snapshots as yaml-files.
              - Advanced search functions (duplicates...)
              - statistic
              - Easier creation for virtual snapshots.

              Knut

              Comment


              • #8
                Originally posted by Craig View Post
                Very cool! I'm glad you were able to put it to use. It's perfectly fine that you included the snapshot docs, though if I'd been thinking I would have included our URL in the header for contact info.
                If you adapt the the doc, you may send it to me and I replace it in the new version.

                My biggest problem with the doc was the usage of little endian encoding. Example: DOS Attributes are UInt32. A directory has the value 16 (Hex F), but you may not use it as "00 00 00 0F", you must store "0F 00 00 00". It took some time of evaluation of your example file to get it right ;-)

                But it was a good lesson for me. I understand now some basics better I learned in the past.

                Knut

                Comment


                • #9
                  Originally posted by knut View Post
                  If you adapt the the doc, you may send it to me and I replace it in the new version.

                  My biggest problem with the doc was the usage of little endian encoding.
                  I updated the attachment in my previous post to add copyright/contact info, and clarified the endianness issue.

                  BC has always been i386-only, so little endian was just easier to read and write. I'm more embarrassed that it doesn't store timestamps in UTC, but I knew a lot less then than I do now.
                  ZoŽ P Scooter Software

                  Comment


                  • #10
                    Hello,
                    I updated my bc3-gem.
                    I added a search tool. You can search inside multiple snapshots and create new snapshots with the result. One search feature is the search for duplicates. With
                    Code:
                    bc3_search.rb -D '*' *.bcss -B duplicates.bcss
                    you should get a new snapshot with duplicates from different snapshots.

                    Details see http://gems.rubypla.net/bc3/0.1.1/fi...search_rb.html

                    Up to now I made no real life tests, only development tests ;-)

                    There are also binaries for the merge and search tool, see http://gems.rubypla.net/bc3/


                    With the gem I parsed you example snapshot and recreated the snapshot. The two results have now the identic content (well, on binary level, there are differences in timestamps. But I think they are at millisecond level).


                    I have two questions:
                    The unicode features: Are they only for Unicode systems? Up to now I found no need for the features. Is it for the Linux version? I'm using Win XP and Win 7.

                    The path in the header: Can I see the path anywhere in BC3? I didn't knew, that it was stored before I looked at the internals. (I'm not missing it, just curiosity).

                    Knut

                    Comment


                    • #11
                      Originally posted by knut View Post
                      The unicode features: Are they only for Unicode systems? Up to now I found no need for the features. Is it for the Linux version? I'm using Win XP and Win 7.
                      The UTF-8 bit in the .bcss header will only be set when creating snapshots OS X/Linux. It could be used on Windows to save some space, but BC2 wouldn't decode the filenames correctly.

                      On Windows the UTF-8 name in the extended file headers is included for any file with extended (non-ASCII) characters. If you only use filenames in your native language you don't need them. It's included to support mixing languages (Greek/CJK/Thai/Cyrillic/etc) and for cases where you move snapshots between systems with different native languages.

                      I updated my previous post again and added snapshots containing mixed filenames from both Windows and Linux so you can see the difference. There's also a second pair showing the case where the source path needs UTF-8 encoding. Those 4 are all compressed. If you want to see what they look like if you don't use the Unicode features just load them in BC2.


                      Originally posted by knut View Post
                      The path in the header: Can I see the path anywhere in BC3? I didn't knew, that it was stored before I looked at the internals. (I'm not missing it, just curiosity).
                      If you double click a .bcss file in Explorer BC reads the header and loads a comparison with the snapshot on one side and the original path on the other.
                      ZoŽ P Scooter Software

                      Comment

                      Working...
                      X