Roller32
Algorithm
Site Map Feedback

Download:

Up Adler32 CRC32 MD5 Roller32

Rolling Checksum

The following algorithm is from a program which compares blocks of data using this very fast, rollable checksum. Blocks of data are compared using their checksums. This is a weak checksum and is used as a first pass before using a stronger checksum to ensure that any two blocks are identical. Having got the checksum of a block of data, you can scroll the block one Byte by telling the Roller what character to remove from the beginning of the block and what character to append to the end (if any). The data block does not need rescanning - the checksum updates itself using just the two changed Bytes. As with Adler32, it gives better results the larger the block size and 128 Bytes is a good minimum.

Here's an example:
All you know of the "OtherFile" is a Roller (weak) checksum and an MD5 (strong) checksum for each 512 Byte block...
CFile File;
if(!File.Open(Path, CFile::modeRead)) return;
BYTE Buffer[512];
CRoller32 Roller;
DWORD Bytes;
do {
  Bytes=File.Read(Buffer, sizeof(Buffer));
  Roller.Set(Buffer, Bytes);
  while(Bytes && !OtherFile.Find(Roller.Get()) {  // Try to find a block with this checksum
    char Old=Buffer[0]; // If a matching block wasn't found, move 1 Byte and retry
    memcpy(Buffer, Buffer+1, --Bytes); // Scroll the Buffer 1 Byte:
    Bytes+=File.Read(Buffer+Bytes, 1);
    Roller.Roll(Old, Bytes, Bytes==sizeof(Buffer), Buffer[Bytes-1]);
  }
  ... // Either finished the file (Bytes==0) or found a block.
}while(Bytes);

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.