CHECKIT : A file integrity tool

Table of Contents

INTRODUCTION

Checkit is a file integrity tool for Linux and Unix systems.

Checkit Copyright (C) 2014 Dennis Katsonis

E-Mail dennisk (at) netspace.net.au

LATEST NEWS

September 14 2014 Sun - v 0.3.0 released Checkit now allows you to set, on a per file bases whether the checksum is read/only or read write.

June 7 2014 Mon - v 0.2.1 released Minor big fixes Code cleanup

April 13 2014 Sun - v 0.2.0 released Some bug fixes. Can now read list of files to process from stdin.

April 5 2014 Sat - v 0.1.1 released. Added support to make crc files on VFAT and NTFS hidden.

March 30 2014 Sun - v 0.1.0 available.

This program has just been made available

DOWNLOAD

WHAT THIS PROGRAM DOES:

Checkit adds additions data assurance capabilities to filesystems which support extended attributes. Checkit allows you to detect any otherwise undetected data integrity issues or file changes to any file. By storing a checksum as an extended attribute, checkit provides an easy way to detect any silent data corruption, bit rot or otherwise modified error.

This was inspired by the checksumming that is performed by filesystems like BTRFS and ZFS. These filesystems ensure data integrity by storing a checksum (CRC) of data and checking read data against the checksum. With mirroring of data, they can silently heal the data should an error be found.

This program does not duplicate this ability, but offers rudimentary checksum abilities for other filesystems. It simply calculates a checksum and stores the checksum with the file. It can then be later used to verify the checksum against the data. Any data corruption of file changes would result in a failed check.

WHY IT WAS CREATED:

Moving data from disk to disk, or leaving data on the disk, leaves a very small possibility of silent data corruption. While rare, the large amounts of data being handled by drives make silent corruption a real possibility. While BTRFS and ZFS can handle this, other filesystems can't. This program was created to add an ability to detect (but not fix) issues. With the ability to detect, you can easily find out whether a copy or extract operation occured perfectly, or whether there has been bit rot in the file.

Backups provide point of reference, but comparison isn't very efficient, as it involves reading two files. Using a CRC, you only need to read the file once, even after a copy operation to determine whether the file is OK. Also, should the file be different from the backup, which copy is OK, the backup or the original? Checkit will let you know whether it is the backup or original which has changed.

There are other ways to do this. You can use a cryptographic hash and store a SHA-1 or MD5 or SHA-256 value in a seperate file, or even use GPG and digitally sign the file. The problem is, that the value is stored in a seperate file, and doing directory recursion, or all files of a particular type (i.e., all JPG's isn't as straighforward. With checkit, the CRC is stored as an extended attribute. It remains as part of the file, and can be copied or archived automatically with the file. No need for seperate files to store the hash/checksum.

Checkit also has the ability to export this to a hidden file, and import it back into an extended attribute.

HOW TO USE:

Checkit calculates and stores the CRC as an extended attribute (user.crc64) or as a hidden file. The file must reside on a filesystem which supports extended attributes (XFS, JFS, EXT2, EXT3, EXT4, BTRFS, ReiserFS among others) to use the extended attribute (recommended).

OPERATION:

checkit [OPTION] [FILE]

Options:

-s Calcuates and stores the checksum.

-c Check file against stored checksum.

-v Verbose. Print more information.

-p Display CRC64 checksum.

-x Remove stored checksum. This simply deletes the extended attribute.

-o Overwite existing checksum. By default, checkit does not overwrite an existing checkum. This option allows you to update the checksum, should the file be deliberately altered).

-r Recurse through subdirectories.

-e Export CRC to a hidden file.

-i Import CRC from a hidden file.

-f Read list of files to process from stdin.

-d Disallow updating of CRC on this file (for files you do not intend to change).

-u Allow CRC on this file to be updated (for files you intend to change).

-V Display license.

[FILE] can include wildcards.

Examples:

Calculates checksum of picture.jpg and overwrites old CRC64

checkit -s -o picture.jpg

Processes current directory and all sub-directories and files.

checkit -s -r .

Check the enture pictures directory. Checkit will report whether all files are OK or not.

checkit -c -r pictures/

Check all JPG images under /media/photos

find /media/photos/ -iname '*.JPG' | checkit \-cf
checkit \-d  dissertation.txt   ;Sets the CRC as read only.
Checkit will NOT update the CRC if you try to store the checksum again.
checkit \-u dissertation.txt    ;Setc the CRC as read write.
Checkit will update the checksum if you run it with the -s option.

LIMITATIONS:

As checkit doesn't repair files, you need to ensure that you have backups of important data. Checkit by default stores the CRC in an extended attribute. This attribute won't be transferred when copying to a filesystem which doesn't support extended attributes, or archived using an archiver which doesn't store them. Note that some file managers may not copy extended attributes by default. You should export the checksums to a hidden file using the -e option beforehand if using an archiver or copying to a filesystem which does not support extended attributes.

TODO:

  • Intergration with file managers.
  • Repair capability.

Date: 2014-09-14T09:05+1000

Author: Dennis Katsonis

Org version 7.9.3f with Emacs version 24

Validate XHTML 1.0