[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: data corruption
On Thu, 07 Jun 2001 22:39:09 -0500, Tom Droege <tdroege@veriomail.com> wrote:
*>Does anyone have a good 18e?
*>
*>Tom Droege
Some TASS members may know that my business is supporting "very old"
personal computers: in particular computers built with S-100 bus cards
manufactured between 1976 and the mid 1980's. I mention this because
in those times, reliability of hard drives, floppy drives, and other
media was questionable. The issue seems to have come up again. It's
still an issue whenever files are sent across networks, as a transfer
can be interrupted.
The traditional way that one verified a file was to check it against
the original. This is not an option for TASS as the "original" CD-ROM
is far away. At the least, a seperate list of filenames, sizes and dates
could be added to a distribution CD: a "DIR" or "ls" command will produce
such a file. However this will not verify CONTENTS, just length. But
it's better than nothing.
The traditional way to verify contents was to use a list of files and
their checksums, and a program which would do one of the following:
read that list, read each file on the list, and compare a computed
checksum vs. the list; OR create a list of files and checksums. A
"checksum" is a single value that is computed by adding all the bytes
of a file in a prescribed way to create a "unique" number. I put
unique in quotes because there is always a slight chance that two
files will have the same checksum, and a smaller chance that such
will occur in a single file due to corruption. Consequently, the checksum
algorithm is usually some kind of polynomial expression; and the
checksum value is often a 32-bit value (setting the odds of duplication
to one in 2**32).
(A side point: floppies, hard drives, and other media use checksums at
the sector level for verification. This is how a storage device knows
to report a read error.)
A review of comprehensive references of computer algorithms will find
reasonable programming samples for checksum computation. Or a search of
archives of shareware/freeware will find such programs. It's a smart
idea to include the programs themselves (and their source code for
any long-term archive) with a distribution of the files. In the case
of TASS, it's fair to say that its checksum program set will have to
run in Windows, Unix/Linux (x86 version and perhaps Sparc), and (if
you please) MS-DOS. I'd suggest the source be in C and (perhaps)
BASIC. Least effort may be for one person to grab appropriate C source
and compile for all the above. I myself cannot do this.
IT IS IMPORTANT THAT ALL VERSIONS IN ALL OPERATING SYSTEMS PRODUCE THE SAME
CHECKSUM PER FILE.
Finally, I note the following from my old computer experience, AND my prior
experience with radio astronomy data from 10 to 30 years ago. It's actually
a blessing that this problem emerged NOW, so it could be solved; rather
than 10 or 20 years from now, when the CD-ROM's started to degrade.
Herb Johnson
Herbert R. Johnson http://pluto.njcc.com/~hjohnson
hjohnson@pluto.njcc.com voice 609-771-1503, New Jersey USA
amateur astronomer and telescope tinkerer
reseller of classic Macs & accessories from Plus to PowerMac
S-100 & 8-inch drive manuals and parts, call for "Dr. S-100"
- References:
- Disk 18e
- From: Tom Droege <tdroege@veriomail.com>