NEEShub services are beginning to be phased out. Please begin using DesignSafe ( Please submit a support ticket ( if you need any assistance.


Support Options

Submit a Support Ticket


Wish List - Wish List: Wish #62

Member picture

0 Like 0 Dislike

Stanislav Pejša

identification,validation, and charcterization of objects in repository

The versions of formats stored in the NEES Data repository are largely unknown. The file extension and MIME TYPE provide only approximate information in this regard. Format identification and validation provide accurate information about the current state of formats in the repository and possible risks due to format obsolescence.

There are software packages that identify and validate stored formats. These formats can later be related to their potential preservation risks, as one can see from the implementation of DROID in EPrints, which is one the most popular open source institutional repostory

See document

for identification of formats a stand-alone application DROID can be used or the PRONOM bundle

Once the results and state of the NEES Data repository would be known

a) a registry can be build up

b) sets of supported formats can be identified

c) policies can written in respect what to do with unsupported formats.

The installation of the package itself is relatively easy.

Characterisation and validation of files should be part of the “ingest” procedure and we should accept only validated formats, because those will work properly and can be preserved. I estimate it won’t take more then a day to set it up, but implementation will take some planning.

FITS looks like viable option – it encapsulates several of the above-mentions tools

would it work on NFS?

Comments (0)

There are no comments on this item. Make a comment.