- 1 Curation
- 2 NEES Experiment Timeline
- 3 NEES Data Lifecycle
- Experiments or projects are not curated yet. They may be new objects (entities) or do not contain any data yet.
- Experiments or projects contain everything required by the Data Archiving and Sharing Policies but more material is expected to be deposited shortly in order to satisfy the requirement of completeness necessary for archiving of a research project. For instance the project contains all data, but the documentation is missing, since the project is not officially over it shouldn’t be marked as incomplete. Current is a transient status that will be changed into Incomplete or Complete depending on compliance with the NEEScomm policies.
- Experiments or projects are missing components essential for successful curation and effective archiving. Typically, if a given object is behind the schedule based on the Data Archiving and Sharing Policies its status is marked as Incomplete.
- Experiment or project do satisfy NSF requirements for archiving and the object is properly described
- Project was determined to have historical significance and should be preserved in the NEES Data Repository, but does satisfy all requirements for completely curated project
Data curation is defined as the activity of organising and validating all relevant data to provide a complete and accurate description of an experiment, including set-up details such as testing methods, specimen drawings, and sensor locations. Curation indicates that a experiment is fit for archiving and future re-use. Files submitted to the repository have to be available in formats that the earthquake engineering community can use for current and future endeavours in research, education, and practice.
Every project or experiment that should be curated MUST contain:
- unprocessed data
- corrected data
- metadata and documentation
- accurate metadata on the ‘About’ page (dates, names of the research team, meaningful description)
- accurate drawing of the specimen
- list of sensors and their locations
- instrumentation plans
- material properties
- equipment used
- final report (project)
- experimental setup report (experiment)
- executive summary (project)
- experimental setup report
- is a written report that includes descriptions of the geometry of the test specimen, the material properties, the sensor locations, boundary conditions, the loading scheme, etc. in narrative form. It represents an updated and expanded version of the Equipment Site Utilization Form (ESUF) that was submitted to the equipment site before the experiment took place. The Experimental Setup Report should be placed in the Documentation folder under the associated Experiment.
- executive summary
- Every project should contain an executive summary directed at practitioners and researchers looking for specific data. The summary should include results and impact of the project, key figures, and plots. It should be uploaded as a PDF file, maximum 2 pages.
All descriptive metadata are considered essential for easy search, browsing, collocation of, and access to research data. An effort should be made to complete the project and experiment descriptions.
Researchers should enter titles, dates, and descriptions for all levels of researcher projects (project, experiment, trials, and repetition) should have filled in titles, dates, and description.
- Identify each object.
- Anchor objects chronologically.
- Provide the context and the methods used. In case of experiments, trials, and repetitions, description should provide information that differentiates the objects on the same level.
For more details on individual fields refer to the User Data Model
Currently the Project Warehouse accepts most of mainstream file formats (with certain exceptions), but the recommended file formats are given below. Researchers are also advised to avoid proprietary and uncommon formats.
- Sensor measurements
- tab-delimited ASCII or CSV
- Reports, publications and other documentation
- PNG, JPG, and GIF (avoid BMP files; they will be accepted but will be displayed on the NEEShub in another format)
- Frame captures
- ZIP, TAR, TAR.GZ – the above-mentioned limitations for images also apply.
- currently there are no restrictions
- are data as they come out of the data acquisition (DA) system. These data can be collected both in volts or in engineering units and uploaded in tab delimited or comma-separated ASCII files. For video or photos, unprocessed data are the files produced by the camera and they may be cut to represent a single repetition.
- are data in tab-delimited or comma-separated ASCII format, converted from volts to engineering units and without zero offsets (if applicable). Each column within the file should provide a sensor label and the units of measure. If the DA outputs data meeting this description, the Converted Data folder can be left empty and the files placed only in the Unprocessed Data folder. . For video/photo data, converted data are files in which the format has been changed.
- are data in tab-delimited or comma-separated ASCII1 format in engineering units (each column within the file should provide a sensor label and the units of measure) and may be corrected to exclude:
- data from sensors that malfunction during the test
- data acquired from long pauses
- errors due to the use of incorrect calibration constants
- correcting for lens distortion (video/photo)
- modifying brightness (video/photo)
- adding a title (video/photo)
- are files that include the corrected data plus columns of data that are functions of existing columns. This data may be re-sampled if appropriate. Each column within the file should provide a sensor label and the units of measure.
The procedures used to compute the derived data must be documented in enough detail such that another researcher could re-create the derived data. The data analysis procedures should be documented in a PDF file placed in the Derived Data folder.
The suggested data file format in the comma-separated-values (CSV) format as ASCII. Each column should correspond to a sensor or a physical quantity, with the first column referring to time. The first row shall include sensor labels (matching those used in sensor maps and sensor lists) and the second row shall include units. Each file shall be labelled using the ID of the specimen it refers to and trial number (and possibly also project and experiment), date of measurement collection and the software that produced the the data. Each file should also include a one-liner explaining the type of data collected, unless the file title is descriptive enough.
The current practice is that metadata are preceded by a symbol that instructs application to skip the line which starts with that symbol – hash sign (#) is frequent, so is exclamation mark (!), or asterisk (*). Below is a modified example based on real data file. This output can often provided automatically by the DAQ system.
|Example: # NEES Metadata|
(this is a work in progress, table is being continuously updated)
— last update 2011-08-16
|extension||Site that creates those||data type||dependencies||binary?||unit out put||opens with||produced by|
|beau||NEESBerkley||Unprocessed_Data||.raw and .beucounts||binary||engineering units||—|
|beucounts||NEESBerkley||Unprocessed_Data||.raw and .beu||binary||mV/V||—|
|raw||NEESBerkley||Unprocessed_Data||.raw and .beucounts||binary||mV/V||—|
|mat||Illionis||Unprocessed Data||–||–||engineering units||matlab||produced by dSPACE DAQ|
|vna||Illinois||Unprocessed Data||–||–||mV/V||matlab||produced by Siglab DAQ|
Each piece of equipment needs to be properly described. The item name, ID, and output units are required fields, other attributes such as manufacturer, model, calibration constant, accuracy, and date of calibration are optional.
Suggested naming format for sensors:
|Camera (Still Image)||CASØØØ|
|Electrical Strain Gage||ESGØØØ|
|Linear Variable Differential Transformer||LVDØØØ|
|Standard Penetration Test Equipment||SPTØØØ|
There are three really important milestone in the data life-cycle all researchers need to be aware of. The unprocessed data need to be uploaded a month ( 1) after experiment is completed. All other data need to be uploaded to the NEES Data Repository six (6) months after experiment is completed, and twelve (12) months after completion of the experiment these data need to be made public, which means that researchers have six months to analyze and work on their data.
select id,display_name from EXPERIMENT_DOMAIN
|3||Tsunami Wave Basin|
If you have any questions or comments regarding data organization or curation please leave comment on this page or send an email to firstname.lastname@example.org