Data Archiver

Data Archiver is an application which may be used to remove and archive data from the Database. This includes Routines, Subroutines, Alignments, nominal information (i.e. characteristics, limits, process baselines, etc.), and samples. All Data Archiver functionality is controlled within a single User Interface. Click on the various regions of the image below to learn more about each option.

Data can simply be archived (a blueprint of the data saved to a file which may be preserved as a record while the data remains in the Database, or it may be archived AND removed from the Database. Data may also be removed from the Database without storing an archive of that data.

Once the data has been removed, it will no longer exist in the Database. However, the archived files (.4ml) may be accessed using the DataArchiver Restore.4jr file supplied in order to be read back into the Database. Once data has been archived and restored to the Database, an Archive association will flag that data so that, if DataArchiver is run a second time, it will not remove any data flagged with that association. Such data must be removed manually.

Database

The Database menu lists the DataSources which are currently available. This list is obtained from the CM4D.4ds file. Refer to the CM4D Help Documentation for more information on the Cm4d.4ds file.

Filters

The Database selected for archiving may be further defined using Routine filters defined in the Database. Select a Managed DataSource from the menu and then click the Filters button to open the Routine Filters dialog.

Operations

Data Archiver will use one of six options to determine what data will be archived and/or removed from the Database.

Create Archive for Routines

To archive the routines from the selected Database, enable the Create Archive for Routines check box. Archived Routine data will include any nominal and sample information contained within the selected routine. For instance, you may wish to archive samples while retaining a "snapshot" of the nominal information that goes with the actuals. You would then archive the Routines along with your samples, but do not select the "Remove" option.

Remove Routines from Database

To remove the Routines completely from the Database, select the Remove Routines from Database check box.

Create Archive for Subroutines

To archive Subroutines from the selected Database, enable the Create Archive for Subroutines check box. Archived Subroutine data will include any Routines, alignments and nominal data assigned to the selected Subroutine.

Create Archive for Samples

To archive Samples from the selected Database, enable the Create Archive for Samples check box.

Remove Samples from Database

To remove Samples completely from the Database, select the Remove Samples from Database check box.

Sample Archive Method

Select the Sample Archive Method by which you would like to archive your sample data. All three are based on the translator used to add them to the Database.

Expired Samples

Before samples are read into the Database, an Archive Date or Number of Days (comparable to an expiration date) can be set using the Sample Archive association in DataSmith. When the Expired Samples Archive Type is selected, any samples assigned an Archive Method (None-0, Date-1, or Days-2) will be examined individually by Data Archiver. Any samples with the Archive Method set to None will not be archived. All samples assigned either the Archive Method Date or Archive Method Days will be archived according to the following conditions:

Sample Archive Date

The Sample Archive Date indicates the date that the sample file will be archived, as defined by the user. All samples with an Archive Date earlier than the date that Data Archiver is run will be archived. For example, if the archive date November 7, 2005 is set for a sample, and the date that Data Archiver is run is November 15, 2005, then Data Archiver will flag any samples with the date November 7, 2005 as expired and they will be archived.

Sample Archive Number of Days

Samples will remain in the database for a certain number of days after the sample create date. The period of time between the sample create date and the date that you are archiving the sample determines how long the sample file will remain in the database. Those samples with a create date that is more than the number of Archive Days before the current date will be archived. For example, if the archive number of days is set to 180, when Data Archiver searches the database, any sample with an import date more than 180 days prior to the current date will be archived.

Before Date

When the Before Date option is selected, any sample with a create date before the date set in Data Archiver's Before Date field will be removed from the database and archived.

Excess Samples

The Excess Samples option relates to the Max Samples attribute of a routine that is stored in the database. If you select the Excess Samples option, and the Max Samples for any of your routine's are set to zero, those routines will be ignored, and any samples in that routine will not be available for archiving. However, if a routine has Max Samples set to any number (other than zero), any samples exceeding that routine's Max Samples number will be archived. For example, if Max Samples is set to 5,000, and there are 8,000 samples in the routine, then the oldest 3,000 samples will be archived.

All Samples

The All Samples option will archive and/or remove all samples that exist in the selected Database.

Archive Folder

When using Data Archiver, you need to assign the location where you want your archived data to be stored. There are three ways to create and/or assign your base archive folder:

1.     Browse to a pre-existing folder using the Browse button to the right of the text field. Selecting a folder using this method will assign that folder location as the top-level archive folder.

2.     When you have a pre-existing folder, you can type the archive folder path manually to assign that folder location as the top-level archive folder.

3.     When you do not have a pre-existing folder, you can type the archive folder path and new folder name manually, and Data Archiver will automatically generate the folder.

As sample files are extracted from the database by Data Archiver, folders are generated by Archiver per Database according to the Database label and the current date. Within each Database folder, another folder will be created for each routine, labeled with the routine key. Within each routine's folder, the sample files are labeled and stored by sample key and the date archived ( yyyy-mm-dd). If Data Archiver is run more than once per day, any subsequent files generated will be appended to the already existing folder for the current date (if data is from the same Database).

Archived Files

The sample files archived will be saved in XML format with the extension ".4ml". Data files with the .4ml extension have the capability to be read back into DataSmith using the "Archive Restore.4jr" translator provided with the CM4D suite. When archived data files are read back into the database, the Archive Restore option will be enabled in the Config properties dialog. This option flags the samples as "Restored" in the Database, thus differentiating "Restored" data from "raw" data. Archived data is deleted from the Database once the Archiving process is completed, but if an archived file is restored to the Database, that archive file remains in the archive folder. The Restored flag is important because it tells Data Archiver to ignore any data that has already been archived, thus avoiding duplicate archive files. Restored files must be removed from the Database manually using DataUtility.

Performance Tuning

The Performance Tuning switch refers to the portion of samples that will be archived at a time. When Data Archiver is run, it first examines the Database to identify all of the samples to archive. Data Archiver then performs the archive operation on subsets of all of the samples identified. Each subset will be archived before Data Archiver automatically continues to the next subset of samples, until all of the selected samples have been archived. The performance Tuning switch controls the size of the subsets.

Setting the Performance Tuning switch to a lower level will decrease the number of samples archived at one time, as well as reduce the amount of memory used during the archiving process. This is a preferential setting and may differ from one machine to the next, depending on the amount of memory you want to allocate to the task, the number of samples to be archived, etc.