Data Archiver

Topic Contents [Hide]Topic Contents [Show]
  1. Database
    1. Filters
  2. Operations
    1. Create Archive for Routines
    2. Remove Routines from Database
    3. Create Archive for Subroutines
    4. Create Archive for Samples
    5. Remove Samples from Database
  3. Sample Archive Method
    1. Expired Samples
    2. Before Date
    3. Excess Samples
    4. All Samples
  4. Archive Folder
    1. Archived Files
  5. Performance Tuning

Data Archiver is an application which may be used to remove and archive data from the Database. This includes Routines, Subroutines, Alignments, nominal information (i.e. characteristics, limits, process baselines, etc.), and samples. All Data Archiver functionality is controlled within a single User Interface.

Performance Browse All Samples Excess Before Expired Samples Subroutines Routines Filters Database

Data can simply be archived (a blueprint of the data saved to a file which may be preserved as a record while the data remains in the Database, or it may be archived AND removed from the Database. Data may also be removed from the Database without storing an archive of that data.

Routines, Subroutines, or Samples may be removed from the Database without Archiving any data using the Remove __ from Database option. Be aware that when ONLY the Remove option is selected, the data will be removed from the Database WITHOUT storing an archived copy of that data. There is NO UNDO operation.

Once the data has been removed, it will no longer exist in the Database. However, the archived files (.4ml) may be accessed using the DataArchiver Restore.4jr file supplied in order to be read back into the Database. Once data has been archived and restored to the Database, an Archive association will flag that data so that, if DataArchiver is run a second time, it will not remove any data flagged with that association. Such data must be removed manually.

Database

The Database menu lists the DataSources which are currently available. This list is obtained from the CM4D.4ds file. Refer to the CM4D Help Documentation for more information on the Cm4d.4ds file.

Filters

The Database selected for archiving may be further defined using Routine filters defined in the Database. Select a Managed DataSource from the menu and then click the Filters button to open the Routine Filters dialog.

Operations

Data Archiver will use one of six options to determine what data will be archived and/or removed from the Database.

Create Archive for Routines

To archive the routines from the selected Database, enable the Create Archive for Routines check box. Archived Routine data will include any nominal and sample information contained within the selected routine. For instance, you may wish to archive samples while retaining a "snapshot" of the nominal information that goes with the actuals. You would then archive the Routines along with your samples, but do not select the "Remove" option.

Remove Routines from Database

To remove the Routines completely from the Database, select the Remove Routines from Database check box.

Routines, Subroutines, or Samples may be removed from the Database without Archiving any data using the Remove __ from Database option. Be aware that when ONLY the Remove option is selected, the data will be removed from the Database WITHOUT storing an archived copy of that data. There is NO UNDO operation.

Create Archive for Subroutines

To archive Subroutines from the selected Database, enable the Create Archive for Subroutines check box. Archived Subroutine data will include any Routines, alignments and nominal data assigned to the selected Subroutine.

Create Archive for Samples

To archive Samples from the selected Database, enable the Create Archive for Samples check box.

Remove Samples from Database

To remove Samples completely from the Database, select the Remove Samples from Database check box.

Routines, Subroutines, or Samples may be removed from the Database without Archiving any data using the Remove __ from Database option. Be aware that when ONLY the Remove option is selected, the data will be removed from the Database WITHOUT storing an archived copy of that data. There is NO UNDO operation.

Sample Archive Method

Select the Sample Archive Method by which you would like to archive your sample data. All three are based on the translator used to add them to the Database.

Expired Samples

Before samples are read into the Database, an Archive Date or Number of Days (comparable to an expiration date) can be set using the Sample Archive association in DataSmith. When the Expired Samples Archive Type is selected, any samples assigned an Archive Method (None-0, Date-1, or Days-2) will be examined individually by Data Archiver. Any samples with the Archive Method set to None will not be archived. All samples assigned either the Archive Method Date or Archive Method Days will be archived according to the following conditions:

Sample Archive Date

The Sample Archive Date indicates the date that the sample file will be archived, as defined by the user. All samples with an Archive Date earlier than the date that Data Archiver is run will be archived. For example, if the archive date November 7, 2005 is set for a sample, and the date that Data Archiver is run is November 15, 2005, then Data Archiver will flag any samples with the date November 7, 2005 as expired and they will be archived.

Sample Archive Number of Days

Samples will remain in the database for a certain number of days after the sample create date. The period of time between the sample create date and the date that you are archiving the sample determines how long the sample file will remain in the database. Those samples with a create date that is more than the number of Archive Days before the current date will be archived. For example, if the archive number of days is set to 180, when Data Archiver searches the database, any sample with an import date more than 180 days prior to the current date will be archived.

Before Date

When the Before Date option is selected, any sample with a create date before the date set in Data Archiver's Before Date field will be removed from the database and archived.

If you are running Data ArchiveĀ for the first time, be careful using the Before Date setting. If a recent date is set in the date field, and you have hundreds of samples in your database to be archived, Data Archiver will require are great deal of time and memory to run. It is recommended to initially set an older date, and then gradually moving the date up to avoid archiving all of your sample files at one time.

Excess Samples

The Excess Samples option relates to the Max Samples attribute of a routine that is stored in the database. If you select the Excess Samples option, and the Max Samples for any of your routine's are set to zero, those routines will be ignored, and any samples in that routine will not be available for archiving. However, if a routine has Max Samples set to any number (other than zero), any samples exceeding that routine's Max Samples number will be archived. For example, if Max Samples is set to 5,000, and there are 8,000 samples in the routine, then the oldest 3,000 samples will be archived.

All Samples

The All Samples option will archive and/or remove all samples that exist in the selected Database.

Archive Folder

When using Data Archiver, you need to assign the location where you want your archived data to be stored. There are three ways to create and/or assign your base archive folder:

  1. Browse to a pre-existing folder using the Browse button to the right of the text field. Selecting a folder using this method will assign that folder location as the top-level archive folder.
  2. When you have a pre-existing folder, you can type the archive folder path manually to assign that folder location as the top-level archive folder.
  3. When you do not have a pre-existing folder, you can type the archive folder path and new folder name manually, and Data Archiver will automatically generate the folder.

As sample files are extracted from the database by Data Archiver, folders are generated by Archiver per Database according to the Database label and the current date. Within each Database folder, another folder will be created for each routine, labeled with the routine key. Within each routine's folder, the sample files are labeled and stored by sample key and the date archived ( yyyy-mm-dd). If Data Archiver is run more than once per day, any subsequent files generated will be appended to the already existing folder for the current date (if data is from the same Database).

Archived Files

The sample files archived will be saved in XML format with the extension ".4ml". Data files with the .4ml extension have the capability to be read back into DataSmith using the "Archive Restore.4jr" translator provided with the CM4D suite. When archived data files are read back into the database, the Archive Restore option will be enabled in the Config properties dialog. This option flags the samples as "Restored" in the Database, thus differentiating "Restored" data from "raw" data. Archived data is deleted from the Database once the Archiving process is completed, but if an archived file is restored to the Database, that archive file remains in the archive folder. The Restored flag is important because it tells Data Archiver to ignore any data that has already been archived, thus avoiding duplicate archive files. Restored files must be removed from the Database manually using DataUtility.

If Data Archiver is terminated before it has finished archiving all of the sample files in queue, you will not lose any data from your Database since data remains in the Database until all data has been archived. The only consequence you may see is that the same sample(s) may be exported twice once Data Archiver is run again.

Performance Tuning

The Performance Tuning switch refers to the portion of samples that will be archived at a time. When Data Archiver is run, it first examines the Database to identify all of the samples to archive. Data Archiver then performs the archive operation on subsets of all of the samples identified. Each subset will be archived before Data Archiver automatically continues to the next subset of samples, until all of the selected samples have been archived. The performance Tuning switch controls the size of the subsets.

Setting the Performance Tuning switch to a lower level will decrease the number of samples archived at one time, as well as reduce the amount of memory used during the archiving process. This is a preferential setting and may differ from one machine to the next, depending on the amount of memory you want to allocate to the task, the number of samples to be archived, etc.

Can we improve this topic?