Elaboration and discussion of EOX processes and variations: background notes, explanations, considerations, and troubleshooting for end-of-expedition data processing.
The actual scripts to run are here.
Negotiate a time window
The end-of-expedition process may be run at when data collection is complete.
...
Pick a time, then stick to it: any new data applied after this point will not return to shore this cycle. To capture it would require an additional end-of-expedition data-processing cycle.
Bring home a copy
The MCS place database snapshots and files on tape.
You may be requested to carry a tape home.
If you are carrying a travel drive, please bring a copy of both database snapshots home. The additional redundancy has value. Do not attempt to bring back the ASMAN file directory.
Procedural notes
Accounts. Three accounts are used to execute the end-of-expedition data processing. The oracle
user (via the ODA operating system) runs the database exports. The Oracle database accounts your-surname_dba
and transfer
are applied within the export utility. Credentials for oracle
and transfer
are stored in \\novarupta\vol1\tas\AD\pwsafe\ship.psafe3.
...
- They have the same columns as their production namesakes.
- The tables are intended to be empty at the beginning of EOX.
- These tables do not have any indexes, constraints, or foreign key dependencies.
Why multiple exports?
- Redundancy.
- Ensures that content gets home that is not part of routine EOX/BOX handling, including:
- Location-specific configuration [LIMS analysis, components, constants, instrument, and user definitions.]
- DESCINFO raw workspaces that were precursors to DescLogik uploads into the LIMS schema.
- Log content (PITALOG).
- Useful to provide real content for development, demo, test, and training databases.
Troubleshooting Notes
Archive logging--This is typically not an issue for EOX, but is still a very real issue for BOX.
Oracle treats the changes resulting from end-of-expedition processing just like any other transactions. Due to the size of these transactions it is possible (though unlikely) that EOX activities will exceed 20 GiB allocated for archive logging. When this occurs, the database blocks all activity until archive log space is freed. All attempts at new transactions will receive the message ORA-00257: archiver stuck
.
...
Monitor the generation of archive logs using the Enterprise Manager Console or the RMAN tool.
Export fails with permission errors
The export [in step (5) of the short form] requires that the system and transfer schemas have permission from Oracle and the ODA operating system to write files to the directory /u02/exports
.
This SQL verifies the existence of the directory named dmpdir. Inspection verifies whether it points to /u02/exports
.
...
These activities are driven by the MCS and DBA. Coordinate with both.
This script isn't working, why?
The underlying SQL for each procedure is viewable (and modifiable) under SQLDeveloper:
...
Most likely source of error: database model changes during the expedition. Model changes must be replicated to the various transfer tables and the EOX scripts.
Something went wrong, I want to start over
Not a problem. The function of these scripts is to make a copy of data for backup. There is no irreversible damage being done by any of these activities. It is sufficient to begin again at step (2) of the short form.
The source data schemas are not modified in any way by this process. The various stored procedures only copy from a source (LIMS, DESCINFO2, OPS, etc.) to the target TRANSFER schema.
Prior to 324, the cleanout method relied on Oracle delete. Oracle's guarantee of atomic, consistent, isolated, durable transactions translates to considerable overhead when millions of records are involved. The truncate
command bypasses this overhead, but does not guarantee atomicity, consistency, isolation, or durability.
Data cleanout
CLEANOUT_NEW_TABLES.
Conducts a fast and non-recoverable truncation of table content. Only applied on tables in the transfer
schema. DO NOT APPLY THE SAME TECHNIQUE on production tables as this command is not intended to process indexes, constraints, table statistics, and other table-related data objects that are affected by data removal.
The truncation command has been applied since expedition 324. Though overflow of the available archive logger space is still possible, use of the truncation command alleviates that issue for EOX processing.
Data copy architecture
LIMS_EXPEDITION_TO_NEW.
Copies data by expedition [not by project!] for sample, test, result and several other tables. The style is a series of insert into ... select * from ... statements.
- The script does not check that the tables are empty.
- The script may be run multiple times.
- Records that match will only be copied once (changed as of expedition 324).
- When new records are identified, they are accumulated in the transfer tables.
Useful Utilities
How do I recreate a missing TRANSFER table?
Before we give you the commands, some background information:
...
The source table for any given entry may be found by dropping the prefix (NEW_, LEGACY_, CONT_). Source tables are in the schemas LIMS, OPS, DESCINFO2.
*nix to Windows Copy examples
Various methods for transferring files if you aren't initially comfortable on a *nix box.Via PuTTY. From the database host to the local Windows box:
...
Tools such as IPSWITCH_FTP, FileZilla and WinSCP provide GUIs to perform the same task for both secure copy (scp) and secure file transfer (sftp) protocols. On the Unix/Linux side, scp
is the equivalent tool. Type man scp
for command-line help.
Are the copies you made any good? Are you sure?
Checksum tools read every byte of a file to generate a signature for the file. If the signatures differ between copies of a file, then the copies are not identical. The signatures should match, regardless of the platform used. These tools are particularly useful to verify transfers conducted over our (slow, less-reliable-than-i'd-like) satellite connection.
Code Block |
---|
Redhat$ md5sum *.dmpdp.bz2 Solaris> digest -a md5 -v transfer_741.dmpdp md5 (transfer_323.dmpdp) = 9a043b6b899b6eddebc70f30e7df450c DOS> md5sum transfer_323.dmpdp 9a043b6b899b6eddebc70f30e7df450c *transfer_741.dmpdp MacOSX$ md5 transfer_741.dmpdp MD5 (transfer_741.dmpdp) = 9a043b6b899b6eddebc70f30e7df450c |
TECHDOC Cumulus Content
TECHDOC is not brought home. No copy is needed. If it is required, the MCS will manage it.
Labstation / Instrument Host Cleanup
Work with the techs and research folks to get any data needed off of the labstations and cleanup any files on those that need doing. Procedures on this will evolve as we gain more experience.
...