...
Lot's of folks want to know: Systems, Data Librarian, Programmers, EPMs, Managers, selected TAS staff.
It is helpful to chat with Curation. If there is a sampling party or post-cruise meeting in progress, it is preferrable to post-pone the Copy the content step to a less busy time. Otherwise activities up to that point can be completed without any production impact.
1. Retrieve data content from tape
Distribute it Subsets of the EOX tapes are distributed to Publications, Data Librarian, DBA, Operations, and public access storage locations. - Systems
A copy of the shipboard database export is stage for DBA use at OEM:/backup/eox/
- The database dump is backed up to tape via the host
OEMJR:/backup/eox/
. TBD--confirm location and match in DEV and MCS SOPs. - When restored, the file is copied to
OEM:/backup/eox/
.
There should be one data dump file and an accompanying log of the export activity.
- When the file contains a full shipboard database export the naming pattern is like
limsjr_full-398.dmpdp
- When the file contains selected schema exports it is named like
limsjr_schemas-398.dmpdp
For the shore EOX work, the filename is typically shortened to the expedition designation, e.g. 398.dmpdp
.
2. Restore and review
Restore routinely selected database content from the full backup to the shore production transfer schema. - DBA
Once staged extract, transfer, and load scripts may be run.
- DBA scripting for the EOX (and other) processes are hosted at
OEM:/backup/scripts/rci_work/exp/
- This scripting is designed to be run from an ODA command-prompt.
- Privileges to sudo to the ODA oracle user are required.
Other pre-requisites.
- What database instance are you connecting to? You must be at a command-line for that ODA instance.
As the OS oracle user specify which database instance you are using with this syntax
Code Block . oraenv
import_transfer.ksh
When executed, this script
- Prompts for the local name (and extension) of the database export file. Upon <Enter>
- Shipboard content is imported into the TRANSFER schema. Indexes are created for the largest rowcount tables: new_sample, new_test, new_result.
- The script now prompts for the expedition of interest, e.g. 398, 397T, etc. Upon <Enter>
- The script imports reference copies of GEODCAT (renamed to include the expedition) and GEODP### schemas.
On completion--of the schema imports above--a variety of SQL statements are run to highlight what data was brought back and how many rows.
- This script imported 398 content over multiple tests between 30 and 40 minutes.
Additional data review scripts take another 10-15 minutes to run. - The script requires no TRANSFER schema preparation--all the tables required are present, the import methodology truncates the tables before importing new content.
- The content is imported directly from SCHEMAS exported on ship (LIMS, DESCINFO2, OPS, GEODCAT, GEODP###, etc.).
There is no duplication of keys incurred by the ETL cycle now in use.
3. Establish moratorium credentials and controls
For the new expedition data. - DBA
Pick a name and credential. The account is shared by a science party for the moratorium period. The role created with the same name ensures that science party is granted the exception to see its moratorium data.
eox_gen_in_moratorium_script.ksh [moratorium-group]
This script accepts a moratorium name, e.g. SHACKLETON, SANTORINI, etc.
It generates another script (below) that will be run to modify NEW_TEST, NEW_RESULT, and NEW_GEOD_DATA tables for moratorium concerns.
- If a moratorium group is not provided, the script will prompt.
- The script does not check these requirement
- The GROUP_NAME field that this string populates is varchar2(10)--10 characters or less. It is conventional to use a memorable uppercase name. Often the EPMs will provide both a name and a credential.
- The string must also be a valid name for a ROLE. Example: should not start with a number; must be valid for naming a Unix file.
in_moratorium_[moratorium-group].ksh
Running this script
- Ensures the current expedition content is tagged for moratorium control.
- Do review the script. Sometimes the EOX expedition number is cleared before running
eox_gen_in_moratorium_script.ksh
.
Upon running the script--wait.
- The NEW_RESULT table typically contains 20-30 million rows. This update takes the longest.
- The NEW_TEST and NEW_SAMPLE table are much faster due to being 3 orders of magnitude smaller.
- Timing is turned on--the scripts will show how long they took to run.
crt_user_exp.sql
Review the script.
- Make a copy of it specifically for the expedition you are processing--e.g.
crt_user_398.sql
- Modify the user credential to reflect EPM requirements.
Run the script via SQL.
Provide an explicit path to the script to run it.
Pass the expedition as a parameter to the script.
Code Block |
---|
sql / as sysdba
SQL>@/backup/scripts/rci_work/exp/crt_user_398.sql 398 |
4. Copy the content
Into the publicly accessible LIMS. - DBA
...
To be worked in. The scripts below are applicable to steps 2. thru 5.
EOX load and merge - rci_work/exp
The collection of scripts in this location is used to automate the shore components of EOX processing: retrieve last expedition content from the database dump, create and apply moratorium supports, merge the data into production; provide statistics which give quality assurance/quality control on the process.
These are the scripts (in-order) that are typically used
|
Other script
tbd