UNAVCO Home
   |    |   |  
Promoting Earth science by advancing high-precision techniques for the measurement of crustal deformation.

· Publications & Reports · Brochures · Community Bibliography · Periodic Reports · Proposals · Staff Publications · Workshop Publications
UNAVCO 1996 Annual Report
[Next] [Previous] [Up] [Contents]

3.3 Data Management and Archive Tasks in FY97


The process of data entry and export from the Archive now frequently involves considerable communication between the DMAG staff and investigators at various stages of the process. This personal contact can start even before the campaign begins or a permanent station is installed, as preparations for data down loading and data communications are undertaken. Upcoming projects are logged and follow-up contact by phone or e-mail is established to ensure that data are transferred to the Archive, usually by a variety of means including tapes, floppy disks and the Internet. For permanent stations, staff assist in the installation and support of UNAVCO's automated downloading scripts and LDM/IDD data transfer programs.

Identification and Retrieval of Outstanding Data

For legacy data (i.e. prior to 1991), however, there was not such an organized means of data collection and meta-data recording. At this point in time, it is not even clear exactly what general data sets exist, based on GPS projects of that time. Identification and location of existing, but misplaced, data sets is one of the tasks to be started in FY97. To recover this data, a subaward to the University of Texas-Austin and the University of Miami was made to develop the capability of recovering and reprocessing this legacy data with appropriate high-precision orbit data. Once these outstanding data sets are identified, requests will be made to investigators to turn these data sets (or copies thereof) into the Archive. Copies will be requested using new tape media.

Personal contact is being expanded in 1997 to include reminding investigators to submit their outstanding data. Data from 55 campaigns known to UNAVCO have not been submitted to the Archive. In some cases, data are not permitted to leave the host country, but it may be permitted to have meta-data allowing others to see what points have been occupied, when, and by whom. For the rest of the campaigns, investigators will be encouraged to submit their data as soon as possible. A comprehensive list of these projects and points of contact have recently been compiled and a mass e-mailing and telephone follow-up will be undertaken to recover the data.

Web User-Interface

The last step in the archival process provides access to the meta-data in the Archive Database and the data in the On-line Repository through a Web User-Interface (WUI). Currently, two products are produced for researchers, a tabular list of data in response to queries and an ftp-delivered data set in response to a request for data. FY97 improvements to the WUI will include less delays in data delivery as data are migrated onto the tape jukebox, and the ability to make queries using a map-oriented data request.

The UNAVCO relational database model has been the focus of development and implementation until early 1997 when version 2 of the Archive Database was officially put on-line for the GPS community. This investment has provided a solid foundation for efficiently providing data and accurate meta-data for future investigators who may require these data for their research. Without a robust underlying database, tools used to access the data have little value. UNAVCO has previously recognized, for example, the power of modern Geographical Information Systems (GIS) for helping the user display information and access a database. In 1993, a major GSI vendor was engaged by UNAVCO to help develop a prototype GIS system which combined maps, geology, GPS monuments, topography and GPS results. Though the result of this effort was a remarkably capable demonstration, it was clear that it was going to be very difficult to maintain, that the tools did not yet exist to fully develop the product, that it would be difficult for the GPS community to access the system, and most importantly that there needed to be a robust database for the system to access.

Based on this lesson, UNAVCO subsequently developed the Archive Database (using an Oracle relational database system) and is now working to improve the user interface, starting with improved graphical reporting and leading toward the elements of GIS which are most important to meet GPS community requirements. Additionally, the UNAVCO interface tools to the Archive Database are easily adaptable to Internet access which has become the preferred method of GPS community access to the Archive. Prototype temporal and spatial report tools have been built but are not yet integrated together with the Archive Database. An overview of this concept is shown in Figure 3-9. These reports still need to be linked with the current database form query and tabular reports so the user can step quickly between tabular and graphical views.

Figure 3-9. Schematic of UNAVCO's Web User Interface Reports. (The graphical map and temporal report are planned for implementation FY97.)

Current Tabular Report

The Archive Database currently delivers to the user a tabular report (Figure 3-10) of the data requested using a Web form. This report presents the UNAVCO unique monument ID, the (non-unique) 4-character monument name, number of visits and approximate coordinates. The monument and visit fields are hyperlinked to the database allowing for more detailed information using a browser.

Figure 3-10. Example of the Current Archive Database Tabular Report Generated by a Data Request.

Data management and archival capabilities will be expanded in 1997 to include a temporal and map interface to the Archive Database, improved and expanded data translate, edit and quality control capability, improved data communications capability, and expanded support for a GPS Community Seamless Archive. The following section expands briefly on these activities for FY97.

Temporal and Spatial (Map) Reports

A graphical timeline report showing the visit times for the data selected in an Archive Database query will be implemented in FY97. This provides the user a graphical view to when the monuments were surveyed including a visit history. A prototype of the temporal report is shown in Figure 3-11. Each horizontal timeline in the display corresponds to an individual GPS data file.

Figure 3-11. Prototype Temporal Report Generated from the Archive Database.

Also under development in FY97 is a map report of an Archive Database search to provide a view of where the monuments are located in relation to landform features and political boundaries. An important feature of this map is that it is generated dynamically on-the-fly and contains the most current information in the Archive Database. The user can select a geographic range (e.g., lat/long) and map features such as rivers and political boundaries. A color-contoured topographic map at 5 minute resolution can be generated for anywhere in the world and be included as a basemap (Figure 3-12).

Figure 3-12. Prototype Map Report Generated from the Archive Database (using the ETPO5 global topographic dataset).

The next step in graphical Web User Interface tool development will be a move toward a true Geographical Information Systems (GIS) map interface. The GIS component makes a map a "smart" map that can be used to generate queries to the Archive Database, not just report the results of a query on a map. Web-based GIS is currently a first generation product being developed by commercial vendors. Research on the top commercial products show they are geared toward more sophisticated applications and are therefore too unwieldy, expensive and difficult to implement using a patchwork of programming languages. More critical to our application, the time element of the display is completely ignored in such commercial packages. The Java programming language, however, is the current tool of choice for generating map-based queries for a multiplatform environment. Using Java, the DMAG staff will add the capability to access in real-time, using a browser, information such as monument name and time of occupation by sliding the mouse over a monument point on the map or timeline. Similarly, the user will be able to select a geographic range for zoom-in using the mouse. The second phase development, which will be initiated in late 1997, will use this type of access information to make new queries of the Archive Database, giving a true GIS capability.

TEQC Software

An essential element of data management is assessing data quality and supplying validated data files to the GPS community. To achieve part of this goal, UNAVCO has recently completely rewritten its QC software as the Translate, Edit, and Quality Check (TEQC) software. The QC portion of TEQC has more complete data statistics, clock offset detection, time windowing capabilities, and results display than the original UNAVCO QC software. TEQC offers an enhanced ease-of-use and is supported on a wider range of platforms. Also, TEQC has built-in native binary format readers, or can translate to the Receiver INdependent EXchange (RINEX) format, or can be used to edit meta-data in RINEX format. Figure 3-13 shows the three general modes included in TEQC, which can be used individually or in combination with one another.

TEQC has been designed to be implemented at all phases of GPS data access. For example, raw data collected by a receiver can be quickly quality checked using TEQC at the data collection site, without necessarily translating to RINEX, yielding the QC portion of the meta-data. This can easily be done either with episodic (campaign) data or permanent station data. Additional meta-data (such as the site name, antenna and receiver characteristics) can be extracted from raw data using TEQC and then prepared for transfer back to the UNAVCO Archive along with the raw data. At UNAVCO, TEQC is used for a more detailed quality check and additional meta-data extraction by the DMAG staff and is used for many other purposes by other UNAVCO groups. Usually, some meta-data contained in the raw data file is incorrect, but using validated meta-data from the Archive Database, correct translation to RINEX is accomplished with TEQC using its meta-data editing and translation features. These TEQC-produced RINEX files are then ready for export to the GPS community. Furthermore, the portability of TEQC allows it to be used by virtually any researcher in the GPS community to accomplish the same general tasks independent of UNAVCO since TEQC is made freely available over the World Wide Web. In fact, TEQC has been adopted by the NGS for translating and quality checking its CORS network data from all of its Ashtech Z-12 and Trimble 4000 receivers.

Understanding raw native binary formats and the RINEX format is key to UNAVCO's support of TEQC on behalf of the GPS community because native binary formats represent the language of GPS data available directly from receivers, whereas RINEX is currently the accepted standard format for transfer and exchange of GPS data within the GPS community. Complete understanding of these formats allows the DMAG staff to directly address data issues without relying exclusively on other groups for support, such as for translator corrections or enhancements. The alpha release of TEQC in late FY96 handled download and real-time data formats from Trimble 4000 receivers and the real-time data format from the Ashtech Z-12 receivers. In early FY97, TEQC was modified to handle TurboBinary and ConanBinary from the Rogue family of receivers, the download format of Ashtech receivers, and two legacy formats from the old Texas Instruments TI-4100 receivers. Later in FY97, TEQC will be modified to handle two more legacy formats from the TI-4100 and formats from other receivers as the need arises. In addition, TEQC translation of these native binary formats to RINEX has helped uncover at least two critical problems in the RINEX Version 2 documentation from the University of Berne.

Figure 3-13. UNAVCO Developed TEQC Features.

Electronic Log Sheets and Remote Data Pre-archival Tool Development

In the case of campaign data, UNAVCO's data management strategy is to develop electronic logs and data tools that permit the PI to place clean, organized data into the UNAVCO Archive with minimal staff time. Electronic log sheets allow automated entry of project meta-data into the Archive Database and ensure that the critical information that might be needed in the future to process, or reprocess with different parameters, the project data is retained. An example of why this is important is antenna mixing where subsequent to a specific campaign, antenna phase center offsets for a set of antennas might be determined. Data from individual projects where these serial number antennas were used could be identified and the data annotated with the measured offsets so that future processing could include offset corrections.

An even simpler example would be use of site meta-data to repeat an occupation using the same antenna and antenna height at each site to improve precision. Complete records of the data collection process will ultimately allow even those not intimately associated with a specific project the ability to interpret and process the data, thus potentially extending the time record of measurements in areas of interest and facilitating more complete and automated data processing. UNAVCO will continue to work with GPS community representatives in FY97 to expand the use of these electronic forms, both to facilitate entry of data into the Archive and as an element of the "Seamless Archive" activity.

Seamless Archive

The proliferation of GPS data centers has recently led to the concept of a "seamless" GPS data archive. Implementing the concept will establish standardized exchange parameters between centers and will lead to a computer index list and catalogue to access data across different centers. UNAVCO will continue playing a major role along with the Scripps and CDDIS archives to implement the seamless archive concept over the next several years. Consolidating log sheets and defining meta-data have been the first steps in this process. More recently, Scripps and UNAVCO have started a dialogue to establish detailed plans to implement a seamless capability between Scripps' permanent station data by indexing that data at UNAVCO for access through the Web User Interface. UNAVCO is also participating in IGS discussions to help establish minimum standards for capture of critical meta-data. Figure 3-14 shows schematically the seamless archive concept in the context of the UNAVCO and broader GPS community.

Receiver Downloads and the Transport Management Layer

In FY97, the DMAG will interface UNAVCO's receiver downloading scripts to electronic transfer protocols. Presently, these downloading scripts make copies of data files that exist in the GPS receiver on a hard disk in a computer that is connected, either directly or via modem, to the receiver. Then copies of these data files are made and shipped to UNAVCO. Using a transport management layer (TML), the data files will be automatically sent to UNAVCO using protocols such as ftp or LDM/IDD (discussed below). Once the data is secured at the UNAVCO Archive, a receipt message will be sent to the TML indicating that the file on the computer can be purged. If data are not received at the Archive in a time period when it was expected, a re-try message will be sent to the TML indicating that the data file should be re-sent. Other messages to the TML from the Archive might include reconfiguration of the receiver or explicit download requests.

Figure 3-14. Concept for a "Seamless Archive". (CDDIS is an example of a global archive and the Southern California Earthquake Center (SCEC) is an example of a regional archive.)

LDM/IDD Automated Data Transportation System

The increased demand for timely continuous data could be met for stations that have Internet access by a variety of systems. UNAVCO has adopted one such system from the atmospheric sciences community, called the Local Data Manager/Internet Data Distribution (LDM/IDD) system. The UCAR-managed Unidata Program developed the Local Data Manager to provide near real-time access to meteorological data via the Internet to over 130 universities. Modifications to LDM needed to manage GPS data have been agreed to between Unidata and UNAVCO. LDM has the potential to support regular, automated delivery of standard data and data products to multiple end users as shown in Figure 3-15.

LDM has been tested by the DMAG staff for GPS data at two download locations and initial results are promising. LDM removes time lags between data download and data reception, contains "smart" software to re-try data transmission for missed files due to network or computer outages, is extendable for any range of download periods (seconds, minutes, hours), and can balance/remove the "peak loading" problem when files are retrieved from archive centers. UNAVCO will continue to evaluate LDM/IDD for solving various GPS data network and distribution problems.

Figure 3-15. The LDM Concept. (Identical data and data products are transported simultaneously to all product users.)


1996 Annual Report - 23 SEP 1997

[Next] [Previous] [Up] [Contents]


Last modified Tuesday, 08-Nov-2005 02:34:54 UTC

 

Home | About Us | Contact Us | Support | Search | Facility | PBO | Education & Outreach

Comments: webmasterATunavco.org
© 2009 UNAVCO, Inc.