Data synchronization by network
The Epos-France seismological data center (SDC) offers an on-demand synchonization service (rsync server) that can be accessed on a network level. Data synchronization of embargoed networks is subject to the acceptance of the PI of the experiment.
Summary
Rsync is a popular utility that provides fast incremental file transfer. It is usually shipped with modern Unix-like systems (eg. Linux, MacOS). Ports exist for Windows [1].
Rsync is a convenient way to download and update large dataset of continuous data, whereas other data services are more suitable for tailored data requests (time-windowed, quality-filtered, etc.) and smaller amounts of data. Epos-France allows end-users to download datasets via rsync upon specific request. If you wish to download a particular dataset, contact us describing your needs and goals. The Epos-France seismological datacenter examines request on a case-by-case basis, according to its data policy.
Usage instructions for open data in PH5 format
For downloading a complete open dataset in PH5 format, you need first to get the name of the rsync module. It is advertised in the persentation page of the network (in the comments section) if available.
Listing the distant files:
rsync rsync://rsync.resif.fr/NETWORK_MODULE_NAME
To download the dataset, add a destination directory and some options to the above command :
rsync -rltvh --compress-level=1 rsync://rsync.resif.fr/NETWORK_MODULE_NAME /data
Usage instructions for restricted data
Once your request is approved, datacentre operators will provide you with a rsync module name and temporary login and password. You may then use the following commands in a shell prompt. These commands are compatible with most Unix/Linux systems. Depending on your shell variant, you may need to use a different syntax.
Note: the machine you are downloading from must be allowed to access server rsync.resif.fr on TCP port 873 (check with your IT service).
Note: Access to open data does not need any previous request. Just set the RSYNC_MODULE
setup in the following steps.
STEP 1
Enter your local destination directory (create this directory before running):
DESTINATION="/my/local/directory"
STEP 2
Enter your credentials to access the data. These are provided by Epos-France (do not disclose!)
RSYNC_MODULE="xxxx"
RSYNC_USERNAME="xxxx"
RSYNC_PASSWORD="xxxx"
Enter datacenter specific parameters
RSYNC_SERVER="rsync.resif.fr"
RSYNC_OPTS="-rltvh --compress-level=1"
DRYRUN="-n --stats"
STEP 3
Launch a trial transfer (recommended)
rsync $RSYNC_OPTS $DRYRUN rsync://$RSYNC_USERNAME@$RSYNC_SERVER/$RSYNC_MODULE $DESTINATION
You are now ready to transfer !
STEP 4
Launch full transfer.
note : running this command many times will update your destination directory with new/modified files since last transfer. This will not delete any files on your side that don’t exist anymore on the datacentre side.
rsync $RSYNC_OPTS rsync://$RSYNC_USERNAME@$RSYNC_SERVER/$RSYNC_MODULE/ $DESTINATION
Other usage exemples
Ask us for more complex usages, or read rsync manpage http://rsync.samba.org/ftp/rsync/rsync.html
Listing remote contents (like ‘ls -l’) without transferring:
rsync $RSYNC_OPTS rsync://$RSYNC_USERNAME@$RSYNC_SERVER/$RSYNC_MODULE/
Using shell-style wildcards to transfer only the files you want :
rsync $RSYNC_OPTS rsync://$RSYNC_USERNAME@$RSYNC_SERVER/$RSYNC_MODULE/2012/KES0* $DESTINATION
rsync $RSYNC_OPTS rsync://$RSYNC_USERNAME@$RSYNC_SERVER/$RSYNC_MODULE//2012/*/HHZ.D/*.??? $DESTINATION
Settings options to remove files in your destination directory that do not exist anymore on the datacentre side (be careful!):
RSYNC_OPTS="$RSYNC_OPTS --delete"
Known bugs and limitations
The data is delivered as daily miniseed files, as SDS file hierarchy, or as PH5 archives. There is no possibility for finer grain time windows selection. Rsync access to restricted data is granted to end-user for a limited time. Bandwidth and service availability may be adjusted depending on overall load on datacentre computing infrastructure.
Metadata, small datasets (typically under 100Gb of data), or fine grain time-windows data selection should be downloaded via FDSN webservices.
[1] For example, Cycgwin provides a large collection of tools which provide Unix-like features to Windows users. Ask your local IT support.