Large datasets synchronization

Large datasets synchronization

Data synchronization by network

Résif-DC offers an on-demand synchonization service (rsync server) that can be accessed on a network level. Data synchronization of embargoed networks is subject to the acceptance of the PI of the experiment.


Rsync is a popular utility that provides fast incremental file transfer. It is usually shipped with modern Unix-like systems (eg. LinuxMacOS). Ports exist for Windows [1].

Rsync is a convenient way to download and update large dataset of continuous data, whereas other data services are more suitable for tailored data requests (time-windowed, quality-filtered, etc.) and smaller amounts of data. Résif allows end-users to download datasets via rsync upon specific request. If you wish to download a particular dataset, contact us describing your needs and goals. Résif-DC examines request on a case-by-case basis, according to its data policy.

Usage instructions for open data in PH5 format

For downloading a complete open dataset in PH5 format, you need first to get the name of the rsync module. It is advertised in the persentation page of the network (in the comments section) if available.

Listing the distant files:

rsync rsync://

To download the dataset, add a destination directory and some options to the above command :

rsync -rltvh --compress-level=1 rsync:// /data

Usage instructions for restricted data

Once your request is approved, datacentre operators will provide you with a rsync module name and temporary login and password. You may then use the following commands in a shell prompt. These commands are compatible with most Unix/Linux systems. Depending on your shell variant, you may need to use a different syntax.

Note: the machine you are downloading from must be allowed to access server on TCP port 873 (check with your IT service).

Note: Access to open data does not need any previous request. Just set the RSYNC_MODULE setup in the following steps.


Enter your local destination directory (create this directory before running):



Enter your credentials to access the data. These are provided by RESIF (do not disclose!)


Enter datacenter specific parameters

RSYNC_OPTS="-rltvh --compress-level=1"
DRYRUN="-n --stats"


Launch a trial transfer (recommended)


You are now ready to transfer !


Launch full transfer.

note : running this command many times will update your destination directory with new/modified files since last transfer. This will not delete any files on your side that don’t exist anymore on the datacentre side.


Other usage exemples

Ask us for more complex usages, or read rsync manpage

Listing remote contents (like ‘ls -l’) without transferring:


Using shell-style wildcards to transfer only the files you want :


Settings options to remove files in your destination directory that do not exist anymore on the datacentre side (be careful!):


Known bugs and limitations

The data is delivered as daily miniseed files, as SDS file hierarchy, or as PH5 archives. There is no possibility for finer grain time windows selection. Rsync access to restricted data is granted to end-user for a limited time. Bandwidth and service availability may be adjusted depending on overall load on datacentre computing infrastructure.

Metadata, small datasets (typically under 100Gb of data), or fine grain time-windows data selection should be downloaded via FDSN webservices.

[1] For example, Cycgwin provides a large collection of tools which provide Unix-like features to Windows users. Ask your local IT support.