How do I synchronize target filesystems?
Synchronizing directories is the process of making
some part of a filesystem (files, directories and subdirectories)
identical on more than one computer. The context for Everserve is
to have multiple peers in a community duplicate a dataset sent to
them from the community’s Publisher.
This document will describe default behavior
of package execution using the delta and version attributes. The
process is limited to the synchronization of a peer’s filesystem
to the publisher's filesystem. Only changed files on the publisher
system will be propagated. This means that the directories may not
match exactly because files removed from the publisher will not
be removed from the target. Renamed files will be duplicated on
the target peer. Everserve can easily handle the case where new
files or subdirectories are added to the dataset, or when existing
files are modified. Everserve has package attributes to send these
limited changes efficiently to the community peers. Future releases
of Everserve will support the transmission of only the changes rather
than the complete file. The current release will resend the entire
file when changes are made to an existing file.
Files that are deleted on the publisher are not
removed from peer systems with this level of synchronization. If
file or directory names are changed, the objects are simply copied
with their new names. This means that the same file with the previous
name will remain on the target peer.
To use Everserve's synchronization features a
package specification must set its version attribute to true and
record the filesystem state. After the first delivery of the complete
dataset, Everserve packages that set the delta flag to true will
transmit updates and send only the filesystem structures that are
new or changed since the last 'version' was sent.
Here is an example package enabling basic updates
for the Windows OS:
<?xml version="1.0"
?>
<!DOCTYPE spec-container SYSTEM "package_spec.dtd">
<spec-container name="synch" delta="true"
version="true" >
<spec-version version="1.0" earliest="1.0"/>
<directory-spec source="c:\test" target="c:\test"
recurse="true"/>
</spec-container >
A similar package for a Unix environment:
<?xml version="1.0"
?>
<!DOCTYPE spec-container SYSTEM "package_spec.dtd">
<spec-container name="synch" delta="true"
version="true" >
<spec-version version="1.0" earliest="1.0"/>
<directory-spec source="/tmp/test"
target="/tmp/test"
recurse="true" />
</spec-container >
What follows are some techniques for more advanced
synchronization. These methods will enable Everserve to propagate
complete filesystems. The packages are designed to allow all peers
in the deliver command to have exactly the same filesystem as the
publisher system sending the package.
Selective deletion using the receipt and a script
After a package specification is executed for
the first time a receipt including the packing list of the files
that were delivered is stored in the database. The most efficient
and sophisticated method for insuring complete synchronization with
the publisher is to create a script to iterate this list, and compare
it to the publisher filesystem in advance of the next delivery.
If a file or directory was removed from the publisher, a new package
containing delete command-specs for the missing objects can created
to correct the peers.
Full directory deletion and redelivery
An alternative means of insuring that the publisher’s
filesystem matches each peer is to remove the previously delivered
dataset and send the complete package every time. In this case,
the version and delta flags are set to false in the package spec.
The package spec below provides an example that you can customize.
It does not take advantage of the Everserve features for more efficient
delivery of datasets via updates. The first set of commands in the
package-spec will issue delete commands to completely remove the
file or directory structure before redelivering the complete dataset
with file-spec or directory-spec package attributes.
Sample packages to provide this complete synchronization
follow. Note that the use of scripts or success codes might be a
better solution. These samples are simplified for demonstration
purposes.
Windows OS:
<?xml version="1.0"
?>
<!DOCTYPE spec-container SYSTEM "package_spec.dtd">
<spec-container name="rmsynch" version="false"
delta="false" >
<spec-version version="1.0" earliest="1.0"/>
<!-- first - remove the entire target directory -->
<command-spec commandline="rmdir /s /q c:\test" />
<!-- now resend the complete directory -->
<directory-spec source="c:\test"
target="c:\test"
recurse="true" />
<!-- Sending individual files is also possible -->
<file-spec source="c:\test\file1" target="c:\test\file1"
/>
<file-spec source="c:\test\file2" target="c:\test\file2"
/>
</spec-container>
And for the Unix environment:
<?xml version="1.0"
?>
<!DOCTYPE spec-container SYSTEM "package_spec.dtd">
<spec-container name="rmsynch" version="false"
delta="false" >
<spec-version version="1.0" earliest="1.0"/>
<!-- first - remove the entire target directory -->
<command-spec commandline="rm -rf /tmp/test" />
<!-- now resend the complete directory -->
<directory-spec source="/tmp/test"
target="/tmp/test"
recurse="true" />
<!-- Sending individual files is also possible -->
<file-spec source="/tmp/test/file1" target="/tmp/test/file1"
/>
<file-spec source="/tmp/test/file2" target="/tmp/test/file2"
/>
</spec-container>
The points to remember are that the delta and
version flags will enable efficient delivery of filesets through
Everserve by transmitting only the additions and changed files of
the dataset. Future releases will support an even more efficient
delivery where only the changed pieces of a file (rather than the
complete file) are sent out to the community (along with the newly
added data). Since file-spec and directory-spec package attributes
do not delete existing files, extra steps must be taken if it is
required that the publisher and peer filesystems match exactly.
Download packages in
a zip file
|