PSTs (Personal Storage Table) are the one of the most popular ways of storing copies of messages, calendar events, and other data items within Outlook and other Microsoft products. Most of the times, they are a key repository of important and sometimes confidential corporate messages. They are also unique in the sense that they are not hosted or stored on a centralized server/service, primarily to save on costly server storage and also for offline access to mails - they are usually distributed across the many endpoints (laptops and desktops) in an organization. These factors make them a prime candidate for data protection (backup).
Reasons for Backing up PSTs
- They hold critical data and are distributed across endpoints. If an endpoint is lost, corrupted or attacked, the PST is also lost. This usually results in lost user productivity and considerable pain for IT.
- PST format is very complex and is prone to corruption. Outlook also keeps compressing PSTs on a regular basis. This is especially troublesome if you consider the previous and next point.
- PST files are typically large, and in some cases, huge (tens of GBs). This creates performance problems with Outlook and so they are a prime candidate to be compressed (or de-duplicated) for efficient storage.
Existing Methods for PST Backup
PST backup solutions are either based on MAPI (Messaging API) or on VSS (Volume Shadow Copy Services). According to various studies and customer experiences, VSS-based backups are seen to be more robust than MAPI-based backups because MAPI-based backups can corrupt the PSTs and are also very inefficient. Besides, their performance is considerably slower than VSS-based backups. They can also slow down the performance of Outlook and hamper the user experience.
VSS-based solutions take a consistent point-in-time snapshot ("Shadow Copy") of the PST file (or folder or volume), and then read the snapshot (instead of the actual file) for the backup. Vaultize, in its previous versions, used pure VSS-based backup to take benefit of PST compression and speed up the overall operation, without touching the actual PST files. This is non-intrusive for the end-user and also keeps Outlook happy.
But, there are some issues with pure VSS-based PST backups as well. PST is a complex file format - adding or deleting a few items (for example, when some messages arrive) changes its binary structure significantly - traditional deduplication techniques find it hard to calculate the most efficient incremental differences between subsequent versions of PSTs. Also, to calculate the incrementals, the whole snapshot has to be scanned again and this can be slow for huge file sizes.
Introducing The Most Efficient PST Backup/Restore Solution: Vaultize PST Cracker
Vaultize PST Cracker solves the problem of calculating optimal and efficient incrementals by making use of the Microsoft PST File Format Specification. In addition to MS Office, ZIP, GZIP and PDF, Vaultize deduplication module has now been made aware of the PST file format. This means, Vaultize client can now find the exact blocks of data that have changed without traversing the whole PST file snapshot. This consumes much lesser time (and CPU) than the earlier technique and also reduces the incremental size hugely, giving our customers more savings in bandwidth and storage.
Our internal and customer pilot tests with varying PST sizes have shown that the new Vaultize PST Backup is 4 times faster on average and 3 times more storage efficient. Restores of PSTs have also becomes faster because of much lesser data to download and "replay".
You can try out our ground-breaking PST Backup (for free) at: https://www.vaultize.com/try-it-free.php. To know more about our data protection and data loss prevention capabilities, please click here: https://www.vaultize.com/continuous-data-protection.html.
This post is written by Akash Shende, the primary developer of Vaultize PST Cracker. He’s a Software Engineer at Vaultize and works in our Pune India R&D center.