Identified an issue with old PST files that started to misbehave when Microsoft Office was upgraded to 2010. PST files created by Outlook 97, 2000 or XP are stored using ANSI, whereas the ones created by Outlook 2003, 2007, and 2010 use Unicode. Microsoft recommend that you don’t use ANSI PST files anymore.
You can determine what format your PST file is by right-clicking it in Outlook (2003 and higher) and selecting Data File Properties… then clicking the Advanced… button and looking in the Format: box. If it says Outlook Data File then you’re already using a Unicode file, if it says Outlook Data File (97-2002) then it’s ANSI. You can also do it programmatically by reading in the 11th byte of the PST file, if (when converted to an integer) it’s 14 or 15 then it’s an ANSI PST file, if it’s 23 then it’s Unicode – see wVer in the PST file format header specification.
There are manual methods to solve the problem – you basically create a new PST file, open it in Outlook, and move everything from your old format PST file to the new one. Not something that your busy users (or less technically capable ones) will appreciate.
Luckily there’s a great utility called Upstart to the rescue. It gives your users a nice easy way upgrade their PST files.
Plus, it has a command line version – so useful to system admins in larger organisations. I identified 1200 ANSI format PST files on user personal drives that needed converting. Using the Upstart command line utility I’ve been able seamlessly upgrade these overnight prior to upgrading my PCs and XenApp servers with Office 2010.
Also, it’s very sensibly (reasonably) priced.
Even better, cupstart.exe (the command line PST upgrade utility) lets you run the Microsoft ScanPST utility, so you can ensure the PST files have no issues before you attempt to upgrade them. As ScanPST has no command line functionality, cupstart is a great way to get this.
Best yet, the guy who wrote it, Pete, is really helpful in the event that you encounter any issues or don’t read the manual properly (ehem…).
So, some hints and tips based on my experiences with the Command Line version, cupstart.exe:
- You need to run cupstart on a machine that has Outlook 2003 or higher installed as it uses Outlook’s resources.
- cupstart by default will run ScanPST to fix problems within the PST file before it converts it. If this isn’t run then any major problems in the file could cause the conversion process to fail. However, ScanPST doesn’t always exit nicely and can crash cupstart. The solution to this is to run cupstart with the scan option first, then repair the file using the repair option if necessary, before finally converting the file to Unicode using the cu option.
- The Outlook profile keeps a record of what PST files a user has open in Outlook, and what type they are. If you convert an ANSI PST file to Unicode then you can sometimes get issues if the Outlook profile still thinks the PST file is ANSI. Luckily, cupstart provides an option, cp, to process the PST entries in the Outlook profile to correct this. I found that it was a good idea to also use the /C (continue on errors) switch as if a user has a PST file specified in their Outlook profile that no longer exists cupstart will stop and not process any further PST entries.
- cupstart does not seem to multithread, I’m running it on a PST with a fast quad core processor but it only uses 25% CPU maximum.
- To eliminate disk-based or network latency bottlenecks I’ve written a VBScript that copies the user’s PST files to a fast SSD on my processing PC, I then process the file and copy the converted files back to the user’s network storage, also dropping a marker that my logon script picks up to tell it to run cupstart cp /c for the user next time they log on.
- PST files on the whole get smaller when converted to Unicode, but a few get bigger. I’ve yet to work out why – and probably won’t be able to as trawling through people’s email is both time consuming and probably aginst some kind of data protection agreement). Here’s a graph of before and after sizes for some of the larger PST files I’ve processed (Y axis is size in MB):