We have had a couple requests to write a post about readpst which is included in the default path of Backtrack 5 and also located in the Backtrack menu underneath Forensics/Forensics Analysis Tools. The readpst application will read PST files which are also known as Microsoft Outlook Personal Folders and convert them to mbox, MH, or KMail formats. There are various other switches that can be used to output each email into a separate file, include attachments, modify contact formats, be recursive, etc. I will explain basic functionality below along with a couple of the formats and various switches.
First off you must obtain a PST file to use as the input source but we will assume you already have the PST file since you are reading this article. If someone needs assistance finding the location of PST files on any operating system just leave a comment below and someone will provide an answer of the default location for PST files on that OS.
readpst Version As Of Backtrack v5r3:
- root@bt:~# readpst -V
- ReadPST / LibPST v0.6.41
- Little Endian implementation being used.
- GCC 4.4 : May 10 2011 07:07:57
- root@bt:~#
Above we have used the -v switch to output the current version of readpst, which is readpst v0.6.41, installed on Backtrack Linux version 5 release 3 or BT5r3. I always like to include version just in case something has changed whomever is reading the article will know that the information within the article worked with XyZ version. Anyhow below there is example output of the most basic version of readpst usage that is not using any specific switches and is processing a PST file named archive.pst.
Basic readpst Usage:
- root@bt:~/readpst# readpst archive.pst
- Opening PST file and indexes...
- Processing Folder "Deleted Items"
- Processing Folder "Inbox"
- "Archives" - 2 items done, 0 items skipped.
- Processing Folder "folder1"
- Processing Folder "folder2"
- Processing Folder "folder3"
- Processing Folder "folder4"
- Processing Folder "folder5"
- "folder1" - 3 items done, 1 items skipped.
- "folder2" - 51 items done, 1 items skipped.
- "folder3" - 43 items done, 1 items skipped.
- Processing Folder "folder6"
- Processing Folder "folder7"
- "folder4" - 37 items done, 1 items skipped.
- Processing Folder "folder8"
- "folder5" - 10 items done, 1 items skipped.
- "folder6" - 3 items done, 1 items skipped.
- "folder7" - 7 items done, 1 items skipped.
- "folder8" - 14 items done, 1 items skipped.
- Processing Folder "folder9"
- "folder9" - 10 items done, 1 items skipped.
- root@bt:~/readpst#
What the above command has accomplished is to generate files named for the folder names in the output above that are in standard mbox format and can be read using something like cat or mail from the command line or by using a GUI application that can read the mbox format. Below is an example of what an mbox file will display when processed with the command line application mail.
Read mbox File Output By readpst With mail:
- root@bt:~# mail -f /root/readpst/folder6
- "/root/readpst/folder6": 3 messages 3 new
- >N 1 Jim Smith Mon Aug 29 15:18 89/4905 Automatic reply: Corp Server
- N 2 Tom Johnson Mon Aug 29 16:19 127/6205 RE: Corporate Server
- N 3 Jim Smith Mon Aug 29 16:40 98/3697 Re: Corp Server
- ? q
- Held 3 messages in /root/readpst/folder6
- root@bt:~#
As you can see above there are three emails located in folder6 and you could read each message separately using mail with the -f switch. My personal preference with readpst is to output each mail message individually but have them organized by folder. This provides me an easier way to recursively grep the entire directory tree and quickly find individual messages with the content I am seeking. Also with individual messages you can open them using Thunderbird and forward them to yourself easily. Below is an example of readpst using the -s switch along with the -r switch to be recursive.
Use readpst To Process PST File Into Single Files Per Email:
- root@bt:~# readpst -rS archive.pst
- Opening PST file and indexes...
- Processing Folder "Deleted Items"
- Processing Folder "Inbox"
- "Archives" - 2 items done, 0 items skipped.
- Processing Folder "folder1"
- Processing Folder "folder2"
- Processing Folder "folder3"
- Processing Folder "folder4"
- Processing Folder "folder5"
- "folder1" - 3 items done, 1 items skipped.
- "folder2" - 51 items done, 1 items skipped.
- "folder3" - 43 items done, 1 items skipped.
- Processing Folder "folder6"
- Processing Folder "folder7"
- "folder4" - 37 items done, 1 items skipped.
- Processing Folder "folder8"
- "folder5" - 10 items done, 1 items skipped.
- "folder6" - 3 items done, 1 items skipped.
- "folder7" - 7 items done, 1 items skipped.
- "folder8" - 14 items done, 1 items skipped.
- Processing Folder "folder9"
- "folder9" - 10 items done, 1 items skipped.
- root@bt:~#
This time the output is all underneath of an Archives folder and each separate email is its own file.
Sub Folder Contents After Processing PST File Using readpst And -s Switch:
- root@bt:~/Archives/Inbox/folder3# ls
- 10 11-rtf-body.rtf 13 15 18 20 22 25 26-image001.jpg 28 29-image001.jpg 31 34 37 5 8
- 10-rtf-body.rtf 12 14 16 19 20-logo.gif 23 25-image001.jpg 27 28-image001.jpg 3 32 35 38 6 8-rtf-body.rtf
- 11 12-rtf-body.rtf 14-rtf-body.rtf 17 2 21 24 26 27-image001.jpg 29 30 33 36 4 7 9
- root@bt:~/Archives/Inbox/folder3#
Notice how there are RTF, JPG, and GIF files which have been separated out because they were attachments to the messages that correlate with the number preceding their filename. This is the type of command I would use to investigate a PST file and then I would use grep from the root of the Archives folder to search for specific content such as “password”. If you wanted to look at the contents of an email including the header information you could simply use cat as demonstrated below.
Use cat To Display Email Details After Converting PST File With readpst:
- root@bt:~/Archives/Inbox/folder3# less 13
- Received: from EXCHANGE.company.com ([f680::2d33:d782:d999:7d64]) by
- EXCHANGE.company.com ([::1]) with mapi id 14.22.0339.004; Fri, 19 Aug
- 2011 20:12:07 -0600
- From: Don Johnson <djohnson@company.com>
- To: TEAM <TEAM@company.com>, MANAGERS
- <MANAGERS@company.com>
- Subject: Great Work Team
- Thread-Topic: Great Work Team
- Thread-Index: Acxe333333P7kxoSSOOOOO2cOJJtTQ==
- Date: Fri, 19 Aug 2011 20:12:07 -0600
- Message-ID: <8E8336DFA563333333335D61BEC511004444BD@EXCHANGE.company.com>
- Accept-Language: en-US
- Content-Language: en-US
- X-MS-Has-Attach:
- X-MS-Exchange-Organization-SCL: -1
- X-MS-TNEF-Correlator: <8E8746DFA568114193146495D61BEC51100F74BD@EXCHANGE.company.com>
- X-MS-Exchange-Organization-AuthSource: EXCHANGE.company.com
- X-MS-Exchange-Organization-AuthAs: Internal
- X-MS-Exchange-Organization-AuthMechanism: 04
- X-Originating-IP: [10.10.10.10]
- X-Auto-Response-Suppress: DR, OOF, AutoReply
- X-libpst-forensic-sender: /O=COMPANYDOMAIN/OU=EXCHANGE ADMINISTRATIVE GROUP (FY3333323SXYZT)/CN=RECIPIENTS/CN=DON JOHNSON
- MIME-Version: 1.0
- Content-Type: multipart/mixed;
- boundary="--boundary-LibPST-iamunique-438838539_-_-"
- ----boundary-LibPST-iamunique-4344448539_-_-
- Content-Type: text/plain; charset="windows-1252"
- Hello everyone. Just wanted to say great work on the last project!
- Don Johnson
- Company Name
- My Cool Title
- o: +1.123.123.1234
- m: +1.123.123.1235
- ----boundary-LibPST-iamunique-438838539_-_---
- root@bt:~/Archives/Inbox/folder6#
So thats really it. There are more switches that can be used to accomplish different tasks such as not outputting the attachments from emails or other tasks but overall readpst is simply just an application that converts PST files into formats that we can easily search and/or read on Linux.