• Home »
  • Security »
  • Backtrack 4: Information Gathering: Archive: Metagoofil – Extract metadata from public documents

Backtrack 4: Information Gathering: Archive: Metagoofil – Extract metadata from public documents

One good thing about writing articles on tools is you get to test out lots of different stuff you may not have normally used. One of these tools for me was Metagoofil. Metagoofil is a tool for written in Python for extracting the metadata from public documents (pdf,doc,xls,ppt) available in the target websites. This information could be useful because you can get valid usernames, or people names, for using later in brute force password attacks (vpn, ftp, webapps etc.). The tool first queries Google for different filetypes that can have useful metadata (pdf, doc, xls,ppt,etc), then it downloads those documents to the disk and run the program “extract” on every file. It will generate a HTML page with the results of the metadata extracted, plus a list of potential usernames.

Lets take a look at the help menu:

root@666:/pentest/enumeration/google/metagoofil# ./metagoofil.py -h

*************************************
*MetaGooFil Ver. 1.4b               *
*Coded by Christian Martorella      *
*Edge-Security Research             *
*cmartorella@edge-security.com      *
*************************************

MetaGooFil 1.4

usage: metagoofil options

        -d: domain to search
        -f: filetype to download (all,pdf,doc,xls,ppt,odp,ods, etc)
        -l: limit of results to work with (default 100)
        -o: output file, html format.
        -t: target directory to download files.

        Example: metagoofil.py -d microsoft.com -l 20 -f all -o micro.html -t micro-files

Lets run a check for pdf’s on the EPA website where I am certain they have lots of PDF files:

root@666:/pentest/enumeration/google/metagoofil# ./metagoofil.py -d www.epa.gov -l 100 -f pdf -o example.html -t deleteme

*************************************
*MetaGooFil Ver. 1.4b               *
*Coded by Christian Martorella      *
*Edge-Security Research             *
*cmartorella@edge-security.com      *
*************************************

[+] Command extract found, proceeding with leeching
[+] Searching in www.epa.gov for: pdf
 350000
[+] Total results in google: 350000
[+] Limit:  100
[+] Searching results: 0
[+] Searching results: 20
[+] Searching results: 40
[+] Searching results: 60
[+] Searching results: 80
[+] Directory deleteme already exist, reusing it
        [ 1/100 ] http://www.epa.gov/clearskies/Air_005.pdf
        [ 2/100 ] http://www.epa.gov/cmop/docs/022red.pdf
        [ 3/100 ] http://www.epa.gov/nps/natlstormwater03/25Neiswender.pdf
        [ 4/100 ] http://www.epa.gov/nps/natlstormwater03/41Weinstein.pdf
        [ 5/100 ] http://www.epa.gov/PR_Notices/pr2001-4.pdf
        [ 6/100 ] http://www.epa.gov/nps/natlstormwater03/17Hillegass.pdf
        [ 7/100 ] http://www.epa.gov/waters/tmdldocs/11536_CaneyForkSed_080105.pdf
        [ 8/100 ] http://www.epa.gov/PR_Notices/pr2000-8.pdf
        [ 9/100 ] http://www.epa.gov/nps/natlstormwater03/07Comstock.pdf
        [ 10/100 ] http://www.epa.gov/PR_Notices/pr2000-9.pdf
        [ 11/100 ] http://www.epa.gov/nps/natlstormwater03/08Dorava.pdf
        [ 12/100 ] http://www.epa.gov/fedfac/pdf/uxo_risk_assmnt_rvw_2004.pdf
        [ 13/100 ] http://www.epa.gov/PR_Notices/pr98-4.pdf
        [ 14/100 ] http://www.epa.gov/cmop/docs/pol006.pdf
        [ 15/100 ] http://www.epa.gov/nps/natlstormwater03/40Tuomari.pdf
        [ 16/100 ] http://www.epa.gov/nps/natlstormwater03/29Reese.pdf
        [ 17/100 ] http://www.epa.gov/ttncatc1/dir1/fsetling.pdf
        [ 18/100 ] http://www.epa.gov/nps/natlstormwater03/14Greer.pdf
        [ 19/100 ] http://www.epa.gov/cmop/docs/002red.pdf
        [ 20/100 ] http://www.epa.gov/opp00001/regulating/fifra.pdf
        [ 21/100 ] http://www.epa.gov/nps/natlstormwater03/45Hollister.pdf
        [ 22/100 ] http://www.epa.gov/nps/natlstormwater03/10Duma.pdf
        [ 23/100 ] http://www.epa.gov/lead/pubs/span_web_secure.pdf
        [ 24/100 ] http://www.epa.gov/endo/pubs/notes_for_appendix_9.pdf
        [ 25/100 ] http://www.epa.gov/oppsrrd1/REDs/0630red.pdf
        [ 26/100 ] http://www.epa.gov/nps/natlstormwater03/38Stephens.pdf
        [ 27/100 ] http://www.epa.gov/nhsrc/pubs/600r04065.pdf
        [ 28/100 ] http://www.epa.gov/asbestos/pubs/vairesearchmethodfinal.pdf
        [ 29/100 ] http://www.epa.gov/nps/natlstormwater03/34Shepard.pdf
        [ 30/100 ] http://www.epa.gov/cmop/docs/013red.pdf
        [ 31/100 ] http://www.epa.gov/endo/pubs/male_pubertal_lit_study_descriptions_table_final.pdf
        [ 32/100 ] http://www.epa.gov/ocr/docs/42usc2000d.pdf
        [ 33/100 ] http://www.epa.gov/cpd/pdf/maccppfinal.pdf
        [ 34/100 ] http://www.epa.gov/nhsrc/pubs/vrUltrastrip032704.pdf
        [ 35/100 ] http://www.epa.gov/ttncatc1/dir1/fsprytwr.pdf
        [ 36/100 ] http://www.epa.gov/nps/natlstormwater03/12Gabbard.pdf
        [ 37/100 ] http://www.epa.gov/endo/pubs/appendix_iv_feed_analysis_reports_rti_fp.pdf
        [ 38/100 ] http://www.epa.gov/nps/natlstormwater03/32Sands.pdf
        [ 39/100 ] http://www.epa.gov/nps/natlstormwater03/11Echols.pdf
        [ 40/100 ] http://www.epa.gov/dced/pdf/ptfd_primer.pdf
        [ 41/100 ] http://www.epa.gov/nps/natlstormwater03/02Booth.pdf
        [ 42/100 ] http://www.epa.gov/ocr/docs/40p0007.pdf
        [ 43/100 ] http://www.epa.gov/ttncatc1/dir1/rblc2002.pdf
        [ 44/100 ] http://www.epa.gov/oppsrrd1/REDs/3082red.pdf
        [ 45/100 ] http://www.epa.gov/cmop/docs/pol003.pdf
        [ 46/100 ] http://www.epa.gov/PR_Notices/pr97-2.pdf
        [ 47/100 ] http://www.epa.gov/nps/natlstormwater03/47Strecker.pdf
        [ 48/100 ] http://www.epa.gov/PR_Notices/pr2001-3.pdf
        [ 49/100 ] http://www.epa.gov/nps/natlstormwater03/15Groner.pdf
        [ 50/100 ] http://www.epa.gov/nps/natlstormwater03/31Roa.pdf
        [ 51/100 ] http://www.epa.gov/ttncatc1/dir1/cs6ch2.pdf
        [ 52/100 ] http://www.epa.gov/endo/pubs/notes_for_appendix_6.pdf
        [ 53/100 ] http://www.epa.gov/oust/mtbe/oxytable.pdf
        [ 54/100 ] http://www.epa.gov/nps/natlstormwater03/28Pitt.pdf
        [ 55/100 ] http://www.epa.gov/cmop/docs/001red.pdf
        [ 56/100 ] http://www.epa.gov/cmop/docs/pol002.pdf
        [ 57/100 ] http://www.epa.gov/ogc/china/eis.pdf
        [ 58/100 ] http://www.epa.gov/msbasin/pdf/symposia_ia_presentations.pdf
        [ 59/100 ] http://www.epa.gov/npdescan/FL0000701FP.pdf
Florida Department of Environmental Protection
Title(Rayonier Performance Fibers, LLC)
        [ 60/100 ] http://www.epa.gov/nps/natlstormwater03/27Claytor.pdf
        [ 61/100 ] http://www.epa.gov/nps/natlstormwater03/Johnsposter.pdf
        [ 62/100 ] http://www.epa.gov/ttncatc1/dir1/fwespwpl.pdf
        [ 63/100 ] http://www.epa.gov/ogd/forms/Buy_Am.pdf
        [ 64/100 ] http://www.epa.gov/cmop/docs/pol005.pdf
        [ 65/100 ] http://www.epa.gov/ocr/docs/33usc1251.pdf
        [ 66/100 ] http://www.epa.gov/ttncatc1/dir1/icboiler.pdf
        [ 67/100 ] http://www.epa.gov/nps/natlstormwater03/Mullinposter.pdf
        [ 68/100 ] http://www.epa.gov/nhsrc/pubs/vrWatts062404.pdf
        [ 69/100 ] http://www.epa.gov/watersense/docs/AWWA_Journal_showerheads.pdf
        [ 70/100 ] http://www.epa.gov/asbestos/pubs/aherarequirements.pdf
        [ 71/100 ] http://www.epa.gov/endo/pubs/trc_fr_101.pdf
        [ 72/100 ] http://www.epa.gov/oust/mtbe/omethods.pdf
        [ 73/100 ] http://www.epa.gov/greenchill/downloads/Bohn_Secondary_Loop_WP.pdf
        [ 74/100 ] http://www.epa.gov/nps/natlstormwater03/35Sloan.pdf
        [ 75/100 ] http://www.epa.gov/ttncatc1/dir1/fmechan.pdf
        [ 76/100 ] http://www.epa.gov/nps/natlstormwater03/03Bretsch.pdf
        [ 77/100 ] http://www.epa.gov/opprd001/factsheets/diclosulam.pdf
        [ 78/100 ] http://www.epa.gov/hurricane/pdf/homeleadremodeling_brochure.pdf
        [ 79/100 ] http://www.epa.gov/cmop/docs/red001.pdf
        [ 80/100 ] http://www.epa.gov/nhsrc/pubs/vrSears090704.pdf
        [ 81/100 ] http://www.epa.gov/PR_Notices/pr98-3.pdf
        [ 82/100 ] http://www.epa.gov/agstar/pdf/wefjune2003.pdf
        [ 83/100 ] http://www.epa.gov/nps/natlstormwater03/13Gentile.pdf
        [ 84/100 ] http://www.epa.gov/nps/natlstormwater03/09Dreyfuss.pdf
        [ 85/100 ] http://www.epa.gov/PR_Notices/pr2002-2.pdf
        [ 86/100 ] http://www.epa.gov/PR_Notices/pr2001-6.pdf
        [ 87/100 ] http://www.epa.gov/nps/natlstormwater03/16Hackett.pdf
        [ 88/100 ] http://www.epa.gov/nps/natlstormwater03/33Shapiro.pdf
        [ 89/100 ] http://www.epa.gov/nps/natlstormwater03/36Solek.pdf
        [ 90/100 ] http://www.epa.gov/waters/tmdldocs/22726_HiwasseeSed.pdf
        [ 91/100 ] http://www.epa.gov/endo/pubs/attachment_a1_ama_test_method.pdf
        [ 92/100 ] http://www.epa.gov/nps/natlstormwater03/21Malec.pdf
        [ 93/100 ] http://www.epa.gov/nhsrc/pubs/vsUltrastrip032704.pdf
        [ 94/100 ] http://www.epa.gov/nps/natlstormwater03/20Liptan.pdf
        [ 95/100 ] http://www.epa.gov/glnpo/lakesuperior/epo1998.pdf
        [ 96/100 ] http://www.epa.gov/npdescan/okg950000gfp.pdf
        [ 97/100 ] http://www.epa.gov/region4/waste/martincs.pdf
        [ 98/100 ] http://www.epa.gov/endo/pubs/fish_assay_charge_questions.pdf
        [ 99/100 ] http://www.epa.gov/oust/mtbe/mtbemap.pdf
        [ 100/100 ] http://www.epa.gov/gasstar/documents/cast_iron_mains.pdf

Usernames found:
================
BG34061
"ÂÃòT▒3U
Author(Florida Department of Environmental Protection)Florida Department of Environmental Protection
ÂÂ3Ã^ÃÃÂWÿÃSÂÃéNúB«`Â,ÂÃs
Âë!çì»eú§:ÂÃàLÂ9ÂÃ
Blumenstein
Ã(¤qÃÃÂ`¶Ã°ÃÂWÂÃÂ2þ
Ãt+´HºÂÂ%ÃÃ3»

Paths found:
============

Title(Rayonier Performance Fibers, LLC)/Author(Florida Department of Environmental Protection)/Keywords(NPDES Permit, Rayonier Performance Fibers LLC, Fermandina Beach, Nassau County, Florida, WWTP, Wastewater Treatment Plant, wastewater)/Subject(NPDES Permit for Rayonier Performance Fibers, LLC in Fermandina Beach, Nass

So now we can look in the same directory that we ran the tool in for our html file which will be called example.html:

root@666:/pentest/enumeration/google/metagoofil# ls -la
total 76
drwxr-xr-x 3  502 root  4096 May 28 09:16 .
drwxr-xr-x 5 root root  4096 May 28 09:09 ..
-rwxr-xr-x 1  502 root 15238 May 11 19:38 COPYING
-rwxr-xr-x 1  502 root    97 May 11 19:38 LICENSES
-rwxr-xr-x 1  502 root  2226 May 11 19:38 README
drwxr-xr-x 2 root root  4096 May 28 09:19 deleteme
-rw-r--r-- 1 root root 27067 May 28 14:20 example.html
-rwxr-xr-x 1  502 root 11926 May 28 09:10 metagoofil.py

Now we can open the file with anyweb browser and get a nice graphical metadata breakdown of each file:
5-28-2010-2-26-09-PM

Metagoofil can be used with any of the file types listed or by simply passing the “-f all” argument you can have it check for other file types. I will most definitely be using this tool in the future.

Metadata (Paperback)

By (author): Marcia Lei Zeng, Jian Qin


List Price: $72.00 USD
New From: $64.38 USD In Stock
Used from: $55.42 USD In Stock

Google Hacking for Penetration Testers (Paperback)

By (author): Johnny Long, Bill Gardner, Justin Brown


List Price: $49.95 USD
New From: $15.52 USD In Stock
Used from: $15.24 USD In Stock

Share