• Home »
  • Security »
  • list-urls.py : Backtrack 5: Non Menu Items: Various Scripts: list-urls.py

list-urls.py : Backtrack 5: Non Menu Items: Various Scripts: list-urls.py

Backtrack includes a bunch of items that are not in the menu system such as the list-urls.py Python script. This script which has been enhanced by Muts simply queries a specific URL and extracts all of the URL’s from the page. These types of scripts really are handy and knowing your way around Backtrack can save you a ridiculous amount of time.

list-urls.py Python Script Code:

Click to expand the list-urls.py efPython code below.

list-urls.py Python Code
#!/usr/bin/python
“””Extract list of URLs in a web page

This program is part of “Dive Into Python”, a free Python book for
experienced programmers.  Visit http://diveintopython.org/ for the
latest version.
“””

__author__ = “Mark Pilgrim (mark@diveintopython.org)”
__version__ = “$Revision: 1.2 $”
__date__ = “$Date: 2004/05/05 21:57:19 $”
__copyright__ = “Copyright (c) 2001 Mark Pilgrim”
__license__ = “Python”

from sgmllib import SGMLParser
import sys

if len(sys.argv) != 2:
print “\n\n+++++++++++++++++++++++++++++++++++++++++++++++++++++”
print “Extract links form webpage – v.0.1            ”
print “+++++++++++++++++++++++++++++++++++++++++++++++++++++”
print “\nUsage : ./list-urls.py            ”
print “Eg: ./list-urls.py http://www.whoppix.net          ”
print “\n+++++++++++++++++++++++++++++++++++++++++++++++++++++”
sys.exit(1)

class URLLister(SGMLParser):
def reset(self):
SGMLParser.reset(self)
self.urls = []

def start_a(self, attrs):
href = [v for k, v in attrs if k=='href']
if href:
self.urls.extend(href)

if __name__ == “__main__”:

import urllib
print “\n##########################################################”
print “#                                                        #”
print “#             Extract URLS from a web page               #”
print “#                muts@whitehat.co.il                     #”
print “#                                                        #”
print “##########################################################\n”
link = sys.argv[1]
try:
usock = urllib.urlopen(link)
parser = URLLister()
parser.feed(usock.read())
parser.close()
usock.close()
for url in parser.urls: print url
except:
print “Could not reach “+ sys.argv[1]+ ” !”
print “Did you remember to put an http:// before the domain name?”

As you can see from the code above the script is fairly simply but think how much time it would save you if you were tasked with extracting thousands of URL’s from a single web page. Below we show an example of the list-urls.py script found in the /pentest/enumeration/list-urls/ directory on Backtrack 5 r2.

list-urls.py Python Command Syntax:

root@bt:~# /pentest/enumeration/list-urls/list-urls.py
+++++++++++++++++++++++++++++++++++++++++++++++++++++
Extract links form webpage - v.0.1
+++++++++++++++++++++++++++++++++++++++++++++++++++++

Usage : ./list-urls.py <web-page>
Eg: ./list-urls.py http://www.whoppix.net

+++++++++++++++++++++++++++++++++++++++++++++++++++++
root@bt:~#

list-urls.py Python Command Example:

root@bt:~# /pentest/enumeration/list-urls/list-urls.py http://www.question-defense.com
##########################################################
#                                                       #
#            Extract URLS from a web page               #
#               muts@whitehat.co.il                     #
#                                                       #
##########################################################

http://httpd.apache.org/

http://www.centos.org/

http://www.centos.org/

http://www.internic.net/whois.html

root@bt:~#

As you can see above there were 4 URL’s available on the www.question-defense.com web page when this example was performed. The list-urls.py script is easy to use and performs a single task which is extracting URL’s from a web page.


List Price: $39.99 USD
New From: $22.36 USD In Stock
Used from: $20.20 USD In Stock


List Price: $34.99 USD
New From: $19.32 USD In Stock
Used from: $18.00 USD In Stock

Share