October 28, 2019 · Basic Pen-Testing
1.3 : Basic Bash scripting - curl , cut , grep & regex (Part I)
Example: get sub-domains from www.example.com/index.html
String matching approach
-
download file
# download www.example.com to current dir wget www.example.com # show index.html file , permissions, size etc. ls -l index.html # show file content cat index.html # well I prefer curl to download curl -L www.example.com > index.html
-
analyse file content, get all urls
# grep all href grep "href=" index.html
-
extract / the 3rd field
grep "href=" index.html | cut -d "/" -f 3
a bit explain on the cut command,
Take the following case as example:<li><a href="//www.example.com/aboutus.html">aboutus</a></li>
the "/" deamintor break into the statement into pieces, start the count by 1:
1 - <li><a href=
2 - [empty]
3 - www.example.com -
clean up only leave domain names filtering line contains a
.
grep "href=" index.html | cut -d "/" -f 3 | grep "\." # grep support regex, need \ for escape
return
www.example.com/abc abc.example.com def.example.com
-
use
cut
command to extract first group to get rid of"
grep "href=" index.html |cut -d "/" -f 3 | grep "\." | cut -d '"' -f 1
-
use
sort -u
to sort & get unqiue contentgrep "href=" index.html |cut -d "/" -f 3 | grep "\." | cut -d '"' -f 1 | sort -u
return
abc.example.com def.example.com www.appdynamics.com www.facebook.com www.instagram.com www.linkedin.com www.webex.com www.youtube.com
Regex approach
grep only matching group , [^"]
= a group of chars ^
(not) include "
cat index.html | grep -o 'http://[^"]*' | cut -d "/" -f 3 | sort -u > list.txt
return
http://www.w3.org/2000/svg\
http://www.schema.org
http://www.w3.org/2000/svg
http://www.w3.org/2000/svg
http://www.w3.org/2000/svg
http://www.w3.org/2000/svg
http://www.w3.org/2000/svg
http://www.w3.org/2000/svg
http://www.w3.org/1999/xlink
http://www.w3.org/2000/svg
http://www.w3.org/2000/svg
http://www.w3.org/2000/svg
grep only matching REGEX group s?
= s or no s
cat index.html | grep -o -E 'https?://[^"]*' | cut -d "/" -f 3 | sort -u > list.txt
return
abc.example.com
def.example.com