Searching through an ENTIRE website(Solved, please close)

General technological topics without their own forum go here

Searching through an ENTIRE website(Solved, please close)

Post by Yoshiboshi3 on Tue Jun 26, 2012 12:31 pm
([msg=67529]see Searching through an ENTIRE website(Solved, please close)[/msg])

EDIT: Solved, please close the topic.
Last edited by Yoshiboshi3 on Wed Jun 27, 2012 11:51 am, edited 2 times in total.
Yoshiboshi3
New User
New User
 
Posts: 2
Joined: Tue Jun 26, 2012 12:07 pm
Blog: View Blog (0)


Re: Searching through an ENTIRE website(Sorry if wrong section!)

Post by WallShadow on Tue Jun 26, 2012 12:47 pm
([msg=67531]see Re: Searching through an ENTIRE website(Sorry if wrong section!)[/msg])

What you need yoshi is a spider/crawler. A spider is a program that given a URL, it looks through the webpage, finds any URIs that it can, and then looks at those. It recursively repeats this process until it has a map of all possible URIs that it has reached.

One good program for this is called Paros Proxy. Its a program that comes with a built in spider. All you need to do is download it, start it, configure your browser settings to route through Paros Proxy, and go to the website (ex: go to www.hackthissite.org). In your Paros proxy window, you will notice that Paros recorded this website visit. All you have to do is right-click the website in Paros, and choose the option to crawl the domain with the spider. It can take sometime for larger sites, but it will immediately start outputting results. Credit goes to whomever first mentioned it on HTS, I can't remember who it was.

-WallShadow <3
User avatar
WallShadow
Contributor
Contributor
 
Posts: 594
Joined: Tue Mar 06, 2012 9:37 pm
Blog: View Blog (0)


Re: Searching through an ENTIRE website(Sorry if wrong section!)

Post by limdis on Tue Jun 26, 2012 1:08 pm
([msg=67532]see Re: Searching through an ENTIRE website(Sorry if wrong section!)[/msg])

I think you might be a little mixed up on how the "site:" operator works. You use site: to only show your search results from a particular site. So, being that you are looking for a specific conversation you could try to quote some of the text just before it and you will likely find what you seek pretty quickly. Like this;

Code: Select all
"my secret", site:trollmegle.net/logs/


Also review this later later.
"The quieter you become, the more you are able to hear..."
"Drink all the booze, hack all the things."
User avatar
limdis
Moderator
Moderator
 
Posts: 1395
Joined: Mon Jun 28, 2010 5:45 pm
Blog: View Blog (0)


Re: Searching through an ENTIRE website(Sorry if wrong section!)

Post by Yoshiboshi3 on Tue Jun 26, 2012 1:29 pm
([msg=67535]see Re: Searching through an ENTIRE website(Sorry if wrong section!)[/msg])

limdis wrote:I think you might be a little mixed up on how the "site:" operator works. You use site: to only show your search results from a particular site. So, being that you are looking for a specific conversation you could try to quote some of the text just before it and you will likely find what you seek pretty quickly. Like this;

Code: Select all
"my secret", site:trollmegle.net/logs/


Also review this later later.


Uh, no. Like I said in the first post, site: does not search the entire site. Here, I'll prove it.

Take this random conversation for example:
http://trollmegle.net/logs/2204091.html

Now let's take a line of text from it.
"The one and only Nicholas Cage"

Now search "The one and only Nicholas Cage", site:trollmegle.net/logs/

No results.

Now, let's take this log.
http://trollmegle.net/logs/1698205.html

Take a line of text.
"not the first time"

search "not the first time", site:trollmegle.net/logs/

And it comes up.

This has me really puzzled. But yeah, google search obviously isn't searching everything.

-- Tue Jun 26, 2012 1:30 pm --

WallShadow wrote:What you need yoshi is a spider/crawler. A spider is a program that given a URL, it looks through the webpage, finds any URIs that it can, and then looks at those. It recursively repeats this process until it has a map of all possible URIs that it has reached.

One good program for this is called Paros Proxy. Its a program that comes with a built in spider. All you need to do is download it, start it, configure your browser settings to route through Paros Proxy, and go to the website (ex: go to http://www.hackthissite.org). In your Paros proxy window, you will notice that Paros recorded this website visit. All you have to do is right-click the website in Paros, and choose the option to crawl the domain with the spider. It can take sometime for larger sites, but it will immediately start outputting results. Credit goes to whomever first mentioned it on HTS, I can't remember who it was.

-WallShadow <3

I'll check it out!
Yoshiboshi3
New User
New User
 
Posts: 2
Joined: Tue Jun 26, 2012 12:07 pm
Blog: View Blog (0)



Return to General

Who is online

Users browsing this forum: No registered users and 0 guests