Search Engine trawler

Annulé Publié le Dec 16, 2009 Paiement à la livraison
Annulé Paiement à la livraison

The object of the exercise is to increase the number of times a selected group of sites are searched for in selected search engines.

The program works like this. It searches from a given list of search engines against a given list of keywords / phrases for a given list of URL's.

## Deliverables

I think if you understand why we want this program, it makes your work easier.

Stage One is an update on a present in house system we have been using for two years. The object of the exercise is to increase the number of times a selected group of sites are search for in selected search engines. We are then able to enhance the sites keywords to maintain a good ranking in the search engines.

Stage One Description

It opens up its own control window with default browser in it (This can be Firefox or IE). A small top menu gives access to settings, while a bottom bar shows progress.

The program works like this. It searches from a given list of search engines against a given list of keywords / phrases for a given list of URL's.

The program would use editable text files, containing the list of URL's and further instructions and a file with the list of search engines to use and there parameters.

These could be stored in same or different editable files.

Also after each cycle, the cookies and cache are cleared.

So in the following example using two of our present files;

Example Engine List:

"[url removed, login to view]

"[url removed, login to view]

Example URL list:

perfume world,[url removed, login to view],/anyfolder/[url removed, login to view],

flight center,[url removed, login to view],[url removed, login to view],

The program would first search Google for the keywords 'perfume world'. If it finds the '[url removed, login to view]' url, it will follow. Then go to '/anyfolder/[url removed, login to view]' on that site. If the perfumeworld url is not found after searching user defined pages deep, the full url is searched for ([url removed, login to view]). If still not found, program moves on to next search.

It will then search Bing for 'perfume world' etc.

It will then search Google for 'flight center' etc

Then search Bing for 'flight center' etc

End of complete cycle.

Cookies/cache/temp internet files/ cleared.

Start new cycle

There are some user parameters to allow for, such as time delays and randomness.

1. The number of pages deep to search each search engine is user defined.

2. In instances where the url is not found in the first page of search engine results, there needs to be a min / max / random time delay before going to the second or subsequent pages of search engine results.

IE; Wait min XX max XX seconds and randomise the difference between min and max.

3. Once URL is found, user set time delay in xxx seconds, before going to second URL.

IE; When arriving at [url removed, login to view], waits xxx seconds before going to thispage.html.

4. Further, stay on that final page for XXX seconds before starting a new search.

5. A drop down from the menu to select / de-select search engines and urls to use.

Advanced;

1. If the same IP has been used for XXX minutes, the program halts and then checks if IP changed or not every XX minutes. If changed it starts at the beginning of the cycle again. (XXX and XX are user defined)

2. Miming: The program can contain a standard up to date list of browsers mime types sourced by you, in approx order of popularity (IE7,Firefox3,IE8,,,,,,,,,,safari,Firefox2 ect). And change randomly on each cycle - though favoring the top ones.

IE;

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2 .........)

ect.

**Important Notes**

1. A complete search cycle is defined as the whole list of given URL's to search.

2. The IP address may change while the program is running, so it will need to check IP and adopt new IP if necessary before starting each search (not cycle).

3. Connectivity can break sometimes for several minutes, which should not 'freeze' or break the program. It can either pause or continue with blank pages.

4. If a hyperlink in the control window is manually clicked at any time, it will open in a new browser window. Ideally, user defined Firefox or IE.

5. Able to change and remember the programs control window size.

6. Must be stable against page script errors and timeouts etc If it encounters a serious problem, it should kick start itself again. And with good memory management.

Useful Options

1. Option to automatically run at start up using previous settings though with say a 3 minute delay to enable Windows and other programs to fully load.

2. To be able to run in its own package without having to install the program on our pc would be good.

3. In the 'Example URL list' above, is it possible to enter a string instead of '[url removed, login to view]

IE; Instead of 'flight center,[url removed, login to view],[url removed, login to view],' we had 'flight center,[url removed, login to view],~159701,' It would follow any url containing '159701'

**Possible STAGE TWO:-**

To be discussed with coder

Apple Safari Ingénierie Google Chrome Microsoft MySQL PHP Gestion de Projet Architecture Logicielle Tests de Logiciels Bureau Windows

Nº du projet : #3028489

À propos du projet

4 propositions Projet à distance Actif Feb 10, 2010

4 freelances font une offre moyenne de 452 $ pour ce travail

delphiprovw

See private message.

$850 USD en 14 jours
(79 Commentaires)
6.0
webexpresszone

See private message.

$233.75 USD en 14 jours
(7 Commentaires)
4.0
sbrtechicon

See private message.

$595 USD en 14 jours
(0 Commentaires)
0.0
acmabpo

See private message.

$127.5 USD en 14 jours
(1 Évaluation)
0.0