En cours

PHP website crawler


I need a programmer to set up a crawler for 9 daily deal sites and crawl 7 values on each site.

You must use this class and the functions inside whenever it is possible to crawl the websites [url removed, login to view] Each site-crawler should be placed in an unique file and just include the parser class, like this:


include_once('../[url removed, login to view]');

// parser stuff


It means the folder/file structure will be like this:

- [url removed, login to view]

- [url removed, login to view]

- Image (folder)

- Crawlers (folder)

--- [url removed, login to view]

--- [url removed, login to view]

--- web....

The script will crawl the site when you visit [url removed, login to view] and the crawled content should go in a database like this:

CREATE TABLE `deal_crawler` (

`deal_id` int(11) NOT NULL AUTO_INCREMENT,

`site` varchar(64) NOT NULL,

`text` text NOT NULL,

`dealprice` int(11) NOT NULL,

`dealvalue` int(11) NOT NULL,

`endtime` datetime NOT NULL,

`picture` varchar(64) NOT NULL,

`url` varchar(256) NOT NULL,

`lastsucces` datetime NOT NULL,

PRIMARY KEY (`deal_id`)


INSERT INTO `deal_crawler` VALUES(1, '[url removed, login to view]', 'This is todays deal...', 100, 200, '2011-05-10 23:31:47', '[url removed, login to view]', '[url removed, login to view]', '2011-05-02 23:31:47');

Where the fields contains:


– unique id


- static name, will just be defined in a variable on top of the crawler php file


– text from the site


– price from the site


– full price of the deal


– the endtime (sometimes calculated from “times back”, sometime can the actual end time be grabbed in the code. Please make sure the time is correct Danish time[[url removed, login to view]] )


– picture will be stored in the image folder with the deal_id as name, this field will then contain [url removed, login to view], [url removed, login to view] or something similar.


– the deals url.


- If NOT something goes wrong when crawling (PHP error or if a field isn’t properly filed) this field should be updated each time the crawler finds that this deal is still active on the front site.

If a new deal is active, a new row should be inserted in the database.

I will highly appreciate if you build functions/classes when the same code are used more than one time, and place the new functions/classes in the [url removed, login to view] file. For exsamle the insert statement in the MySQL database can be a function, the transfer of picture e.g.

Furthermore the code have to be well commented.

You will have to make the crawler work 3 days in a row before your work is done. Just to make sure it is not only working with todays deal. If the site changes sometime in the future, I will of cause pay you (or someone else) again.

A quick deadline and low price is essential.

----> Please see the attached file to view a detailed description of the data need to be crawled. <--- (The PDF and the Doc file have the same content)

Best regards


Compétences : MySQL, PHP

en voir plus : pdf crawler php, crawling php pdf, web crawler php pdf, php mysql site crawler 2011, php script crawling pdf sites, php script web crawler, crawler code php, php folder crawler, working of web crawler, website price value, value website, top programmer website, statement of work .doc, statement of work doc, price programmer website, php website programmer, php script null, php create table, php and mysql pdf, pdf to doc projects, need of data structure to programmer, key data structure, create table php, best websites for programmer, best data structure

Concernant l'employeur :
( 38 commentaires ) København Ø, Denmark

Nº du projet : #1045703