Annulé

Bot Needed to Extract Email & Save as CSV File w/Existing Data

Read everything below to fully understand the project. Do not bid until you have read in full. Personal messages along with the bid will be given extra attention. If a requested feature drastically increases the price, mention how much it is with and without it so that I can correctly compare your bids to the others...

During the process, it is very important that we stay in contact with one another.

Thanks,

Steve

OUTLINE

I need a program that I can run on Windows to extract email addresses from URL in an existing CSV file and save the results into the same file which contains other data.

CSV has this column structure:

A- URL

B- Email

C- Company

D- Contact

E- Address

F- Phone

EXAMPLE DATABASE

[url removed, login to view]

FEATURES

- I need these; .com, .co, .net, .biz, .us

- Use comma if more than one email found.

- Nulti-threading which can be adjusted by the user (1-30)

- Must load data into database (ie: sqlite) for scraping. There are times where I will use this for 100 URL’s and times where I will want to use it for 100k URLs. So it is important that the results be saved either in the CSV or DB in case of a loss of internet or PC restart.

- Must be able to read URLs in this format; http, www, and [url removed, login to view]

- Scrape email in source code and screen scrape (for email that is output with JavaScript). If this increases price, let me know.

FUNCTION

The program will pull the URL (which I can always make column A), scrape the website for email and post the results into the Email column (column B).

The program needs to have three scraping modes to help with speed. Do not scrape external URL’s or redirects.

1) Slow - Full scan of entire website (50 URL max)

2) Medium - Scrape only the links found on the initial landing page and stop scraping after 30 URL's

3) Fastest - Scrape only these pages; landing page, contact-us, contact, contactus, about, about-us, aboutus, staff. If these pages have extensions (php, jsp, htm, apx, html, etc), that means that case does matter. So we also have to have Contact-us, Contact, Contactus, About, About-us, Aboutus, Staff, ContactUs, Contact-Us, About-Us. And sometimes, the "contact" page is a folder such as [url removed, login to view] (max 15 domains)

I will use as many threads as I can, and run all URL’s in ‘Fastest’ mode. Then, if there are domains that do not have URL’s, I will run it in Slow or Medium (since it will take longer).

One GUI where I will select the file, watch the process, and if possible, specify the URL/time limit for each option (Slow, Medium, Fastest). If that increases the price, let me know. I may later decide that it is better to have a time limit instead of URL limit and will want the ability to change this without rewriting the program.

The program will save the results into a new CSV file which defaults to the original file name with the word RESULTS added to the end of it. If it cannot default to the original file name, it should call itself [url removed, login to view]

Since many websites have forms, it would be nice to know this so that I do not continue trying to process those. Maybe the program can detect the <form> code and put FORM in column B so that I can skip those and keep it for my records.

DEMO

I will want to test this along the way. The demo you provide will need the ability to test at least 50-100 URL’s. It’s much harder to get a good idea of performance with a smaller list.

SOURCE CODE

I want the source code once the project has been completed. As long as you are available, I will continue to work with you if changes are needed, but if you are unable to be reached, I will need to take it to someone else to receive help.

SUPPORT

Two-weeks of support once the project is finalized. There are emails that will be missed so revisions will be needed.

Compétences : Programmation C, Javascript, MySQL, PHP, Web Scraping

Voir plus : email csv bot, trying function, use data structure programming, use data structure, threading programming, steve case, sqlite database price list, programming websites company, programming help needed, programming data structure, programming ability test, process data structure, new data structure, net programming websites, list data structure, internet speed javascript code, initial programming needed, html programming file, post website phone, get good programming, get better programming, get html code url, get external data, data structure list, data structure example

Concernant l'employeur :
( 44 commentaires ) Lexington, United States

N° du projet : #2336622

8 freelance ont fait une offre moyenne de 500 $ pour ce travail

SigmaVisual

We can help in your project, please check PMB and our ratings/reviews to get idea of our experience.

250 $ USD en 7 jours
(239 Commentaires)
7.8
gaffapi

....................

250 $ USD en 0 jours
(87 Commentaires)
6.5
ehsankayani

HI, Kindly see details in PMB Thank you

550 $ USD en 4 jours
(39 Commentaires)
6.1
TetySoft

If you are looking for an expert - I am the person for the job. Please check your PMB.

650 $ USD en 7 jours
(13 Commentaires)
5.0
sikandermandal

Please see PMB.

700 $ USD en 10 jours
(11 Commentaires)
4.8
rofreelance

Please check your private messages.

600 $ USD en 7 jours
(1 Commentaire)
2.0
PS507VrCw

Custom software development - <b><i>Removed by Admin</i></b>

750 $ USD en 1 jour
(0 Commentaires)
0.0
bsHJAgaPM

Please check the PMB

250 $ USD en 1 jour
(0 Commentaires)
0.0