Terminé

67187 Research Tool

This script is required as a research tool. It will be required to investigate a few different United Kingdom specific sources such as search engines and business directories and retrieve and store basic information from the results pertaining to particular search strings. Recording of this information would be in a simple fashion such as in a tab delimited text file. The project I am led to believe mainly involves a relatively straightforward adaptation of a number of existing Perl modules. The script would be in perl and would run from the command line on a UNIX server and basically go through a text file containing a list of search strings checking the results for each one in turn at the various sources and recording the results. The sources to be used would be: UK Business Listing Sites Companies House ([url removed, login to view] SearchStringGoesHere) Yahoo! UK & Ireland - Business Finder ([url removed, login to view]) [url removed, login to view] ([url removed, login to view]). For all three a simple counting of the results returned and the recording of company names containing the keyword string (whole words only) would be required, no additional information from the results is required. This will often involve the retrieval of information from two or more consecutive pages on these sites. However I have been told (although have not been able to verify this) that there is already a perl module available on CPAN which deals with this and other issues and has been specifically written for lookups such as this on Companies House. As well as similarities There are also differences amongst the 3 sites that will require slightly different handling. For example companies house returns all results starting with the search string so if the string was ‘chris' it would return ‘christopher ..' christian ..' etc. Whereas the only ones that should be recorded are results containing the words in their own right where the search string represents a full word in part of the name (if not the full name itself) defined as the string followed by a space. So that is in the results of search on the string 'chris' you would want results for example for ‘chris ltd.', and ‘chris cars ltd.' but not results for christophorus cars ltd. The same is true for results from [url removed, login to view], whereas yahoo business finder just returns results matching the single word anyway. UK Results From Search Engines Also required are searches through search engine listings such as [url removed, login to view] to find a certain string in the .[url removed, login to view] domain names of the url of UK listings and record these. Such as if the string were 'chris' and there were listings for [url removed, login to view] and [url removed, login to view] in the search engine it would record both of these as the domain name only ie chrisbrown.co.uk. This match for the string would have to be based on it's occurrence in the domain name itself only and not simply the presence of the string within the full url as is provided by the advanced searches of these search engines themselves. Also relating to search engines the simple recording of the count of the number of results (Matching Sites) obtained for a given search sring on a different UK search engine, probably AOL search of UK only domain names ([url removed, login to view]). Whois Lookup Use of standardised whois module to check the availability/registered status of domain names equalling the string in (just simply a domain name equalling string name rahter than containing the string or anything more complicated). A lookup of the majority of the international domain extensions including a check of the alternative whois registries such as [url removed, login to view] which provide .[url removed, login to view] etc. a popular alternative extension in the United Kingom would be required. Also where available some brief address information from administrative contact should be recorded, this would be limited to the country field in order to simply determine country of ownership (whether UK based registrant or not). Namedroppers Search Lookup Also the counting and recording of results for the specified search string on [url removed, login to view] (with keyword only selection made - [url removed, login to view]). The results of this lookup would then ideally be passed through the whois lookup to again identify domains with UK registrants, these domain names would be recorded to the file as a separate field.

Compétences : Tout va bien, Perl

en voir plus : yp com, yp, yell pages uk, yahoo.co id, web search tool, web research companies, want to find a part in business, true results, string matching in c, string match, some search string, simply fashion, search engine names, research domain names, research domain name, q find, on line cars, match string, matching strings, matching string

Concernant l'employeur :
( 0 commentaires )

Nº du projet : #1815345

Décerné à:

jfreyre

Just accept my bid

%selectedBids___i_sum_sub_7% %project_currencyDetails_sign_sub_8% USD en 10 jours
(312 Commentaires)
7.3