Using Python simulate a classifier that was built for a research paper. Creating a binary NB classifier for DMOZ (ODP) dataset (the dataset will be provided) using BOW toolkit.
DMOZ dataset contains (category, URI, title, description), the dataset used for training is the (category and URI), the dataset used for testing (URI). The URI should be in all-gram (4-5-6-7-8-gram) combined (for more details on all-gram look at the Research Paper). The dataset is in the rdf format and can be converted to csv using the tool [url removed, login to view] found in [url removed, login to view]
The number of test and train dataset is based on the Research Paper method, which is for testing 1K for each topic, for training the same number of positive (in the category), and same number of negative from all the other categories (not in topic). For example 1000 are in news category we will have to collect 1000/(number of categories) from each category. (Note: this can be done easily using a tool called [url removed, login to view], found in [url removed, login to view])
The resulted should be a table matching the table in the Research Paper page 10. So for ODP dataset each category has a P, R, and F score with the total average.
I will need the all the code created for the classifier and the result.
Research Paper used is: A Comprehensive Study of Features and Algorithms for URL-Based Topic Classification
I'm novice freelancer with enough experience in ML sphere because I'm ready to make this task for symbolic pay.
8 freelance font une offre moyenne de $158 pour ce travail
I have worked on Web Data Mining- Web Harvesting- Email address and contact detail extraction from web- Web data collection- Plain data entry- JPG/PDF to DOC file- Entry in Excel/ACT- Link Exchange on the web. Data Plus
Hi. I am a <WEB EXPERT!!!> I have 10+ years of experience in web development. I am familiar with python. I can certainly help you in this project. I am a friendly person and always open for discussions. Plus
Hi!. I am a Python expert and have 7 years of experience with Python. I know how to complete your project, and can get this developed for you quickly!.. If you hire me, I will give you excellent results with a smal Plus