Fermé

Word replacement engine - sophisticated content generator

Dear freelancers,

Extract: I need a system, that produces text for a given keyword. The text needs to be unique, targetted to the keyword and halfway readable.

The good news: I've a concept which will probably work.

The bad news: It's not easy to develop and requires handling large ammounts of data.

==============

Goal: A text production maschine, where I can enter a keyword and it generates a lot of texts for it. Those texts pass copyscape and look halfway legit on the first glance. I know that "sense" is not possible and I don't expect that.

Input: $keyword (i.e. "credit card")

Output: $text (string with ~400 words unique text about "credit card")

Please read the project description at least twice before you bid. THIS IS A HARD TASK. If you have any questions please let me know. The project has high priority for me and I'm 24/7 available for the developer.

==============

Project spezification:

Step 0 [preperations]: We collect large ammounts of human written content (german texts). I have a list of 1.7 million .de domains, let's crawl them (including subpages) and extract all text to a database/semantic cloud.

If you choose a database, I'd suggest mongoDB as it's ways faster than MySQL with that ammount of data. Our main business is Hosting, so we can provide you with custom server technology (like a 24 GB RAM to load parts of the cloud into memory or SSDs raids to speed up access). I've also a good crawler for webpages available, but it's written in python.

Step 1 [generation proccess]: Input Keyword by user. Generate a random number between 1 and 30. Let's assume it is 15.

Step 2: Make a google query for the keyword we want to generate text for. Parse the google result number 15, remove tags and navigation (script for this is existing) and extract remaining content.

Step 3: We now have a snippet of relevant text for our keyword. But of course, it's only a copy - this is where the rewriting begins.

Step 4 [rewriting]: Split the snippet into single sentences. A new sentence begins after .,:!?;

Step 5: Here it is getting tricky. We need to somehow find out, which words form a block. When the sentence is: "Mark studies law at harvard university.", the system needs to detect that "harvard university" is a block and "university" shall not be replaced with "school", however, "harvard university" may be replaced with "stanford". So the words in a block need to be replaced together. How do we find out which words belong together? We check how "near" they are in our word cloud, how often they stand next to each other: "Mark | studies | law | at | harvard university."

Step 5: okay, we now have the blocks. Next step is to aim for replacing as many blocks as possible in order to make the text unique. Here we query our natural content cloud something like that: "$left" * "$right".

In this practical example: query: "Mark " * "law" - as you can see, we took three following blocks and replaced the middle one with an placeholder. Our natural content database should now return legit blocks for *, as they were used in natural language, for example: "Mark teaches law", "Mark demands law", "Mark is still searching for law" etc.

Not all will perfectly make sense, but it's a start and far better than working with synonyms, because you can also replace single words with blocks of multiple words and vice versa.

We should use this replacement system multiple times in each sentence. It also works for the begin and the end of sentences. From my manual tests with google, this works pretty well and might work even better with our own datapool. The system works identically for all languages, so you don't need to speak german.

Step 6: Output rewritten text.

=====================

You can use any programming language you like.

I can't write more description text here due to [url removed, login to view], so happy bidding & discussion!

Best,

Steve

Compétences : Programmation C++, Exploitation de Données, Python, Architecture Logicielle, Web Scraping

Voir plus : sophisticated word generator, sophisticated language generator, word replacement generator, sophisticated sentence generator, sophisticated text generator, sophisticated word replacement, sophisticated words generator, word replace sophisticated, words replacing sophisticated, freelancer engine replacement, python content generator, word replacement sophisticated, replacing words synonyms, written content, write sentences find, working web crawler, working freelancer web developer, work freelancer german, programming language best, write work freelancer, find sophisticated web developer, find freelancer write project, make webpages, web python freelancer, web programming technology

Concernant l'employeur :
( 29 commentaires ) Fürstenfeldbruck, Germany

N° du projet : #1077111

13 freelance ont fait une offre moyenne de 2500 $ pour ce travail

synl0rd

I'm very interesting, Check PMB boss.

1500 $ USD en 15 jours
(8 Commentaires)
5.0
sentromed

Hello. I have much experience in Mysql, Python and C++. I also can use any other database. I can code this projects, but I need to know how to do this. I may code all your ideas to see if this works or not. Regards, Va Plus

3000 $ USD en 40 jours
(4 Commentaires)
5.0
expertMan

Please check message board.

3500 $ USD en 60 jours
(4 Commentaires)
4.7
LiveConnector

Please check PMB

3000 $ USD en 30 jours
(5 Commentaires)
4.3
priboy

Please check PMB

1500 $ USD en 10 jours
(4 Commentaires)
4.3
Dutchstudent7750

Hi. This seems like a very interesting summer project. I have a lot of experience in programming in Python, JAVA and C. I have made software that uses the Google search engine to extract data from websites. Also, I can Plus

2000 $ USD en 60 jours
(6 Commentaires)
3.0
AlexandrP

Ia am very experiences in writing text based engines in C.

2800 $ USD en 30 jours
(1 Commentaire)
2.4
Hangleton

Dear sir, This is a really interesting project, close to computer language-compilation challenges. Computer compilation theory is not only used for computer languages but their application also extend to natural lan Plus

1500 $ USD en 30 jours
(0 Commentaires)
0.0
Twirlie

Hello there! This sounds like a fascinating idea, and we would love to work with you on it! Please check your PM for our bid!

1500 $ USD en 30 jours
(0 Commentaires)
0.0
meisel

I'm an expert in the field of natural language theory, and a native speaker of English! Contact me for more details.

1700 $ USD en 8 jours
(1 Commentaire)
0.0
Jeandasse

Hi, I would like to work on this project. I usually work on this kind of parsing project Best regards Jean

3000 $ USD en 30 jours
(0 Commentaires)
0.0
TavishiSystems

We can do it.

2500 $ USD en 30 jours
(0 Commentaires)
0.0
itbscompany

Hi! Please have a look at a private message.

5000 $ USD en 30 jours
(0 Commentaires)
0.0