A library for Windows to extract the plaintext of several file formats
I need a library in .dll and .lib forms, that extracts the plaintext of some file formats, listed below:
• Microsoft Word Files (.doc, .docx)
• Microsoft Excel Files (.xls, xlsx)
• Microsoft Access (.mdb, .accdb)
• Adobe Acrobat Files (.pdf)
• IBM Lotus Notes database file (.nsf)
The library must have a callable function with the following signature
BOOL ExtractPlaintextFromFile(PCTSTR FilePath, TextCallback Callback);
The first parameter will be a pointer to a unicode string containing the file path to extract the plaintext from (ex. D:\[url removed, login to view])
The second parameter will be a plaintext processing callback, explained below.
The return value must be TRUE on success and FALSE on error.
The callback function must have the following signature, with each parameter explained.
typedef BOOL (*TextCallback)(PCTSTR Text, SIZE_T TextLength, PCTSTR SourceFile);
Text: Pointer to a buffer that contains all or part of the extracted plaintext in unicode. If the file is to be extracted in chunk or parts, the callback can be safely called again pointing to the new chunk or part.
TextLength: Length in characters of the buffer pointed by Text.
SourceFile: Pointer to a buffer that contains the originating source file (ex. D:\[url removed, login to view])
Return value: TRUE on success, FALSE on error.
The project must be delivered in one or two .sln files (Visual Studio solution file) to the choice of the developer.
If only one .sln file is provided it must compile everything from scratch to a demo application
If two .sln file are provided one must be for all the possible dependencies of the project (external libraries and such) and other for the main library and the demo application
The demo application must be a simple application that calls the extracting function with a provided sample file for each of the supported file formats.
The callback function of the demo must simply save the extracted contents to a file with the .txt extension added. (ex. D:\[url removed, login to view]).
The goal of the demo is to extract all the plaintext from all the included sample files.
The included sample files were uploaded as a multipart rar file due upload file size limitations
As expected, converting from formats with special formatting like PDFs to plaintext can lead to loss of text positioning or format. This is no problem for my requirements. As long as all available text from the document is extracted, superfluous whitespace is not a problem.
Additionally, the library must meet the following technical specifications:
• It must be coded in C or C++ (Avoid using C++0x/C++11)
• It must be able to run in any version of windows from Windows XP SP1 to the latest version. (Windows XP SP1 to SP3, Windows Vista Retail to SP2, Windows 7 Retail and SP1, Windows 8, Windows 8.1 and Windows 10)
• The library must be self-contained. This means that it should not depend on any external libraries, installed programs, DLLs or frameworks that are not included in a clean installation of Windows XP SP1 (That is, an installation of Windows XP with SP1 with no extra programs or system updates installed).
• It must not have any graphical interface, play any sound or generate any kind of alert to the user
• You must deliver all the source code that generates the final library; no precompiled libraries will be accepted.
• You must document all the external libraries used by the library, including the version used, direct download link and detailed notes about any changes to the original source code of such libraries.
• The final binaries should be compiled using Visual Studio 2010 or higher and compiled with the Runtime Library option set to Multi-threaded (/MT).
If you are interested in the job please answer this request with the following information:
• Estimated time of development.
• What is your favorite animal pet
Your proposal will be subject to approval
10 freelance font une offre moyenne de $587 pour ce travail
Hello I'm interesting your project very well I'm a Good C++/C#, Java, Math, Algorithm expert. I understand your req exactly. I m quite well experienced in these jobs. Let's go ahead with me I want to service Plus
Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to work !Ready to Plus
I am very proficient in c, c++. I have 15 years c++ developing experience now, and I have worked for 5 years. My work is online game developing, and mainly focus on server side, the language is c++ under windows. I use Plus
Hi there! I just read the proposal, it is pretty detailed. I just have one offer to make if you that makes you interested then Please do contact me for a demo. I already made a simple command line application Plus
Hello, I have a partial solution for your problem. Some time ago I developed a C++ dll that extracts text from *.doc files (not *docx). The library is self-contained and works on raw doc files doesn't require MS W Plus
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Hi, I have read your post and understood your requirement. I have great experience in handling /C programming/C++ Programming/PHP/MySQL/HTML5/jQuery/Wordpress/Magento/Joomla/Drupal/AngularJS/node.js/CSS3/Java/Pyt Plus
Hi, Friend. I had read your requirement carefully and find out I can complete it enough. I have 8+ years experience in c/c++ programming. Also I had completed some project similar this one. I think I can finish Plus