
Fermé
Publié
Payé lors de la livraison
I need a ChatGPT-style assistant that can read every file my organisation generates—PDFs, Word docs, Excel sheets, HTML pages and anything else we store—then answer questions using that private knowledge base. The assistant must recognise the language of the query and reply in the same tongue; Hindi, Marathi, English and Telugu are the priority. The job covers the full pipeline: ingesting and parsing documents, building a semantic index or vector store, connecting it to a generative model and exposing everything through a simple chat interface. Whichever mix of OpenAI, open-source LLMs, embeddings or retrieval-augmented generation you choose is up to you, as long as the final experience feels as fluid as ChatGPT while keeping my data secure on our own servers. Please send a detailed project proposal describing • how you will extract and normalise content from the various formats • your multilingual strategy (tokenisers, embeddings, fine-tuning, or any other approach) • the tech stack for search, orchestration and the front-end chat window • milestones, delivery timeline and post-delivery support Deliverables 1. Automated pipeline that continuously ingests new documents and updates the index 2. Multilingual Q&A model wired to that index, with language auto-detection and same-language response 3. Web-based chat UI (desktop and mobile friendly) 4. Deployment guide and source code Acceptance will be based on accurate answers drawn from my files and correct language handling across all four languages.
N° de projet : 40276321
8 propositions
Projet à distance
Actif à il y a 9 jours
Fixez votre budget et vos délais
Soyez payé pour votre travail
Surlignez votre proposition
Il est gratuit de s'inscrire et de faire des offres sur des travaux
8 freelances proposent en moyenne ₹6 581 INR pour ce travail

✔ I deliver 100% work — 99.9% is not for me. ✔ Workflow Diagram Document Ingestion ⟶⟶ Content Parsing & Normalization ⟶⟶ Semantic Indexing / Vector Store ⟶⟶ Multilingual Embeddings & RAG Pipeline ⟶⟶ Chat Interface Integration ⟶⟶ Continuous Update & Monitoring Key Highlights ✔ Universal Document Parsing — PDFs, Word, Excel, HTML, and other file formats automatically extracted, normalized, and structured. ✔ Multilingual Support — Hindi, Marathi, English, Telugu handled via language detection, dedicated tokenizers, and embeddings for each language. ✔ Semantic Search & RAG — Retrieval-Augmented Generation ensures accurate, context-aware answers from your private knowledge base. ✔ Secure On-Premise Deployment — All data stays on your servers; no cloud leakage of sensitive documents. ✔ Chat Interface — Desktop and mobile-friendly web-based chat window with responsive design. ✔ Continuous Ingestion — New documents automatically indexed and incorporated into the knowledge base. ✔ Extensible Architecture — Modular pipeline for easy addition of new languages, document types, or LLM models. ✔ Documentation & Support — Clear setup, deployment guide, and post-delivery support included. Best Regards, Asad AI & NLP Engineer | Chatbot & RAG Specialist | Multilingual Knowledge Systems
₹7 000 INR en 3 jours
0,0
0,0

I’d love to help you build a private, ChatGPT-style assistant that truly understands your organisation’s documents and answers naturally in Hindi, Marathi, English, or Telugu. I’ll take care of the complete pipeline from document parsing and multilingual semantic search to a secure on-prem deployment with a smooth, mobile-friendly chat interface. My focus will be accuracy, data security, and a fluid user experience, plus clear milestones and reliable post-delivery support so you’re never left stuck after handover. Lets connect and resolve your query.
₹11 000 INR en 7 jours
0,0
0,0

Hey — saw your post about building a multilingual ChatGPT-style assistant over your internal documents. The tricky part here is getting reliable multilingual retrieval (Hindi/Marathi/English/Telugu) from mixed file types, not just making the chat UI look good. Quick question before I map an approach: Are all your documents stored in a single system (like SharePoint, Google Drive, on-prem file server), or are they spread across multiple repositories? I’ve built RAG systems on top of PDFs, Office docs, and HTML with multilingual embeddings and on-prem deployment, so I’m familiar with both the parsing headaches and the security side. If you share a short spec, example files, or a link describing your current storage setup, I can outline a concrete stack, milestones, and timeline tailored to your environment.
₹7 000 INR en 7 jours
0,0
0,0

You’re looking to build a ChatGPT-style assistant that can ingest and understand a wide range of document formats—PDFs, Word docs, Excel sheets, HTML—and respond accurately in Hindi, Marathi, English, and Telugu. Your focus on seamless language detection and secure on-premise data handling is clear, along with a need for a continuous ingestion pipeline and a user-friendly chat interface. With over 15 years of experience and 200+ projects, I specialize in AI development, natural language processing, and full-stack solutions using Python and Node.js. I’ve worked extensively with OpenAI models, custom embeddings, and multilingual text processing, which aligns well with your requirements for language tokenization and retrieval-augmented generation. I will build an automated pipeline to parse and normalize all your document types using Python libraries and tailored preprocessors, then create a vector store with language-specific embeddings. The chat interface will be React-based for responsiveness, connected to a secure backend that handles language detection and query routing. I expect to deliver a working prototype within 6 weeks, followed by deployment support and detailed documentation. Let’s discuss how to tailor this solution precisely to your organisation’s needs and timelines.
₹1 650 INR en 7 jours
0,0
0,0

With over a decade of experience in backend and AI development, I am confident that I can create a groundbreaking multilingual AI system for your organization. My experience in building complex applications, like Blockchain related apps, has equipped me with the skills to understand and address the unique challenges that come with raw data processing and handling proprietary information - just like your project entails. For your multilingual strategy, I propose a combination of tokenizers, embeddings, and fine-tuning techniques to ensure accurate translations and maximum language coverage. Moreover, I have deep knowledge of incorporating different language models and their proper integration in a single system that allows us to not only auto-detect languages but also respond in the same tongue. Handling a full AI pipeline from data ingestion to generative model development is right up my alley. My proficiency in various tech stacks enables me to create a well-orchestrated architecture for managing your data securely while maintaining a fluid experience for end-users. The milestones and delivery timeline will be tailored to your needs with efficient incorporation of continuous integration and deployment strategies. Finally, rest assured that my support doesn't end with delivery; I will always be available in case you need assistance even post-delivery.
₹7 000 INR en 7 jours
0,0
0,0

I believe I’m a strong candidate for this project. I would build a RAG-based assistant that ingests documents such as PDFs, Word files, Excel sheets, and HTML pages, extracts and normalises their content, and indexes it in a vector database. This will allow a generative model to retrieve relevant information from your private files and answer questions through a ChatGPT-style interface. To support Hindi, Marathi, English, and Telugu, I will implement automatic language detection and use multilingual embeddings so the system understands queries in any of these languages and responds in the same language. The pipeline will continuously ingest new documents and update the index to keep the assistant up to date. I’ve already built similar document-intelligence and RAG systems, so I’m familiar with creating secure, efficient pipelines and deploying them on private infrastructure while delivering a smooth chat experience.
₹5 000 INR en 7 jours
0,0
0,0

Hello, I understand the project clearly: the core goal is building a private multilingual RAG-based assistant that continuously reads organisational documents, converts them into searchable knowledge, and answers naturally in the same language as the user query. My approach would start with a document ingestion pipeline that handles PDFs, Word, Excel, HTML, and structured files using dedicated parsers, followed by content cleaning and chunking before storing embeddings in a vector database. For multilingual support, language detection will be applied at query time, with embeddings chosen to preserve semantic meaning across Hindi, Marathi, English, and Telugu, while responses are generated in the detected language through an LLM connected to retrieval. The stack can include Python, FastAPI, a vector store such as FAISS or Chroma, document loaders for multiple file formats, and a web chat interface designed for desktop and mobile access. The system can also be structured to support continuous indexing when new files are added, with deployment on your own server for full data control. I also focus on keeping retrieval accurate by improving chunk strategy, metadata filtering, and prompt orchestration so responses remain grounded in your private files rather than generic model output.
₹7 000 INR en 15 jours
0,0
0,0

I will build a secure, multilingual AI assistant tailored to your organization’s private knowledge base. The system will automatically ingest and normalize PDFs, Word files, Excel sheets, HTML pages, and other document formats, then create a semantic vector index for accurate retrieval. Using a Retrieval-Augmented Generation pipeline, the assistant will answer questions grounded strictly in your data. It will auto-detect user language and respond fluently in Hindi, Marathi, English, or Telugu. The solution includes an automated ingestion pipeline, scalable vector database, LLM integration, and a responsive web chat interface. Deployment will run on your servers with full source code, documentation, and post-delivery support.
₹7 000 INR en 7 jours
0,0
0,0

Nagpur, India
Membre depuis juin 26, 2024
₹75000-150000 INR
$250-750 USD
$3000-5000 USD
$30-250 USD
₹1500-12500 INR
$30-250 USD
₹12500-37500 INR
₹1500-12500 INR
$50-80 USD
$15-25 USD / heure
₹1500-12500 INR
$25-50 USD / heure
€250-750 EUR
$25-50 AUD / heure
₹1500-12500 INR
$2-8 USD / heure
£18-36 GBP / heure
$250-750 USD
₹12500-37500 INR
₹750-1250 INR / heure
$1500-3000 USD