
Closed
Posted
Paid on delivery
Title: Set up offline AI research and drafting tool on macOS (Ollama + Kotaemon + local RAG) Project — "Legal X" I am building a fully offline, zero-API-cost AI research and drafting assistant for legal work, called Legal X. It will run entirely on my MacBook Air M-series (16 GB unified RAM, 10-core GPU, 512 GB), with no cloud dependency and no ongoing subscription cost. The system needs to do three things over a private corpus of approximately 40,000 readable PDF and HTML files (~5 GB): 1. Answer research queries with numbered footnotes, citing the exact source file and highlighting the passage in the original document for verification. 2. Produce drafts grounded in retrieved source material, in a consistent style. 3. Support long-form writing (articles, book chapters) drawing from the same corpus as source material. I need a freelancer to set this up remotely on my machine. I will provide remote access (AnyDesk / TeamViewer / Chrome Remote Desktop) and full cooperation throughout. Scope of work 1. Install and configure Ollama. Pull and verify three models: qwen2.5:14b, mistral, nomic-embed-text. Confirm Metal GPU acceleration is active. 2. Install Kotaemon from the official GitHub release [login to view URL] installer. Change default credentials. 3. Connect Kotaemon to local Ollama: register Qwen 2.5 14B as primary LLM, Mistral as secondary, and nomic-embed-text as the embedding model. Configure and test hybrid (BM25 + vector) search. 4. Create a dedicated project workspace inside Kotaemon. Run a sanity-test index on 50 sample files and verify highlighted-passage citations work end to end. 5. Run full indexing of the ~40,000-file corpus (PDF + HTML). Confirm successful completion and disk-resident index. 6. Tune retrieval settings — chunk size, top-K, hybrid weighting. Run 10 test queries I will provide and confirm at least 8/10 produce accurate footnoted answers with verifiable source highlighting. 7. Set up a preferences/style file inside the index (content I will provide) so outputs reflect my drafting style and citation format. 8. Pull and register deepseek-r1:14b as a third selectable LLM. 9. Write a short handover document: how to start and stop the system, how to add new files, how to re-index, how to switch models, common troubleshooting. Out of scope Cross-session memory layer (Letta / MemGPT) — to be added later in a separate engagement. Deliverables A fully working Legal X installation on my machine, all three use cases tested. Plain-English handover document (Word or Markdown). One 30-minute video walkthrough at the end of the engagement. Requirements - Hands-on experience with Ollama on Apple Silicon. - Prior Kotaemon, AnythingLLM, LangChain, or LlamaIndex deployment experience — please cite a specific past project. - Comfort with macOS Terminal, Python virtual environments, and remote-access setup. What I provide Remote access to the machine throughout. The full corpus already on disk. The 10 test queries. The preferences file content. Prompt response on Slack/WhatsApp during your working hours. Budget and bidding Fixed-price preferred. Please quote your price, expected total hours, and your timezone. Do not bid if you have not deployed a local RAG system end to end before — I will ask for proof. To apply In your first message, tell me: 1. One prior local-RAG project you built, and what stack. 2. Your plan for verifying Metal GPU acceleration on Ollama. 3. How you would handle a failed embedding step mid-indexing without restarting from zero. Generic AI-written pitches will be ignored.
Project ID: 40406853
22 proposals
Remote project
Active 15 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
22 freelancers are bidding on average ₹12,420 INR for this job

Hello, I trust you're doing well. I am well experienced in machine learning algorithms, with nearly a decade of hands-on practice. My expertise lies in developing various artificial intelligence algorithms, including the one you require, using Matlab, Python, and similar tools. I hold a doctorate from Tohoku University and have a number of publications in the same subject. My portfolio, which showcases my past work, is available for your review. Your project piqued my interest, and I would be delighted to be part of it. Let's connect to discuss in detail. Warm regards. please check my portfolio link: https://www.freelancer.com/u/sajjadtaghvaeifr
₹37,000 INR in 7 days
7.2
7.2

As a team that specializes in building dependable, production-grade AI systems, our skills and experience directly align with the unique needs of your Offline Legal AI Assistant Project. We have deployed various local RAG systems end-to-end, most notably for a legal research institute that presented us with an even bigger corpus than yours. Our expertise with Kotaemon, AnythingLLM, LangChain, and LlamaIndex enables us to tackle your project - from installing and configuring Ollama on Apple Silicon (with GPU acceleration) to optimizing retrieval settings and running thorough tests on your 40k-file corpus. What makes us uniquely qualiQed is our expansive knowledge beyond just AI. Apart from hands-on experience with Ollama on Apple Silicon and the entire macOS Terminal setup, we have solid backgrounds with Python virtual environments and remote-access configuration - skills essential for the successful execution of your project. Similarly, our deployment of Odoo ERP complements your requirement for a zero-API-cost solution as we prioritize non-reliance on multiple services. During rare occasions of failures in the embedding step mid-indexing, our problem-solving competencies play a vital role. Instead of starting the process from scratch, which is time-consuming and inefficient, we opt for pinpointed troubleshooting approaches that minimize disruptions without sacrificing quality or accuracy.
₹27,000 INR in 7 days
6.3
6.3

Hi, this is a solid setup and I’ve worked on local RAG-style systems using Ollama and vector-based retrieval pipelines. 1. Prior project I built a local document Q&A system using Ollama + LlamaIndex with a PDF corpus (~10k files), where queries returned source-grounded answers with chunk-level references. It included embedding pipelines, local indexing, and query-time retrieval tuning. 2. Metal GPU verification After installing Ollama, I verify Metal acceleration by: => checking model load logs (GPU layers offloaded) => monitoring activity via Activity Monitor (GPU usage spikes) => running a test inference with a larger model (e.g., 14B) and confirming expected latency vs CPU-only fallback 3. Handling failed embedding mid-index I avoid full restarts by: => using incremental indexing (chunk-level persistence) => storing embedding state/checkpoints => resuming only failed batches instead of rebuilding entire index Approach for your setup => configure Ollama with required models and verify GPU usage => install and connect Kotaemon with hybrid retrieval (BM25 + vector) => test small dataset first to validate highlighting + citations => run full indexing with monitoring and recovery strategy => tune chunk size, top-K, and weighting based on your test queries => integrate your style/preferences file for consistent drafting I understand this needs to be clean, fully offline, and reliable on your Mac setup.
₹12,000 INR in 4 days
4.0
4.0

Hi, I’ve deployed a local RAG system using Ollama + LlamaIndex on Apple Silicon (M2, 16GB), indexing ~25k documents with hybrid search (BM25 + embeddings) and citation grounding—very similar to your Legal X setup. I’ll verify Metal GPU acceleration via ollama run logs + sysctl/Activity Monitor checks, and handle failed embeddings using checkpointed indexing (persisted vector store + resume logic) so we never restart from zero. Ready to start immediately; I estimate 10–14 hours total to deliver a fully working setup, tuning, and handover with video walkthrough.
₹3,000 INR in 3 days
4.0
4.0

I’ll set up Legal X as a fully offline macOS RAG workflow on your M-series MacBook, with Ollama connected to Kotaemon and configured for Qwen 2.5 14B, Mistral, nomic-embed-text, and deepseek-r1:14b. I’ll verify Metal acceleration, test hybrid retrieval on a sample workspace, and tune chunking, top-K, and weighting so citations point back to exact source files with highlighted passages. After the 40,000-file index is built and validated, I’ll add your style/preferences file, run the 10 provided queries against the corpus, and document the setup in a clear handover guide. I’ll also include the recovery approach for partial indexing failures so the process can resume cleanly without starting over.
₹12,500 INR in 12 days
3.9
3.9

Drawing on my extensive experience as a web and software developer, particularly in the realm of building Smart, Scalable & Future-Ready Digital Solutions, I strongly believe I am the perfect fit for your project. In terms of your specific needs, I have prior hands-on experience with Ollama on Apple Silicon and have successfully deployed local RAG systems end to end before, using similar stacks such as AnythingLLM, LangChain, and LlamaIndex. One of your stipulated requirements is to verify Metal GPU acceleration on Ollama. To handle this, I’ll employ effective techniques like running benchmark tests and leveraging performance profiling tools specific to the macOS platform. This will help confirm Metal GPUC acceleration is active, ensuring optimal performance for your system. Finally, I want to highlight my commitment to quality assurance and efficient troubleshooting. In the unlikely event of a failed embedding step mid-indexing, rest assured that I would methodically diagnose the issue without restarting from zero. My approach prioritizes data backup strategies ensuring that steps performed so far are repeatable minimizing downtime.
₹10,000 INR in 3 days
2.2
2.2

I have successfully deployed a local RAG system for a tech startup using Ollama and Kotaemon on macOS, ensuring seamless Metal GPU acceleration. I will verify GPU use via terminal commands and system queries, and implement checkpoints during the indexing to handle failures without restarting. Excited to set up Legal X on your MacBook, I am confident of delivering a robust, offline AI assistant tailored to your legal drafting needs.
₹7,000 INR in 7 days
2.3
2.3

Hi, Setting up a local legal research system on macOS that keeps sensitive documents offline and doesn't bleed into cloud API costs is the right move—but Ollama, Kotaemon, and RAG integration often get stuck at the environment setup stage, especially on macOS. I've built local RAG systems with Ollama using Chroma as the vector store and tailored model selection for domain-specific accuracy. For legal documents, I'd start with Mistral 7B or Llama 2 13B—they handle complex legal language better than smaller models, particularly for case law and contract analysis. Kotaemon's pipeline orchestration makes ingestion and retrieval straightforward once the pieces are connected. First step: I'll validate your macOS setup (RAM, storage, GPU if available) and scope your document corpus. That determines model choice and RAG tuning. Ollama plus a working Kotaemon pipeline with initial retrieval can run within 24 hours—then I iterate quality with your actual documents. What's your document volume and format? (PDFs, plain text, mixed?) Best regards, Val
₹1,500 INR in 7 days
1.8
1.8

Hi there, I read your Legal X requirements carefully, and I can set up the full offline RAG workflow on your MacBook Air using Ollama + Kotaemon + local hybrid search. I have worked on a local RAG setup before using Ollama, LlamaIndex/Chroma, local embedding models, and PDF document indexing for private document Q&A with cited retrieval. For your setup, I’ll configure qwen2.5:14b, mistral, deepseek-r1:14b, and nomic-embed-text, connect them inside Kotaemon, test BM25 + vector retrieval, and tune chunk size/top-K/hybrid weighting against your 10 legal test queries. For Metal GPU verification, I’ll check Ollama logs/activity while running inference, confirm Apple Silicon acceleration is active, and test model response speed/resource usage. If embedding/indexing fails midway, I’ll isolate the failed files, review logs, resume from completed indexed chunks where possible, and re-run only the failed batch instead of restarting the full 40,000-file corpus. I’ll deliver the working Legal X installation, indexed corpus, tested citations/highlighted source passages, style preference setup, handover document, and 30-minute walkthrough. Cost: ₹10,000 || Timeline: 1 day Timezone: IST Payment and timeline details can be discussed further to align with your expectations. I’d be happy to help build this cleanly and make sure you can operate it confidently after handover. Best regards,
₹10,000 INR in 1 day
0.0
0.0

Hi — this is exactly the kind of system I like building: fully local, practical, and tuned for real workflows. 1. Prior project: I recently set up a local RAG stack for a research-heavy use case (academic + technical PDFs, ~8GB corpus) using Ollama (Mistral + Qwen), LlamaIndex, and a hybrid retrieval pipeline (BM25 + embeddings via nomic). It supported citation-grounded answers and long-form drafting. 2. Metal GPU verification: After installing Ollama, I verify GPU usage via: * `ollama run` with a large model (qwen2.5:14b) and monitor activity using `Activity Monitor` (GPU History) * Check logs (`OLLAMA_DEBUG=1`) to confirm Metal backend * Benchmark token/sec vs CPU fallback to ensure acceleration is active 3. Handling failed embedding mid-index: I avoid full restarts by: * Chunk-level indexing with persistent checkpoints * Storing processed document IDs + embedding hashes * Resuming only failed batches (idempotent pipeline) * Logging failures separately for retry Execution plan: install + wire Ollama and Kotaemon, validate hybrid retrieval on sample set, then run full indexing with checkpoints and tuning (chunk size, top-K, weighting). I’ll ensure citation highlighting works reliably before final delivery. Timeline: 5–7 days Estimate: ~25–30 hours Timezone: IST Ready to start immediately and work live with you during setup.
₹7,000 INR in 7 days
0.0
0.0

Hi, I’ve deployed a local RAG system using Ollama + LlamaIndex + vector DB (FAISS) for document search with citation grounding, so your “Legal X” setup aligns well with my experience. Plan: Install & verify Ollama (qwen2.5:14b, mistral, nomic-embed-text) with Metal GPU via OLLAMA_METAL=1 + activity monitor validation Deploy Kotaemon, connect models, configure hybrid search (BM25 + vector) Run staged indexing (sample → full 40k docs) with checkpointing Tune chunking/top-K for accurate footnoted answers + highlighting Add style layer + register deepseek-r1:14b Final testing + handover doc + walkthrough Failed embedding handling: I’ll use incremental indexing with persisted state, so failures resume from last processed batch (no restart). Quick question: Are your PDFs mostly text-based or scanned (for OCR considerations)? I can also show a quick demo flow before starting.
₹7,000 INR in 2 days
0.0
0.0

Hi there, I specialize in local LLM deployments and agentic RAG architectures. I can set up Legal X remotely on your M-series MacBook Air, ensuring complete offline privacy and zero ongoing API costs. Answering your required questions first: 1. Prior local-RAG project: I built and actively maintain the "Internet Researcher Agent," an autonomous RAG architecture utilizing LangChain, LlamaIndex, and local LLMs. I have deep, hands-on experience tuning chunk sizes, top-K retrieval, and embedding strategies. 2. Metal GPU verification: I will verify Apple Silicon acceleration by checking the Ollama server logs for the [metal] compute tag during inference, running ollama ps to ensure the model is fully offloaded, and tracking your 10-core GPU utilization via macOS Activity Monitor during prompt execution. 3. Handling mid-indexing failure: For a 40,000-file corpus, I implement incremental indexing. I hash the documents and maintain a local manifest (e.g., a lightweight SQLite DB) of successfully embedded document IDs. If the process drops, the system checks this manifest upon restart and resumes exactly where it left off. I already work extensively with deepseek-r1:14b locally via Ollama and will cleanly register it alongside Qwen and Mistral. I will ensure your hybrid (BM25 + vector) search in Kotaemon is highly accurate for legal citations.
₹10,000 INR in 7 days
0.0
0.0

Hi, I can set up your Legal X offline AI research system end-to-end on your MacBook exactly as described, using a fully local RAG pipeline with Ollama + Kotaemon + hybrid retrieval (BM25 + vector search). The goal will be a stable, GPU-accelerated system that produces verifiable, citation-backed legal research and drafting without any cloud dependencies. ? Deliverables Fully working Legal X offline system on your Mac Complete indexed corpus (40,000 files) Verified RAG + citation system Model routing (Qwen / Mistral / DeepSeek) Performance-tuned retrieval pipeline Plain-English setup + maintenance guide 30-minute walkthrough session ? Relevant Experience I have worked on: Local RAG systems using Ollama + LangChain Hybrid search pipelines (BM25 + embeddings) Document QA systems for large legal/technical datasets Offline AI deployments on Apple Silicon (M-series optimization) ? Timeline Setup + base configuration: 1–2 days Full indexing + tuning: 2–4 days (depending on corpus speed) Testing + final optimization: 1 day ? What I Need From You Remote access (AnyDesk / TeamViewer / Chrome Remote Desktop) Corpus folder access 10 test queries Your citation + writing style preferences If everything is clear, I can start immediately and first deliver a working prototype (50-file index + citation demo) before scaling to full deployment.
₹10,000 INR in 5 days
0.0
0.0

Hi there, While I am new to Freelancer.com, I have tons of off-site projects under my belt and I think I am the perfect fit for your project. I will try to keep this short to not waste your valuable time :) I've deployed local RAG pipelines using Ollama, LlamaIndex, and ChromaDB on Apple Silicon - verifying Metal acceleration via ollama run with OLLAMA_METAL=1 and confirming GPU layer offload in the verbose output. For a failed embedding mid-index, I'd use checkpoint-based ingestion so only unprocessed files are retried, not the full corpus. I'm comfortable with the full scope here - Kotaemon setup, hybrid retrieval tuning, corpus indexing, and the handover doc and walkthrough at the end. I'd love to chat more about your project. Best, Atish.M
₹7,000 INR in 7 days
0.0
0.0

Answers to your three questions first. Prior RAG project: Built a document retrieval assistant for a legal/compliance use case using LangChain, ChromaDB, and Ollama (llama3 + nomic-embed-text) over ~8,000 PDFs. Implemented hybrid BM25 + vector search with a citation formatter returning document name, page, and matched passage — directly maps to your use case. Metal GPU verification: After pulling the model I run "ollama ps" — if Metal is active, it shows VRAM usage. I also check ~/Library/Logs/Ollama/ where the server prints "using Metal" on load, and confirm via Activity Monitor GPU History. Failed embedding mid-index: I script a checkpoint before full indexing — hash all file paths, log successful embeddings. On failure, filter the corpus to only un-indexed files using the hash log and resume. Never restart from zero; Kotaemon's ChromaDB backend preserves what's already embedded. Timezone UTC+3:30. Estimated 6-8 hours total. Comfortable with macOS Terminal and remote-access sessions. What date works for you to schedule the remote session?
₹7,000 INR in 5 days
0.0
0.0

Hello, This project fits my background well because it focuses on what matters most in local AI systems: reliable RAG setup, retrieval quality, and production-minded configuration—not just connecting an LLM. I’m a Backend & AI Engineer with hands-on experience building RAG-based workflows, AI automation systems, and production AI pipelines. I’ve worked on systems that required structured retrieval, backend orchestration, API integrations, and performance optimization under real constraints. In one production AI project, I reduced processing time by 70%+ and cut cloud costs by 50%+ through optimization and caching. For Legal X, I can help set up the full offline workflow on macOS, connect Ollama and Kotaemon correctly, tune retrieval, validate highlighting/citations, and make the system stable and maintainable for daily use. I also understand the importance of resumable indexing, retrieval tuning, and verification when working with a large private corpus. My value here is not just installation, but making sure the system is usable, efficient, and dependable for real legal research and drafting. Estimated time: 20–24 hours Timezone: GMT+2 Pricing: 100 $ Best regards, Abdelrahman Emad
₹10,000 INR in 5 days
0.0
0.0

Hi, I have created a exact similar platform for Cyber security domain. I can provide you highly efficient solution for this.
₹45,000 INR in 21 days
0.0
0.0

Hi there, I am a Lead AI Engineer and I have already built the core of what you are calling "Legal X." I recently developed a Multi-Agent AI Courtroom that handles complex legal-style debating and research using local orchestration. Answers to your requirements: Prior Local-RAG Project: I built an "AI Courtroom" using LangGraph and Ollama. It orchestrates specialized agents through a state machine to deliver verdicts based on indexed legal texts. I also built a Self-Correcting RAG pipeline that uses "Critic Agents" to detect hallucinations, achieving 95% accuracy. Verifying Metal Acceleration: I verify this by checking Ollama server logs to confirm layer offloading and using "sudo powermetrics --samplers gpu_power" to monitor real-time Metal GPU utilization during inference. Handling Mid-Indexing Failure: I use a checkpointing system. I script the ingestion to track processed file hashes in a local manifest. If a failure occurs, the system resumes by comparing the corpus against the manifest, only processing the remaining files. I am an IEEE researcher. I can optimize your 14B models for 16GB RAM. Best, Priyanka Shah
₹10,000 INR in 3 days
0.0
0.0

new delhi, India
Payment method verified
Member since Jun 3, 2014
₹600-1500 INR
₹600-1500 INR
₹1500-12500 INR
₹910 INR
₹12500-37500 INR
$25-50 AUD / hour
₹1500-12500 INR
€250-750 EUR
₹12500-37500 INR
$15-25 USD / hour
₹12500-37500 INR
₹600-1500 INR
₹600-1500 INR
$2-8 USD / hour
₹12500-37500 INR
$30-250 USD
$30-250 USD
£250-750 GBP
£250-750 GBP
₹1500-12500 INR
$10-30 USD
$25-50 USD / hour
₹12500-37500 INR
$30-250 USD
₹12500-37500 INR