Filtrer

Mes recherches récentes
Filtrer par :
Budget
à
à
à
Type
Compétences
Langues
    État du travail
    344 multimodal missions trouvées

    Description Of Project - HRID-AI is a low cost handheld device aimed at estimating ejection fraction using multimodal biosignals (ECG, seismocardiography, and cardiac acoustics). It uses a combination of biosensors to detect abnormalities in ECG, presence of abnormal heart sounds and their interpretation as well as usage of seismocardiographic signals to provide an estimate of Ejection Fraction. The principle has been verified through multiple studies done in reputed institutions The goal is rapid point-of-care triage for echo referral in emergency and low-resource settings in patients of Heart Failure with Reduced Ejection fraction We need your support, skills and expertise for sensor integration, embedded design, signal acquisition. Current Status- We have so far been able...

    €20 / hr Average bid
    €20 / hr Offre moyenne
    6 offres

    Description Of Project - HRID-AI is a low cost handheld device aimed at estimating ejection fraction using multimodal biosignals (ECG, seismocardiography, and cardiac acoustics). It uses a combination of biosensors to detect abnormalities in ECG, presence of abnormal heart sounds and their interpretation as well as usage of seismocardiographic signals to provide an estimate of Ejection Fraction. The principle has been verified through multiple studies done in reputed institutions The goal is rapid point-of-care triage for echo referral in emergency and low-resource settings in patients of Heart Failure with Reduced Ejection fraction We need your support, skills and expertise for sensor integration, embedded design, signal acquisition. Current Status- We have so far been able...

    €353 Average bid
    €353 Offre moyenne
    6 offres

    Build an end-to-end AI application that can reliably solve JEE-style math problems, explain solutions step-by-step, and improve over time. The goal of this assignment is not just model usage, but to evaluate whether you can: design a RAG pipeline build a multi-agent system handle image, text, and audio inputs introduce human-in-the-loop (HITL) implement memory & self-learning package everything into a working application and deploy it You do not need to be a DevOps or full-stack expert, but you must be able to build, run, and deploy a simple app.

    €93 Average bid
    €93 Offre moyenne
    7 offres

    I need an AI-powered chatbot fully integrated with the WhatsApp Business API. It must converse fluently via text, understand incoming voice notes, and react appropriately to images or short video clips sent by users. I’m open on the underlying stack—Dialogflow, Microsoft Bot Framework, IBM Watson, or any other platform you believe best fits WhatsApp’s constraints—so long as latency stays low and the solution can scale as traffic grows. Core deliverables: • End-to-end WhatsApp Business API setup (webhook, number verification, cloud or on-prem hosting). • NLP pipeline that handles:  – Text intent recognition and response generation.  – Speech-to-text for voice messages, with the transcript feeding the same intent flow. &...

    €23 Average bid
    €23 Offre moyenne
    14 offres

    Title: Senior Android Developer: Multimodal AI Pipeline (Real-time Video & Audio) **Project Description:** We are seeking a Senior Android Engineer to develop a modular, high-performance infrastructure for **real-time Multimodal capture (Video + Audio)**. The core of the project is a "Hybrid AI Routing Engine" that intelligently switches between on-device local processing and cloud analysis using Gemini 2.0. This application is an R&D prototype that must be **Play Store Ready**, with a heavy focus on background stability and thermal management. **Important:** The ultimate goal is to port this architecture to an Android-based smart glasses OS (AugmentOS), so the code must be hardware-agnostic. **Technical Specifications:** **1. Multimodal Pipeline...

    €206 Average bid
    €206 Offre moyenne
    111 offres
    Audio-Physio Prediction Model
    S'est terminé left

    I’m building a unsupervised classifier that learns jointly from audio recordings and accompanying physiological signals. My end-goal is a robust prediction model that can generalise to new subjects, so every modelling choice—from feature pipeline through network architecture and hyper-parameter search—has to be evidence-driven and repro... • End-to-end training code, neatly commented • Saved model weights plus an inference script that takes new audio + physio files and outputs class probabilities • Brief report (accuracy, precision, recall, F1, confusion matrix) and guidance on further improvement Clean, modular code and explain-as-you-go communication matter more to me than glossy presentations, so if classification of multimodal signals is yo...

    €93 Average bid
    €93 Offre moyenne
    17 offres
    Sleep Stages ML Model Development
    S'est terminé left

    I'm looking for a skilled machine learning expert to help with my final year university project. The goal is to identify different sleep stages using multimodal data, specifically ECG patterns and blood pressure signals. Key Requirements: - Analyze ECG and blood pressure data - Develop a machine learning model to estimate sleep stages - Utilize existing dataset Ideal Skills and Experience: - Strong background in machine learning - Experience with ECG and blood pressure signal analysis - Proficiency in data processing and model development - Familiarity with sleep stage identification techniques

    €581 Average bid
    €581 Offre moyenne
    106 offres

    ...will expand it through public sources or augmentation, perform rigorous cross-validation, and refine the model until we consistently exceed 90 % precision and recall on an unseen hold-out set. When you apply, show me past work—links to papers, GitHub repos, Kaggle solutions, or shipped features—demonstrating experience with cry detection, sound-event recognition, emotion analysis, or any other multimodal perception problem. A concise paragraph with links is enough; no full proposal is needed at this stage. Deliverables • Well-documented training pipeline and source code • Trained model file(s) plus lightweight export (ONNX/TFLite) • Inference script or microservice, ready for product integration • Evaluation report: confusion matrix, per-...

    €236 Average bid
    €236 Offre moyenne
    14 offres
    Oil & Gas SCM Research Paper
    S'est terminé left

    ...discussion in current academic thinking while weaving in up-to-date industry reports and real-world company case studies from the Kingdom. • Map out typical lead-time challenges, illustrate bottlenecks at ports, yards, or in-country corridors, and quantify the schedule or cost impact when logistics falter. • Highlight proven mitigation tactics—expedited shipping models, strategic stockpiling, multimodal routing, digital tracking, customs clearance strategies—and evaluate their effectiveness. • Conclude with actionable recommendations tailored to Saudi project environments and their regulatory frameworks. Research Approach Prioritise peer-reviewed academic journals first, reinforce findings with reputable industry reports, and enrich the analysi...

    €340 Average bid
    €340 Offre moyenne
    89 offres

    ...activo y evolutivo: Consulta Multimodal: El operario consulta vía texto, audio (notas de voz), foto o vídeo corto de la incidencia. RAG Híbrido: El sistema busca en la base de conocimiento interna (manuales, vídeos previos, histórico técnico). Si no hay respuesta, actúa como un Agente de Búsqueda en internet (manuales estándar, foros técnicos), con aislamiento de red y validación de fuentes, traduciendo la información al español. Escalado Humano: Si la IA no conoce la solución, notifica a los supervisores. Aprendizaje Activo: 1–2 días después de una consulta sin respuesta validada, el sistema envía un recordatorio no invasivo al operario: “¿...

    €11 / hr Average bid
    €11 / hr Offre moyenne
    25 offres
    Multimodal Safety Forecast ML Model
    S'est terminé left

    ...architecture (or a rigorously justified adaptation of cutting-edge multimodal papers) that fuses image, text, and numeric signals into a single forecasting pipeline and demonstrably outperforms strong baselines. Key expectations • End-to-end experimentation code (Python, PyTorch or TensorFlow) with clear data loaders for each modality • Custom model implementation with commented rationale for design decisions • Reproducible training scripts, hyper-parameter configs, and a validation notebook that plots forecast accuracy against standard baselines • Final technical report summarizing methodology, results, and potential publication avenues Acceptance criteria • Forecast MAE or MAPE improvement over baseline multimodal fusion of at least X...

    €920 Average bid
    €920 Offre moyenne
    17 offres
    Multimodal Safety Forecast ML Model
    S'est terminé left

    ...architecture (or a rigorously justified adaptation of cutting-edge multimodal papers) that fuses image, text, and numeric signals into a single forecasting pipeline and demonstrably outperforms strong baselines. Key expectations • End-to-end experimentation code (Python, PyTorch or TensorFlow) with clear data loaders for each modality • Custom model implementation with commented rationale for design decisions • Reproducible training scripts, hyper-parameter configs, and a validation notebook that plots forecast accuracy against standard baselines • Final technical report summarizing methodology, results, and potential publication avenues Acceptance criteria • Forecast MAE or MAPE improvement over baseline multimodal fusion of at least X...

    €5 / hr Average bid
    €5 / hr Offre moyenne
    21 offres

    Lead AI / Fullstack Engineer — Project "AZIZA" (Voice-to-Voice AI) ​Project Name: AZIZA Format: Project-based / Remote (with access to local GPU clusters) Tech Stack: PersonaPlex (Moshi-based architecture), PyTorch, TensorRT-LLM, FastAPI, WebRTC, Telegram Mini App (TMA). Hardware Location: Uzbekistan & Turkey clusters powered by NVIDIA L40S ​Project Overview ​AZIZA is an innovative multimodal "Speech-to-Speech" (S2S) ecosystem designed to simulate natural human interaction. We are building an AI assistant that seamlessly transitions between roles: an expert tutor (Chemistry, History, Biology), an empathetic companion, and a simultaneous translator. By processing audio tokens directly, the system achieves unprecedented interaction speeds. ​Current Statu...

    €1060 Average bid
    €1060 Offre moyenne
    62 offres

    ...Questions Questions for you? * "For the deepfake detection, will you be training a model from scratch, or do you plan to use a pre-trained model like XceptionNet or MesoNet? Why?" (A good dev will suggest pre-trained models to save time/cost). * "How will you handle the latency? If we use Whisper for audio transcription, will it be fast enough for a live alert?" * "Do you have experience with 'Multimodal' analysis (combining audio and video data), or will these run as separate independent modules?" Option A: The Screen-Reflection Test Implement a feature where the screen flashes a random color sequence. Build a CV model that attempts to detect this color change in the reflection of the caller's eyes/glasses. Goal: Prove the calle...

    €102 Average bid
    €102 Offre moyenne
    16 offres

    .../ Fullstack Engineer — Project "AZIZA" (Voice-to-Voice AI) ​Project Name: AZIZA Format: Project-based / Remote (with access to local GPU clusters) Tech Stack: PersonaPlex (Moshi-based architecture), PyTorch, TensorRT-LLM, FastAPI, WebRTC, Telegram Mini App (TMA). Hardware Location: Uzbekistan & Kazakhstan (TAS-IX), clusters powered by NVIDIA RTX 4090. ​Project Overview ​AZIZA is an innovative multimodal "Speech-to-Speech" (S2S) ecosystem designed to simulate natural human interaction. We are building an AI assistant that seamlessly transitions between roles: an expert tutor (Chemistry, History, Biology), an empathetic companion, and a simultaneous translator. By processing audio tokens directly, the system achieves unprecedented interaction speeds. ​...

    €3500 Average bid
    €3500 Offre moyenne
    76 offres
    Trophy icon Modern ZORO AI Logo Design
    S'est terminé left

    I’m refreshing the visual identity of my research project, ZORO. The name nods to Roronoa Zoro from One Piece, so a modern logo that borrows his signature palette—forest-to-emerald greens with dark accents—will immediately resonate with our audience. What we do: ZORO applies AI to analyse multimodal robot data (video, audio, text) and verify that each robot behaves exactly as expected. The logo will appear on our official site, in academic papers, and on large screens at international conferences, so it must stay sharp and readable from thumbnail to banner size. What I need from you • A clean, modern word-mark or combination-mark featuring the name “ZORO”. • Colour treatment inspired by Roronoa Zoro; feel free to weave in subtle tech or ...

    €50 Average bid
    Garanti
    €50
    1844 propositions
    AI Social Video Generator Needed
    S'est terminé left

    ...visuals, adds dynamic captions in brand colours, synthesises the voice-over, mixes in background music, then renders and exports the final MP4. • I choose the target platform(s) and it automatically applies the right format and duration limits (15–60 s for Reels/Shorts, up to 3 min for Facebook/YouTube feed posts). I’m open to the underlying stack—Python, Node, ffmpeg, OpenAI or similar multimodal models, TTS engines such as ElevenLabs, and royalty-free music libraries are all acceptable so long as licensing remains clear. A lightweight web dashboard or command-line tool is fine for the first version; clean, documented code is crucial. Deliverables 1. Working MVP that runs locally or on a modest cloud instance and outputs ready-to-publish videos wit...

    €202 Average bid
    €202 Offre moyenne
    25 offres
    AI-Powered Platform Development
    S'est terminé left

    We are looking for experienced AI Developers to help design and build an advanced AI-powered platform. The role involves developing intelligent chatbots, Retrieval-Augmented Generation (RAG) systems, multimodal AI capabilities, and scalable backend architectures. You will work closely with the founding team to bring innovative ideas to life—from concept to production-ready systems. Key Responsibilities Build and deploy AI chatbots using modern LLM frameworks Design and implement RAG pipelines for document and knowledge-base querying Integrate OCR and Vision models for document and image understanding Implement Text-to-Speech (TTS), Speech-to-Text (STT), and Speech-to-Speech (STS) pipelines Fine-tune LLMs to create offline, self-hostable AI models Architect and develop a...

    €85 Average bid
    €85 Offre moyenne
    25 offres

    I am looking for a freelancer to assist with the implementation of my graduation project. I already have a clear research idea and an initial proposed methodology, but please note that the methodology is flexible and open to refinement since this is still a proposal an...Writing the graduation thesis and paper You are NOT expected to: • Design a completely new research idea from scratch • Train the model yourself • Write the thesis or academic paper This is an academic project, so clarity, correctness, and reproducibility are very important. Experience in the following is a strong plus: • Deep Learning / PyTorch • Research-oriented implementations • Multimodal models (audio & visual) If you are interested, please share relevant experience...

    €1021 Average bid
    €1021 Offre moyenne
    17 offres

    I am working on a graduation-level academic research project in the area of AI and Computer Vision, specifically related to multimodal media analysis. I am looking for an experienced AI/ML research writer to help write a full academic paper, while I focus on the implementation, experiments, and code development. The research idea, experimental design, and results will be provided privately after selecting the freelancer. The role primarily involves translating technical concepts and experimental findings into clear, publication-quality academic writing. Responsibilities: * Writing all paper sections (Introduction, Related Work, Methodology, Experiments, Results, Discussion, Conclusion) * Structuring the paper according to academic standards * Ensuring originality, clarity, and prope...

    €793 Average bid
    LDN
    €793 Offre moyenne
    11 offres

    Project Overview: We are looking for an experienced AI Automation Specialist to develop advanced multimodal AI agents. The ideal candidate has deep expertise in Google Cloud (Vertex AI/Agent Builder) and/or n8n workflow automation. You will be responsible for building agents capable of processing various data types (text, audio, images). Key Responsibilities: Design and deploy AI agents using Google Cloud Vertex AI (Agent Builder) or n8n. Implement multimodal capabilities (e.g., analyzing medical images, processing voice commands, and handling complex text queries). Integrate agents with external APIs and databases. Ensure workflows are robust, scalable, and secure. Requirements: Proven experience building AI Agents and workflows. Strong knowledge of...

    €340 Average bid
    €340 Offre moyenne
    96 offres

    I am building a clinically robust, retrieval-augmented framework that produces structured radiology reports from chest-x-ray images and associated text. Accuracy and clinical relevance drive every design choice, so I want the system to learn equally from both the IU X-ray and MIMIC-CXR datasets. The pipeline I envision looks like this: • Visual encoding with ViT-B16 to obtain global image embeddings. • Retrieval of the top-k similar studies from the training corpus to steer generation toward clinically plausible language and findings. • Text generation with Clinical T5, producing both the “Findings” and “Impression” sections. • Relation-aware validation using RadGraph, with a specific focus on analyzing relationships between clinical enti...

    €56 Average bid
    €56 Offre moyenne
    7 offres
    Auto Dealer AI Employee Best BDC
    S'est terminé left

    ...a single AI agent that becomes the first point of contact for my dealership on every channel customers already use—voice calls, website chat/SMS, and email. The goal is for this agent to greet prospects, answer their questions, book test-drive or service appointments, and handle day-to-day customer service without human intervention unless the inquiry is escalated. Core capabilities I need • Multimodal communication: the same agent must work over Voice, Text/SMS, and Email, preserving context when a customer switches among them. • Full customer-service coverage: technical support, sales inquiries, and general questions about our inventory, financing, or policies. • Appointment setting: real-time scheduling into our existing calendar so customers can lock...

    €7236 Average bid
    €7236 Offre moyenne
    102 offres
    Custom ERP for Logistics
    S'est terminé left

    3A Logistics OS – End-to-End ERP, Control Tower & AI Operating System 1. Company Overview 3A International is a multimodal freight forwarding and logistics group in Egypt, operating: Air & sea freight (import/export, FCL/LCL, consolidation) Customs clearance & brokerage Inland multimodal transport (rail, river, road) Terminals, depots, CFS and value-added logistics We are ISO 9001 / 14001 / 45001 certified We want a custom, AI-native ERP / “Logistics Operating System” that becomes the central brain of the company. 2. Project Goal Build a web-based ERP platform that: Centralises all shipments and operations (air, sea, rail, river, road, customs, terminals). Manages customers, partners, carriers, contractors, rates and contracts in one...

    €8486 Average bid
    €8486 Offre moyenne
    103 offres
    Multimodal LLM
    S'est terminé left

    ...GPU. 2. Captions pass a basic grammar checker with ≥ 95 % accuracy and follow supplied style rules. 3. At least 80 % of generated media assets meet resolution and duration specs for major platforms (Instagram, TikTok, X). 4. Codebase installs from scratch with one command and all tests pass. If this aligns with your skill set, let’s discuss timelines and milestones so we can bring this multimodal content engine to life....

    €122 Average bid
    €122 Offre moyenne
    23 offres
    Multimodal LLM -- 2
    S'est terminé left

    ...GPU. 2. Captions pass a basic grammar checker with ≥ 95 % accuracy and follow supplied style rules. 3. At least 80 % of generated media assets meet resolution and duration specs for major platforms (Instagram, TikTok, X). 4. Codebase installs from scratch with one command and all tests pass. If this aligns with your skill set, let’s discuss timelines and milestones so we can bring this multimodal content engine to life....

    €558 Average bid
    €558 Offre moyenne
    32 offres
    Real-Time Object Detection App
    S'est terminé left

    ...Expertise in AI and machine learning - Experience with live video processing - Proficiency in mobile app development - Background in computer vision technologies Real-Time Multimodal Vision & Wearable Platform ​Project Overview: We are building a cutting-edge, real-time "Action-Analysis" platform. The app uses a device’s camera to monitor high-speed activity, provides instant AI-driven verbal/visual verdicts, and allows for retrospective "highlight" clipping. We are moving toward a multi-camera ecosystem involving external hardware and wearable integration. ​Key Technical Requirements for Initial and Future Developments: ​Multimodal AI: Implementation of Gemini 2.0 Flash / Live API for real-time video/audio reasoning. ​Audio/Voice Logic: ...

    €505 Average bid
    €505 Offre moyenne
    162 offres
    AI Interview Platform Build
    S'est terminé left

    ...the user can upload either a résumé or a job description in PDF or Word format. Your backend should parse the document, identify key skills and context, and instantly generate a tailored set of interview questions. The next step is an AI-powered mock interview, ideally with real-time voice (and, if practical, video) so the system can follow up naturally. After the session finishes, I want a multimodal analysis engine—text, audio and video—to rate performance, uncover sentiment cues, and surface constructive feedback on a dashboard that’s clear and actionable. Deliverables • Fully tested social-login module for Facebook, Google and LinkedIn • Upload component that accepts PDF and Word files and feeds the question generator &...

    €2133 Average bid
    €2133 Offre moyenne
    28 offres

    This project covers preprocessing of a breast cancer mammography dataset strictly following the methodology as discussed. Tasks include lesion cropping using ground-truth masks, image resizing to 224×224, normalization, and augmentation (rotation, flipping). Clinical features will be encoded as one-hot vectors with proper handling of missing data to ensure full compatibility with downstream multimodal fusion models.

    €434 Average bid
    €434 Offre moyenne
    1 offres

    Project Overview: I am looking for a freelancer to draft a base research paper that consolidates concepts from a specific project (Causal Multimodal Diagnostic Agent) and several reference IEEE papers. The goal is to create a unified paper that synthesizes the observations, methodologies, and results from the provided materials into a single cohesive document. What Will Be Provided: Main Project Details: Documentation/summary of the "Causal Multimodal Diagnostic Agent" project. Reference Papers: A list of IEEE-standard papers related to the topic. Scope of Work: You are required to: Review: Read the provided project details and the additional reference papers. Synthesize: Combine the observations, methods, and findings from all provided sources. Draft: Write a stru...

    €36 Average bid
    €36 Offre moyenne
    11 offres
    Explainable AI for Classification
    S'est terminé left

    Build a high-performance binary classifier using multimodal data: • images •tabular features The model must incorporate Explainable AI (XAI) In training and using advanced fusion technique.

    €255 Average bid
    €255 Offre moyenne
    37 offres

    I have a half-finished manuscript on MedXpert AI, our multimodal clinical decision assistant, that needs to be transformed into a fully developed research paper. The core emphasis must remain on the system’s technical implementation details, written in a formal academic style with clear sections, solid citations and polished language suitable for submission to a peer-reviewed venue. In parallel, I also need a compact, five-page survey paper that distils and showcases the most innovative features of MedXpert AI. This survey is meant to sit alongside the main article as a quick, literature-backed overview that highlights why our approach is novel compared with existing clinical decision assistants. Deliverables • Finalised technical paper on MedXpert AI’s implemen...

    €29 Average bid
    €29 Offre moyenne
    5 offres

    ...(SPP profile) between Head Unit and Pocket Unit. Wi-Fi disabled on head unit. • Image Preprocessing: Grayscale conversion and JPEG compression to minimize data size. • Network Logic: 4G/LTE preference. If signal drops or timeout (>10s) occurs, trigger the error vibration immediately. • Target Latency: $<5$ seconds end-to-end (from capture to audio start). D. Software Architecture • Function: Multimodal Image Analysis. o Instead of local OCR, the system must send the compressed image directly to a Vision-capable Cloud AI (e.g., GPT-4o, Gemini Pro Vision). o This allows the logic of "where to start/stop reading" to be controlled via the prompt based on visual layout and finger position. • AI: Cloud AI supported (API-based). No API keys hardc...

    €488 Average bid
    €488 Offre moyenne
    130 offres
    Offline Raspberry Pi AI Tutor
    S&#039;est terminé left

    I want a self-contained AI tutor that runs entirely on a Raspberry Pi zero w . Once installed it should let students ask anything—from world facts to coding techniques, web-design tips, image-gener...and image formats on demand. • Local inference only—TensorFlow Lite, ONNX-runtime, , , Stable Diffusion-Lite or similar lightweight frameworks are fine, as long as startup scripts and dependencies are provided. Acceptance for hand-over – Ready-to-run model files and optimized weights. – Python (or Bash) launcher that handles user input by voice or text and returns multimodal output. – Example session demonstrating a coding question, an image-based question, and an auto-generated mixed quiz. – Clear setup guide tested on a fresh Raspb...

    €539 Average bid
    €539 Offre moyenne
    13 offres

    ...upload, retrieval, and Q&A Integrate functionality into our Angular front end and Laravel backend Enable the bot to display screenshots, images, or short instructional clips when helpful guide us in generating screenshots or visual steps on the fly after learning our application workflow Preferred Skills Strong experience with RAG pipelines, vector databases, and LLM tuning Familiarity with multimodal AI (text + images) Ability to create or guide demonstration clips or step-by-step visuals To Apply Please provide: Examples of similar AI or RAG projects A brief outline of how you would approach improving our bot Your hourly rate or project-based pricing...

    €19 / hr Average bid
    €19 / hr Offre moyenne
    77 offres
    Senior AI Engineer
    S&#039;est terminé left

    ...years multi-agent systems Type: Contract ROLE SUMMARY We are seeking a highly experienced Senior AI Engineer to lead the development of production-grade multi-agent AI systems, backend services, LLM orchestration, and full-stack AI-driven product experiences. The ideal candidate possesses deep technical expertise across Python backends, multi-agent workflows, LLM integrations, RAG pipelines, multimodal processing, and frontend engineering. KEY RESPONSIBILITIES ● Design and implement scalable multi-agent architectures: supervisor patterns, orchestrators, shared memory/state, workflow dependencies, checkpointing, retries, and debuggability. ● Build agent-driven coding workflows with hooks, background tasks, and toolchains integrating AI coding tools. ● Develop high-performance Pyth...

    €14 / hr Average bid
    €14 / hr Offre moyenne
    40 offres

    Describe what you need I’m building a system that can sense emotional signals in a live conversation — from audio, video and speech — and return a synchronized emotional stream for a weekly podcast. I need one engineer who can build a real-time multimodal pipeline from scratch. The role is hands-on: prototype fast, ship weekly improvements, and make it work end-to-end. This is inference only, not model training. The System (High-Level) The pipeline will: Capture 2 video feeds from cameras extract facial/body emotional signals timestamp frames Capture audio input from a dual mic receiver run emotion model track tone/tension/stress cues timestamp stream Run Whisper (or similar) in real time speech-to-text confidence scores timestamped text segments S...

    €24 / hr Average bid
    €24 / hr Offre moyenne
    73 offres

    need this done in ONE day 60-Second Fast-Paced Product Demo (Travel Tech Platform) PROJECT OVERVIEW BookSmart24 creates multimodal routes (Train replaces Flight / Train→Plane combos). We need a 60-second horizontal demo showing these functions in a fast, clean, TikTok-paced style – but with a professional, investor-grade look. WHAT YOU WILL DO 1. Record our UI (we give route instructions): – search input – loading animation (train + plane) – SmartChoice results – train→plane combined itinerary – unified checkout 2. Edit the video: – TikTok-style pacing (fast, crisp, smooth) – clean modern transitions – light zooms on UI elements – minimal text overlays – AI voice-over (script provided) – soft te...

    €129 Average bid
    €129 Offre moyenne
    49 offres

    ...must be reported with standard clinical metrics—AUC, sensitivity, specificity—on a held-out test set. • I need concise documentation so that hospital staff can reproduce the results, plus a short technical report explaining the architecture choices and how the attention maps can be visualised for clinical insight. If you have prior experience with medical imaging, EEG feature engineering, or multimodal transformers, I’d like to see examples. Otherwise, let me know how you plan to tackle regulatory-grade data handling and the small-sample challenges inherent to psychiatric datasets. Deliverables that will mark the job complete: 1. Full, commented source code and environment file. 2. Trained model weights and a reproducible inference notebook. 3. Docu...

    €376 Average bid
    €376 Offre moyenne
    3 offres

    Project Title: Causal Multimodal Diagnostic Agent (Medical AI) – Code + Frontend + Research Paper ​Budget: 8,000 (Fixed Price) Deadline: December 20, 2025 (Strict) ​Project Overview: I am looking for an experienced AI/ML developer and researcher to build a Causal Multimodal Diagnostic Agent (CMDA). This system must integrate medical imaging (Chest X-rays) and clinical text reports to diagnose diseases, using causal graph learning to eliminate spurious correlations. ​The project requires delivering a fully working codebase, a basic interactive frontend for testing, and a complete, high-quality research paper suitable for publication. ​Key Technical Requirements (Based on Project Design) ​Multimodal Inputs: ​Image Encoder: ResNet50 or Vision Transformer (ViT) for...

    €70 Average bid
    €70 Offre moyenne
    16 offres

    We need a developer to build a fully offline AI companion pipeline that integrates directly with Unity. The system must include a Python function that uses Qwen2.5-VL to generate a clean, one-sentence caption from a base64-encoded screenshot using the correct Hugging Face multimodal workflow. A local RAG component (FAISS or Chroma) should preload our text documents, embed them locally, and retrieve the most relevant chunks using both the scene caption and the player’s question. A final response generator must then combine the caption, the retrieved RAG context, and the player’s query to produce a concise, grounded, one-sentence answer from the AI companion. On the Unity side, we need InputActions for the Q key, screenshot capture and base64 encoding, a UnityWebRequest PO...

    €123 Average bid
    €123 Offre moyenne
    29 offres

    I need a skilled expert to implement and publish a CNN and GNN-based fusion model on a multimodal dataset of images and text. The primary goal is to improve classification accuracy. Requirements: - Expertise in CNNs and GNNs - Experience with multimodal datasets - Strong background in image and text data processing - Proven track record in model implementation and publication Please include relevant past work and experience in your application.

    €509 Average bid
    €509 Offre moyenne
    11 offres

    We are an AI consulting services company with several potential clients in our sales pipeline. We aim to be the single 'source of truth' for clients when it comes to creating bespoke AI automation strategies and products. For our first client, we are developing a multimodal conversational AI app with real-time chat, secure payments, session-based billing, wallet logic, transcript storage, and strong privacy controls. We already have the frontend foundation; we now need a skilled engineer to harden the backend, integrate payments, and make the application secure, scalable, and deployable. This role requires someone comfortable with Node/Express or FastAPI, secure payment integrations, LLM proxying, and production-grade backend architecture. Here are the core responsibili...

    €1150 Average bid
    €1150 Offre moyenne
    57 offres

    We need a developer to build a fully offline AI companion pipeline that integrates directly with Unity. The system must include a Python function that uses Qwen2.5-VL to generate a clean, one-sentence caption from a base64-encoded screenshot using the correct Hugging Face multimodal workflow. A local RAG component (FAISS or Chroma) should preload our text documents, embed them locally, and retrieve the most relevant chunks using both the scene caption and the player’s question. A final response generator must then combine the caption, the retrieved RAG context, and the player’s query to produce a concise, grounded, one-sentence answer from the AI companion. On the Unity side, we need InputActions for the Q key, screenshot capture and base64 encoding, a UnityWebRequest PO...

    €134 Average bid
    €134 Offre moyenne
    41 offres

    I am looking for an experienced Deep Learning / Medical AI Expert to develop a complete multimodal pipeline for early-stage (prodromal) Parkinson’s disease classification. The project consists of four phases: MRI data curation + 3D CNN imaging-only model Explainability using LRP Clinical feature-based model with attention

    €142 Average bid
    €142 Offre moyenne
    16 offres

    We are looking for a technology commercialization and IP licensing specialist to help bring a patent‑pending, multi‑module AI system to market through licensing, joint ventures, or strategic partnerships. The invention introduces a predictive AI engine that anticipates user intent, refines prompts before execution, and delivers real‑time, multimodal results—defining a new class of adaptive AI. The system is patent pending (U.S. Non‑Provisional) and supported by a trademark filing for “IQ Prompt.” Key Modules: Predictive Interaction Engine, Recursive Refinement Core, IQPROMPT Widgets, Predictive Keyboard, and Developer Integration API. Markets: SaaS, AI assistants, workflow automation, and adaptive interface systems. Your Role: Identify markets and potent...

    €30 / hr Average bid
    €30 / hr Offre moyenne
    46 offres

    Qualitative research to code 10 short TikTok videos using ATLAS.ti. The task involves watching each video, transcribing spoken and written text, and coding multimodal features (linguistic, visual, auditory, gestural, and spatial modes). The goal is to identify and organize recurrent multimodal patterns across videos and able to deliver a completed project file with a summary of the codes and patterns.

    €174 Average bid
    €174 Offre moyenne
    1 offres

    Meilleurs articles de la communauté multimodal