
Closed
Posted
Paid on delivery
We are currently looking for pre-recorded call center conversation datasets with the following requirements: • English – 500 hours • Hindi – 500 hours • Spanish – 500 hours Requirements: * Agent-side audio only * No background noise * No long silence segments * Audio format: WAV * 16 kHz, 16-bit or higher * Unidirectional audio * Transcription text required Additionally, please confirm the following details: 1. Is transcription text available? * If yes, is it AI-generated or human-annotated? * If AI-generated, can it be manually refined? 2. If transcripts are available, please share details regarding: * Speaker tags * Timestamp information * Accuracy rates for: • Speaker diarization • WER (Word Error Rate) • Timestamp alignment 3. Please confirm the audio sampling rate specifications. 4. Has personal information within the recordings been de-identified? * If yes, please explain the de-identification process. 5. Scope of Data Use: * The dataset will only be used for internal model training purposes.
Project ID: 40476637
2 proposals
Remote project
Active 21 hours ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
2 freelancers are bidding on average ₹6,260,000 INR for this job

I can supply pre-recorded call centre conversation datasets meeting your specifications and am happy to confirm the details you have asked for. Transcription text is available. Our transcripts are produced with AI generation as a first pass and then refined through human annotation, giving you both speed and accuracy. Speaker tags and timestamp information are included as standard, and we can share accuracy benchmarks for speaker diarization, word error rate, and timestamp alignment on request. All audio is delivered in WAV format at 16 kHz 16-bit or higher, unidirectional, with agent side audio only, no background noise, and silence segments removed. Personal information within recordings is fully de-identified prior to delivery through a combination of automated PII detection and manual review, covering names, account numbers, addresses, and contact details. We can supply English, Hindi, and Spanish at 500 hours each. The dataset is licensed for internal model training use only as you have specified. Pricing is negotiable depending on exact volume, annotation depth, and delivery timeline. Please get in touch so we can discuss your requirements and turn around a formal proposal quickly.
₹12,500,000 INR in 7 days
0.0
0.0

Hello I just read out your description and am interested in your project. I am an Expert in Audio Dataset , and have also worked with it in the past. If you need Quality Work, then feel free to contact me Thanks
₹20,000 INR in 7 days
0.0
0.0

Jagun, India
Member since May 10, 2026
₹12500-37500 INR
$30-35 USD
₹750-1250 INR / hour
$58 USD
$10-5000 USD
₹12500-37500 INR
$250-750 USD
₹1400-1500 INR / hour
₹750-1250 INR / hour
₹12500-37500 INR
$30-60 NZD / hour
$30-250 AUD
₹600-1500 INR
$30-250 USD
₹12500-37500 INR
₹12500-37500 INR
$15-18 USD
₹750-1250 INR / hour
$10-30 USD
$10-50 USD
$201 USD