
Fermé
Publié
LLM – AI Quality Analyst (Personalization) Language Specialist: [Chinese, Korean, Japanese, Thai] Remote | Contract Opportunity | Special Requirements • [Chinese, Korean, Japanese, Thai] Proficiency: Ability to read and write in [Chinese, Korean, Japanese, Thai] with a high degree of comprehension, as [Chinese, Korean, Japanese, Thai] is the focus language for this project. • Personal Account Usage: Willingness to use your primary personal Google account (not a testing account) and enable personal data sources for a genuine assessment. • Schedule Flexibility: Full-time availability in your local time zone is required. We are staffing a global, 24-hour operations team. Must maintain 4 hours of overlap with PST time zone. • Exceptional Analytical Thinking: Demonstrated ability to evaluate nuanced and ambiguous AI responses, specifically assessing personalization quality. Vetting Process • Interest check form + Assessment Role Overview As an AI Quality Analyst, you will evaluate a new personalization feature for Gemini. You will assess how well the model uses information from your past Gemini conversations, Gmail, Google Search, and YouTube activity to make responses more relevant and helpful. This role requires a unique blend of creativity and analytical rigor. You will actively design prompts from the perspective of your own personal experiences, then use your analytical skills to assess the quality of the model's personalized responses — evaluating dimensions like Grounding, Integration, and Helpfulness. Key Qualifications • [Chinese, Korean, Japanese, Thai] Proficiency: Ability to read and write in [Chinese, Korean, Japanese, Thai] with a high degree of comprehension — this is the core requirement for this role. • Personal Account Usage: Willingness to use your primary personal Google account (not a testing account) and enable personal data sources for genuine assessment. • Schedule Flexibility: Full-time availability (8 hrs/day) in your local time zone, with 4 hours of overlap with PST. • Exceptional Analytical Thinking: Ability to evaluate nuanced and ambiguous AI responses, specifically around personalization quality. • Creative Prompt Engineering: Experience designing creative, multi-turn prompts based on personal context to thoroughly test model capabilities. • Strong Evaluation Acumen: Understanding of personalization concepts, including ability to identify incorrect personalization, poor inferences, and forced connections. • Meticulous Attention to Detail: Ability to review Side-by-Side (SxS) model responses and spot subtle differences in naturalness and overnarrating. • Excellent Written Communication: Superior ability to write clear, concise, and structured rationales for model rankings, explicitly referencing specific turn numbers. • Feedback Skills: Ability to provide constructive feedback and detailed annotations. • Collaboration: Strong communication and teamwork skills with the ability to work independently in a remote setting. • Technical Setup: Desktop or laptop with a reliable internet connection. Day-to-Day Responsibilities As part of a dynamic team focused on evaluating the quality of personalized AI interactions, your daily work will include: • Designing and executing multi-turn conversational prompts (typically 1–5 turns) that require the AI to utilize your personal information and experiences. • Evaluating model responses based on your intent from the starting prompt, checking if personalization was appropriately applied. • Analyzing responses for Grounding issues — ensuring claims about you are supported by evidence, not flawed inferences or hallucinations. • Assessing Integration quality to ensure personal data is woven naturally into responses without robotic "overnarrating". • Rigorously evaluating and stack-ranking two model responses side-by-side (SxS) to determine which is more helpful, easy to use, and enjoyable. • Writing clear, defensible rationales for your comparisons, explicitly referencing where issues or positive aspects occurred in the conversation. • Extracting and verifying "Debug Info" from the model to confirm that chat summaries and data sources were properly utilized. • Maintaining strict data hygiene by deleting evaluation conversations to prevent them from polluting your future chat history. Education & Experience • BS/BA degree or equivalent experience in a relevant field (e.g., Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or a related analytical field). • Experience in data annotation, AI quality evaluation, content moderation, or a related role is strongly preferred.
N° de projet : 40292163
5 propositions
Projet à distance
Actif à il y a 28 jours
Fixez votre budget et vos délais
Soyez payé pour votre travail
Surlignez votre proposition
Il est gratuit de s'inscrire et de faire des offres sur des travaux
5 freelances proposent en moyenne ₹935 INR/heure pour ce travail

Hello, I see you need a Language Specialist proficient in Chinese, Korean, Japanese, or Thai to evaluate AI personalization quality for Gemini using your personal Google account. Your requirement for full-time remote availability with PST overlap and strong analytical skills to assess nuanced AI responses really stands out. You want someone who can design multi-turn prompts based on personal experiences and rigorously evaluate model responses for grounding, integration, and helpfulness, including side-by-side comparisons with clear justifications. The role also demands meticulous attention to detail and a strong grasp of personalization concepts to identify subtle errors or forced connections. I have worked on AI quality evaluation projects where I created personalized prompts and analyzed LLM responses for accuracy and relevance, focusing on grounding and natural integration of user data. I also prepared detailed feedback reports referencing specific conversation turns, which aligns closely with your need for clear rationales and debug info verification. I can commit to full-time availability with the required PST overlap and deliver thorough evaluations and reports within a 4-week timeframe. I’m ready to discuss how I can contribute to your team and help enhance Gemini’s personalization feature.
₹825 INR en 7 jours
0,0
0,0

Hi, Japanese has been part of my life for over 18 years — not just as a language, but as a lens I research, teach, and think through. At the Max Planck Institute I spent two years inside Japanese legal texts, where precision wasn't a preference — one misread inference could unravel an entire argument. That kind of close reading becomes a reflex. I also read Classical Japanese (古文) and Sino-Japanese (漢文訓読), which keeps me honest about how deep and layered the language really is. Modern Japanese is just the surface. What draws me to this role is the chance to bring that same attention to how AI handles Japanese — where cultural nuance, register, and inference matter enormously and are easy to get subtly wrong. Based in Germany, full-time available, 4+ hours PST overlap, real personal Google account. Happy to jump into the assessment whenever you're ready. Looking forward to hearing from you.
₹1 100 INR en 40 jours
0,0
0,0

Indore, India
Membre depuis févr. 23, 2026
$10-30 USD
₹1250-2500 INR / heure
$10-30 USD
₹500000-1000000 INR
$10-30 USD
₹750-1250 INR / heure
$750-1500 USD
₹12500-37500 INR
₹12500-37500 INR
$30-250 USD
$30-250 SGD
$250-750 CAD
€8-30 EUR
$250-750 USD
₹1500-12500 INR
$10-30 USD
$250-750 USD
$250-750 USD
₹12000-15000 INR
$15-25 USD / heure
$250-750 USD
£20-250 GBP