MCP HubMCP Hub
스킬 목록으로 돌아가기

voice-localization

guia-matthieu
업데이트됨 2 days ago
7 조회
111
20
111
GitHub에서 보기
메타aidesign

정보

이 스킬은 AI 기반 음성 합성 기술을 제공하여 브랜드의 음성 정체성을 유지하면서 오디오 콘텐츠를 여러 언어로 현지화합니다. 동영상 더빙, 마케팅 자료 현지화, 다국어 교육 콘텐츠 제작에 이상적입니다. 개발자는 이를 통해 글로벌 언어 확장 시 일관된 음질과 캐릭터를 유지할 수 있습니다.

빠른 설치

Claude Code

추천
기본
npx skills add guia-matthieu/clawfu-skills -a claude-code
플러그인 명령대체
/plugin add https://github.com/guia-matthieu/clawfu-skills
Git 클론대체
git clone https://github.com/guia-matthieu/clawfu-skills.git ~/.claude/skills/voice-localization

Claude Code에서 이 명령을 복사하여 붙여넣어 스킬을 설치하세요

문서

AI Voice Localization

Scale your brand voice across multiple languages using AI voice synthesis, maintaining consistent character and quality for global content.

When to Use This Skill

  • Expanding video content to new language markets
  • Creating multilingual courses or training
  • Localizing ads and marketing videos
  • Dubbing existing content for international audiences
  • Building consistent global brand voice
  • Deciding between dubbing vs. subtitles

Methodology Foundation

Source: ElevenLabs Multilingual + Global Content Best Practices

Core Principle: True localization means the same perceived person speaks each language natively—not a translated voice, but a voice that sounds local while maintaining brand character. AI voice synthesis enables this at scale by preserving voice identity while adapting pronunciation and rhythm to each language.

Why This Matters: Global content traditionally required separate voice actors per language, losing brand consistency. AI voice localization maintains the same "person" across 29+ languages, creating unified brand experience worldwide while reducing production costs 70-90%.

What Claude Does vs What You Decide

Claude DoesYou Decide
Structures production workflowFinal creative direction
Suggests technical approachesEquipment and tool choices
Creates templates and checklistsQuality standards
Identifies best practicesBrand/voice decisions
Generates script outlinesFinal script approval

What This Skill Does

  1. Maintains voice identity across languages - Same character, different language
  2. Handles cultural adaptation - Beyond translation to localization
  3. Manages multilingual production - Efficient workflows for many languages
  4. Ensures quality per market - Native speaker validation
  5. Calculates ROI - Traditional dubbing vs. AI localization costs

How to Use

Plan Localization Project

Help me plan voice localization for [content].
Source language: [original]
Target languages: [list]
Content type: [video/audio/course]
Volume: [duration/number of assets]

Evaluate Localization Approach

Should I use AI voice localization or traditional dubbing?
Content: [describe]
Markets: [target countries]
Budget: [range]
Timeline: [deadline]

Instructions

When localizing voice content, follow this methodology:

Step 1: Assess Localization Needs

Determine the right approach for your content.

## Localization Decision Matrix

### When to Use AI Voice Localization

✓ Same brand voice needed across markets
✓ Frequent content updates (efficiency matters)
✓ Educational/informational content
✓ Budget constraints
✓ Quick turnaround needed
✓ 5+ languages needed

### When to Use Traditional Dubbing

✓ Character-driven content (emotions critical)
✓ One-time major production
✓ Markets expect dubbed content (Germany, France)
✓ Complex lip-sync requirements
✓ Budget allows $1,000+ per language

### When to Use Subtitles Instead

✓ Documentary/interview content
✓ Authenticity of original voice matters
✓ Lowest budget option
✓ Markets prefer subtitles (Nordics, Netherlands)
✓ Legal/compliance content (exact words matter)

### Hybrid Approach
Hero content → Traditional dubbing
Supporting content → AI localization
Supplementary → Subtitles

Step 2: Select Languages Strategically

Prioritize languages based on market opportunity.

## Language Prioritization Framework

### Tier 1: High Volume Languages (1B+ speakers)
| Language | Global Speakers | Key Markets |
|----------|----------------|-------------|
| English | 1.5B | Global |
| Mandarin | 1.1B | China |
| Spanish | 550M | LATAM, Spain |
| Hindi | 600M | India |

### Tier 2: High Value Languages
| Language | Economic Value | Markets |
|----------|---------------|---------|
| German | High GDP | DACH |
| French | Colonial reach | France, Africa |
| Japanese | High spending | Japan |
| Portuguese | Large market | Brazil |

### Tier 3: Strategic Languages
| Language | Strategic Value | Markets |
|----------|----------------|---------|
| Arabic | Growing middle class | MENA |
| Korean | Tech-forward | South Korea |
| Italian | Fashion/luxury | Italy |
| Dutch | High English | Benelux |

### ElevenLabs Supported Languages (29+)
English, Spanish, French, German, Italian, Portuguese,
Polish, Dutch, Hindi, Arabic, Chinese, Japanese, Korean,
Turkish, Swedish, Indonesian, Filipino, Malay, Russian,
Czech, Danish, Finnish, Greek, Romanian, Ukrainian,
Vietnamese, Norwegian, Hungarian, Tamil, and more.

Step 3: Prepare Content for Localization

Translation alone isn't enough—prepare for voice adaptation.

## Content Preparation Checklist

### Script Adaptation

**Text expansion/contraction**:
| Language | vs English |
|----------|-----------|
| German | +30% longer |
| French | +15-20% longer |
| Spanish | +15-25% longer |
| Chinese | -30% shorter |
| Japanese | Variable |

**Implications**:
- Video may need re-timing
- Allow flexibility in pacing
- Consider sentence splitting for longer languages

**Localization notes to provide**:
□ Brand terms (don't translate, keep English)
□ Product names (pronunciation guide)
□ Numbers (format varies by locale)
□ Dates (format varies by locale)
□ Currency (localize amounts)
□ Cultural references (adapt or explain)

### Voice Consistency Notes

**Preserve across languages**:
- Character/personality
- Energy level
- Authority/warmth balance
- Pace relative to content

**Adapt per language**:
- Natural rhythm and cadence
- Pronunciation of brand terms
- Formal/informal register (varies by culture)

Step 4: Production Workflow

Efficient process for multilingual voice production.

## Multilingual Production Pipeline

### Phase 1: Source Production
1. Finalize English script
2. Record/generate English voice
3. Lock timing and pacing
4. Create master video/audio

### Phase 2: Translation
1. Professional translation (not machine)
2. Localization review (cultural adaptation)
3. Timing adaptation (fit original duration)
4. Brand term glossary enforcement

### Phase 3: Voice Generation

**Per language**:
  1. Load translated script
  2. Apply same voice settings as source
  3. Generate voice in target language
  4. Check pronunciation of brand terms
  5. Adjust pacing if needed
  6. Review for naturalness

### Phase 4: Quality Control

**Native speaker review checklist**:
□ Natural pronunciation
□ Correct emphasis and intonation
□ Brand terms handled correctly
□ No awkward phrasing
□ Appropriate formality level
□ Cultural appropriateness

### Phase 5: Integration
1. Replace audio track in video
2. Re-sync if timing changed
3. Update text overlays
4. Localize captions/subtitles
5. Final review per language

Step 5: Quality Assurance

Ensure each language meets standards.

## Localization QA Framework

### Technical QA
□ Audio levels consistent across languages
□ No clipping or distortion
□ Background music balanced correctly
□ Transitions smooth
□ Sync with video acceptable

### Linguistic QA
□ Translation accuracy (spot check 10%)
□ Natural flow and rhythm
□ Brand voice maintained
□ Technical terms correct
□ No machine-translation artifacts

### Cultural QA
□ No offensive content for market
□ References appropriate
□ Humor/idioms adapted correctly
□ Visual content appropriate
□ Call-to-action localized

### Native Speaker Sign-Off
For each language:
- [ ] Spanish (Reviewer: _____) ☐ Approved
- [ ] French (Reviewer: _____) ☐ Approved
- [ ] German (Reviewer: _____) ☐ Approved
- [ ] [Add languages...]

Step 6: Calculate ROI

Compare AI localization to traditional approaches.

## Localization Cost Comparison

### Traditional Dubbing (per language)

| Component | Cost |
|-----------|------|
| Translation | $0.15/word |
| Voice talent | $300-1,000/hour finished |
| Studio time | $100-200/hour |
| Direction | $50-100/hour |
| Engineering | $50-100/hour |

**Example**: 10-minute video (1,500 words)
- Translation: $225
- Voice talent: $400
- Studio: $200
- Direction: $150
- Engineering: $100
- **Total: ~$1,075 per language**

### AI Voice Localization

| Component | Cost |
|-----------|------|
| Translation | $0.15/word |
| ElevenLabs Pro | $99/mo (unlimited) |
| QA review | $50-100/language |

**Example**: 10-minute video (1,500 words)
- Translation: $225
- Voice generation: ~$0 (within plan)
- QA review: $75
- **Total: ~$300 per language**

### ROI Summary

| Languages | Traditional | AI Localization | Savings |
|-----------|-------------|-----------------|---------|
| 5 | $5,375 | $1,500 | 72% |
| 10 | $10,750 | $3,000 | 72% |
| 20 | $21,500 | $6,000 | 72% |

**Break-even**: AI localization typically saves 70%+ vs. traditional

Examples

Example 1: E-Learning Course Localization

Context: Localizing a 20-video course to 5 languages.

Input:

We have a 20-video SaaS training course in English (total 3 hours). Need to localize to Spanish, French, German, Portuguese, and Japanese. What's the best approach?

Output:

Course Localization Plan

Project Scope

  • Content: 20 videos, 3 hours total
  • Source: English
  • Targets: Spanish, French, German, Portuguese, Japanese
  • Content type: Educational (clarity > emotion)

Recommended Approach: AI Voice Localization

Why AI is ideal here:

  • Educational content (fact delivery, not drama)
  • 5 languages (efficiency critical)
  • Likely ongoing updates (need repeatable process)
  • Consistent "instructor" voice across all versions

Implementation Plan

Platform: ElevenLabs Pro ($99/mo)

Voice Strategy:

  • Clone original instructor OR design consistent voice
  • Same voice speaks all 6 languages
  • Maintains authority and teaching style

Workflow per Video:

  1. Translation (outsource)

    • Professional translation (not machine)
    • Provide glossary of product terms
    • Timing guidance for text expansion
  2. Voice Generation

    • Generate all 5 languages from same voice
    • ~15 minutes per language per video
    • Total: ~25 hours voice generation
  3. QA Review

    • Native speaker review per language
    • Focus: pronunciation, naturalness, accuracy
    • Budget 1 hour review per language per video
  4. Video Integration

    • Replace audio tracks
    • Adjust timing if needed
    • Verify captions match

Timeline

PhaseDurationNotes
Translation (all)2 weeksParallel
Voice generation1 week~5 hours/day
QA review2 weeksParallel per language
Integration1 weekParallel
Total5-6 weeksWith buffer

Budget Estimate

ItemCost
Translation (5 lang × 27k words)$20,250
ElevenLabs (2 months)$200
QA review (5 lang × 20 videos × $30)$3,000
Integration (20 hours @ $50)$1,000
Total~$24,500

vs. Traditional dubbing: ~$107,500 (77% savings)


Example 2: Global Ad Campaign

Context: 30-second ad needs localization for global launch.

Input:

We're launching a 30-second video ad globally. Need 10 language versions in 2 weeks. Budget is $15,000.

Output:

Ad Localization: Rapid Global Launch

Constraints

  • 30-second ad (tight timing)
  • 10 languages
  • 2-week deadline
  • $15,000 budget (~$1,500/language)

Recommended: Hybrid Approach

Tier 1 (Hero Markets) - Traditional Dubbing

  • English (source)
  • Spanish (largest reach)
  • German (high value)
  • French (high value)

Tier 2 (Scale Markets) - AI Localization

  • Portuguese, Italian, Dutch, Polish, Japanese, Korean

Rationale

  • Hero markets get premium treatment
  • AI handles scale efficiently
  • Both meet deadline

Production Schedule

Week 1:

DayTask
1-2All translations complete
2-3Traditional dubbing sessions (4 languages)
3-4AI voice generation (6 languages)
4-5QA review all versions

Week 2:

DayTask
1-2Revisions and fixes
3-4Video integration all versions
5Final review and delivery

Budget Allocation

ItemCost
Translation (10 × ~120 words)$1,800
Traditional dubbing (4 lang)$4,800
AI generation (6 lang)$600
QA review (10 lang)$2,000
Integration (10 lang)$2,500
Buffer$3,300
Total$15,000

Checklists & Templates

Localization Project Checklist

## Pre-Production
□ Languages selected and prioritized
□ Budget allocated per language
□ Timeline established
□ Translation vendor selected
□ Brand glossary prepared
□ Voice consistency plan defined

## Production
□ Translations complete
□ Translations reviewed for brand terms
□ Voice generated per language
□ Pronunciation verified
□ Timing adjusted if needed

## Quality Assurance
□ Native speaker review complete
□ Technical QA passed
□ Brand guidelines verified
□ Cultural review passed
□ Legal/compliance check (if needed)

## Delivery
□ Files named correctly per language
□ All formats delivered
□ Captions/subtitles provided
□ Documentation complete
□ Source files archived

Brand Glossary Template

## [Brand] Localization Glossary

### Never Translate
| English | Note |
|---------|------|
| [Brand Name] | Keep English, pronunciation: [X] |
| [Product Name] | Keep English |
| [Feature Name] | Keep English, explain in context |

### Translate Consistently
| English | Spanish | French | German |
|---------|---------|--------|--------|
| Dashboard | Panel | Tableau de bord | Dashboard |
| Workflow | Flujo de trabajo | Flux de travail | Arbeitsablauf |
| [Term] | | | |

### Pronunciation Guide
| Term | Pronunciation |
|------|--------------|
| [Brand] | /brănd/ |
| [Feature] | /fē-chər/ |

Skill Boundaries

What This Skill Does Well

  • Structuring audio production workflows
  • Providing technical guidance
  • Creating quality checklists
  • Suggesting creative approaches

What This Skill Cannot Do

  • Replace audio engineering expertise
  • Make subjective creative decisions
  • Access or edit audio files directly
  • Guarantee commercial success

References

  • ElevenLabs. "Multilingual Voice Synthesis" - Platform documentation
  • CSA Research. "Global Content Strategy" - Localization best practices
  • Unbabel. "The State of Localization" - Industry benchmarks
  • Nimdzi. "Localization Market Research" - Cost and ROI data

Related Skills


Skill Metadata (Internal Use)

name: voice-localization
category: audio
subcategory: voice
version: 1.0
author: MKTG Skills
source_expert: ElevenLabs, Localization Best Practices
source_work: Multilingual Content Production
difficulty: intermediate
estimated_value: 70%+ cost savings vs. traditional dubbing
tags: [localization, multilingual, dubbing, ai-voice, global]
created: 2026-01-26
updated: 2026-01-26

GitHub 저장소

guia-matthieu/clawfu-skills
경로: skills/audio/voice-localization
0
ai-skillsanthropicclaude-codeclaude-skillsmarketingmcp-server

연관 스킬

content-collections

메타

이 스킬은 콘텐츠 콜렉션(Content Collections)을 위한 프로덕션 검증된 설정을 제공합니다. 콘텐츠 콜렉션은 Markdown/MDX 파일을 Zod 검증이 포함된 타입 안전한 데이터 콜렉션으로 변환해주는 TypeScript 최우선 도구입니다. 블로그, 문서 사이트 또는 콘텐츠 중심의 Vite + React 애플리케이션을 구축할 때 타입 안전성과 자동 콘텐츠 검증을 보장하기 위해 사용하세요. Vite 플러그인 구성과 MDX 컴파일부터 배포 최적화 및 스키마 검증에 이르기까지 모든 것을 다룹니다.

스킬 보기

polymarket

메타

이 스킬은 개발자들이 Polymarket 예측 시장 플랫폼을 활용한 애플리케이션을 구축할 수 있도록 지원하며, 거래 및 시장 데이터를 위한 API 통합 기능을 포함합니다. 또한 WebSocket을 통한 실시간 데이터 스트리밍을 제공하여 실시간 거래와 시장 활동을 모니터링할 수 있습니다. 이를 통해 거래 전략을 구현하거나 실시간 시장 업데이트를 처리하는 도구를 생성하는 데 활용할 수 있습니다.

스킬 보기

creating-opencode-plugins

메타

이 스킬은 개발자들이 명령어, 파일, LSP 작업 등 25개 이상의 이벤트 유형에 연결되는 OpenCode 플러그인을 만들 수 있도록 돕습니다. JavaScript/TypeScript 모듈을 위한 플러그인 구조, 이벤트 API 명세, 구현 패턴을 제공합니다. OpenCode AI 어시스턴트의 라이프사이클을 사용자 정의 이벤트 기반 로직으로 가로채거나, 모니터링하거나, 확장해야 할 때 사용하세요.

스킬 보기

sglang

메타

SGLang은 RadixAttention 프리픽스 캐싱을 활용하여 JSON, 정규식, 에이전트 워크플로우를 위한 고속 구조화 생성에 특화된 고성능 LLM 서빙 프레임워크입니다. 특히 반복되는 프리픽스가 있는 작업에서 상당히 빠른 추론 속도를 제공하여 복잡한 구조화 출력 및 다중 턴 대화에 이상적입니다. 제약 디코딩이 필요하거나 광범위한 프리픽스 공유가 있는 애플리케이션을 구축할 때는 vLLM과 같은 대안보다 SGLang을 선택하십시오.

스킬 보기