Audio Extractor
Extracts audio from video files to WAV format for further analysis. Converts to mono 16kHz PCM format optimized for speech/energy analysis.
Use Cases
- Extracting audio for speech analysis
- Preparing audio for energy calculation
- Converting video audio to standard format
Usage
python3 /root/.claude/skills/audio-extractor/scripts/extract_audio.py \
--video /path/to/video.mp4 \
--output /path/to/audio.wav
Parameters
--video: Path to input video file
--output: Path to output WAV file
--sample-rate: Audio sample rate in Hz (default: 16000)
--duration: Optional duration limit in seconds (default: full video)
Output Format
- Format: WAV (PCM 16-bit signed)
- Channels: Mono
- Sample rate: 16000 Hz (default)
Dependencies
Example
# Extract first 10 minutes of audio
python3 /root/.claude/skills/audio-extractor/scripts/extract_audio.py \
--video lecture.mp4 \
--duration 600 \
--output audio.wav
Notes
- Output is always mono for consistent analysis
- 16kHz sample rate is sufficient for speech analysis and reduces file size
- Supports any video format that ffmpeg can read
Skill Companion Files
Additional files collected from the skill directory layout.
scripts/extract_audio.py
#!/usr/bin/env python3
"""
Audio extractor - extracts audio from video to WAV format.
"""
import argparse
import os
import subprocess
def extract_audio(video_path, output_path, sample_rate=16000, duration=None):
"""Extract audio from video to WAV format."""
cmd = [
"ffmpeg",
"-i",
video_path,
"-vn", # No video
"-acodec",
"pcm_s16le", # 16-bit PCM
"-ar",
str(sample_rate), # Sample rate
"-ac",
"1", # Mono
]
if duration:
cmd.extend(["-t", str(duration)])
cmd.extend([output_path, "-y"])
subprocess.run(cmd, check=True, capture_output=True)
return output_path
def main():
parser = argparse.ArgumentParser(description="Extract audio from video")
parser.add_argument("--video", required=True, help="Path to input video")
parser.add_argument("--output", required=True, help="Path to output WAV file")
parser.add_argument("--sample-rate", type=int, default=16000, help="Sample rate (default: 16000)")
parser.add_argument("--duration", type=int, default=None, help="Duration limit in seconds")
args = parser.parse_args()
print(f"Extracting audio from: {args.video}")
print(f"Sample rate: {args.sample_rate} Hz")
if args.duration:
print(f"Duration limit: {args.duration}s")
extract_audio(args.video, args.output, args.sample_rate, args.duration)
# Get output file size
size_mb = os.path.getsize(args.output) / (1024 * 1024)
print(f"\nAudio extracted to: {args.output}")
print(f"File size: {size_mb:.2f} MB")
if __name__ == "__main__":
main()