🤖 AI Summary
Rhyme CTRL (open-source repo: rap-rhyme-highlighter) is a web app that analyzes rap lyrics with phoneme-aware rhyme detection and uses OpenAI Whisper to produce word-level aligned, karaoke-style and exportable videos. It maps words to CMU Pronouncing Dictionary phonemes, groups rhyme "families" by pronunciation rather than spelling, and offers multiple modes—Text (analysis-only), Auto (upload audio + lyrics for Whisper alignment), Finetune (manual timing/rhyme edits), Perform (live display) and Capture (cinematic 1280×720 MP4 export via Playwright/FFmpeg). The project ships as a Flask app (Python 3.9+), stores tracks in SQLite, and includes a capture_video.py script to automate frame capture and encoding.
Technically, rhyme grouping uses a multi-factor similarity score: Stressed Vowel Match 65% (primary nucleus), Tail Similarity 25% (post-vowel consonants), and Head Similarity 15% (prefix patterns). Words exceeding a configurable threshold (default 0.6; 0.4 loose, 0.8 strict) are clustered; families with fewer than three members are filtered to reduce noise. Whisper supplies accurate word timestamps (model size configurable: tiny→large trade-offs in speed/accuracy), so synchronized highlighting and video exports are precise but require model disk/RAM (~2GB for small) and FFmpeg. This tool is useful for lyricists, producers, visualization researchers and dataset curators who need pronunciation-aware rhyme analysis and aligned audiovisual outputs.
Loading comments...
login to comment
loading comments...
no comments yet