Video Preview API — Building a video preview API
Video Preview API
Ever wanted to quickly preview a song from Spotify or YouTube without watching the entire video? That's exactly what I built with the Video Preview API, a service that generates 7 second clips from the middle of music videos.
The Challenge
The idea seemed simple: take a Spotify track or YouTube video and create a short preview. But as I dove deeper, the complexity multiplied:
- Multiple input sources: Spotify IDs, YouTube video IDs
- Video quality optimization: Different resolutions and encoding settings
- Smart video matching: Finding the right video (official, not covers or live versions)
- Performance at scale: Caching and processing optimization
How It Works
The API follows a sophisticated pipeline that handles multiple input types:
1. Input Processing
// Accepts Spotify and Youtube information. (Simplified code)
if (input.includes('spotify.com') || /^[a-zA-Z0-9]{22}$/.test(input)) {
const spotifyId = extractSpotifyId(input);
const searchQuery = await spotifyService.getYoutubeSearchQuery(spotifyId);
videoId = await youtubeService.searchVideo(searchQuery);
}
2. Smart Video Matching
The trickiest part was finding the right YouTube video for a Spotify track. Not just any cover or live version but the official music video. The system:
- Extracts artist and track info from Spotify
- Searches YouTube with intelligent filtering
- Prioritizes official channels and verified uploads
- Filters out covers, remixes, and live versions (except if the requested song is one of them)
3. Video Processing with FFmpeg
The heart of the system uses FFmpeg to extract high-quality clips:
ffmpeg()
.input(videoInfo.url)
.seekInput(startTime) // Jump to the middle
.duration(options.duration) // Extract 7 seconds
.videoCodec('libx264')
.size(`${options.width}x${options.height}`)
.videoFilter(['scale=640:360:flags=lanczos'])
.outputOptions(['-crf', '23', '-preset', 'medium'])
The Tech Stack
- *Node.js + Express: Fast, lightweight API server
- *FFmpeg: Video processing powerhouse
- *Spotify Web API: Track metadata and artist information
- *YouTube APIs: Video discovery and stream extraction
- *Custom Caching Layer: Separate metadata and video caching
What Was Hard
1. YouTube's Moving Targets
YouTube constantly changes their internal APIs. I had to implement multiple fallback strategies using different extraction libraries (ytdl-core, youtube-dl-exec, @distube/ytdl-core) to ensure reliability.
2. Video Quality vs. Performance
Balancing quality with processing speed was crucial. I implemented three quality presets:
- Low: 800k bitrate, CRF 28 (fast processing)
- Medium: 2.5M bitrate, CRF 23 (balanced)
- Max: 6M bitrate, CRF 18 (broadcast quality)
3. Intelligent Caching
Simple caching wasn't enough. I built a two-tier system:
- *Metadata cache: Track info, video IDs, thumbnails
- *Video cache: Actual processed clips with size limits
4. Error Handling Across Services
When Spotify fails, fall back to YouTube. When one YouTube extractor breaks, try another. The system needed to be resilient across multiple external dependencies.
The Results
The API now handles multiple input formats seamlessly:
# Spotify track
GET /api/preview/spotify:track:4PTG3Z6ehGkBFwjybzWkR8
# YouTube URL
GET /api/preview/https://youtube.com/watch?v=dQw4w9WgXcQ
# Direct video ID
GET /api/preview/dQw4w9WgXcQ?quality=max&duration=10
Response times average 2-6 seconds for new videos and under 500ms for cached content.
What's Next
This project taught me about building resilient APIs that depend on external services. The key insights:
- Always have fallbacks: External APIs will fail
- Cache intelligently: Not just the result, but intermediate steps too
- Optimize the critical path: Video processing is CPU-intensive
I'm considering adding support for other platforms like SoundCloud or Apple Music. Maybe even a web interface for easier testing.
Try It Out
The API is designed to be simple yet powerful. Whether you're building a music app, creating previews for social media, or just extracting clips for personal use, it handles the complexity so you don't have to.
Check out the GitHub repository to see the full implementation, including the smart video matching algorithms and caching strategies.