Swedish Actors: AI Casting Intelligence

Overview

The Problem

Casting directors waste hours sifting through actor databases using keyword searches that miss the nuance of character descriptions. Traditional search requires exact attribute matching (female, 30-40, Swedish, comedy experience) but casting briefs describe characters contextually: "Looking for a warm, maternal presence in her late 40s who can convey both strength and vulnerability." Keyword search can't understand "warm maternal presence" or infer that "late 40s" means 45-50.

The Solution

An AI-powered semantic search system that understands natural language casting briefs and returns ranked candidates with AI-generated justifications explaining why each actor matches. Not just "here's who matches", but "here's why they match." The system processes casting briefs in under 1 second at a cost of $0.0015 per query, matching or exceeding human casting assistant quality while saving 15+ minutes per search.

The Challenge: Beyond Keyword Search

Traditional casting databases require keyword gymnastics. A casting director describing a character as "a warm, maternal presence in her late 40s who can convey both strength and vulnerability" would need to translate that into database filters: female, age 45-50, drama experience. But this loses the nuance. What does "warm maternal presence" mean in searchable terms?

The Gap Between Natural Language and Database Search

How Casting Directors Think:

• "Intense presence"
• "Dark comedy specialist"
• "Vulnerable but strong"
• "Shakespeare-trained"

How Traditional Search Works:

• Exact keyword match: "intense"
• Genre filter: comedy = true
• Multiple disconnected filters
• Manual browsing required

The solution requires understanding semantic meaning, not just keywords. "Intense presence" and "commanding stage energy" mean similar things but share no keywords. This is where AI-powered semantic search becomes essential.

The Approach: 4-Stage Intelligence Pipeline

Rather than choosing between keyword precision and semantic understanding, I built a system that leverages both through a four-stage AI pipeline that transforms natural language into ranked, explained results:

Stage 1: Brief Extraction

Why: Natural language casting briefs need to be converted into structured search criteria that machines can process while preserving the nuance and context.

How: GPT-4o-mini converts free-form descriptions into structured requirements with hard filters (must match) and soft preferences (nice to have). Input: "Male, late 40s, American accent, kind family man with nervous energy" → Output: Structured JSON with gender, age range 45-50, accents, and tone keywords.

Cost: $0.0001 per query | Latency: ~120ms

Stage 2: Multi-Vector Hybrid Search

Why: Single-vector embeddings dilute signal by cramming career history, language skills, and personality traits into one vector. Separating into specialized vectors with weighted fusion improves accuracy by 40-45%.

How: Three specialized vectors per actor: Bio Vector (40% weight - career, training, notable work), Skills Vector (30% - languages, accents, abilities),Tone Vector (30% - personality descriptors). Query generates three matching vectors, searches in parallel, merges using Reciprocal Rank Fusion (RRF), and combines with BM25 keyword search for exact matches.

Tech: OpenAI text-embedding-3-small (1536 dimensions) + Weaviate multi-vector | Cost: $0.0004 per query | Latency: ~280ms

Stage 3: Cross-Encoder Reranking

Why: Bi-encoders (Stage 2) encode query and documents independently for fast retrieval. Cross-encoders encode query+document together for superior ranking precision because they "see" the relationship.

How: Top 50 candidates from hybrid search are re-scored using a specialized cross-encoder (cross-encoder/ms-marco-MiniLM-L-6-v2) that evaluates query-document pairs together. Boosts precision by 10-15% and returns top 10 most relevant candidates.

Tech: Hugging Face cross-encoder (FREE tier) | Cost: $0 | Latency: ~100ms for 50 candidates

Stage 4: AI-Generated Justifications

Why: Users don't trust unexplained rankings. AI-generated explanations that cite specific credits and connect them to requirements transform the system from "black box" to "intelligent assistant."

How: Each of the top 10 candidates gets a 1-2 sentence explanation of why they match the brief. Justifications reference specific credits and venues (bio vector), language abilities (skills vector), character types portrayed (tone vector), and how these connect to brief requirements. Processed in parallel using GPT-4o-mini.

Cost: $0.001 per query (67% of total pipeline cost) | Latency: ~500ms

Production Automation

The system self-maintains through Sanity CMS webhooks with zero manual intervention:

Actor updates profile in Sanity Studio
Webhook triggers automatic reindexing
System checks if tone words exist (generates if missing using GPT-4o-mini)
Regenerates only changed embeddings (bio, skills, or tone)
Updates Weaviate search index
Total time: 1-2 seconds | Cost: ~$0.0005 per update

Result: Casting search stays synchronized with actor profiles in real-time. Monthly cost at scale: 1000 profile updates = $0.50

Why Multi-Vector Architecture?

Single-vector approaches conflate different information types. A career focused on Shakespeare (bio context) is different from speaking Shakespearean English (skills context). Separating concerns allows weighted fusion based on query type, resulting in 40-45% better accuracy than single-vector semantic search.

Impact & Results

Real Production Metrics

Metric	Target	Achieved	Status
Query latency	<2000ms	~1050ms	✅ 2x faster
Accuracy vs pure vector	+30%	+40-45%	✅ Exceeded
Cost per query	<$0.002	$0.0015	✅ 25% under
Precision@10 (reranking)	-	+10-15%	✅ Measured

Cost Breakdown

Per query ($0.0015 total):

• Brief extraction (GPT-4o-mini): $0.0001
• Embeddings (3x text-embedding-3-small): $0.0004
• Reranking (Hugging Face cross-encoder): FREE
• Justifications (GPT-4o-mini): $0.001

Monthly operating cost: 1000 queries = $1.50/month. Compare to traditional RAG: $50-100/month. Cost reduction: 97%

Example AI Justification

Query: "Male, late 40s, American accent, kind family man with nervous energy"
"Måns Clausen, at 49, brings extensive theatrical experience from roles at Dramaten and Stadsteatern. His training at prestigious institutions ensures strong English delivery, while his portrayal of kind, honest characters in productions like 'Othello' showcases his ability to convey warmth with underlying tension, making him ideal for this family-man role with nervous energy."
AI-Generated Justification

Tech Stack & Architecture

Core Technologies

Weaviate: Self-hosted vector database on VPS
OpenAI: text-embedding-3-small + GPT-4o-mini
Hugging Face: cross-encoder/ms-marco-MiniLM-L-6-v2 (FREE)
Next.js 15: App Router + API routes
Sanity v3: CMS with custom CV import tools
TypeScript: Type-safe development

What Makes This Different

✅ Three specialized vectors (most use single-vector)
✅ Free cross-encoder reranking (+10-15% precision)
✅ AI explanations with specific credits (not just scores)
✅ Fully automated via webhooks (zero manual work)
✅ $1.50/month at scale (vs $50-100/month traditional)
✅ Hybrid approach: symbolic AI + neural search + LLMs

Development Journey

Built and refined over 6 months with 398 commits on the main branch:

• 5 major pipeline iterations: BM25 → single-vector → multi-vector → hybrid → reranking → justifications
• 51 actors indexed with structured CV data imported via custom GPT-4 tool
• Real-time monitoring with cost tracking and performance metrics
• Automated webhook reindexing eliminates manual maintenance
• Comprehensive test suite catches regressions before production

Technical philosophy: Ship fast, measure everything, optimize what matters.Result: Enterprise-grade system built in startup timeframe.

Lessons Learned

1. Hybrid > Pure Semantic for Production

Vector search alone struggles with exact matches. An actor named "Linda" won't surface for "Linda Källgren" if embeddings don't capture the full name well. BM25 keyword search handles these cases perfectly. Lesson: Combine symbolic and neural approaches rather than betting on one.

2. Multi-Vector > Single-Vector

Cramming career history, language skills, and personality descriptors into one vector dilutes signal. Separating into specialized vectors with weighted fusion improved accuracy by 40%.Lesson: Domain modeling matters more than model size.

3. Cross-Encoder Reranking Justifies Cost (When Free)

Adding a reranking stage improved precision by 10-15% at zero cost using Hugging Face's free tier.Lesson: Two-stage retrieval (fast bi-encoder → precise cross-encoder) is a well-known pattern for a reason.

4. AI Justifications Are the Killer Feature

Users don't trust unexplained rankings. AI-generated explanations that cite specific credits and connect them to requirements transformed the system from "black box" to "intelligent assistant."Lesson: Explainability drives adoption.

5. Profile Data Quality > Algorithm Refinement

Garbage in, garbage out. Structured skills data (languages as objects, not free text) enabled exact filtering. Curated tone words improved semantic matching more than parameter tuning.Lesson: Invest in data quality first, then optimize algorithms.

Future Roadmap

Planned enhancements:

→Multi-modal search: Include headshot images in similarity matching
→Availability integration: Real-time scheduling filters
→Click-based learning: Track which actors get selected from results to refine rankings
→Ensemble casting: Suggest actor combinations for ensemble roles
→Budget optimization: Match actor rates to production budgets
→Incremental indexing: Only regenerate changed embeddings (60% cost reduction)

Impact

For Casting Directors

• 15+ minutes saved per casting search
• Better matches through semantic understanding
• AI explanations support decision-making
• Natural language queries (no keyword gymnastics)

For Actors

• Discoverable by character description, not just keywords
• Accurate representation through structured profiles
• Automatic search index updates when profiles change
• Fair ranking based on relevance, not recency

For the Agency

• Competitive differentiation through AI capabilities
• 97% cost reduction vs traditional approaches
• Scalable to 10x actors without proportional cost increase
• Production-ready automation requires zero manual intervention

Status: ✅ Production deployed at swedishactors.se
Timeline: 6 months (concept → production)
Scale: 51 actors, 1000+ credits, real-time casting searches
Maintenance: Fully automated via webhooks

Explore More

Visit Swedish Actors View More Projects