Documentation
Capabilities / Music

Media-to-Music Generation

Generate original, royalty-free music from text, images, videos, and audio

Overview

Wubble's media-to-music generation feature allows you to create custom, original music from multiple input types. Whether you have a text description, reference image, video footage, or audio sample, Wubble can generate high-quality music that perfectly matches your creative vision.

Text-to-Music

Describe your desired music in natural language and let AI create it

🖼️

Image-to-Music

Generate music that captures the mood and emotion of your images

Video-to-Music

Create soundtracks that sync perfectly with your video content

Audio-to-Music

Generate complementary music based on existing audio samples

What You Can Create

Background music for videos, podcasts, and presentations
Custom soundtracks for games, films, and interactive media
Branded audio identities and sonic logos
Ambient music for physical and digital spaces
Production music for commercials and advertising

Text-to-Music

The most versatile way to generate music. Simply describe what you want in natural language, and Wubble creates it for you. Our AI understands musical concepts like genre, mood, tempo, instrumentation, and structure.

How to Write Effective Prompts

The more specific and detailed your description, the better the results. Include information about:

Genre & Style

Electronic, rock, jazz, classical, hip-hop, lo-fi, cinematic, ambient, etc. Be as specific as possible with subgenres.

Mood & Emotion

Upbeat, melancholic, energetic, calm, dramatic, mysterious, triumphant, nostalgic, tense, peaceful, etc.

Instrumentation

Specify instruments you want: piano, guitar, strings, synthesizers, drums, bass, brass, woodwinds, etc.

Tempo & Rhythm

Fast, slow, moderate, or specific BPM (e.g., "120 BPM"). Include rhythm patterns like "driving beat" or "syncopated rhythm".

Duration

How long the track should be (30 seconds, 2 minutes, 5 minutes, etc.)

Use Case

What the music will be used for helps the AI understand context: "for a tech product demo", "background for a meditation app", etc.

Example Prompt

Text Prompttext
"Create upbeat electronic music for a tech product demo. 
Duration: 2 minutes. 
Style: Modern, energetic, with synthesizers and electronic drums. 
Mood: Inspiring and innovative."

Using the API

Text-to-Music API
const response = await fetch('https://prod-backup-backend.wubble.ai/v1/music/songs', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.WUBBLE_API_KEY}`,
    'Content-Type': 'application/json',
    'Idempotency-Key': crypto.randomUUID(),
  },
  body: JSON.stringify({
    prompt:
      'Upbeat electronic music with modern feel, energetic mood, synth leads and tight electronic drums',
  }),
});

const payload = await response.json();
// payload.data.request_id -> poll with GET /v1/requests/:requestId
💡

Pro Tip

Start with a clear description, then refine iteratively. You can ask the AI to "make it faster", "add more bass", or "make it more emotional" to fine-tune your track.

🖼️

Image-to-Music

Transform visual aesthetics into audio. Upload an image and Wubble analyzes the colors, composition, mood, and visual elements to generate music that captures the essence of your image.

How It Works

Our AI vision model analyzes your image to understand:

  • Color palette: Warm colors → energetic music, cool colors → calm music
  • Composition: Busy images → complex arrangements, minimal images → sparse instrumentation
  • Subject matter: Nature → organic sounds, urban → electronic elements, etc.
  • Lighting & mood: Bright → uplifting, dark → moody or dramatic
  • Movement & energy: Dynamic compositions → faster tempos, static → slower tempos

Use Cases

Brand Visual Identity

Convert your brand's visual identity into a sonic identity

Album Art Sonification

Create music that matches your album artwork

Product Launches

Generate music from product images for promotional content

Art Installations

Create soundscapes for visual art exhibits and installations

Using the API

Image-to-Music API
// 1) Analyze your image externally (or in your app) and build a musical prompt
const derivedPrompt = 'Cinematic, atmospheric score inspired by a neon night city image';

// 2) Generate music using the prompt
const response = await fetch('https://prod-backup-backend.wubble.ai/v1/music/songs', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.WUBBLE_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ prompt: derivedPrompt }),
});

const payload = await response.json();
ℹ️

Supported Image Formats

JPG, PNG, WebP, GIF (first frame). Maximum file size: 10MB. Recommended resolution: 1920x1080 or higher for best analysis results.

Video-to-Music

Generate music that perfectly syncs with your video content. Wubble analyzes your video's pacing, scene changes, action, and mood to create a soundtrack that enhances your visual storytelling.

Advanced Video Analysis

Our AI analyzes multiple aspects of your video:

Scene Detection

Automatically identifies scene transitions and creates musical transitions that match

Action & Movement

Fast-paced action scenes get energetic music, slow scenes get calmer music

Emotion Recognition

Detects emotional content in scenes and matches music mood accordingly

Visual Energy Matching

Analyzes visual complexity and motion to match musical intensity

Color Grading Analysis

Understands the visual tone from color grading to inform musical style

Synchronization Options

Auto-Sync

COMING SOON

Automatically synchronizes music to video length and key moments

Hit Points

COMING SOON

Mark specific moments for musical accents or changes

Energy Matching

COMING SOON

Music intensity follows the visual energy throughout the video

Custom Timing

COMING SOON

Specify exact timing for intros, builds, and outros

Perfect For

  • YouTube videos, vlogs, and social media content
  • Product demos and promotional videos
  • Film and documentary soundtracks
  • Corporate presentations and training videos
  • Wedding videos and event highlights

Using the API

Video-to-Music API
// Upload video audio track or extracted audio first
const form = new FormData();
form.append('file', videoAudioFile);

const uploadRes = await fetch('https://prod-backup-backend.wubble.ai/v1/music/uploads', {
  method: 'POST',
  headers: { Authorization: `Bearer ${process.env.WUBBLE_API_KEY}` },
  body: form,
});
const upload = await uploadRes.json();

// Use remix route to generate a soundtrack variation
const remixRes = await fetch('https://prod-backup-backend.wubble.ai/v1/music/songs/remix', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.WUBBLE_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    upload_audio_id: upload.data.upload_audio_id,
    lyrics: 'Instrumental cinematic cue that follows rising scene energy',
    prompt: 'Epic trailer-style progression synced to visual pacing',
  }),
});

const remixPayload = await remixRes.json();
ℹ️

Supported Video Formats

MP4, MOV, AVI, WebM. Maximum file size: 500MB. Maximum duration: 10 minutes. Processing time varies based on video length.

Audio-to-Music

Generate new music based on existing audio samples. Upload an audio file and Wubble analyzes its characteristics to create complementary or matching music that works alongside your original audio.

How It Works

Similar to how we analyze images and videos, our AI analyzes your audio input to understand its characteristics and generate music that complements it.

Audio Analysis

Our AI analyzes your audio input to understand:

  • Key & scale: Ensures harmonic compatibility
  • Tempo & rhythm: Matches or complements timing
  • Instrumentation: Avoids clashing frequencies
  • Mood & energy: Creates appropriate atmosphere
  • Structure:Aligns with your audio's arrangement

Common Use Cases

Podcast Music

Add background music that doesn't compete with speech

Layered Production

Build complex arrangements by layering generated tracks

Sample Variation

Create multiple versions from a single audio sample

Remix Foundation

Generate new elements for remixing existing tracks

Using the API

Audio-to-Music API
// 1) Upload reference audio
const form = new FormData();
form.append('file', audioFile);

const uploadRes = await fetch('https://prod-backup-backend.wubble.ai/v1/music/uploads', {
  method: 'POST',
  headers: { Authorization: `Bearer ${process.env.WUBBLE_API_KEY}` },
  body: form,
});
const upload = await uploadRes.json();

// 2) Generate complementary music from uploaded reference
const response = await fetch('https://prod-backup-backend.wubble.ai/v1/music/songs/remix', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.WUBBLE_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    upload_audio_id: upload.data.upload_audio_id,
    lyrics: 'Create a complementary instrumental layer',
    prompt: 'Warm, modern production that harmonizes with source audio',
  }),
});

const payload = await response.json();
ℹ️

Supported Audio Formats

MP3, WAV, FLAC, AAC, OGG. Maximum file size: 50MB. Maximum duration: 10 minutes. Higher quality input yields better analysis results.

Best Practices

Be Specific

Whether using text prompts or other media, provide clear direction about what you want. Detailed descriptions and high-quality inputs yield better results.

Iterate & Refine

Generate multiple variations and use conversational refinement. Ask the AI to adjust specific elements rather than starting from scratch.

Combine Input Types

You can combine inputs! Upload an image and add a text description, or provide audio with additional prompt guidance for more precise control.

Use Brand Kit Settings

Set up your Brand Kit to apply consistent style preferences across all generations, regardless of input type.

Consider Context & Usage

Think about where and how the music will be used. Background music needs different characteristics than featured soundtracks.

Test Different Lengths

Start with shorter durations (30-60 seconds) to validate the style and direction before generating longer pieces.

Was this page helpful?