Environment Setup Guide: Installing FFmpeg & Configuring APIs
This post walks you through building a full AI-powered video generation pipeline using n8n, ComfyUI, FFmpeg, Google Gemini, and Azure TTS. The final result is a fully automated workflow that transforms a short story into an animated video in the style of Studio Ghibli—complete with generated voiceover and seamless YouTube publishing.
👉 Download the ready-to-use n8n workflow here: Get it on Gumroad
To ensure this workflow runs correctly, you need to make some key configurations to your n8n deployment environment. As the workflow relies on external services, please follow the steps below.
1. Install FFmpeg for n8n (via a Custom Docker Image)
The official n8n Docker image does not include FFmpeg by default. Therefore, we need to build a custom image to add it. This is a prerequisite for the Execute Command
node to successfully run ffmpeg
commands.
Step 1: Create a Dockerfile
In the same directory where your docker-compose.yml
file is located, create a new file named Dockerfile
and add the following content:
# Use a specific n8n version as the base image for stability
FROM n8nio/n8n:1.99.0
# Switch to the root user to get permissions to install software
USER root
# Use the apk package manager to install ffmpeg. The --no-cache option reduces the final image size
RUN apk add --no-cache ffmpeg
# Switch back to the non-privileged node user, which is recommended for running n8n securely
USER node
Description: This file defines how to install the FFmpeg package on top of a standard n8n image.
💡 If you’d prefer to skip setup and test a working pipeline first, you can download the full n8n workflow here.
Step 2: Modify the docker-compose.yml
File
Next, edit your docker-compose.yml
file to use the Dockerfile
we just created to build the image, instead of pulling it directly from Docker Hub.
Modify the n8n
service under services
by adding the build
directive:
services:
n8n:
# Use the build directive to build a custom image
build:
context: . # context: . indicates the Dockerfile is in the current directory
dockerfile: Dockerfile
# container_name and image are optional but help with management
container_name: n8n-custom-ffmpeg
image: my-n8n-with-ffmpeg:latest
restart: always
ports:
- "5678:5678"
env_file:
- .env
volumes:
- n8n_data:/home/node/.n8n
# ... other configurations remain the same ...
volumes:
n8n_data:
Description: When you run
docker-compose up --build
, Docker Compose will first build a new image namedmy-n8n-with-ffmpeg:latest
based on theDockerfile
, and then start the n8n service using this new image that includes FFmpeg.
2. Configure the ComfyUI Server Address
Your n8n workflow needs to know the ComfyUI server address to send requests to it. Using environment variables is the recommended way to configure this, as it is more flexible and secure.
Recommended Method: Use an .env
file
-
Confirm
docker-compose.yml
Configuration: Ensure then8n
service section in yourdocker-compose.yml
file includes the lineenv_file: - .env
. -
Create or Edit the
.env
File: In the same directory as yourdocker-compose.yml
, create a file named.env
(if it doesn’t already exist) and add the following content, replacing the URL with your actual address:
# .env file
# Replace this URL with the access address of your ComfyUI instance
COMFYUI_BASE_URL_1=http://your-comfyui-server-ip:8188
# ... other environment variables ...
How it works: When n8n starts, it automatically loads all variables from the
.env
file. The expression{{ $env.COMFYUI_BASE_URL_1 }}
in the workflow can then successfully read this address. Note for Local ComfyUI Users: If you are running ComfyUI on the same machine as your Docker-based n8n instance, you cannot use localhost or 127.0.0.1 directly. Instead, you may need to use a special Docker network address like host.docker.internal to allow the n8n container to reach the ComfyUI service running on the host machine. For example: COMFYUI_BASE_URL_1=http://host.docker.internal:8188.
3. Configure API Credentials
This workflow requires calls to several external service APIs. Please go to the Credentials section in your n8n instance to create and configure credentials for the following services.
Set up Google Gemini API Key
- Purpose: Used for story generation, scene breakdown, and creating video prompts.
- Action:
- In n8n, choose to create a new credential.
- Search for and select “Google Gemini (PaLM) API”.
- Enter your API key as prompted.
- Reference Document: https://docs.n8n.io/integrations/builtin/credentials/google/
Set up YouTube API Credentials
- Purpose: Used to automatically upload the final composite video to your YouTube channel.
- Action:
- In n8n, choose to create a new credential.
- Search for and select “YouTube OAuth2 API”.
- This requires you to create a project in the Google Cloud Console, enable the YouTube Data API v3, and obtain an OAuth 2.0 Client ID and Secret.
- Reference Document: https://docs.n8n.io/integrations/builtin/credentials/google/
Set up Azure TTS API Credentials
- Purpose: Used to convert text narration and dialogue into high-quality speech.
- Action:
- In n8n, choose to create a new credential.
- Search for and select “Microsoft Azure Speech API”.
- You will need to create a “Speech service” resource in the Azure portal to get the required Subscription Key and Region.
- Reference Document: https://docs.n8n.io/integrations/builtin/credentials/microsoft-azure-speech/
🚀 Ready to Try It Instantly?
You can skip the setup and get started right away with a pre-built, battle-tested n8n workflow—including nodes for scene generation, voiceover synthesis, video merging, and upload logic.
👉 Download the full project on Gumroad
How to Use Your Own ComfyUI Workflow
The Set Workflow Payload
node is the “blueprint” for video generation. You can replace the JSON content inside it with any workflow you create in ComfyUI. Follow these steps:
Step 1: Prepare and Export Your Workflow in ComfyUI
- Design and Debug: First, ensure your workflow runs correctly without errors in the ComfyUI interface. This is the most important step and will prevent many future issues.
- Export API Format: In the ComfyUI sidebar menu, click the “Save (API Format)” button. This will download a
.json
file containing the API call format for your workflow. - Copy the Content: Open the downloaded
.json
file with a text editor (like VS Code or Notepad) and copy its entire content.
Step 2: Replace the Content in the n8n Node
- Find the Node: In n8n, open the
Set Workflow Payload
node. - Clear and Paste: Delete all the existing JSON code in its
JSON Output
field, and then paste in the new code you just copied from the file.
Step 3: Reconnect the Dynamic Prompt (The Most Crucial Step!)
When you paste your own workflow, the original connection for the dynamic prompt will be broken. You need to reconnect it manually.
- Find Your “Positive Prompt” Node: In the new JSON code you just pasted, find the
CLIPTextEncode
node that handles the Positive Prompt. Itsclass_type
should beCLIPTextEncode
.- Hint: It usually connects to the
positive
input of a KSampler node.
- Hint: It usually connects to the
- Modify the
text
Field: Inside this node, find thetext
field within theinputs
object. It will likely be a fixed, hardcoded prompt, for example,"text": "a beautiful landscape"
. - Replace it with an n8n Expression: Replace this hardcoded text exactly with the following n8n expression:
{{ $json.output.video_prompt }}
Before:
"6": {
"inputs": {
"text": "a cat sitting on a bench, best quality", // <-- This is the fixed text you wrote in ComfyUI
"clip": [ "38", 0 ]
},
"class_type": "CLIPTextEncode"
},
After:
"6": {
"inputs": {
"text": "{{ $json.output.video_prompt }}", // <-- Change it to this expression
"clip": [ "38", 0 ] // Make sure the clip connection remains unchanged
},
"class_type": "CLIPTextEncode"
},
Please note that your node numbers (e.g., "6"
or "38"
) may be different depending on your own workflow structure. The key is to find the correct CLIPTextEncode
node and modify its text
input.
After completing these three steps, your n8n workflow will now use your own ComfyUI design to generate videos.
4. Debugging & Optimization Tips
Handling Slow ComfyUI Generation
On machines with lower GPU performance, ComfyUI video generation can be very time-consuming, which can significantly impact the efficiency of debugging subsequent processes (like video/audio merging, uploading, etc.). Here are two methods to speed up the debugging process:
-
Method 1: Use a Fixed Test Video (Disabled by default) In the workflow, we have pre-configured two nodes for downloading landscape and portrait test videos. You can disable the ComfyUI generation flow and enable one of these download nodes to quickly get a video sample for testing.
-
Method 2: Generate Only the First Scene This is an excellent trade-off if you want to test the complete end-to-end flow (from AI generation to final upload) but don’t want to wait for all scenes to be generated.
- Find the Debug a data entry node: This is a
Code
node that is disabled by default. - Activate this node: Right-click the node and select “Activate”.
How it works: This node’s function is very simple: it takes the array of scenes generated by the AI and keeps only the first one. As a result, all subsequent processes (video generation, audio generation, merging, etc.) will only run once for this single scene. This drastically reduces the testing time for the entire workflow, allowing you to quickly verify the overall logic. Remember to disable this node again when you’re done debugging to generate the full video.
- Find the Debug a data entry node: This is a
Time Calibration
In this workflow, the Extract Scenes & Dialogue node, in its system prompt, requests that the generated dialogue text corresponds to a reading time of approximately 8 seconds. Therefore, in your ComfyUI workflow settings, it is best for the generated video duration to match this (for example, by setting an appropriate number of frames) to avoid mismatches between video and audio length during the final merge.
After completing all the above configurations, restart your n8n container (docker-compose up -d --build
), and your environment will be ready.
📬 关注我获取更多资讯

