A2E
  1. Create Avatars and Train Lip-sync Models
A2E
  • AI Avatar API
  • Get Tokens
    • Obtain Login Token
      POST
    • Get API token
      POST
    • Delete API token
      POST
    • List api tokens
      GET
    • Getting API Tokens (2025 version)
      GET
  • TTS and Voice Clone
    • List Public TTS Options
      POST
    • List Voice Clone Options
      GET
    • Train TTS Model of The User's Voice (Voice Clone)
      POST
    • List Ongoing Voice Clone Tasks
      GET
    • Generate TTS Audio (Text-to-Speech)
      POST
    • Get Details of a Voice
      GET
    • Delete a User Voice
      DELETE
  • Generate Avatar Videos
    • Generate AI Avatar Videos
      POST
    • List of Result Videos
      POST
    • List One or All Avatars
      POST
    • Obtain the Status of One Avatar Video Task
      POST
    • Obtain the List of Personalized Lip-Sync Models
      GET
    • Delete or Cancel a Video
      DELETE
    • Auto Language Detect
      POST
    • Auto Swith to Public Computing Pool
      POST
  • Create Avatars and Train Lip-sync Models
    • Create A Custom Avatar by a Video or an Image
      POST
    • Train a Personalized Lip-sync Model (Optional) a.k.a. Continue Training đź’ 
      POST
    • Remove A Customized Avatar
      POST
    • Get Status of All Tasks
      GET
    • Get All Ongoing "Training" Tasks
      GET
    • Status of One Task
      GET
    • Clone Voice from a Video
      POST
  • Background Matting and Replacement
    • Obtain the List of Background Images
    • Add Custom Background Image
    • Delete Custom Image
  • Face Swap
    • Manage Face Swap Resource
      • Add Face Swap Image
      • Get Records of Face Swap Images
      • Delete User Face Swap Image
    • Quickly Preview Face Swap
      • Add User Face Swap Preview
      • Get Status of Face Swap Preview Process
    • Face Swap Tasks
      • Add User Face Swap Task
      • Get Status of Face Swap Task
      • Get Face Swap Task Records
      • Get Details of Face Swap
      • Delete Record
  • AI Dubbing
    • Start dubbing
    • List Dubbing Tasks
    • List All Processing Dubbing Tasks
    • Get Details
    • Delete Record
  • Image to Video
    • Start Image-to-Video
    • Check Status of One Task
    • List Status of All Tasks
    • Delete Record
  • Caption Removal
    • Start Caption Removal
    • Get Records of All Tasks
    • Get Status of All Tasks in Processing
    • Get Details of One Task
    • Delete a Task
  • Streaming Avatar
    • Get All avatars
    • Get a Streaming Avatar Token
    • Set QA Context
    • Get QA Context
    • Ask a Question to the Avatar
    • Let the Avatar Speak Directly
    • Leave the Room
  • Miscellaneous
    • Add a User
    • Get User Remaining Credits
    • Exchange Diamonds
    • List Available Languages
    • Save URL to A2E's storage
  1. Create Avatars and Train Lip-sync Models

Create A Custom Avatar by a Video or an Image

Global Server
https://video.a2e.ai
Global Server
https://video.a2e.ai
POST
/api/v1/userVideoTwin/startTraining
This API is equivalent to "⚡️Instant Avatar" in our website.
Read the tutorial for how to film the best training videos.
Video Avatar (Recommended): provide "video_url"
Upload a video of yourself using "video_url". The video will be the base asset for all your AI generated videos. Upload a video of yourself in MP4 or MOV format. It can be either horizontal or vertical, with no size limitations. The video length should be at least 5 seconds. This video will serve as the foundation for all your subsequent AI-generated character videos. Ensure that the person in the video is clear and attractive.
(Optional) If you are unsatisfied with the lip-sync quality of the AI video, please check if your video meets are the requirements (read below). if you uploaded video meets the requirements, you can use "continueTraining" API (requires 100 credits). Wait for 30 minutes, and the avatar AI model will be automatically updated for better lip-sync quality.
Common issues:
The mime type of your video URL must be set correctly (e.g. video/mp4). We use the mime of the URL header to determine the file type, not the suffix of the URL. If you use an object storage service of a popular cloud service (e.g. S3 of AWS), the mime is usually automatically set.
No space is allowed in the URL
Address redirect is not allowed (i.e. 3xx response code of the http request). This is a common issue if someone provides a http link, but later his server redirects the http address to a https address.
Image Avatar: provide "image_url"
Note: The video / image must contain only one face.
Price:
The video startTraining is free. But it taks 100 credits for 1 "Continue Training" in video mode. If you do not have enough credits, check out pricing page to learn how to recharge.
The image has been updated to version 2, and it now taks 100 credits for startTraining.
When the training is completed, the API will be automatically uploaded to the user avatar.

Request

Authorization
Provide your bearer token in the
Authorization
header when making requests to protected resources.
Example:
Authorization: Bearer ********************
Header Params
x-lang
string 
required
the language of you use a2e ai platform. This is not what languange your avatar speaks. If you are unsure about this parameter, use "en-US" by default
Example:
en-US
Body Params application/json
name
string 
required
The name of the video to be uploaded
gender
string 
required
The gender of the video to be uploaded. The value is either female or male
video_url
string 
optional
The URL of the video of the avatar you want to do AI clone.
Either image_url or video_url should be provided. No space is allowed in the URL.
Video requirements:
Do not use videos with multiple faces appearing.
Ensure the face is neither too large nor too small. The entire face should be within the screen area and not cropped out. It is recommended that the face width occupy between one-tenth and one-third of the overall frame width.
Make sure facial features are not obscured, ensuring the clarity of facial features and contours.
The recommended video resolution is 720P or 1080P, with a maximum resolution not exceeding 4K.
The video duration should be no less than 5 seconds and no more than 5 minutes (5s–5min).
For better lip-sync generation results, it is recommended to use videos of people speaking normally.The audio and lip movements in the video must be synchronized, and background noise or other sounds (except speech) should be avoided. Maintain a moderate speaking speed; speech that is too slow may reduce lip-sync accuracy, while speech that is too fast may cause lip-sync jitter.
This URL link must be permenant and fast to access. For privacy reasons, we do not automatically store the content of your URL. Instead, each time you generate an AI avatar video, our system retrieves the video directly from the provided URL. If you cannot provide a permenant and fast URL, please use /api/v1/tos/transferToStorage to store your content in our server.
image_url
string 
optional
The URL of the image of the avatar you want to do AI clone.
Either image_url or video_url should be provided. No space is allowed in the URL.
Image requirements:
1.
The person in the image should be facing forward (angle should allow both ears to be visible).
2.
Ensure there is only one face in the image; images containing multiple faces are not supported.
3.
The face should neither be too large nor too small. Ensure the entire face is within the screen area and not cropped out. It is recommended that the face width occupy between one-tenth and one-third of the overall image width.
4.
Make sure facial features are not obscured, ensuring clear visibility of the facial features and contours.
5.
The image size should not exceed 10MB, and the dimensions should not exceed 4000 pixels in either width or height.
6.
Our backend algorithm is optimized for vertical or horizontal (rectangular) images. It automatically detects the image's orientation and crops accordingly. However, if you upload an image that is nearly square, the algorithm may cut off the top and bottom portions, potentially resulting in a missing head. To ensure the best results, please use vertical or horizontal images.
video_backgroud_image
string 
optional
The background image of the video. Please checkout our web UI to see how the background image is used.
If you set this value, you do not need to set "video_backgroud_color".
video_backgroud_color
string 
optional
The rbg string of the background color of the video. For example, if your avatar video is filmed with a green background, you can set a color value close the green. Then you will be able to do background matting in the following video synthesis stage.
You can film your model in green, white, blue or any pure color background. You must set this color value close to the actual color of the background in your video.
If you set this value, you do not need to set "video_backgroud_image".
ideal_short_edge
integer 
deprecated
The short edge pixels of the avatar, only activated under the image mode. default is 640
skipPreview
boolean 
optional
Whether to skip preview and directly perform "continue training".
If set to true, it is equivalent to "Continue Training đź’ " in our website.
Example
{
    "name":"dubbing demo",
    "gender":"female",
    // "image_url": "https://d1tzkvq5ukphug.cloudfront.net/adam2eve/stable/video_twin/63076d83-d345-4caa-be8a-19fc7c9338c8.png",
    "video_url":"http://XXXXX/cache/videoplayback%20%2816%29.mp4",
    "video_backgroud_color":"rgb(61,165,82)"
}

Request samples

Shell
JavaScript
Java
Swift
Go
PHP
Python
HTTP
C
C#
Objective-C
Ruby
OCaml
Dart
R
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request POST 'https://video.a2e.ai/api/v1/userVideoTwin/startTraining' \
--header 'x-lang: en-US' \
--header 'Content-Type: application/json' \
--data-raw '{
    "name":"dubbing demo",
    "gender":"female",
    // "image_url": "https://d1tzkvq5ukphug.cloudfront.net/adam2eve/stable/video_twin/63076d83-d345-4caa-be8a-19fc7c9338c8.png",
    "video_url":"http://XXXXX/cache/videoplayback%20%2816%29.mp4",
    "video_backgroud_color":"rgb(61,165,82)"
}'

Responses

🟢200startTraining
application/json
Body
code
integer 
required
data
object 
required
name
string 
required
gender
string 
required
image_url
string 
required
video_url
string 
required
current_status
string 
required
close_mouth_path
string 
required
task_status
string 
required
video_backgroud_image
string 
required
video_backgroud_color
string 
required
image_result_url
string 
required
sent_time
null 
required
preview_result_url
string 
required
version
integer 
required
isSilent
boolean 
required
hasVideoClone
boolean 
required
_id
string 
required
a.k.a. user_video_twin_id. The ID of the video twin record to query /api/v1/anchor/character_list
createdAt
string 
required
updatedAt
string 
required
Example
{
  "code": 0,
  "data": {
    "_id": "67bd583f7b4ae10c76393899",
    "name": "test",
    "gender": "female",
    "image_url": "",
    "video_url": "https://d1tzkvq5ukphug.cloudfront.net/adam2eve/beta/users/665da3d7bcf6ab778bad0f6a/1e5fa88b-806c-4777-b2c0-0595be2a0e1e.mp4",
    "current_status": "initialized",
    "failed_code": "",
    "failed_message": "",
    "wl_model": "live_protrait",
    "close_mouth_path": "",
    "task_id": "",
    "task_status": "",
    "task_result": {},
    "video_backgroud_image": "",
    "video_backgroud_color": "rgb(61,165,82)",
    "image_result_url": "",
    "image_error_code": "",
    "sent_time": null,
    "preview_result_url": "",
    "version": 1,
    "isSilent": false,
    "hasVoiceClone": false,
    "hasVideoClone": false,
    "skipPreview": false,
    "isToPublicPool": false,
    "image_version": 2,
    "hasEyecontact": false,
    "eyecontact_result_url": ""
  }
}
Modified at 2025-04-08 22:40:51
Previous
Auto Swith to Public Computing Pool
Next
Train a Personalized Lip-sync Model (Optional) a.k.a. Continue Training đź’ 
Built with