Fable, a San Francisco startup, has introduced its SHOW-1 AI technology, capable of creating entire episodes of TV shows through writing, production, direction, animation, and even voice acting.
The technology is an intricate amalgamation of several AI models, including Large Language Models (LLMs) for writing, custom diffusion models for image creation, and multi-agent simulation for story progression and characterization.
To demonstrate the capability of SHOW-1, the startup produced a 20-minute episode of South Park, entirely written, produced, and voiced by AI. Interested viewers can access the episode and explore the project’s details on their Github page.
The significance of this development lies in addressing the limitations of current generative AI systems, which excel at short-term tasks but often fall short in long-form content creation, especially within established IPs. Additionally, the entertainment industry is currently navigating writers’ and actors’ strikes, leading to concerns about AI replacing jobs across the TV and movie spectrum.
SHOW-1 aims to reach the holy grail of AI-generated content: matching the quality of existing IPs. Its custom content approach allows viewers to influence the parameters, potentially transforming the entertainment experience.
How it works?
The magic behind SHOW-1’s technology involves multi-agent simulation, enabling rich character history and coherent story generation, along with the utilization of GPT-4-based Large Language Models for natural language processing. The diffusion models, trained on South Park’s IP, contribute to rendering characters and background images, while voice-cloning tech provides character voices.
- Multi-agent simulation allows for rich character history, goal setting, emotional expression, and coherent story development.
- Large Language Models (GPT-4) facilitate seamless natural language processing and generation, benefiting from prior digestion of numerous South Park episodes. Prompt-chaining techniques preserve story coherence.
- Diffusion models trained on 1200 characters and 600 background images from South Park’s IP. Dream Booth trains the models, and Stable Diffusion renders the outputs.
- Voice-cloning tech delivers authentic character voices.
This achievement is the result of combining various off-the-shelf frameworks into a unified system, highlighting the exciting potential of AI in entertainment. However, the convergence of the right tools also raises concerns about AI’s disruptive force in the industry.
Although we are currently in the “science projects” phase of AI in entertainment, it is crucial to recognize that significant advancements have been made in a relatively short period. A future where generative AI enables customized, personalized, and limitless entertainment might not be far off, but it also prompts us to question the implications of such a development.