Musical Agent Systems: MACAT and MACataRT

Our research explores the development and application of musical agents, human-in-the-loop generative AI systems designed to support music performance and improvisation within co-creative spaces. We introduce MACAT and MACataRT, two distinct musical agent systems crafted to enhance interactive music-making between human musicians and AI. MACAT is optimized for agent-led performance, employing real-time synthesis and self-listening to shape its output autonomously, while MACataRT provides a flexible environment for collaborative improvisation through audio mosaicing and sequence-based learning. Both systems emphasize training on personalized, small datasets, fostering ethical and transparent AI engagement that respects artistic integrity. This research highlights how interactive, artist-centred generative AI can expand creative possibilities, empowering musicians to explore new forms of artistic expression in real-time, performance-driven and music improvisation contexts.

Musical agent system workflow comparison

*Interface of Musical Agent Systems: MACAT (left) and MACataRT (right).*

Overview of MACAT and MACataRT Systems

MACAT and MACataRT are two musical agent systems designed to foster interactive music-making by integrating generative AI with human musicians. Both systems emphasize co-creativity and real-time improvisation.

MACAT System:

Functionality: MACAT focuses on agent-led performance. It uses real-time synthesis and self-listening to shape its output autonomously. The system generates musical material by selecting and synthesizing audio segments from a trained corpus based on a self-organizing map (SOM) that clusters audio segments.
Training and Workflow: During training, MACAT uses a small dataset of personalized music, allowing it to develop a highly customized agent. It employs a combination of self-organizing maps, Factor Oracles, and Variable Markov Models to manage the musical structures in real-time, making it responsive to live inputs. The system can generate dynamic, context-sensitive performances.
Interface: The MACAT interface allows users to visualize and control the self-organizing map, adjust parameters like tempo and congruence, and interact with the generative process in real-time. Users can manipulate sound synthesis features, such as playback, resampling, and pitch shifting.

MACataRT System:

Functionality: MACataRT is built on the foundation of the CataRT system by IRCAM, integrating it with a temporal model (Factor Oracle) to manage musical structures. The system excels in collaborative improvisation, providing two modes: reactive improvisation (responding to live inputs) and proactive improvisation (learning patterns from a pre-trained corpus).
Training and Workflow: Similar to MACAT, MACataRT uses personalized, small datasets for training. The system employs audio mosaicing, which reassembles audio fragments based on specific features such as pitch, timbre, and rhythm. It enhances real-time interaction by enabling musicians to control the flow of the generative process with fine-tuned parameters.
Interface: The interface offers a 2D scatter plot where users can visualize and manipulate audio segments based on selected features. The system provides flexibility through trigger modes, temporal control, and interactive sound synthesis tools, facilitating a fluid, improvisational creative space.

Research-Creation and Musical Practice

The research-creation approach for these systems combines technical innovation with artistic exploration. By integrating AI into the creative process, these systems allow musicians to engage with AI not as mere tools but as collaborative partners. This methodology supports:

Interactive Collaboration: The human musician interacts with AI agents that listen, adapt, and respond in real-time, enabling an ongoing, co-creative dialogue.
Personalization: The use of small datasets tailored to the musician’s specific style allows the systems to become highly responsive and customized to the individual artist’s needs and preferences.
Artistic Freedom: Both systems expand creative possibilities, facilitating new forms of expression in improvisation and composition.

Small Data Mindset and Environmentally Friendly Approach

Both systems adopt a small data mindset, prioritizing the use of personalized, high-quality datasets over large, generic ones. This approach:

Ethical AI: Ensures that AI systems are trained with consented and clearly attributed data, promoting transparency and artistic integrity.
Energy Efficiency: By relying on small datasets, the training process is less computationally intensive, requiring fewer resources and lower energy consumption compared to traditional large-scale AI models. This contributes to reducing the carbon footprint associated with AI training.

Conclusion

MACAT and MACataRT are innovative systems that enhance human creativity through AI collaboration in music. Their use of small, curated datasets ensures that the systems are not only artistically flexible and personalized but also ethically responsible and environmentally sustainable. These systems represent a new frontier in interactive, co-creative music, where musicians and AI work together to create novel, real-time compositions.

GitHub: https://github.com/Metacreation-Lab/Musical-Agent-Systems

Reference

Keon Ju Maverick Lee, Philippe Pasquier, and Jun Yuri. (December 2024), Musical Agent Systems: MACAT and MACataRT, NIPS Workshop on Creativity & Generative AI.

Paper Link: https://arxiv.org/pdf/2502.00023