We're excited to announce the release of the initial version of our Content-Aware Model Orchestration (CAMO) framework for deepfake detection. In this post, we'll discuss the motivation for developing this framework, showcase the novel features of its design, and introduce several open source datasets and pre-trained generalist and expert models. We'll also explore the future directions we are considering.

Highlights

Novel hard mixture-of-experts framework leveraging generalist and specialist/expert models for deepfake detection
Modular image generation pipeline designed to produce synthetic datasets with semantic balance
Open source datasets created by the aforementioned pipeline, containing synthetic data that semantically mirror open source real image datasets
Reproducible with open-source training code, datasets, and pre-trained weights
GitHub repository: https://github.com/BitMind-AI/bitmind-subnet
HuggingFace models and datasets: https://huggingface.co/bitmind

CAMO.drawio (4).png

Why Content Aware Model Orchestration?

As deepfakes grow more sophisticated, traditional single-model detection techniques fall short. With different generative algorithms leaving behind nuanced, unique forgery artifacts, state-of-the-art models have been shown to generalize poorly to outputs of unseen models. Our CAMO framework tackles this challenge by leveraging a hierarchy of expert models, each attuned to different aspects of deepfake detection. This strategy allows for specialized models to learn content-specific forgery features without being burdened by the expectation to generalize to different content. It also improves interpretability through insights into the decision-making process, and offers adaptability by allowing easy integration of new expert models as deepfake methods evolve.

Improving Scalability and Complexity

The CAMO system’s architecture excels in scalability over traditional monolithic approaches by breaking down deepfake detection into more manageable subproblems, each solved by a specialized model. Content-aware routing during inference dynamically allocates resources for selected downstream models as needed, based on the specific input. This design offers the following benefits:

Parallel Processing - Various gating functions operate asynchronously, selecting the appropriate models to invoke based on specific criteria.
Subproblem Specialization - CAMO breaks down deepfake detection into manageable subproblems, enabling specialized models to excel within defined constraints. This approach avoids the need for models to generalize beyond their intended expertise.
Modular, extensible architecture - CAMO’s architecture allows contributors to easily integrate or update individual expert models, addressing the latest deepfake challenges without needing to retrain the entire system. Additionally, its detection and gating mechanism is designed to be readily adaptable to new categories.

Theoretical Backing

Our approach is supported both by general computer vision research as well as recent work on deepfake detection.

Mixture of Experts for Face Forgery Detection: This study validates the use of multiple expert models for detecting face forgeries.