https://mellow-trader-6de.notion.site/152ba9bb589180a4adfde02bfad742e6
SynthGen Agent: Enhancing AI Models with Synthetic Data
The SynthGen Agent is designed to augment models trained on FLock's AI Arena by generating high-quality synthetic data. Utilizing initial datasets from FLock's training tasks, SynthGen employs advanced algorithms to produce datasets that enhance the robustness and performance of machine learning models. This tool is ideal for hackathon participants aiming to push the boundaries of AI capabilities and explore innovative solutions in synthetic data generation.
Scoring Criteria:
Participants' solutions will be evaluated based on the following criteria:
Benchmarking Methodology:
To quantitatively assess the impact of synthetic data on model performance, we will fine-tune a specified model (e.g., LLAMA3.2 3B or Phi3) using both the original and the augmented datasets. The performance of each fine-tuned model will be evaluated using a relevant metric (e.g., accuracy, F1 score) on a designated test set.
Performance Improvement Calculation:
The improvement in model performance due to synthetic data can be calculated using the following formula:
Performance Improvement (%) = ((P_synthetic - P_initial) / P_initial) × 100
Where:
Sample Dataset
You can find test data, which will benchmark the model across different scenarios: