AI-Augmented Data Modeling: Enhancing Star Schema Design for Modern Analytics
Keywords:
Star Schema, Dimensional Modeling, Artificial Intelligence, Data Warehousing, LLMs, GPT-4, Google Gemini, Meta LLaMA, AutomationAbstract
The star schema remains a foundational dimensional modeling approach in business intelligence, valued for its simplicity, performance, and compatibility with OLAP queries. However, manual schema design is labor-intensive and error-prone in large-scale or rapidly evolving data environments. This study investigates the application of Artificial Intelligence (AI), particularly large language models (LLMs), in automating and optimizing star schema generation. Models such as OpenAI’s GPT-4, Google Gemini, and Meta’s LLaMA 3 were evaluated for their ability to infer schema structures, enforce relational integrity, and enhance semantic alignment. Experimental results demonstrated that AI-assisted modeling can reduce development time by over 80%, while increasing accuracy and consistency. These findings highlight the growing potential of AI in streamlining enterprise data modeling processes.
References
[1] Kimball, R., & Ross, M. (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd ed.). Wiley.
[2] Sharma, A., et al. (2021). Applying NLP for Data Schema Labeling. Journal of Data Engineering.
[3] Rao, K. (2023). Dimensional Modeling Automation Using Machine Learning. ACM SIGMOD Posters.
[4] Google. (2024). Introducing Gemini 1.5. Technical Overview.
[5] Meta AI. (2024). LLaMA 3: Open Foundation Models. Meta Research Release Notes.
[6] OpenAI. (2023). GPT-4 Technical Report. OpenAI Documentation.
[7] Vasilios Mavroudis (2024). LangChain v0.3., Available: https://hal.science/hal-04817573/
[8] Li, X., Zhang, Y., & Chen, H. (2025). SchemaAgent: Multi-Agent LLM Framework for Relational Schema Generation. arXiv preprint arXiv:2503.23886.
[9] Ahmed, H., & Mohamed, S. (2021). Semantic-Based Star Schema Designer: Automating Dimensional Modeling Using Knowledge Rules. ResearchGate., Available:https://www.researchgate.net/publication/364324920_GENERATING_DATA_WAREHOUSE_SCHEMA
Downloads
Published
Issue
Section
License
Copyright (c) 2025 American Scientific Research Journal for Engineering, Technology, and Sciences

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who submit papers with this journal agree to the following terms.