From Autonomic Computing to Self-Driving Databases: AI-Driven Autonomous Operations in Cloud Environments
DOI:
https://doi.org/10.15662/IJRAI.2021.0401004Keywords:
Autonomous Databases, Autonomic Computing, Cloud Infrastructure, Artificial Intelligence, Self-Managing Systems, Database Operations, High Availability, Self-Driving DatabasesAbstract
The increasing scale, complexity, and heterogeneity of modern data platforms have decisively outpaced the capabilities of traditional, manually operated database administration models, which were originally designed for relatively static, centralized environments. As enterprises migrate mission-critical workloads to cloud and hybrid architectures, database operations must now contend with elastic infrastructure, geographically distributed deployments, multi-tenant resource contention, and continuously evolving workload patterns. These shifts significantly amplify the operational burden associated with performance tuning, high-availability management, security patching, compliance enforcement, and capacity planning, often exceeding the practical limits of human-driven processes. In response, autonomous database operations systems endowed with self-monitoring, self-analysis, self-optimization, and self-healing capabilities have emerged as a compelling paradigm for sustaining reliability and performance at scale. This article traces the evolution of autonomous database systems through the foundational principles of autonomic computing, early self-managing database research, and the maturation of cloud-native infrastructure platforms. By synthesizing theoretical models such as the MAPE-K control loop with real-world implementations from IBM and Oracle, we examine how AI and machine learning techniques enable continuous, closed-loop operational decision-making across monitoring, diagnosis, planning, and execution phases. We further analyze the architectural trade-offs, organizational impacts, and trust considerations inherent in delegating operational control to intelligent systems, and outline future research directions focused on explainability, governance, and resilience in AI-driven autonomous database platforms.
References
1. Huebscher, M. C., & McCann, J. A. (2008). A survey of autonomic computing systems. ACM Computing Surveys, 40(3), 1–28. DOI: https://doi.org/10.1145/1380584.1380585
2. Kephart, J. O., & Chess, D. M. (2003). The vision of autonomic computing. IEEE Computer, 36(1), 41–50. DOI: https://doi.org/10.1109/MC.2003.1160055
3. Lightstone, S., Rao, J., Lohman, G. M., Storm, A., Haas, P. J., Surendra, M., Markl, V., & Zilio, D. C. (2006). Making DB2 products self-managing: Strategies and experiences. IBM Systems Journal, 1-8. https://cs.brown.edu/courses/cs227/archives/2008/Papers/IEEE-DataEngineeringBulletin/Lohman.pdf
4. Zilio, D., Lightstone, S., Lyons, K., & Lohman, G. (2001). Self-managing technology in IBM DB2 Universal Database. Proceedings of the tenth international conference on Information and knowledge, 541-543. DOI: https://doi.org/10.1145/502585.502682
5. Abadi, D., Boncz, P., Harizopoulos, S., Idreos, S., Madden, S.(2013). The design and implementation of modern column-oriented DB systems. Foundations and Trends in Databases, 5(3),197–280. DOI: https://doi.org/10.1561/1900000024
6. Zaharia, M., et al. (2016). Apache Spark: A unified engine for big data processing. Communications of the ACM, 59(11), 56–65. DOI: https://doi.org/10.1145/2934664
7. Dean, J., & Ghemawat, S. (2004). MapReduce: Simplified data processing on large clusters. OSDI, 13-149. DOI: https://www.usenix.org/legacy/events/osdi04/tech/full_papers/dean/dean.pdf
8. Armbrust, M., et al. (2010). A view of cloud computing. Communications of the ACM, 53(4), 50–58. DOI: https://doi.org/10.1145/1721654.1721672
9. Hellerstein, J. M., Stonebraker, M., & Hamilton, J. (2007). Architecture of a database system. Foundations and Trends in Databases, 1(2), 141–259. DOI: https://doi.org/10.1561/1900000002
10. Chen, Y., Alspaugh, S., & Katz, R. (2012). Interactive analytical processing in big data systems: A cross-industry study of MapReduce workloads. Proceedings of the VLDB Endowment, 5(12), 1802–1813. DOI: https://doi.org/10.14778/2367502.2367519
11. Herodotou, H., & Babu, S. (2011). Profiling, what-if analysis, and cost-based optimization of MapReduce programs. Proceedings of the VLDB Endowment, 4(11), 1111–1122. DOI: https://doi.org/10.14778/3402707.3402746
12. Curino, C., Jones, E., Zhang, Y., & Madden, S. (2010). Schism: A workload-driven approach to database replication and partitioning. Proceedings of the VLDB Endowment, 3(1–2), 48–57. DOI: https://doi.org/10.14778/1920841.1920853
13. Lim, H.C, Babu, S., & Chase, J. S. (2010). Automated control for elastic storage. Proceedings of IEEE ICAC, 1-10. DOI: https://doi.org/10.1145/1809049.1809051
14. Mao, M., & Humphrey, M. (2011). Auto-scaling to minimize cost and meet application deadlines in cloud workflows. Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, Article 49, 1-12. DOI: https://doi.org/10.1145/2063384.2063449
15. Bernstein, P. A., Cseri, I., Dani, N., Ellis, N., Kalhan, A., Kakivaya, G., Lomet, D.B, Manne, R., Novik, L., & Talius, T. (2011). Adapting Microsoft SQL Server for cloud computing. Proceedings of IEEE ICDE, 1255-1263. DOI: https://doi.org/10.1109/ICDE.2011.5767935
16. Delimitrou, C., & Kozyrakis, C. (2014). Quasar: Resource-efficient and QoS-aware cluster management. Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 127-144. DOI: https://doi.org/10.1145/2541940.2541941
17. Gunawi, H.S., Suminto, R.O, Sears, R., et al. (2018). Fail-slow at scale: Evidence of hardware performance faults in large production systems. Proceedings of the ACM transactions on storage, 14(3), 1-26. DOI:https://doi.org/10.1145/3242086
18. Zheng, x. (2018). Database as a service: Current issues and its future, 1-5. DOI: https://doi.org/10.48550/arXiv.1804.00465
19. Shahrad, M., et al. (2020). Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider. Proceedings of USENIX ATC, 205-218. https://www.usenix.org/system/files/atc20-shahrad.pdf
20. Popa, R.A., Malviya, N., Wu, E., Madden, S., Balakrishnan., H., & Zeldovich, N. (2011). Relational cloud: A database-as-a-service for the cloud. Proceedings of CIDR. https://people.csail.mit.edu/nickolai/papers/curino-relcloud-cidr.pdf





