Principles and methods for automatic scaling of high‑load systems
DOI:
https://doi.org/10.32347/2707-501x.2023.52(3).217-226Keywords:
adaptability, dynamic scaling, cost efficiency, information system, information technology, high performance, resource management, softwareAbstract
This paper investigates a wide spectrum of use cases and the main approaches applied to Internet‑traffic processing by industry giants such as Netflix, YouTube, Disney+, TikTok, Facebook, Xbox Live, PlayStation Downloads, and Amazon Prime. It has been demonstrated that the defining characteristic of high-load systems is the need for automated resource management to process large volumes of real-time requests while ensuring high availability, reliability and stability amid dynamically changing workloads. The impact of key performance metrics on other functional characteristics has been analyzed. Based on these analyze, the system's key metrics set that guarantees system’s responsiveness under unpredictable conditions has been systematized and structured. Findings has indicated that system throughput depends on adaptive load balancing, whereas scalability determines the system’s capacity to adapt. Thus, efficient resource management and adaptive scaling algorithms makes it possible to reduce costs without losing performance. It has also been shown that the efficiency and resilience of high-load systems significantly depend on architectural solutions. Moreover, vertical scaling of the system enables increased performance of individual system nodes, while horizontal scaling provides better fault tolerance. It has been established that the main challenge in achieving economic efficiency lies in balancing scalability, performance and fault tolerance, while combined scaling strategies can help align these objectives. The relevance of developing models and methods for optimizing load adjustments timing has been highlighted, and directions for further research are outlined. The results these investigation may be used in the future to develop adaptive models and management methods of high‑load distributed systems resources that are able to operate effectively under conditions of dynamic unpredictable loads and increasing performance requirements..
References
Datareportal. Digital 2022: Global Overview Report January 2022. URL: https://datareportal.com/reports/digital-2022-global-overview-report
Sandvine. Global Internet Phenomena Report (GIPR) January 2022. Sandvine Inc. URL: https://www.sandvine.com/hubfs/Sandvine_Redesign 2019/Downloads/2022/Phenomena%20Reports/GIPR%202022/Sandvine%20GIPR%20January%202022.pdf
Cisco. (2022). Global Networking Trends Report. URL: https://storage.eventcheckin.co.kr/cisco/2023/CXO_symposium/data/2022_Global_Networking_Trend_Report_eng.pdf
Weiner M., Xu Y., "Performance Metrics for Distributed Systems", ACM Computing Surveys, 2022, Vol. 54, No. 7, P. 1–29.
Dean J., Ghemawat S., "MapReduce: Simplified Data Processing on Large Clusters", Communications of the ACM, 2008, Vol. 51, No. 1, P. 107–113. DOI: https://doi.org/10.1145/1327452.1327492
Leis V., Boncz P., "Query Processing for Distributed Databases", IEEE Transactions on Knowledge and Data Engineering, 2021, Vol. 33, No. 5, P. 1784–1802. DOI: https://doi.org/10.1109/TKDE.2020.3036432
Drepper U., "The Cost of Latency in Distributed Systems", USENIX; login, 2018, Vol. 43, No. 2, P. 3–10.
Abadi D., "Consistency Tradeoffs in Modern Distributed Database Systems", IEEE Data Engineering Bulletin, 2020, Vol. 43, No. 2, P. 22–30. URL: https://sites.computer.org/debull/A20june/2-Abadi.pdf
Rao J., "Tail Latency in Distributed Systems", IEEE Internet Computing, 2019, Vol. 23, No. 2, P. 68–77. DOI: https://doi.org/10.1109/MIC.2019.2899036
Fox A., Patterson D., "Recovery-Oriented Computing: A New Research Agenda for Highly Available Systems", ACM Transactions on Internet Technology, 2021, Vol. 19, No. 4, P. 1–27.
Kanev S., "Profiling Tail Latencies in Data Centers", ACM Symposium on Cloud Computing, 2021, P. 103–116.
Papageorgiou G., "Ensuring Reliability in Distributed Systems: A Survey", Future Generation Computer Systems, 2022, Vol. 129, P. 102–124.
Armbrust M., "Scalability of Cloud-Based Workloads", USENIX Annual Technical Conference, 2020, P. 245–258.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).