Computer Informatization and Mechanical System
ISSN：2434-1010

Home	About Ciams	Submisson	Recent Articles	Contact US	Register	Login

location：Home > 2025 Vol.8 Dec.N06 > Dynamic Scheduling Strategy for Online Experimental Resources in Multi-Agent Reinforcement Learning

2025 Vol.8 Dec.N06
Title: Dynamic Scheduling Strategy for Online Experimental Resources in Multi-Agent Reinforcement Learning Name: Xiaojun Cheng Company: Jiangsu Union Technical Institute，Yancheng Mechanical and Electrical Branch，Yancheng，224005，China Abstract: Conventional online experimental resource dynamic scheduling methods primarily achieve fixed resource allocation through predefined resource allocation rules or estimates of resource demand based on tasks. Due to the lack of cluster analysis for experimental resources, it is difficult to comprehensively consider the similarities and differences in functionality, performance, and other aspects among different experimental resources, resulting in poor scheduling stability. To address this, a multi-agent reinforcement learning-based online experimental resource dynamic scheduling strategy is proposed. Multidimensional feature data from users and online experimental resources are collected and converted into feature vectors. Based on these vectors, similarity scores between users and resources are calculated to establish matching relationships. The fuzzy C-means clustering algorithm is applied to partition users and related resources into distinct clusters according to their similarity scores. Each clustered entity is defined as an agent, comprehensively representing relevant elements in online experimental resource scheduling. Simultaneously, resource demands and experiment progress are integrated to form the sum of the state space. The action space defines agents' resource request behaviors, constrained by total resource availability and actual demand. Factors such as resource satisfaction, experiment progress, and resource wastage are comprehensively considered. Weighted coefficients balance reward components to guide agents toward learning effective strategies. Deep Deterministic Policy Gradient (DDPG) is employed to train the multi-agent system. By designing the loss function, agents are guided to optimize allocation rules according to reward expectations, enabling dynamic resource scheduling. Experimental validation confirms the proposed method's scheduling stability. Comparative test results demonstrate that when applying this approach for dynamic experimental resource scheduling, the average number of rescheduling instances is 3, achieving relatively ideal scheduling performance. Keyword: Multi-agent; Reinforcement learning; Online experimental resources; Dynamic scheduling; DOI: 10.12250/jpciams2025091103 Citation form: Xiaojun Cheng.Dynamic Scheduling Strategy for Online Experimental Resources in Multi-Agent Reinforcement Learning[J]. Computer Informatization and Mechanical System,2025,Vol.8,pp.

2025 Vol.8 Dec.N06

Title: Dynamic Scheduling Strategy for Online Experimental Resources in Multi-Agent Reinforcement Learning
Name: Xiaojun Cheng
Company: Jiangsu Union Technical Institute，Yancheng Mechanical and Electrical Branch，Yancheng，224005，China
Abstract:
Conventional online experimental resource dynamic scheduling methods primarily achieve fixed resource allocation through predefined resource allocation rules or estimates of resource demand based on tasks. Due to the lack of cluster analysis for experimental resources, it is difficult to comprehensively consider the similarities and differences in functionality, performance, and other aspects among different experimental resources, resulting in poor scheduling stability. To address this, a multi-agent reinforcement learning-based online experimental resource dynamic scheduling strategy is proposed. Multidimensional feature data from users and online experimental resources are collected and converted into feature vectors. Based on these vectors, similarity scores between users and resources are calculated to establish matching relationships. The fuzzy C-means clustering algorithm is applied to partition users and related resources into distinct clusters according to their similarity scores. Each clustered entity is defined as an agent, comprehensively representing relevant elements in online experimental resource scheduling. Simultaneously, resource demands and experiment progress are integrated to form the sum of the state space. The action space defines agents' resource request behaviors, constrained by total resource availability and actual demand. Factors such as resource satisfaction, experiment progress, and resource wastage are comprehensively considered. Weighted coefficients balance reward components to guide agents toward learning effective strategies. Deep Deterministic Policy Gradient (DDPG) is employed to train the multi-agent system. By designing the loss function, agents are guided to optimize allocation rules according to reward expectations, enabling dynamic resource scheduling. Experimental validation confirms the proposed method's scheduling stability. Comparative test results demonstrate that when applying this approach for dynamic experimental resource scheduling, the average number of rescheduling instances is 3, achieving relatively ideal scheduling performance.
Keyword: Multi-agent; Reinforcement learning; Online experimental resources; Dynamic scheduling;
DOI: 10.12250/jpciams2025091103
Citation form: Xiaojun Cheng.Dynamic Scheduling Strategy for Online Experimental Resources in Multi-Agent Reinforcement Learning[J]. Computer Informatization and Mechanical System,2025,Vol.8,pp.

Reference:

[1] Mishra R, Gupta M. DRABC-LB: A Novel Resource-Aware Load Balancing Algorithm Based on Dynamic Artificial Bee Colony for Dynamic Resource Allocation in Cloud [J]. SN Computer Science, 2024, 5(2):1-16.

[2] Shang J, Yan J, Ren F. BDI Agents Based Dynamic Resource Allocation in Emergency Scenarios[J]. 2024 IEEE International Conference on Agents (ICA), 2024:45-49.

[3] Basu D, Kal S, Datta G R. DRIVE: Dynamic Resource Introspection and VNF Embedding for 5G Using Machine Learning[J]. IEEE Internet of Things Journal, 2023, 10(21):18971-18979.

[4] Wu J, Lin K, Xie Y, et al. A Multi-Service Real-Time Resource Scheduling Optimization Method Based on the O-RAN Architecture[J]. 2024 International Conference on Intelligent Communication, Sensing and Electromagnetics (ICSE), 2024:233-237.

[5] Huang F, Wang W, Wang T. Dynamic Resource Management for Enhanced QoS in Collaborative Edge-Edge Industrial Environments[J]. 2024 Twelfth International Conference on Advanced Cloud and Big Data (CBD), 2024:314-320.

[6] Xiong W, Wang X, Wotawa F, et al. Optimizing Resource Scheduling for Multi-Scenario Mixed Service Groups under Edge Cloud-Native Environments Using Simulation Learning[J]. Journal of Internet Technology, 2024, 25(7):1071-1081.

[7] Li Y, Yang S, Zhang W, et al. Research on Resource Scheduling Optimization Algorithms for Cloud Platform Business Systems in a Microservices Architecture[J]. 2024 IEEE 4th International Conference on Data Science and Computer Application (ICDSCA), 2024:736-742.

[8] Sehgal N, Bansal S, Bansal R K. A Comparative Analysis of Dynamic Scheduling Algorithms for Enhanced Resource Management in Homogeneous and Heterogeneous Fog Computing Environments[J]. Procedia Computer Science, 2023, 230:542-553.

[9] Satic U, Jacko P, Kirkbride C. A simulation-based approximate dynamic programming approach to dynamic and stochastic resource-constrained multi-project scheduling problem[J]. European Journal of Operational Research, 2024, 315(2):454-469.

[10] Sinha A, Banerjee P, Roy S, et al. Improved Dynamic Johnson Sequencing Algorithm (DJS) in Cloud Computing Environment for Efficient Resource Scheduling for Distributed Overloading [J]. Journal of Systems Science and Systems Engineering, 2024, 33(4):391-424.

[11] Kovalenko V, Zhdanova O. Dynamic mathematical model for resource management and scheduling in cloud computing environments[J]. Information, Computing and Intelligent Systems, 2024(5):90-100.

[12] Rabaaoui S, Héla Hachicha, Zagrouba E. An Efficient and Autonomous Dynamic Resource Allocation in Cloud Computing with Optimized Task Scheduling[J]. Procedia Computer Science, 2024, 246(000):3654-3663.

[13] Majhi M K, Kabat M R, Sahoo S P. MCFSGO: An Energy-Efficient Multi-Adaptive Firebug Swarm Genetic Optimization Algorithm for Dynamic Resource Scheduling in Cloud Environments[J]. International Journal of Bio-Inspired Computation, 2025, 25(2):79-87.

[14] Zhou X, Yang J, Li Y, et al. EC-TRL: Evolutionary-Weighted Clustering and Transformer-Augmented Reinforcement Learning for Dynamic Resource Scheduling in Edge Cloud Environments[J]. IEEE Internet of Things Journal, 2025, 12(6):7503-7517.

[15] Cao A, Chen X, Jiao L, et al. Dynamic Resource Scheduling Based Quality of Service Optimization in Multi-UAV-Assisted City Edge Network Systems[J]. 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2024:4561-4567.

Tsuruta Institute of Medical Information Technology
Address:[502,5-47-6], Tsuyama, Tsukuba, Saitama, Japan TEL:008148-28809 fax:008148-28808 Japan,Email:jpciams@hotmail.com,2019-09-16