{"id":389876,"date":"2025-09-01T18:26:19","date_gmt":"2025-09-01T18:26:19","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/389876\/"},"modified":"2025-09-01T18:26:19","modified_gmt":"2025-09-01T18:26:19","slug":"crfts-a-cluster-centric-and-reservation-based-fault-tolerant-scheduling-strategy-to-enhance-qos-in-cloud-computing","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/389876\/","title":{"rendered":"CRFTS: a cluster-centric and reservation-based fault-tolerant scheduling strategy to enhance QoS in cloud computing"},"content":{"rendered":"<p>This section explains the proposed CRFTS model, which uses a clustering approach for allocation and the advance reservation of VMs to handle the failure in a dynamic, computationally intensive cloud system. The proposed cluster-based allocation technique can be represented as a bipartite graph between the task set and VM set, as shown in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Fig1\" target=\"_blank\" rel=\"noopener\">1<\/a>. Additionally, the notions used are presented in Table\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Tab3\" target=\"_blank\" rel=\"noopener\">3<\/a>. The System Model and Problem Formulation are illustrated in the subsection below. In this section, we introduced the fault-tolerant-based scheduling algorithm. The incoming tasks are represented as set T= {t1, t2, t3\u2026 tn} and the VMs are represented as set V= {v1, v2, v3\u2026 vk}. At any decision point (t), only subset Tk of T, i.e., Tk \u2286 T, is known. Since the entire task set T is unknown, the scheduler must make decisions without prior knowledge of tasks, which models the problem as online scheduling. The proposed scheduling assigns each arriving task to a VM based only on the current system state.<\/p>\n<p><b id=\"Fig1\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 1<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-17609-7\/figures\/1\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig1\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/09\/41598_2025_17609_Fig1_HTML.png\" alt=\"figure 1\" loading=\"lazy\" width=\"685\" height=\"363\"\/><\/a><\/p>\n<p>Clustered-allocation mapping strategy.<\/p>\n<p>System model<\/p>\n<p>The CRFTS system architecture includes several components, as illustrated in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Fig2\" target=\"_blank\" rel=\"noopener\">2<\/a>. User tasks are submitted to the Application Layer. Besides, in the cloud, the tasks may come from any geographical location at any point in time. The Task Sorter sorts the incoming tasks in ascending order of task size. Similarly, the Host\/VM Layer contains a variety of resources with varying processing abilities. VM Sorter sorts the available VMs in ascending order of their processing abilities. Additionally, the sorting of both tasks and VMs is followed by a clustering mechanism where both tasks and VMs are separated into three different clusters, i.e., low, mid, and high clusters, as discussed in the algorithm section. The clustering will further ensure the best-suited task and VM mapping by limiting the task and VM domain of mapping. Furthermore, the VM allocator is responsible for choosing the relevant VM from the cluster for task completion.<\/p>\n<p>Each VM periodically sends a simulated heartbeat signal at a fixed interval \u03b4. The heartbeat controller records the last received time Hj\u200b of the signal. If the difference between current time \u03c4 and Hj\u200b exceeds a threshold \u201c\u0394\u201d, the VM is considered faulty and the tasks on it are marked as failed and rescheduled accordingly.<\/p>\n<p>The AR Module\/Fault Handler collaborates with the VM Allocator and the Heartbeat Monitor to detect and manage VM and task failures, initiating reservations to ensure the uninterrupted execution of tasks. This coordination guarantees fault-tolerant execution with minimal downtime. The VM allocator also communicates the generated schedule to the AR Module. In response, the AR Module\u2019s Time Manager component predicts the AR slot as given in Eq.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ3\" target=\"_blank\" rel=\"noopener\">3<\/a> based on the schedule produced by the VM allocator. The Reservation Producer component provides the alternative VM for the computed AR slot in case of a fault, thereby initiating the reservation.<\/p>\n<p><b id=\"Fig2\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 2<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-17609-7\/figures\/2\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig2\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/09\/41598_2025_17609_Fig2_HTML.png\" alt=\"figure 2\" loading=\"lazy\" width=\"685\" height=\"335\"\/><\/a><\/p>\n<p>System architecture of CRFTS.<\/p>\n<p>Problem formulation<\/p>\n<p>The section presents a foundation for the rest of the work, facilitating readers to understand the objectives, variables, constraints\/assumptions, and approach of the research done in the work.<\/p>\n<p>Task and VM Constraints\/Assumptions:<\/p>\n<p>\\(1MI~ \\leqslant ~Size{\\text{ }}({t_i}) \\leqslant 100MI\\)<\/p>\n<p>\\(1MIPS~ \\leqslant ~Speed{\\text{ }}({v_j}) \\leqslant 100MIPS\\)<\/p>\n<p>Objectives of the model: Minimize makespan while maximizing resource utilization and reliability.<\/p>\n<p>Allocation Variables: STij, FTij, Rj, P(ti,vj).<\/p>\n<p>Fault Tolerance Variables: AR, ARM, Status.<\/p>\n<p>The allocation and fault tolerance variables are calculated in the problem formulation section later. The task allocation problem is modelled mathematically as a mapping of every arriving task to the suitable VM cluster to optimize reliability. The proposed model works in five phases:<\/p>\n<p>Phase 1: task clustering<\/p>\n<p>This phase is responsible for organizing the tasks in three clusters, namely, Low task Cluster, Medium task Cluster, and High task Cluster.<\/p>\n<p>Phase 2: VM clustering<\/p>\n<p>This phase is responsible for organizing the VMs in three clusters, namely, Low VM Cluster, Medium VM Cluster, and High VM Cluster.<\/p>\n<p>Phase 3: task to Vm mapping without reservation<\/p>\n<p>In each cluster, the VMs and tasks may be placed in random order because the task and VM clusters are created based on task size and VM speed, respectively, and not based on the number. Mathematically, the mapping (\u00b5) can be represented as follows:<\/p>\n<p>\u00b5: T \u2192 V<\/p>\n<p>Every task has its Start time and Finish time, which are computed in Eqs.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ1\" target=\"_blank\" rel=\"noopener\">1<\/a>) and (<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ2\" target=\"_blank\" rel=\"noopener\">2<\/a>). To reserve the VM, the task execution time slot, i.e., the AR slot for every task on a suitable VM, is estimated. The AR slot for each task is calculated by taking the difference between ST and FT as computed in Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ3\" target=\"_blank\" rel=\"noopener\">3<\/a>). The ST is the time when the task starts executing on the allocated VM. The FT is the time when the task completes its execution on that VM. The execution time of the task is added to the ST of that task to compute the FT. Additionally, each VM has a load history (Rj) that is used to determine when it is ready to execute the new task. Whenever any ti will begin to execute on any vj after mapping according to the allocation procedure. The time at which the allocation of ti begins or starts on vj is known as STij. Rj is first assumed to be STij as seen in Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ1\" target=\"_blank\" rel=\"noopener\">1<\/a>)<\/p>\n<p>$$S{T_{ij}}={\\text{ }}{R_j}$$<\/p>\n<p>\n                    (1)\n                <\/p>\n<p>Once ti has completed its execution on vj, the processing time of ti on vj (p(ti, vj)) is added to the STij as indicated in Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ2\" target=\"_blank\" rel=\"noopener\">2<\/a>) to determine the FTij.<\/p>\n<p>$$F{T_{ij}}={\\text{ }}S{T_{ij}}+{\\text{ }}p\\left( {{t_i},{v_j}} \\right)$$<\/p>\n<p>\n                    (2)\n                <\/p>\n<p>$$AR\\,=\\,F{T_{ij}} &#8211; {\\text{ }}S{T_{ij}}$$<\/p>\n<p>\n                    (3)\n                <\/p>\n<p>Here, p(ti,vj) is computed by dividing the size of ti by the speed of vj.<\/p>\n<p>Phase 4: failure modelling and detection via heartbeat mechanism<\/p>\n<p>The model enables every VM to send a heartbeat signal at certain instances (Hj).<\/p>\n<p>Parameters defined for enabling the heartbeat mechanism:<\/p>\n<p>\u03b4: Heartbeat interval (VM sends the heartbeat every \u03b4 units of time), Hj\u200b: Last heartbeat timestamp from VM.<\/p>\n<p>T: Current time, \u0394: Timeout threshold.<\/p>\n<p>The heartbeat mechanism works in two steps:<\/p>\n<p>Step 1: Heartbeat Sending<\/p>\n<p>The phase initially tracks the time of sending the heartbeat by using the \u03b4 as:<\/p>\n<p>\\(T{\\text{ }}\\bmod {\\text{ }}\\,\\delta =\\,0\\)<\/p>\n<p>If the heartbeat sending time is detected, then VM sends the heartbeat time stamp to the heartbeat controller. After receiving a heartbeat, the controller updates Hj of the corresponding VM with the current time (\u03c4).<\/p>\n<p>\\(H\\left( {{v_j}} \\right){\\text{ }}=\\tau\\)<\/p>\n<p>Step 2: Heartbeat Scanning<\/p>\n<p>The heartbeat controller performs this phase, and it periodically scans the heartbeat of VMs. If the heartbeat of any VM is not found within the time, then it is declared as faulty and ejected from the cluster. The controller measures the past time since the most recent heartbeat was acknowledged from the VM and uses this interval as a pointer for detecting VM failure as:<\/p>\n<p>\\(Last\\_Breath\\,=\\,\\tau &#8211;H\\left( {{v_j}} \\right)\\)<\/p>\n<p>If the Last_Breath exceeds the timeout threshold, then the controller declares the VM as failed by indicating the following:<\/p>\n<p>\\(H\\left( {VM} \\right) = \\left\\{ \\begin{gathered} 1,~~\\,\\,if~heartbeat~received~from\\,~VM~within~Timeout \\hfill \\\\ 0,\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\;\\,\\,\\,Otherwise \\hfill \\\\ \\end{gathered} \\right.\\)<\/p>\n<p>Phase 5: fault tolerance via reservation<\/p>\n<p>In this phase, Task VM mapping is performed, and simultaneously, VMs are monitored for faults using the heartbeat mechanism. In case of faults, advance reservation is enabled and an alternative VM is provided to the affected task. If any VM leaves the system or fails at any point in time, and fault tolerance is not there, the associated tasks may suffer premature termination. The problem is addressed by accompanying the scheduling with a proactive, advance reservation fault tolerance technique. Let\u2019s suppose \u2018f\u2019 VMs are failing and these \u2018f\u2019 VMs are taken as a separate set of failed VMs (Vf) and are defined as:<\/p>\n<p>\\({V_f}={\\text{ }}\\{ {v_f}:{\\text{ }}{v_f} \\in V{\\text{ }}and{\\text{ }}\\left| {{V_f}} \\right|{\\text{ }}=f\\}\\)<\/p>\n<p>This failure of VMs will affect the corresponding tasks and cause them to fail. Let \u2018u\u2019 be the number of unsuccessful tasks. Similarly, these \u2018u\u2019 unsuccessful tasks are taken as a separate set of unsuccessful tasks (Tu) and are defined as:<\/p>\n<p>\\({T_u}={\\text{ }}\\{ {t_u}:{\\text{ }}{t_u} \\in T{\\text{ }}and{\\text{ }}\\left| {{T_u}} \\right|{\\text{ }}={\\text{ }}u{\\text{ }}\\&amp; {\\text{ }}u<\/p>\n<p>Now, these unexecuted tasks need to be reassigned to some alternative VM to complete their execution, which is done by the advance reservation technique. All tasks in Tu are migrated to another VM (Vj) having the least RT from (Vf) for the calculated AR slot such that Vj \u2209 Vf as discussed in Algorithm 5.<\/p>\n<p>The proposed clustering approach allocates tasks in three ways. First, sorting helps the model to achieve a one-to-one suitable order. Thereafter, clustering before allocation restricts the domain of allocation, which allows the model to assign the most appropriate VM to the task. Lastly, the task from each cluster is allocated to the corresponding VM clusters only, which helps the model to avoid under- and overutilization of the VM up to a certain degree. However, there are still chances of over- and underutilization of VMs, which will be handled by integrating the proposed approach with an efficient load-balancing technique that is under consideration for future work, which continuously monitors the under- and over-utilizing VMs and shifts the load between them.<\/p>\n<p>Performance metrics<\/p>\n<p>After the successful task execution, the reliability of the system is checked. Additionally, the makespan is taken as the highest or maximum among all TETj and can be expressed as Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ4\" target=\"_blank\" rel=\"noopener\">4<\/a>)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 32\" title=\"Poola, D., Salehi, M. A., Ramamohanarao, K. &amp; Buyya, R. A Taxonomy and Survey of Fault-Tolerant Workflow Management Systems in Cloud and Distributed Computing Environments. In Software Architecture for Big Data and the Cloud 285&#x2013;320 (Elsevier, 2017). &#010;                  https:\/\/doi.org\/10.1016\/B978-0-12-805467-3.00015-6&#010;                  &#010;                .\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#ref-CR32\" id=\"ref-link-section-d19980398e2927\" target=\"_blank\" rel=\"noopener\">32<\/a>.<\/p>\n<p>$$Makespan\\,=\\,\\hbox{max} {\\text{ }}\\left( {F{T_{ij}}} \\right),\\forall {V_j}$$<\/p>\n<p>\n                    (4)\n                <\/p>\n<p>Progress percentage of makespan (Ppm): It is the percentage of progress in makespan offered in proposed CRFTS over the other existing approaches, i.e., HEFT, E-HEFT, and LB-HEFT approaches, and is calculated in the Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ5\" target=\"_blank\" rel=\"noopener\">5<\/a>)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 33\" title=\"Elliott, J., Hoemmen, M. &amp; Mueller, F. Exploiting data representation for fault tolerance. J. Comput. Sci. 14, 51&#x2013;60. &#010;                  https:\/\/doi.org\/10.1016\/j.jocs.2015.12.002&#010;                  &#010;                 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#ref-CR33\" id=\"ref-link-section-d19980398e2951\" target=\"_blank\" rel=\"noopener\">33<\/a>.<\/p>\n<p>$$P{p_m}\\left( \\% \\right){\\text{ }}=\\frac{{M~\\left( {Compared~Approach} \\right) &#8211; ~M\\left( {\\Pr oposed~Approach} \\right)}}{{M\\left( {Compared~Approach} \\right)}}*{\\text{ }}100$$<\/p>\n<p>\n                    (5)\n                <\/p>\n<p>Average of (Ppm): To calculate the deviation from the desired rate in percentage, divide the sum of every Ppi for each tested VM by their respective tested numbers, as shown in the Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ6\" target=\"_blank\" rel=\"noopener\">6<\/a>)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 34\" title=\"Walia, G. K., Kumar, M. &amp; Gill, S. S. AI-Empowered fog\/edge resource management for IoT applications: A comprehensive review, research challenges, and future perspectives. IEEE Commun. Surv. Tutorials. 26 (1), 619&#x2013;669. &#010;                  https:\/\/doi.org\/10.1109\/COMST.2023.3338015&#010;                  &#010;                 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#ref-CR34\" id=\"ref-link-section-d19980398e2977\" target=\"_blank\" rel=\"noopener\">34<\/a>.<\/p>\n<p>$$Average{\\text{ }}of{\\text{ }}\\left( {P{p_m}} \\right){\\text{ }}=\\frac{{\\mathop {{\\text{ }}\\sum }\\nolimits_{{i=1}}^{t} Ppm~\\left( {each~tested~VM~EquationNumber} \\right)}}{t}$$<\/p>\n<p>\n                    (6)\n                <\/p>\n<p>The Average VM utilization of the system is calculated in the Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ7\" target=\"_blank\" rel=\"noopener\">7<\/a>) as<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 35\" title=\"Hosseini Shirvani, M. &amp; Ramzanpoor, Y. Multi-objective QoS-aware optimization for deployment of IoT applications on cloud and fog computing infrastructure. Neural Comput. Appl. 35(26), 19581&#x2013;19626. &#010;                  https:\/\/doi.org\/10.1007\/s00521-023-08759-8&#010;                  &#010;                 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#ref-CR35\" id=\"ref-link-section-d19980398e2999\" target=\"_blank\" rel=\"noopener\">35<\/a>.<\/p>\n<p>$$UT=\\frac{{\\mathop \\sum \\nolimits_{1}^{k} \\left( {FT~ &#8211; ~p\\left( {tf~\\hbox{c}{\\kern-6.5pt}\\raise2.75pt\\hbox{\\vrule height .5pt width3.5pt depth0pt}{\\kern4pt}~Tf~,~Vj~\\hbox{c}{\\kern-6.5pt}\\raise2.75pt\\hbox{\\vrule height .5pt width3.5pt depth0pt}{\\kern4pt}~Vf~} \\right)} \\right)}}{{k~*Makespan}}~\\forall Vj~$$<\/p>\n<p>\n                    (7)\n                <\/p>\n<p>Progress percentage of UT (Ppu): It is the ratio defining the progress percentage of average utilization of the proposed CRFTS and other compared approaches and is calculated in the Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ8\" target=\"_blank\" rel=\"noopener\">8<\/a>) as<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 36\" title=\"Yakubu, I. Z. &amp; Murali, M. An efficient meta-heuristic resource allocation with load balancing in IoT-Fog-cloud computing environment. J. Ambient Intell. Humaniz. Comput. 14(3), 2981&#x2013;2992. &#010;                  https:\/\/doi.org\/10.1007\/s12652-023-04544-6&#010;                  &#010;                 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#ref-CR36\" id=\"ref-link-section-d19980398e3024\" target=\"_blank\" rel=\"noopener\">36<\/a>.<\/p>\n<p>$$P{p_u}\\left( \\% \\right){\\text{ }}=\\left( {\\frac{{UT~\\left( {Compared~Approach} \\right) &#8211; ~UT\\left( {\\Pr oposed~Approach} \\right)}}{{UT\\left( {Compared~Approach} \\right)}}} \\right)*{\\text{ }}100$$<\/p>\n<p>\n                    (8)\n                <\/p>\n<p>Average of (Ppm): To calculate the deviation from the desired rate in percentage, divide the sum of every Ppu for each tested VM by the respective number of tests, as shown in the Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ9\" target=\"_blank\" rel=\"noopener\">9<\/a>)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 37\" title=\"Zahraddeen Yakubu, I. &amp; Murali, M. An efficient IoT-Fog-Cloud resource allocation framework based on Two-Stage approach. IEEE Access. 12, 75384&#x2013;75395. &#010;                  https:\/\/doi.org\/10.1109\/ACCESS.2024.3405581&#010;                  &#010;                 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#ref-CR37\" id=\"ref-link-section-d19980398e3050\" target=\"_blank\" rel=\"noopener\">37<\/a>.<\/p>\n<p>$$Average{\\text{ }}of{\\text{ }}\\left( {P{p_u}} \\right){\\text{ }}=\\frac{{\\mathop \\sum \\nolimits_{{i=1}}^{t} Ppu~\\left( {each~tested~VM~EquationNumber} \\right)}}{t}$$<\/p>\n<p>\n                    (9)\n                <\/p>\n<p>Finally, reliability is computed by the Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ10\" target=\"_blank\" rel=\"noopener\">10<\/a>)<\/p>\n<p>$$R{\\text{ }}=\\frac{{\\left| {{T_u}} \\right|}}{{\\left| T \\right|}}*{\\text{ }}100$$<\/p>\n<p>\n                    (10)\n                <\/p>\n<p>The Model readily provides the reserved VM as an alternative VM for the impacted task to ensure the successful execution of every task. Thereby, confirming maximum reliability in the event of more than 50% of faulty VMs.<\/p>\n<p>Pseudocode and algorithm<\/p>\n<p>The proposed pseudocode works by taking some assumptions and parameters listed as follows:<\/p>\n<ul class=\"u-list-style-bullet\">\n<li>\n<p>Firstly, the system\u2019s input parameters, such as the task number, task size, number of VMs, processing speed, and the times at which each resource is available, i.e., ready time, were established.<\/p>\n<\/li>\n<li>\n<p>Initially, the ready time of the VM is determined as the total execution time.<\/p>\n<\/li>\n<li>\n<p>The selected tasks, early start time, and actual finish time with calculated time slots are initialized in the ARM matrix with the status flag.<\/p>\n<\/li>\n<li>\n<p>As per the heterogeneity considered in the model, i.e., low task and low machine heterogeneity, the task size ranges from 1\u00a0million instructions to 100\u00a0million instructions, and the machine heterogeneity ranges from 1 MIPS to 10 MIPS.<\/p>\n<\/li>\n<\/ul>\n<p>The steps of pseudocode followed by the proposed algorithm for generating the schedule are as follows:<\/p>\n<p>Proposed pseudocode for CRFTS<\/p>\n<p>Let T and V be the set of tasks and VMs, respectively.<\/p>\n<ol class=\"u-list-style-none\">\n<li>\n                      1.<\/p>\n<p>Sort tasks and VMs:<\/p>\n<ul class=\"u-list-style-bullet\">\n<li>\n<p>T= {t1, t2, t3, \u2026tn} such that S(ti) in T is in ascending order.<\/p>\n<\/li>\n<li>\n<p>V= {v1, v2, v3, \u2026vm} such that Sp(vm) in V is in ascending order.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li>\n                      2.<\/p>\n<p>Task Clustering:<\/p>\n<ul class=\"u-list-style-bullet\">\n<li>\n<p>Calculate mid, low_mid, and high_mid for task set (T).<\/p>\n<\/li>\n<li>\n<p>Set the boundaries of three task clusters as shown in the algorithm below.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li>\n                      3.<\/p>\n<p>VM Clustering:<\/p>\n<ul class=\"u-list-style-bullet\">\n<li>\n<p>Calculate mid, low_mid, and high_mid for VM set (V).<\/p>\n<\/li>\n<li>\n<p>Set the boundaries of three VM clusters as shown in the algorithm below.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li>\n                      4.<\/p>\n<p>Task to VM mapping:<\/p>\n<ul class=\"u-list-style-bullet\">\n<li>\n<p>Let ti and vj denote the ith task and jth VM.<\/p>\n<\/li>\n<li>\n<p>Map ti in each cluster to the corresponding vk in the same cluster.<\/p>\n<p>Ti \u25ca vk<\/p>\n<\/li>\n<li>\n<p>Estimate ST and FT for each tI as in Eqs.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ1\" target=\"_blank\" rel=\"noopener\">1<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ2\" target=\"_blank\" rel=\"noopener\">2<\/a>), respectively.<\/p>\n<\/li>\n<li>\n<p>Calculate the AR slot as in Eq.\u00a0(<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ3\" target=\"_blank\" rel=\"noopener\">3<\/a>)<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<ol class=\"u-list-style-none\">\n<li>\n                      5.<\/p>\n<p>Detecting failed VMs and failed tasks.<\/p>\n<\/li>\n<li>\n                      6.<\/p>\n<p>Remove failed VMs from the list of VMs.<\/p>\n<\/li>\n<li>\n                      7.<\/p>\n<p>VM Reservation:<\/p>\n<ul class=\"u-list-style-bullet\">\n<li>\n<p>If (fault occurs)<\/p>\n<p>Reserve VM for the corresponding AR slot.<\/p>\n<\/li>\n<li>\n<p>Update ARM with recent task information.<\/p>\n<\/li>\n<li>\n<p>Turn the Status of the corresponding task to 1, implying VM is reserved for the AR slot.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<p>Repeat steps 4 and 5 for all tasks in T.<\/p>\n<p>After the task set is empty, check the parameters of the model as in Eqs.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ4\" target=\"_blank\" rel=\"noopener\">4<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ5\" target=\"_blank\" rel=\"noopener\">5<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ6\" target=\"_blank\" rel=\"noopener\">6<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ7\" target=\"_blank\" rel=\"noopener\">7<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ8\" target=\"_blank\" rel=\"noopener\">8<\/a>, <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ9\" target=\"_blank\" rel=\"noopener\">9<\/a>, and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"equation anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Equ10\" target=\"_blank\" rel=\"noopener\">10<\/a>.<\/p>\n<p>Proposed algorithm (Task allocation and fault tolerance)Phase 1: task clustering<\/p>\n<p>The set of tasks (T) is taken as input by the Task Clustering algorithm and separates T into three different clusters as presented in Algorithm <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Figa\" target=\"_blank\" rel=\"noopener\">1<\/a> below:<\/p>\n<p><b id=\"Figa\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Algorithm 1<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-17609-7\/figures\/a\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Figa\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/09\/41598_2025_17609_Figa_HTML.png\" alt=\"figure a\" loading=\"lazy\" width=\"685\" height=\"335\"\/><\/a>Phase 2: VM clustering<\/p>\n<p>The set of VMs (V) is taken as input by the VM Clustering algorithm and separates V into three different clusters as presented in Algorithm <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Figb\" target=\"_blank\" rel=\"noopener\">2<\/a> below:<\/p>\n<p><b id=\"Figb\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Algorithm 2<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-17609-7\/figures\/b\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Figb\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/09\/41598_2025_17609_Figb_HTML.png\" alt=\"figure b\" loading=\"lazy\" width=\"685\" height=\"335\"\/><\/a>Phase 3: task to VM mapping<\/p>\n<p>The algorithm maps the tasks to the suitable VMs in the corresponding task and VM cluster. The algorithm works in two steps. In step 1, fresh VMs are allocated, and in step 2, the VMs with the minimum ready time are allocated to the next tasks. Algorithm <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Figc\" target=\"_blank\" rel=\"noopener\">3<\/a> is presented below:<\/p>\n<p><b id=\"Figc\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Algorithm 3<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-17609-7\/figures\/c\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Figc\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/09\/41598_2025_17609_Figc_HTML.png\" alt=\"figure c\" loading=\"lazy\" width=\"685\" height=\"335\"\/><\/a><\/p>\n<p>Task to VM mapping without reservation.<\/p>\n<p>Note 1: The Initial task to the VM mapping algorithm maps tasks to VMs without reservation. The algorithm considers all available VMs for allocation created in clustering.<\/p>\n<p>Phase 4: failure modelling and detection via heartbeat mechanism<\/p>\n<p>The set of VMs is taken, and this algorithm tracks every VM for its breaths to detect faulty VMs and is presented in Algorithm <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Figd\" target=\"_blank\" rel=\"noopener\">4<\/a> below:<\/p>\n<p><b id=\"Figd\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Algorithm 4<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-17609-7\/figures\/d\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Figd\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/09\/41598_2025_17609_Figd_HTML.png\" alt=\"figure d\" loading=\"lazy\" width=\"685\" height=\"335\"\/><\/a>Phase 5: fault tolerance via reservation<\/p>\n<p>The model uses this algorithm in case of a fault in any of the VMs. If any VM fails, the fault tolerance algorithm provides an alternative VM for the affected task based on the reservation. The model uses a structured data structure known as ARM (Advanced Reservation Matrix), used to keep track of the advance reservation task status in the fault tolerance phase. Each row of ARM indicates a particular task to VM reservation and includes the following:<\/p>\n<p>T_ID(\\(\\:{t}_{i}\\)): Unique identifier of the task<\/p>\n<p>VM_ID \\(\\:{(v}_{f})\\): The initially mapped failed VM<\/p>\n<p>VM_ID \\(\\:{(v}_{j}\\)): The reserved VM.<\/p>\n<p>AR_Slot: The advance reservation slot<\/p>\n<p>Status: A flag signifying the state of the reservation<\/p>\n<p>This algorithm is shown below:<\/p>\n<p><b id=\"Fige\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Algorithm 5<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-17609-7\/figures\/e\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fige\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/09\/41598_2025_17609_Fige_HTML.png\" alt=\"figure e\" loading=\"lazy\" width=\"685\" height=\"335\"\/><\/a><\/p>\n<p>Mapping and fault tolerance algorithm using reservation.<\/p>\n<p>Note 2: When a failed VM and corresponding failed tasks are detected by Algorithm 3, CRFTS reserves the VM with the least ready time entirely for the failed task. This VM is then not considered for general task allocations until the failed task is executed completely.<\/p>\n<p>Figure\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-17609-7#Fig3\" target=\"_blank\" rel=\"noopener\">3<\/a> presents the proposed CRFTS algorithm\u2019s flowchart to aid in understanding the model more clearly. The flowchart includes a detailed representation of the entire procedure.<\/p>\n<p><b id=\"Fig3\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 3<\/b><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-17609-7\/figures\/3\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig3\" src=\"https:\/\/www.europesays.com\/uk\/wp-content\/uploads\/2025\/09\/41598_2025_17609_Fig3_HTML.png\" alt=\"figure 3\" loading=\"lazy\" width=\"685\" height=\"743\"\/><\/a>Computational time complexity of CRFTS<\/p>\n<p>CRFTS involves the following Steps:<\/p>\n<ol class=\"u-list-style-none\">\n<li>\n                    1.<\/p>\n<p>Initialization and Input Parsing<\/p>\n<p>This typically consists of getting the task and VM information. The complexity is O(1) for each input.<\/p>\n<p>Task to VM Mapping<\/p>\n<\/li>\n<li>\n                    2.<\/p>\n<p>Sorting Tasks or VMs:<\/p>\n<p>The algorithm involves sorting tasks or VMs based on Task size and VM Speed. Sorting can be done with the complexity of O(nlogn) and O(klogk), respectively, where n and k are the number of tasks and number of VMs, respectively.<\/p>\n<\/li>\n<li>\n                    3.<\/p>\n<p>Clustering:<\/p>\n<p>Task and VM clusters are obtained with the complexity of O(n) and O(k), respectively. Therefore, the overall complexity of the clustering phase will be O(n+k).<\/p>\n<\/li>\n<li>\n                    4.<\/p>\n<p>Allocating Tasks to VMs within Clusters:<\/p>\n<p>We need to analyze the phases involved in the allocation process. Mapping each task to the VM in the corresponding VM cluster, the complexity can be computed as follows:<\/p>\n<p>Reading the task and VM data:<\/p>\n<p>The complexity will be O(1)<\/p>\n<p>Iterating through task and VM clusters:<\/p>\n<p>We need to iterate through each task cluster to assign tasks to VMs within the corresponding VM cluster. Since there are 3 clusters, let each cluster contain a maximum of m\u200b tasks and v VMs, respectively. The complexity of this phase will be O(m+v).<\/p>\n<p>Allocating Tasks to VMs within Clusters:<\/p>\n<p>For each task in a cluster, we allocate it to a VM in the corresponding VM cluster. This typically involves:<\/p>\n<ul class=\"u-list-style-bullet\">\n<li>\n<p>Iterating through each task in a task cluster: O(m).<\/p>\n<\/li>\n<li>\n<p>Iterating through each VM in VM cluster: O(v).<\/p>\n<\/li>\n<li>\n<p>Since allocation is performed on two conditions. Therefore, as per the algorithm, the complexity of allocation will be O(1) for the first condition. O(v) for the second condition.<\/p>\n<\/li>\n<li>\n<p>The combined complexity will be computed as: O(m) + O(v)+ O(1) +O (v) = O(m+v).<\/p>\n<\/li>\n<\/ul>\n<p>Fault Tolerance Algorithm<\/p>\n<p>The fault tolerance mechanism involves the following steps:<\/p>\n<\/li>\n<li>\n                    5.<\/p>\n<p>Detection of faulty VMs:<\/p>\n<p>Checking the status of each VM to detect faults. This step involves iterating through all VMs and can be done in O(k).<\/p>\n<\/li>\n<li>\n                    6.<\/p>\n<p>Redistributing tasks from faulty VMs to healthy VMs:<\/p>\n<p>For f faulty VMs with a maximum of t tasks, the complexity will be O(f\u22c5t). In the worst case, f\u22c5t will be equal to n, so the complexity will be O(n).<\/p>\n<\/li>\n<li>\n                    7.<\/p>\n<p>Updating execution times:<\/p>\n<p>Updating the total execution times for VMs. The worst-case complexity will be O(f.t)= O(n).<\/p>\n<\/li>\n<\/ol>\n<p>Combined Total Complexity<\/p>\n<p>Combining these complexities, the overall complexity of CRFTS is dominated by the sorting phase. Thus, the overall time complexity can be expressed as:<\/p>\n<p>O(1) + O(nlogn) + O(klogk) + O(n+k) + O(1) + O(m+v) + O(m) + O(v) + O(1) + O(v) + O(m+v) + O(k) + O(n) + O(n).<\/p>\n<p>=(nlogn) + O(klogk)<\/p>\n<p>Since n will always be greater than k, the final complexity of CRFTS will be O(nlogn).<\/p>\n","protected":false},"excerpt":{"rendered":"This section explains the proposed CRFTS model, which uses a clustering approach for allocation and the advance reservation&hellip;\n","protected":false},"author":2,"featured_media":389877,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3164],"tags":[135084,3284,51200,8173,130748,3965,7462,3966,135087,135085,135086,59813,70,53,16,15],"class_list":{"0":"post-389876","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-computing","8":"tag-clustering","9":"tag-computing","10":"tag-energy-science-and-technology","11":"tag-engineering","12":"tag-fault-tolerance","13":"tag-humanities-and-social-sciences","14":"tag-mathematics-and-computing","15":"tag-multidisciplinary","16":"tag-qos-parameters","17":"tag-reliability","18":"tag-resource-reservation","19":"tag-scheduling","20":"tag-science","21":"tag-technology","22":"tag-uk","23":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/115130452757691033","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/389876","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=389876"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/389876\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/389877"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=389876"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=389876"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=389876"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}