An evaluation of the graceful degradation properties of real-time schedulers

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

2 Overload Scheduling
Transient overload is deﬁned as temporarily having so much work that some deadline(s) must be missed. This situation can arise from a number of sources, including software redundancy in fault tolerant systems. As discussed in [Thuel, 1993], transient fault tolerance can be achieved through redundant hardware, redundant software, or a combination of both. [Randell, 1975] describes a method of using redundant software for fault recovery called the Recovery Block scheme. This scheme requires multiple versions of a software module and a single acceptance test. When a module ﬁnishes, the results are veriﬁed using the acceptance test. If the results are found to be faulty, another module is tried. If software redundancy is to be used in a real-time system, the designer must ﬁnd a suitable balance between the real-time fault-free schedulability and fault-tolerance. Because transient faults can occur stochastically, the amount of time needed to recover from such a fault or sequence of faults varies
1 Introduction
Real-time scheduling algorithms differ from other scheduling algorithms in that they focus on predictability and worst-case performance guarantees. Getting the best average performance out of a system is not the highest concern. Instead, the scheduler’s goal is to insure that each task always makes its deadline. In 1973, Liu and Layland [Liu, 1973] showed that under certain assumptions the Rate Monotonic (RM) and Earliest Deadline First (EDF) scheduling algorithms were optimal for static and dynamic priority systems respectively. RM and EDF are optimal in that if any static/dynamic-priority scheduling algorithm can insure that no task will miss a deadline, then RM/EDF can make the same assurance. Liu and Layland derived equations for testing the schedulability of any given task set in terms of the sum of worst-case utilizations of each of the tasks. For RM, the sufﬁcient condition is
Abstract Real-time, fault-tolerant systems require schedulers that provide graceful degradation during transient overloads resulting from fault recovery workloads or other system uncertainties. We initially hypothesize that a scheduler ideally suited to this environment should dispatch tasks using only response time criterion as long as all deadlines can be met, and that in the presence of overload the best a scheduler can do is semantic-driven load shedding. By temporarily eliminating less important tasks, the more intelligent scheduling algorithms were expected to service the more important tasks well without harming the non-critical tasks unnecessarily. Experimentally we then show this hypothesis to be false. On-line tracking of system load does not provide enough information about future load to effectively trigger semantic-driven load shedding. Simpler algorithms can insure predictable behavior, but sacriﬁce fault-free schedulability to do so. We show that a compromise can be reached by disallowing non-critical task fault recoveries. We propose a taxonomy for realtime fault-tolerance schedulers and show the trade-offs that arise from their structural differences. To complete the taxonomy, we extend the Myopic Slack Manager algorithm to support semantic priorities. A complete set of schedulers from this taxonomy are implemented on a real system and their performance measured quantitatively using synthetic task sets.
An Evaluation of the Graceful Degradation Properties of Real-Time Schedulers
Michael J. Marucheck and Jay K. Strosnider Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA 15213
1/n Un ≤ n 2 – 1
(0.69 for large n). For EDF the necessary and sufﬁcient condition is U n ≤ 1 . Lehoczky, Sha, and
Ding [Lehoczky, 1989] later developed a necessary and sufﬁcient condition for ﬁxed-priority algorithms and found that for randomly generated task sets the expected maximum utilization was 88%. EDF enables 100% system utilization under Liu and Layland’s assumptions. While RM cannot usually schedule a 100% load, it enjoys greater predictability and slightly smaller overhead. During transient overload conditions, RM guarantees that tasks will fail in a certain order, offering a level of predictability unavailable under EDF. Because of their optimality and relative simplicity, most new algorithms use one or both of these algorithms as a foundation. Liu and Layland’s schedulability tests were developed with the assumption of zero overhead and perfect preemption. In practice, overheads involved in on-line scheduling generally necessitate more sophisticated tests to insure schedulability. One method for handling this, proposed in [Katcher, 1993] involves the use of scheduling models. These models provide an accounting system for all the various overheads and blocking time the system experiences. They can provide Βιβλιοθήκη Baiduighly accurate necessary and sufﬁcient conditions for schedulability even in the presence of overload and blocking. However, the ultimate test for schedulability requires observation of the running system.