Publications

Comet: An active distributed key/value store

Roxana Geambasu, Amit Levy, Tadayoshi Kohno, Arvind Krishnamurthy, Henry M. Levy

Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI), October 2010

Stable Deterministic Multithreading through Schedule Memoization

Heming Cui, Jingyue Wu, Chia-Che Tsai, Junfeng Yang

Proceedings of the Ninth Symposium on Operating Systems Design and Implementation (OSDI '10), October 2010

Scalable and Systematic Detection of Buggy Inconsistencies in Source Code

Mark Gabel, Junfeng Yang, Yuan Yu, Moises Goldszmidt, Zhendong Su

Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA '10), October 2010

Bypassing Races in Live Applications with Execution Filters

Jingyue Wu, Heming Cui, Junfeng Yang

Proceedings of the Ninth Symposium on Operating Systems Design and Implementation (OSDI '10), October 2010

CPU Scheduling with Automatic Interactivity and Dependency Detection

Haoqiang Zheng

Ph.D. Thesis, Department of Computer Science, Columbia University, August 2010

Recent trends in virtualization and server consolidation have expanded the number of appli- cations with different resource requirements and quality-of-service demands being run on the same system. Users expect computers not only to run these different applications, but to be able to run them all at the same time. A key challenge is how to ensure that the sys- tem provides acceptable interactive responsiveness to users while multiplexing resources among a diverse collection of applications. However, identifying interactive processes and scheduling them promptly are not easy because the latency sensitivity of a process in mod- ern computer systems may vary dynamically based on the user request it is processing and the user requests that depend on this process directly and indirectly. Most commod- ity operating systems either allow users to specify the latency sensitivity of a process or try to detect interactive latency-sensitive processes based on processor resource usage and sleeping behavior. These are generally ineffective. This dissertation introduces RSIO. It detects latency-sensitive processes by monitor- ing accesses to I/O channels and inferring when user interactions occur. RSIO provides a general mechanism for all user interactions, including direct interactions via local HCI devices such as mouse and keyboard, indirect interactions through middleware, and remote interactions through networks. It automatically and dynamically identifies processes in- volved in a user interaction and boosts their priorities at the time the interaction occurs to improve system response time. RSIO detects processes that directly handle a user in- teraction as well as those indirectly involved in processing the interaction, automatically accounting for dependencies and boosting their priorities accordingly. RSIO works with existing schedulers, processes that may mix interactive and batch activities, and requires no application modifications to identify periods of latency-sensitive application activity. Even when a process is detected as latency-sensitive and its priority is boosted, the process may still not be scheduled promptly because of a problem known as priority in- version, which happens when a high priority process blocks waiting for the response from a low priority process. Without knowing the dependency among the processes, the CPU scheduler may schedule a medium priority process to run, and thus effectively delay the execution of the high priority process. We have developed SWAP to address the prior- ity inversion problems caused by inter-process dependencies. SWAP can automatically determine possible resource dependencies among processes based on process system call history. Because some dependencies cannot be precisely determined, SWAP associates confidence levels with dependency information that are dynamically adjusted using feed- back from process blocking behavior. SWAP can schedule processes using this imprecise dependency information in a manner that is compatible with existing scheduling mecha- nisms and ensures that actual scheduling behavior corresponds to the desired scheduling policy in the presence of process dependencies. Our results show that SWAP can provide substantial improvements in system performance in scheduling processes with dependen- cies. As CPU schedulers are complicated to develop and increasingly important with the introduction of multi-core systems, we also introduce WARP, which is a new scheduler de- velopment and evaluation platform which facilitated our solutions. WARP is a trace-driven virtualized CPU scheduler execution environment that can dramatically simplify and speed the development and evaluation of CPU schedulers, including SWAP and RSIO. It is easy to use as it can run unmodified kernel scheduling code in user-space and can be used with standard user-space debugging and performance monitoring tools. We have implemented a WARP Linux prototype. Our results show that it can use application traces captured from its toolkit to accurately reflect the scheduling behavior of the real Linux operating system. Executing an application trace using WARP can be two orders of magnitude faster than running real applications.

PDF

Linux-CR: Transparent Application Checkpoint-Restart in Linux

Oren Laadan, Serge E. Hallyn

Proceedings of the 12th Annual Linux Symposium, July 2010

Abstract

PDF

KVM for ARM

Christoffer Dall, Jason Nieh

Proceedings of the 12th Annual Linux Symposium, July 2010

Abstract

PDF

Guaranteeing Performance through Fairness in Peer-to-Peer File-Sharing and Streaming Systems

Alex Sherman

Ph.D. Thesis, Department of Computer Science, Columbia University, July 2010

Abstract

Over the past decade, Peer-to-Peer (P2P) file-sharing and streaming systems have evolved as a cheap and effective technology in distributing content to users. Guar- anteeing a level of performance in P2P systems is, therefore, of utmost importance. However, P2P file-sharing and streaming applications suffer from a fundamental prob- lem of unfairness, where many users have a tendency to free-ride by contributing little or no upload bandwidth while consuming much download bandwidth. By taking away an unfair share of resources, free-riders deteriorate the quality of service experienced by other users, by causing slower download times in P2P file-sharing networks and higher stream updatesâ€™ miss rates in P2P streaming networks. Previous attempts at addressing fair bandwidth allocation in P2P, such as BitTorrent-like systems, suf- fer from slow peer discovery, inaccurate predictions of neighboring peersâ€™ bandwidth allocations, under-utilization of bandwidth, and complex parameter tuning. We present FairTorrent, a new deficit-based distributed algorithm that accurately rewards peers in accordance with their contribution in a file-sharing P2P system. In a nutshell, a FairTorrent peer uploads the next data block to a peer to whom it owes the most data. FairTorrent is resilient to exploitation by free-riders and strategic peers, is simple to implement, requires no bandwidth over-allocation, no prediction of peersâ€™ rates, no centralized control, and no parameter tuning. We implemented FairTorrent in a BitTorrent client without modifications to the BitTorrent protocol, and evaluated its performance against other widely-used BitTorrent clients using various scenarios including live BitTorrent swarms. Our results show that FairTorrent provides up to two orders of magnitude better fairness, up to five times better download performance for contributing peers, and 60-100% better performance on average in live BitTorrent swarms. We show analytically that for a number of upload capacity distributions, in an n-node FairTorrent network, no peer is ever owed more than O(log n) data blocks with high probability. Achieving fair bandwidth allocation in a P2P streaming scenario is even more difficult, as it comes with an additional constraint: each stream update must be received before its playback deadline. P2P live streaming systems require global re- source over-provisioning to deliver adequate streaming performance. When there is not enough bandwidth to accommodate all users for a particular stream, such as due to free-riders or low-contributing peers, all users, including high-contributing peers, observe poor performance. We present FairStream, a new P2P streaming system that delivers a good quality stream to peers that upload data at a rate above the stream rate, even in the presence of free-riders or malicious users. FairStream achieves this with three mechanisms. First, it provides a new peer reply policy framework that enables file sharing incentive mechanisms to be adapted for streaming. Second, it uses this framework to incorporate a deficit-based peer reply policy that enables each peer to reply first to the neighbor to whom it owes the most data as measured by a deficit counter. Third, it introduces a collusion-resistant mechanism to ensure ef- fective data distribution of a stream despite a large fraction of free-riders who do not forward received data. We prove that FairStream is resilient to free-riders and rewards peers with streaming performance correlated with their contributions. We have also implemented FairStream as a BitTorrent client and evaluated its perfor- mance against other popular streaming systems. Our results on PlanetLab show that FairStream, similar to other systems, provides good quality streaming performance when resources are over-provisioned, but it also provides orders of magnitude better streaming performance for peers uploading above the stream rate when resources are constrained, in the presence of free-riders and low-contributing peers.

Publications from 2010

Comet: An active distributed key/value store

Abstract

PDF

Stable Deterministic Multithreading through Schedule Memoization

Abstract

PDF

Scalable and Systematic Detection of Buggy Inconsistencies in Source Code

Abstract

PDF

Bypassing Races in Live Applications with Execution Filters

Abstract

PDF

CPU Scheduling with Automatic Interactivity and Dependency Detection

Abstract

PDF

Linux-CR: Transparent Application Checkpoint-Restart in Linux

Abstract

PDF

KVM for ARM

Abstract

PDF

Guaranteeing Performance through Fairness in Peer-to-Peer File-Sharing and Streaming Systems

Abstract

PDF

Apiary: Easy-to-Use Desktop Application Fault Containment on Commodity Operating Systems

Abstract

PDF

RSIO: Automatic User Interaction Detection and Scheduling

Abstract

PDF

Archive