Blogroll

Predicting Post-Operative Visual Acuity for LASIK Surgeries

Microsoft Research Publications - Fri, 04/01/2016 - 09:00
LASIK (Laser-Assisted in SItu Keratomileusis) surgeries have been quite popular for treatment of myopia (nearsightedness), hyperopia (farsightedness) and astigmatism over the past two decades. In the past decade, over 10 million LASIK procedures had been performed in the United States alone with an average cost of approximately $2000 USD per surgery. While 99% of such surgeries are successful, the commonest side effect is a residual refractive error and poor uncorrected visual acuity (UCVA). In this work, we aim at predicting the UCVA post LASIK surgery. We model the task as a regression problem and use the patient demography and pre-operative examination details as features. To the best of our knowledge, this is the first work to systematically explore this critical problem using machine learning methods. Further, LASIK surgery settings are often determined by practitioners using manually designed rules. We explore the possibility of determining such settings automatically to optimize for the best post-operative UCVA by including such settings as features in our regression model. Our experiments on a dataset of 791 surgeries provides an RMSE (root mean square error) of 0.102, 0.094 and 0.074 for the predicted post-operative UCVA after one day, one week and one month of the surgery respectively.
Categories: Microsoft

High-Density Image Storage Using Approximate Memory Cells

Microsoft Research Publications - Fri, 04/01/2016 - 09:00
This paper proposes tailoring image encoding for an approximate storage substrate, developing an approximation-aware encoding algorithm. We develop a methodology to determine relative importance of encoded bits and store them in an approximate storage substrate that we tune to match their error tolerance. We present a case study with the progressive transform codec (PTC), a precursor to JPEG XR, and a phase-change memory (PCM) storage substrate that is optimized to minimize errors via biasing and tuned via selective error correction to different error rate levels. This enables effective use of approximate storage for images, resulting in over 2.7x increase in density of pixels per silicon volume that is additive to PTC storage savings.
Categories: Microsoft

A DNA-Based Archival Storage System

Microsoft Research Publications - Fri, 04/01/2016 - 09:00
Demand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up. Using DNA to archive data is an attractive possibility because it is extremely dense, with a raw limit of 1 exabyte per cubic millimeter, and long-lasting, with observed half-life of over 500 years. This paper presents an architecture for a DNA-backed archival storage system. It is structured as a key-value store, and leverages common biochemical techniques to provide random access. We also propose a new encoding scheme that offers controllable redundancy, trading off reliability for density.
Categories: Microsoft

Where Can I Buy a Boulder? Searching for Offline Retail Locations

Microsoft Research Publications - Fri, 04/01/2016 - 09:00
People commonly need to purchase things in person, from large garden supplies to home decor. Although modern search systems are very effective at finding online products, little research attention has been paid to helping users find places that sell a specific product offline. For instance, users searching for an apron are not typically directed to a nearby kitchen store by a standard search engine. In this paper, we investigate "where can I buy"-style queries related to in-person purchases of products and services. Answering these queries is challenging since little is known about the range of products sold in many stores, especially those which are smaller in size. To better understand this class of queries, we first present an in-depth analysis of typical offline purchase needs as observed by a major search engine, producing an ontology of such needs. We then propose ranking features for this new problem, and learn a ranking function that returns stores most likely to sell a queried item or service, even if there is very little information available online about some of the stores. Our final contribution is a new evaluation framework that combines distance with store relevance in measuring the effectiveness of such a search system. We evaluate our method using this approach and show that it outperforms a modern web search engine.
Categories: Microsoft

Exploring limits to prediction in complex social systems: Predicting cascade size on Twitter

Microsoft Research Publications - Fri, 04/01/2016 - 09:00
How predictable is success in complex social systems? In spite of a recent profusion of prediction studies that exploit online social and information network data, this question remains unanswered, in part because it has not been adequately specified. In this paper we attempt to clarify the question by presenting a simple stylized model of success that attributes prediction error to one of two generic sources: insufficiency of available data and/or models on the one hand; and inherent unpredictability of complex social systems on the other. We then use this model to motivate an illustrative empirical study of information cascade size prediction on Twitter. Despite an unprecedented volume of information about users, content, and past performance, our best performing models can explain less than half of the variance in cascade sizes. In turn, this result suggests that even with unlimited data predictive performance would be bounded well below deterministic accuracy. Finally, we explore this potential bound theoretically using simulations of a diffusion process on a random scale free network similar to Twitter. We show that although higher predictive power is possible in theory, such performance requires a homogeneous system and perfect ex-ante knowledge of it: even a small degree of uncertainty in estimating product quality or slight variation in quality across products leads to substantially more restrictive bounds on predictability. We conclude that realistic bounds on predictive accuracy are not dissimilar from those we have obtained empirically, and that such bounds for other complex social systems for which data is more difficult to obtain are likely even lower.
Categories: Microsoft

Toward Full Elasticity in Distributed Static Analysis

Microsoft Research Publications - Wed, 03/23/2016 - 09:00
In this paper we present the design and implementation of an elastic static analysis framework that is designed to scale with the size of the input. Our approach is based on the actor programming model and is deployed in the cloud. This provides a degree of elasticity for CPU, memory and storage resources. To demonstrate the potential of our technique, we show how a typical call graph analysis can be implemented. We experimentally validate the analysis using a combination of both synthetic and real benchmarks. The results show that our analysis scales well in terms of memory pressure independently of the input size, as we add more machines. Despite using stock hardware and incurring a non-trivial communication overhead, our processing time for projects of close to 1M LOC is about 15 minutes. As the number of machines increases, we show that the analysis time does not suffer. Lastly, we demonstrate that querying the results can be performed with a median latency of well under 20 ms.
Categories: Microsoft

XFabric: A Reconfigurable In-Rack Network for Rack-Scale Computers

Microsoft Research Publications - Wed, 03/16/2016 - 09:00
Rack-scale computers are dense clusters with hundreds of micro-servers per rack. Designed for data center workloads, they can have significant power, cost and performance benefits over current racks. The rack network can be distributed, with small packet switches embedded on each processor as part of a system-on-chip (SoC) design. Ingress/egress traffic is forwarded by SoCs that have direct uplinks to the data center. Such fabrics are not fully provisioned and the chosen topology and uplink placement impacts performance for different workloads. XFabric is a rack-scale network that reconfigures the topology and uplink placement using a circuit-switched physical layer over which SoCs perform packet switching. To satisfy tight power and space requirements in the rack, XFabric does not use a single large circuit switch, instead relying on a set of independent smaller circuit switches. This introduces partial reconfigurability, as some ports in the rack cannot be connected by a circuit. XFabric optimizes the physical topology and manages uplinks, efficiently coping with partial reconfigurability. It significantly outperforms static topologies and has a performance similar to fully reconfigurable fabrics. We demonstrate the benefits of XFabric using flow-based simulations and a prototype built with electrical crosspoint switch ASICs.
Categories: Microsoft

Video Cube

Microsoft Research Downloads - Wed, 03/16/2016 - 09:00
VideoCube allows one to load an AVI movie file as a volume, and play back the movie sampling space and time in different ways. It also provides a single cutting plane for interactively viewing single spacetime slices of the video.
Categories: Microsoft

StereoMatcher

Microsoft Research Downloads - Wed, 03/16/2016 - 09:00
StereoMatcher is an implementation of some commonly used two-frame stereo matching algorithms. It also contains code to evaluate the quality of a computed depth map relative to a ground truth image.
Categories: Microsoft

Pan - Source

Microsoft Research Downloads - Wed, 03/16/2016 - 09:00
Pan is an experimental embedded language and compiler for image synthesis and manipulation, based on principles from functional programming.
Categories: Microsoft

Pan - Components

Microsoft Research Downloads - Wed, 03/16/2016 - 09:00
Pan is an experimental embedded language and compiler for image synthesis and manipulation, based on principles from functional programming.
Categories: Microsoft

Mppt

Microsoft Research Downloads - Wed, 03/16/2016 - 09:00
Mppt is an add-in for PowerPoint that allows a presenter to multicast PowerPoint slides, including animations and effects, to a group of viewers.
Categories: Microsoft

Mping

Microsoft Research Downloads - Wed, 03/16/2016 - 09:00
Mping is a simple command line application that sends and receives multicast packets.
Categories: Microsoft

JCLUSTER

Microsoft Research Downloads - Wed, 03/16/2016 - 09:00
JCLUSTER is a fast simple clustering program that produces hierarchical, binary branching, tree structured clusters.
Categories: Microsoft

Functional Reactive Animation

Microsoft Research Downloads - Wed, 03/16/2016 - 09:00
Fran is a Haskell library (or "embedded language") for interactive animations with 2D and 3D graphics and sound.
Categories: Microsoft

SizeCap: Coordinating Energy Storage Sizing and Power Capping for Fuel Cell Powered Data Centers

Microsoft Research Publications - Sat, 03/12/2016 - 10:00
Fuel cells are a promising power source for future data centers, offering high energy efficiency, low greenhouse gas emissions, and high reliability. However, due to mechanical limitations related to fuel delivery, fuel cells are slow to adjust to sudden increases in data center power demands, which can result in temporary power shortfalls. To mitigate the impact of power shortfalls, prior work has proposed to either perform power capping by throttling the servers, or by leveraging energy storage devices (ESDs) that can temporarily provide enough power to make up for the shortfall while the fuel cells ramp up power generation. Both approaches have disadvantages: power capping conservatively limits server performance and can lead to service level agreement (SLA) violations, while ESD-only solutions must significantly overprovision energy storage capacity to tolerate the shortfalls caused by worst-case (i.e., largest) power surges, which greatly increases the total cost of ownership (TCO). We propose SizeCap, the first ESD sizing framework for fuel cell powered data centers, which coordinates ESD sizing with power capping to allow data centers to employ a cost-effective solution to power shortfalls. SizeCap sizes the ESD just large enough to cover the majority of power surges, but not the worst-case surges that occur infrequently, to greatly reduce TCO. It then uses the smaller capacity ESD in conjunction with power capping to cover the power shortfalls caused by worst-case power surges. As part of our new flexible framework, we propose multiple power capping policies with different degrees of awareness of fuel cell and workload behavior, and evaluate their impact on workload performance and ESD size. Using traces from production data center systems, we demonstrate that SizeCap significantly reduces the ESD size (by 85% for a workload with infrequent yet large power surges, and by 50% for a workload with frequent power surges) without violating any SLAs.
Categories: Microsoft
Syndicate content

eXTReMe Tracker