Microsoft

Modeling Signals Embedded in a Euclidean Domain

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Graphs are often used to model signals defined on a set of points embedded in a Euclidean domain. Examples are distributed sensor readings, measures of congestion in a transportation network, samples in a feature space, and colors on a 3D point clouds. However, it may be better to model such signals as samples of a Gaussian Process defined on the Euclidean domain. We show, on a 3D point cloud example, that Karhunen Loeve Transforms (KLTs) based on Gaussian Process models can have significantly higher energy compaction and coding gain than KLTs based on sparse graph models. The latter KLTs are known as Graph Transforms; we call the former Gaussian Process Transforms.

Categories: Microsoft

Functions of Code-Switching in Tweets: An Annotation Scheme and Some Initial Experiments

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Code-Switching (CS) is very common among multilinguals who switch between two or more languages when communicating or having a dialogue with each other. People have not constrained CS to just spoken form but also have introduced this concept to written text. Due to the popularity of social-media, people have used this platform to perform CS in the text form. This gave rise to the need of computational processing of the code-switched data. In this study, we focus on CS between English and Hindi in the Twitter corpus which is an informal text. With the help of this data, we have done a detailed linguistic study of various aspects of CS. For understanding, processing, and generation of code-switched data, we need annotated code-switched data. Hence, in this paper, we present an annotation scheme for annotating the functions of CS in Hindi-English (Hi-En) code-switched tweets and we also present some initial experiments. In this effort, we are focussing on CS in text data from social-media whereas earlier studies have focused on CS in spoken data from a small number of speakers.

Categories: Microsoft

Case The Bones of the System: A Study of Logging and Telemetry at Microsoft

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Large software organizations are transitioning to event data platforms as they culturally shift to better support data-driven decision making. This paper offers a case study at Microsoft during such a transition. Through qualitative interviews of 28 participants, and a quantitative survey of 1,823 respondents, we catalog a diverse set of activities that leverage event data sources, identify challenges in conducting these activities, and describe tensions that emerge in data-driven cultures as event data flow through these activities within the organization. We find that the use of event data span every job role in our interviews and survey, that different perspectives on event data create tensions between roles or teams, and that professionals report social and technical challenges across activities.

Categories: Microsoft

Automated Synthesis and Analysis of Switching Gene Regulatory Networks

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Studying the gene regulatory networks (GRNs) that govern how cells change into specific cell types with unique roles throughout development is an active area of experimental research. The fate specification process can be viewed as a program prescribing the system dynamics, governed by a network of genetic interactions. To investigate the possibility that GRNs are not fixed but rather change their topology, for example as cells progress through commitment, we introduce the concept of Switching Gene Regulatory Networks (SGRNs) to enable the modelling and analysis of network reconfiguration. We define the synthesis problem of constructing SGRNs that are guaranteed to satisfy a set of constraints representing experimental observations of cell behaviour. We propose a solution to this problem that employs methods based upon Satisfiability Modulo Theories (SMT) solvers, and evaluate the feasibility and scalability of our approach by considering a set of synthetic benchmarks exhibiting possible biological behaviour of cell development. We outline how our approach is applied to a more realistic biological system, by considering a simplified network involved in the processes of neuron maturation and fate specification in the mammalian cortex.

Categories: Microsoft

Improved bounded-strength decoupling schemes for local Hamiltonians

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

We address the task of switching off the Hamiltonian of a system by removing all internal and system-environment couplings. We propose dynamical decoupling schemes, that use only bounded-strength controls, for quantum many-body systems with local system Hamiltonians and local environmental couplings. To do so, we introduce the combinatorial concept of balanced-cycle orthogonal arrays (BOAs) and show how to construct them from classical error-correcting codes. The derived decoupling schemes may be useful as a primitive for more complex schemes, e.g., for Hamiltonian simulation. For the case of n qubits and a 2-local Hamiltonian, the length of the resulting decoupling scheme scales as O(n log(n)), improving over the previously best-known schemes that scaled quadratically with n. More generally, using balanced-cycle orthogonal arrays constructed from families of BCH codes, we show that bounded-strength decoupling for any local Hamiltonian can be achieved.

Categories: Microsoft

Information Flows in Encrypted Databases

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

In encrypted databases, sensitive data is protected from an untrusted server by encrypting columns using partially homomorphic encryption schemes, and storing encryption keys in a trusted client. However, encrypting columns and protecting encryption keys does not ensure confidentiality - sensitive data can leak during query processing due to information flows through the trusted client. In this paper, we propose SecureSQL, an encrypted database that partitions query processing between an untrusted server and a trusted client while ensuring the absence of information flows. Our evaluation based on OLTP benchmarks suggests that SecureSQL can protect against explicit flows with low overheads (< 30%). However, protecting against implicit flows can be expensive because it precludes the use of key databases optimizations and introduces additional round trips between client and server.

Categories: Microsoft

Networks of Gratitude: Structures of Thanks and User Expectations in Workplace Appreciation Systems

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Appreciation systems―platforms for users to exchange thanks and praise―are becoming common in the workplace, where employees share appreciation, managers are notified, and aggregate scores are sometimes made visible. Who do people thank on these systems, and what do they expect from each other and their managers? After introducing the design affordances of 13 appreciation systems, we discuss a system we call Gratia, in use at a large multinational company for over four years. Using logs of 422,000 appreciation messages and user surveys, we explore the social dynamics of use and ask if use of the system addresses the recognition problem. We find that while thanks is mostly exchanged among employees at the same level and different parts of the company, addressing the recognition problem, managers do not always act on that recognition in ways that employees expect.

Categories: Microsoft

BoolTraineR: training asynchronous Boolean models using single-cell expression data

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Categories: Microsoft

Synthesizing Signaling Pathways from Temporal Phosphoproteomic Data

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Categories: Microsoft

Universal Models of Multivariate Temporal Point Processes

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

With the rapidly increasing availability of event stream data there is growing interest in multivariate temporal point process models to capture both qualitative and quantitative features of this type of data. Recent research on multivariate point processes have focused in inference and estimation problems for restricted classes of models such as continuous time Bayesian networks, Markov jump processes, Gaussian Cox processes, and Hawkes Processes. In this paper, we study the expressive power and learnability of Graphical Event Models (GEMs) --- the analogue of directed graphical models for multivariate temporal point processes. In particular, we describe a set of Graphical Event Models (GEMs) and show that this class can universally approximate any smooth multivariate temporal point process. We also describe a universal learning algorithm for this class of GEMs and show, under a mild set of assumptions, learnability results for both the dependency structures and distributions in this class. Our consistency results demonstrate the possibility of learning about both qualitative and quantitative dependencies from rich event stream data.

Categories: Microsoft

Surviving an "Eternal September" — How an Online Community Managed a Surge of Newcomers

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

We present a qualitative analysis of interviews with participants in the NoSleep community within Reddit where millions of fans and writers of horror fiction congregate. We explore how the community handled a massive, sudden, and sustained increase in new members. Although existing theory and stories like Usenet's infamous "Eternal September" suggest that large influxes of newcomers can hurt online communities, our interviews suggest that NoSleep survived without major incident. We propose that three features of NoSleep allowed it to manage the rapid influx of newcomers gracefully: (1) an active and well-coordinated group of administrators, (2) a shared sense of community which facilitated community moderation, and (3) technological systems that mitigated norm violations. We also point to several important trade-offs and limitations.

Categories: Microsoft

Journeys & Notes — Designing Social Computing for Non-Places

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

In this work we present a mobile application we designed and engineered to enable people to log their travels near and far, leave notes behind, and build a community around spaces in between destinations. Our design explores new ground for location-based social computing systems, identifying opportunities where these systems can foster the growth of on-line communities rooted at non-places. In our work we develop, explore, and evaluate several innovative features designed around four usage scenarios: daily commuting, long-distance traveling, quantified traveling, and journaling. We present the results of two small-scale user studies, and one large-scale, world-wide deployment, synthesizing the results as potential opportunities and lessons learned in designing social computing for non-places.

Categories: Microsoft

in Study Neurotics Can't Focus: An situ of Online Multitasking in the Workplace

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

In HCI research, attention has focused on understanding external influences on workplace multitasking. We explore instead how multitasking might be influenced by individual factors: personality, stress, and sleep. Forty information workers' online activity was tracked over two work weeks. The median duration of online screen focus was 40 seconds. The personality trait of Neuroticism was associated with shorter online focus duration and Impulsivity-Urgency was associated with longer online focus duration. Stress and sleep duration showed trends to be inversely associated with online focus. Shorter focus duration was associated with lower assessed productivity at day's end. Factor analysis revealed a factor of lack of control which significantly predicts multitasking. Our results suggest that there could be a trait for distractibility where some individuals are susceptible to online attention shifting in the workplace. Our results have implications for information systems (e.g. educational systems, game design) where attention focus is key.

Categories: Microsoft

Email Duration, Batching and Self-interruption: Patterns of Email Use on Productivity and Stress

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

While email provides numerous benefits in the workplace, it is unclear how patterns of email use might affect key workplace indicators of productivity and stress. We investigate how three email use patterns: duration, interruption habit, and batching, relate to perceived workplace productivity and stress. We tracked email usage with computer logging, biosensors and daily surveys for 40 information workers in their in situ workplace environments for 12 workdays. We found that the longer daily time spent on email, the lower was perceived productivity and the higher the measured stress. People who primarily check email through self-interruptions report higher productivity with longer email duration compared to those who rely on notifications. Batching email is associated with higher rated productivity with longer email duration, but despite widespread claims, we found no evidence that batching email leads to lower stress. We discuss the implications of our results for improving organizational email practices.

Categories: Microsoft

Online Mobile Micro-Task Allocation in Spatial Crowdsourcing

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

With the rapid development of smartphones, spatial crowdsourcing platforms are getting popular. A foundational research of spatial crowdsourcing is to allocate micro-tasks to suitable crowd workers. Most existing studies focus on ofﬂine scenarios, where all the spatiotemporal information of micro-tasks and crowd workers is given. However, they are impractical since micro-tasks and crowd workers in real applications appear dynamically and their spatiotemporal information cannot be known in advance. In this paper, to address the shortcomings of existing ofﬂine approaches, we ﬁrst identify a more practical micro-task allocation problem, called the Global Online Micro-task Allocation in spatial crowdsourcing (GOMA) problem. We ﬁrst extend the state-of-art algorithm for the online maximum weighted bipartite matching problem to the GOMA problem as the baseline algorithm. Although the baseline algorithm provides theoretical guarantee for the worst case, its average performance in practice is not good enough since the worst case happens with a very low probability in real world. Thus, we consider the average performance of online algorithms, a.k.a. online random order model. We propose a two-phase-based framework, based on which we present the TGOA algorithm with 1/4-competitive ratio under the online random order model. To improve its efﬁciency, we further design the TGOA-Greedy algorithm following the framework, which runs faster than the TGOA algorithm but has lower competitive ratio of 1/8. Finally, we verify the effectiveness and efﬁciency of the proposed methods through extensive experiments on real and synthetic datasets.

Categories: Microsoft

Cinderella: Turning Shabby X.509 Certificates into Elegant Anonymous Credentials with the Magic of Verifiable Computation

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Despite advances in security engineering, authentication in applications such as email and the Web still primarily relies on the X.509 public key infrastructure introduced in 1988. This PKI has many issues but is nearly impossible to replace. Leveraging recent progress in verifiable computation, we propose a novel use of existing X.509 certificates and infrastructure. Instead of receiving & validating chains of certificates, our applications receive & verify proofs of their knowledge, their validity, and their compliance with application policies. This yields smaller messages (by omitting certificates), stronger privacy (by hiding certificate contents), and stronger integrity (by embedding additional checks, e.g. for revocation). X.509 certificate validation is famously complex and errorprone, as it involves parsing ASN.1 data structures and interpreting them against diverse application policies. To manage this diversity, we propose a new format for writing application policies by composing X.509 templates, and we provide a template compiler that generates C code for validating certificates within a given policy. We then use the Geppetto cryptographic compiler to produce a zero-knowledge verifiable computation scheme for that policy. To optimize the resulting scheme, we develop new C libraries for RSA-PKCS#1 signatures and ASN.1 parsing, carefully tailored for cryptographic verifiability. We evaluate our approach by providing two real-world applications of verifiable computation: a drop-in replacement for certificates within TLS; and access control for the Helios voting protocol. For TLS, we support fine-grained validation policies, with revocation checking and selective disclosure of certificate contents, effectively turning X.509 certificates into anonymous credentials. For Helios, we obtain additional privacy and verifiability guarantees for voters equipped with X.509 certificates, such as those readily available from some national ID cards.

Categories: Microsoft

Shifts to Suicidal Ideation from Mental Health Content in Social Media

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

History of mental illness is a major factor behind suicide risk and ideation. However research efforts toward characterizing and forecasting this risk is limited due to the paucity of information regarding suicide ideation, exacerbated by the stigma of mental illness. This paper fills gaps in the literature by developing a statistical methodology to infer which individuals could undergo transitions from mental health discourse to suicidal ideation. We utilize semi-anonymous support communities on Reddit as unobtrusive data sources to infer the likelihood of these shifts. We develop language and interactional measures for this purpose, as well as a propensity score matching based statistical approach. Our approach allows us to derive distinct markers of shifts to suicidal ideation. These markers can be modeled in a prediction framework to identify individuals likely to engage in suicidal ideation in the future. We discuss societal and ethical implications of this research.

Categories: Microsoft

Understanding Conversational Programmers: A Perspective from the Software Industry

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Recent research suggests that some students learn to program with the goal of becoming conversational programmers: they want to develop programming literacy skills not to write code in the future but mainly to develop conversational skills and communicate better with developers and to improve their marketability. To investigate the existence of such a population of conversational programmers in practice, we surveyed professionals at a large multinational technology company who were not in software development roles. Based on 3151 survey responses from professionals who never or rarely wrote code, we found that a significant number of them (42.6%) had invested in learning programming on the job. While many of these respondents wanted to perform traditional end-user programming tasks (e.g., data analysis), we discovered that two top motivations for learning programming were to improve the efficacy of technical conversations and to acquire marketable skillsets. The main contribution of this work is in empirically establishing the existence and characteristics of conversational programmers in a large software development context.

Categories: Microsoft

Yaq: Efficient Queue Management for Cluster Scheduling

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

Categories: Microsoft

Microsoft Touch Develop and the BBC micro:bit

Microsoft Research Publications - Sun, 05/01/2016 - 09:00

The chance to influence the lives of a million children does not come often. Through a partnership between the BBC and several technology companies, a small instructional computing device called the BBC micro:bit will be given to a million children in the UK in 2016. Moreover, using the micro:bit will be part of the CS curriculum. We describe how Microsoft's Touch Develop programming platform works with the BBC micro:bit. We describe the design and architecture of the micro:bit and the software engineering hurdles that had to be overcome to ensure it was as accessible as possible to children and teachers. The combined hardware/software platform is evaluated and early anecdotal evidence is presented. A video about the micro:bit is available at http://aka.ms/bbcmicrobit.

Categories: Microsoft

Navigation

Geo Tracker

User login

Microsoft

Modeling Signals Embedded in a Euclidean Domain

Functions of Code-Switching in Tweets: An Annotation Scheme and Some Initial Experiments

Case The Bones of the System: A Study of Logging and Telemetry at Microsoft

Automated Synthesis and Analysis of Switching Gene Regulatory Networks

Improved bounded-strength decoupling schemes for local Hamiltonians

Information Flows in Encrypted Databases

Networks of Gratitude: Structures of Thanks and User Expectations in Workplace Appreciation Systems

BoolTraineR: training asynchronous Boolean models using single-cell expression data

Synthesizing Signaling Pathways from Temporal Phosphoproteomic Data

Universal Models of Multivariate Temporal Point Processes

Surviving an "Eternal September" — How an Online Community Managed a Surge of Newcomers

Journeys & Notes — Designing Social Computing for Non-Places

in Study Neurotics Can't Focus: An situ of Online Multitasking in the Workplace

Email Duration, Batching and Self-interruption: Patterns of Email Use on Productivity and Stress

Online Mobile Micro-Task Allocation in Spatial Crowdsourcing

Cinderella: Turning Shabby X.509 Certificates into Elegant Anonymous Credentials with the Magic of Verifiable Computation

Shifts to Suicidal Ideation from Mental Health Content in Social Media

Understanding Conversational Programmers: A Perspective from the Software Industry

Yaq: Efficient Queue Management for Cluster Scheduling

Microsoft Touch Develop and the BBC micro:bit

Search

Twitter

RSS Feedburner

Add This

Dilbert...

SkyDrive

Security