Microsoft
Password Guidance
This paper provides Microsoft’s recommendations for password management based on current research and lessons from our own experience as one of the largest Identity Providers (IdPs) in the world. It covers recommendations for end users and identity administrators. Microsoft sees over 10 million username/password pair attacks every day. This gives us a unique vantage point to understand the role of passwords in account takeover. The guidance in this paper is scoped to users of Microsoft’s identity platforms (Azure Active Directory, Active Directory, and Microsoft account) though it generalizes to other platforms.
Categories: Microsoft
Deep Roots: Improving CNN Efficiency with Hierarchical Filter Groups
We propose a new method for training computationally efficient and compact convolutional neural networks (CNNs) using a novel sparse connection structure that resembles a tree root. Our sparse connection structure facilitates a significant reduction in computational cost and number of parameters of state-of-the-art deep CNNs without compromising accuracy. We validate our approach by using it to train more efficient variants of state-of-the-art CNN architectures, evaluated on the CIFAR10 and ILSVRC datasets. Our results show similar or higher accuracy than the baseline architectures with much less compute, as measured by CPU and GPU timings. For example, for ResNet 50, our model has 40% fewer parameters, 45% fewer floating point operations, and is 31% (12%) faster on a CPU (GPU). For the deeper ResNet 200 our model has 25% fewer floating point operations and 44% fewer parameters, while maintaining state-of-the-art accuracy. For GoogLeNet, our model has 7% fewer parameters and is 21% (16%) faster on a CPU (GPU).
Categories: Microsoft
Measuring Neural Net Robustness with Constraints
Despite having high accuracy, neural nets have been shown to be susceptible to adversarial examples, where a small perturbation to an input can cause it to become mislabeled. We propose metrics for measuring the robustness of a neural net and devise a novel algorithm for approximating these metrics based on an encoding of robustness as a linear program. We show how our metrics can be used to evaluate the robustness of deep neural nets with experiments on the MNIST and CIFAR-10 datasets. Our algorithm generates more informative estimates of robustness metrics compared to estimates based on existing algorithms. Furthermore, we show how existing approaches to improving robustness “overfit” to adversarial examples generated using a specific algorithm. Finally, we show that our techniques can be used to additionally improve neural net robustness both according to the metrics that we propose, but also according to previously proposed metrics.
Categories: Microsoft
Things We Own Together: Sharing Possessions at Home
Sharing is an important facet of human relationships, yet there is a lack of research on how people share ownership of possessions. This paper reports on a study that investigates shared ownership of physical and digital possessions through interviews with couples and families in 13 households. We offer a more nuanced definition of shared ownership and show that certain practices, which are central to sharing physical objects, are not supported in the sharing of digital content. We suggest potential approaches to address this, focusing in particular on how the sharing of possessions plays a role in the building of relationships and is done against a backdrop of trust.
Categories: Microsoft
Modeling Signals Embedded in a Euclidean Domain
Graphs are often used to model signals defined on a set of points embedded in a Euclidean domain. Examples are distributed sensor readings, measures of congestion in a transportation network, samples in a feature space, and colors on a 3D point clouds. However, it may be better to model such signals as samples of a Gaussian Process defined on the Euclidean domain. We show, on a 3D point cloud example, that Karhunen Loeve Transforms (KLTs) based on Gaussian Process models can have significantly higher energy compaction and coding gain than KLTs based on sparse graph models. The latter KLTs are known as Graph Transforms; we call the former Gaussian Process Transforms.
Categories: Microsoft
Functions of Code-Switching in Tweets: An Annotation Scheme and Some Initial Experiments
Code-Switching (CS) is very common among multilinguals who switch between two or more languages when communicating or having a dialogue with each other. People have not constrained CS to just spoken form but also have introduced this concept to written text. Due to the popularity of social-media, people have used this platform to perform CS in the text form. This gave rise to the need of computational processing of the code-switched data. In this study, we focus on CS between English and Hindi in the Twitter corpus which is an informal text. With the help of this data, we have done a detailed linguistic study of various aspects of CS. For understanding, processing, and generation of code-switched data, we need annotated code-switched data. Hence, in this paper, we present an annotation scheme for annotating the functions of CS in Hindi-English (Hi-En) code-switched tweets and we also present some initial experiments. In this effort, we are focussing on CS in text data from social-media whereas earlier studies have focused on CS in spoken data from a small number of speakers.
Categories: Microsoft
Case The Bones of the System: A Study of Logging and Telemetry at Microsoft
Large software organizations are transitioning to event data platforms as they culturally shift to better support data-driven decision making. This paper offers a case study at Microsoft during such a transition. Through qualitative interviews of 28 participants, and a quantitative survey of 1,823 respondents, we catalog a diverse set of activities that leverage event data sources, identify challenges in conducting these activities, and describe tensions that emerge in data-driven cultures as event data flow through these activities within the organization. We find that the use of event data span every job role in our interviews and survey, that different perspectives on event data create tensions between roles or teams, and that professionals report social and technical challenges across activities.
Categories: Microsoft
Automated Synthesis and Analysis of Switching Gene Regulatory Networks
Studying the gene regulatory networks (GRNs) that govern how cells change into specific cell types with unique roles throughout development is an active area of experimental research. The fate specification process can be viewed as a program prescribing the system dynamics, governed by a network of genetic interactions. To investigate the possibility that GRNs are not fixed but rather change their topology, for example as cells progress through commitment, we introduce the concept of Switching Gene Regulatory Networks (SGRNs) to enable the modelling and analysis of network reconfiguration. We define the synthesis problem of constructing SGRNs that are guaranteed to satisfy a set of constraints representing experimental observations of cell behaviour. We propose a solution to this problem that employs methods based upon Satisfiability Modulo Theories (SMT) solvers, and evaluate the feasibility and scalability of our approach by considering a set of synthetic benchmarks exhibiting possible biological behaviour of cell development. We outline how our approach is applied to a more realistic biological system, by considering a simplified network involved in the processes of neuron maturation and fate specification in the mammalian cortex.
Categories: Microsoft
Improved bounded-strength decoupling schemes for local Hamiltonians
We address the task of switching off the Hamiltonian of a system by removing all internal and system-environment couplings. We propose dynamical decoupling schemes, that use only bounded-strength controls, for quantum many-body systems with local system Hamiltonians and local environmental couplings. To do so, we introduce the combinatorial concept of balanced-cycle orthogonal arrays (BOAs) and show how to construct them from classical error-correcting codes. The derived decoupling schemes may be useful as a primitive for more complex schemes, e.g., for Hamiltonian simulation. For the case of n qubits and a 2-local Hamiltonian, the length of the resulting decoupling scheme scales as O(n log(n)), improving over the previously best-known schemes that scaled quadratically with n. More generally, using balanced-cycle orthogonal arrays constructed from families of BCH codes, we show that bounded-strength decoupling for any local Hamiltonian can be achieved.
Categories: Microsoft
Information Flows in Encrypted Databases
In encrypted databases, sensitive data is protected from an untrusted server by encrypting columns using partially homomorphic encryption schemes, and storing encryption keys in a trusted client. However, encrypting columns and protecting encryption keys does not ensure confidentiality - sensitive data can leak during query processing due to information flows through the trusted client. In this paper, we propose SecureSQL, an encrypted database that partitions query processing between an untrusted server and a trusted client while ensuring the absence of information flows. Our evaluation based on OLTP benchmarks suggests that SecureSQL can protect against explicit flows with low overheads (< 30%). However, protecting against implicit flows can be expensive because it precludes the use of key databases optimizations and introduces additional round trips between client and server.
Categories: Microsoft
Networks of Gratitude: Structures of Thanks and User Expectations in Workplace Appreciation Systems
Appreciation systems―platforms for users to exchange thanks and praise―are becoming common in the workplace, where employees share appreciation, managers are notified, and aggregate scores are sometimes made visible. Who do people thank on these systems, and what do they expect from each other and their managers? After introducing the design affordances of 13 appreciation systems, we discuss a system we call Gratia, in use at a large multinational company for over four years. Using logs of 422,000 appreciation messages and user surveys, we explore the social dynamics of use and ask if use of the system addresses the recognition problem. We find that while thanks is mostly exchanged among employees at the same level and different parts of the company, addressing the recognition problem, managers do not always act on that recognition in ways that employees expect.
Categories: Microsoft
BoolTraineR: training asynchronous Boolean models using single-cell expression data
Categories: Microsoft
Universal Models of Multivariate Temporal Point Processes
With the rapidly increasing availability of event stream data there is growing interest in multivariate temporal point process models to capture both qualitative and quantitative features of this type of data. Recent research on multivariate point processes have focused in inference and estimation problems for restricted classes of models such as continuous time Bayesian networks, Markov jump processes, Gaussian Cox processes, and Hawkes Processes. In this paper, we study the expressive power and learnability of Graphical Event Models (GEMs) --- the analogue of directed graphical models for multivariate temporal point processes. In particular, we describe a set of Graphical Event Models (GEMs) and show that this class can universally approximate any smooth multivariate temporal point process. We also describe a universal learning algorithm for this class of GEMs and show, under a mild set of assumptions, learnability results for both the dependency structures and distributions in this class. Our consistency results demonstrate the possibility of learning about both qualitative and quantitative dependencies from rich event stream data.
Categories: Microsoft
Surviving an "Eternal September" — How an Online Community Managed a Surge of Newcomers
We present a qualitative analysis of interviews with participants in the NoSleep community within Reddit where millions of fans and writers of horror fiction congregate. We explore how the community handled a massive, sudden, and sustained increase in new members. Although existing theory and stories like Usenet's infamous "Eternal September" suggest that large influxes of newcomers can hurt online communities, our interviews suggest that NoSleep survived without major incident. We propose that three features of NoSleep allowed it to manage the rapid influx of newcomers gracefully: (1) an active and well-coordinated group of administrators, (2) a shared sense of community which facilitated community moderation, and (3) technological systems that mitigated norm violations. We also point to several important trade-offs and limitations.
Categories: Microsoft
Journeys & Notes — Designing Social Computing for Non-Places
In this work we present a mobile application we designed and engineered to enable people to log their travels near and far, leave notes behind, and build a community around spaces in between destinations. Our design explores new ground for location-based social computing systems, identifying opportunities where these systems can foster the growth of on-line communities rooted at non-places. In our work we develop, explore, and evaluate several innovative features designed around four usage scenarios: daily commuting, long-distance traveling, quantified traveling, and journaling. We present the results of two small-scale user studies, and one large-scale, world-wide deployment, synthesizing the results as potential opportunities and lessons learned in designing social computing for non-places.
Categories: Microsoft
in Study Neurotics Can't Focus: An situ of Online Multitasking in the Workplace
In HCI research, attention has focused on understanding external influences on workplace multitasking. We explore instead how multitasking might be influenced by individual factors: personality, stress, and sleep. Forty information workers' online activity was tracked over two work weeks. The median duration of online screen focus was 40 seconds. The personality trait of Neuroticism was associated with shorter online focus duration and Impulsivity-Urgency was associated with longer online focus duration. Stress and sleep duration showed trends to be inversely associated with online focus. Shorter focus duration was associated with lower assessed productivity at day's end. Factor analysis revealed a factor of lack of control which significantly predicts multitasking. Our results suggest that there could be a trait for distractibility where some individuals are susceptible to online attention shifting in the workplace. Our results have implications for information systems (e.g. educational systems, game design) where attention focus is key.
Categories: Microsoft