Microsoft
Deep Neural Decision Forests
Categories: Microsoft
The Global Patch Collider
This paper proposes a novel extremely efficient, fully-parallelizable, task-specific algorithm for the computation of global point-wise correspondences in images and videos. Our algorithm, the Global Patch Collider, is based on detecting unique collisions between image points using a collection of learned tree structures that act as conditional hash functions. In contrast to conventional approaches that rely on pairwise distance computation, our algorithm isolates distinctive pixel pairs that hit the same leaf during traversal through multiple learned tree structures. The split functions stored at the intermediate nodes of the trees are trained to ensure that only visually similar patches or their geometric or photometric transformed versions fall into the same leaf node. The matching process involves passing all pixel positions in the images under analysis through the tree structures. We then compute matches by isolating points that uniquely collide with each other ie. fell in the same empty leaf in multiple trees. Our algorithm is linear in the number of pixels but can be made constant time on a parallel computation architecture as the tree traversal for individual image points is decoupled. We demonstrate the efficacy of our method by using it to perform optical flow matching and stereo matching on some challenging benchmarks. Experimental results show that not only is our method extremely computationally efficient, but it is also able to match or outperform state of the art methods that are much more complex.
Categories: Microsoft
Rich Image Captioning in the Wild
We present an image caption system that addresses new challenges of automatically describing images in the wild. The challenges include high quality caption quality with respect to human judgments, out-of-domain data handling, and low latency required in many applications. Built on top of a state-of-the-art framework, we developed a deep vision model that detects a broad range of visual concepts, an entity recognition model that identifies celebrities and landmarks, and a confidence model for the caption output. Experimental results show that our caption engine outperforms previous state-of-the-art systems significantly on both in-domain dataset (i.e. MS COCO) and out-of-domain datasets.
Categories: Microsoft
Large Scale and Streaming Time Series Segmentation and Piece-Wise Approximation: Extended Version
Segmenting a time series or approximating it with piecewise linear function is often needed when handling data in the time domain to detect outliers, clean data, detect events and more. The data varies from ECG signals, traffic monitors to stock prices and sensor networks. Modern data-sets of this type are large and in many cases are infinite in the sense that the data is a stream rather than a finite sample. Therefore, in order to segment it, an algorithm has to scale gracefully with the size of the data. Dynamic Programming (DP) can findthe optimal segmentation, however, the DP approach has a complexity of O T 2 thus cannot handle datasets with millions of elements, nor can it handle streaming data. Therefore, various heuristics are used in practice to handle the data. This study shows that if the approximation measure has an inverse triangle inequality property (ITIP), the solution of the dynamic program can be computed in linear time and streaming data can be handled too. The ITIP is shown to hold in many cases of interest. The speedup due to the new algorithms is evaluated on a variety of data-sets to be in the range of 8 8200x over the DP solution without sacrificing accuracy. Confidence intervals for segmentations are derived as well.
Categories: Microsoft
FourQNEON: Faster Elliptic Curve Scalar Multiplications on ARM Processors
We present a high speed, high security implementation of the recently proposed elliptic curve FourQ (ASIACRYPT 2015) for 32-bit ARM processors with NEON support. Exploiting the versatile and compact arithmetic of this curve, we design a vectorized implementation that achieves high-performance across a large variety of ARM architectures. Our software is fully protected against timing and cache attacks, and showcases the impressive speed of FourQ when compared with other curve-based alternatives. For example, one single variable-base scalar multiplication is computed in about 235,000 Cortex-A8 cycles or 132,000 Cortex-A15 cycles which, compared to the fastest genus 2 Kummer and Curve25519 implementations on the same platforms, offers speedups between 1.3x-1.7x and between 2.1x-2.4x, respectively. In comparison with the NIST standard curve K-283, we achieve speedups above 4x and 5.5x.
Categories: Microsoft
Biggish: A solution for the inverse privacy problem
An item of your personal information is inversely private if some party has access to it but you do not. Inverse privacy is ubiquitous. Each interaction you have with commercial and other institutions generates inversely private data. The inverse privacy problem is unjustified inaccessibility of your inversely private data to you. Elsewhere a subset of these authors determined that the problem has a market-based solution that provides consumers with large amounts of their personal data to be mined and processed to benefit them. Here we sketch a particular solution.
Categories: Microsoft
E-TIPSY: Search Query Corpus Annotated with Entities, Term Importance, POS Tags, and Syntactic Parses
We present E-TIPSY, a search query corpus annotated with named Entities, Term Importance, POS tags, and SYntactic parses. This corpus contains crowdsourced (gold) annotations of the three most important terms in each query. In addition, it contains automatically produced annotations of named entities, part-of-speech tags, and syntactic parses for the same queries. This corpus comes in two formats: (1) Sober Subset: annotations that two or more crowd workers agreed upon, and (2) Full Glass: all annotations. We analyze the strikingly low correlation between term importance and syntactic headedness, which invites research into effective ways of combining these different signals. Our corpus can serve as a benchmark for term importance methods aimed at improving search engine quality and as an initial step toward developing a dataset of gold linguistic analysis of web search queries. In addition, it can be used as a basis for linguistic inquiries into the kind of expressions used in search.
Categories: Microsoft
Speeding up the Number Theoretic Transform for Faster Ideal Lattice-Based Cryptography
The Number Theoretic Transform (NTT) provides efficient algorithms for cyclic and nega-cyclic convolutions, which have many applications in computer arithmetic, e.g., for multiplying large integers and large degree polynomials. It is commonly used in cryptographic schemes that are based on the hardness of the Ring Learning With Errors (R-LWE) problem to efficiently implement modular polynomial multiplication. We present a new modular reduction technique that is tailored for the special moduli required by the NTT. Based on this reduction, we speed up the NTT and propose faster, multi-purpose algorithms. We present two implementations of these algorithms: a portable C implementation and a high-speed implementation using assembly with AVX2 instructions. In comparison to state-of-the-art implementations of the NTT using the same parameters, our C and assembly implementations achieve factor-1.90 and factor-1.25 speedups, respectively. To demonstrate the improved efficiency in an application example, we benchmarked the algorithms in the context of the R-LWE key exchange protocol that has recently been proposed by Alkim, Ducas, Poppelmann and Schwabe. In this case, our C and assembly implementations compute the full key exchange 1.40 and 1.34 times faster, respectively. These results are achieved with full protection against timing attacks.
Categories: Microsoft
Information Dissemination in Heterogeneous-Intent Networks
Individual’s motivations for participating in social networks can vary greatly, and this heterogeneity is reflected in the wide-variety of behaviors and communication styles we see in large social networks. In this paper, we present a first study of the basic properties of communication in a heterogeneous-intent network and their implications. First, through both user studies and large-scale data analyses, we validate that individual’s specific intents affect their style of communication and their selection of messages. Secondly, we propose a simple structural model of information diffusion within a heterogeneous-intent network and present the results of experiments exploring its emergent properties, finding that it recreates aspects of empirically observed cascades that are not captured in previous models.
Categories: Microsoft
Training A Quantum Optimizer
We study a variant of the quantum approximate optimization algorithm [ E. Farhi, J. Goldstone, and S. Gutmann, arXiv:1411.4028] with slightly different parametrization and different objective: rather than looking for a state which approximately solves an optimization problem, our goal is to find a quantum algorithm that, given an instance of MAX-2-SAT, will produce a state with high overlap with the optimal state. Using a machine learning approach, we chose a ``training set" of instances and optimized the parameters to produce large overlap for the training set. We then tested these optimized parameters on a larger instance set. As a training set, we used a subset of the hard instances studied by E. Crosson, E. Farhi, C. Yen-Yu Lin, H.-H. Lin, and P. Shor (CFLLS) [arXiv:1401.7320]. When tested on the full set, the parameters that we find produce significantly larger overlap than the optimized annealing times of CFLLS. Testing on other random instances from 20 to 28 bits continues to show improvement over annealing, with the improvement being most notable on the hardest instances. Further tests on instances of MAX-3-SAT also showed improvement on the hardest instances. This algorithm may be a possible application for near-term quantum computers with limited coherence times.
Categories: Microsoft
The quality of online answers to parents who suspect that their child has an Autism Spectrum Disorder
The growing diagnosis and public awareness of Autism Spectrum Disorders (ASD) leads more parents to seek answers to their suspicions for ASD in their child on Internet forums. This study describes an analysis of the quality of content of 371 answers on Yahoo Answers (YA), a social question and answer forum, to parents querying whether their child has ASD. We contrasted the perceived quality of answers by clinicians with that of parents. The study tested the feasibility of automatically assisting parents in selecting answers with higher quality using a predictive model based on the text of answers and the attributes of answerers.
Categories: Microsoft
Towards an Open-Domain Framework for Distilling the Outcomes of Personal Experiences from Social Media Timelines
Millions of people share details about their real-world experiences on social media. This provides an opportunity to observe the outcomes of common and critical situations and actions for individual and societal benefit. In this paper, we discuss our efforts to design and build an open-domain framework for mining the outcomes of any given experience from social media timelines. Through a number of example situations and actions across multiple domains, we discuss the kinds of outcomes we are able to extract and their relevance.
Categories: Microsoft
Recommendations meet web browsing: Enhancing Collaborative
Collaborative filtering (CF) recommendation systems are one of the most popular and successful methods for recommending products to people. CF systems work by finding similarities between different people according to their past purchases, and using these similarities to suggest possible items of interest. Here we investigate how CF systems can be enhanced using Internet browsing data and search engine query logs, both of which represent a rich profile of individuals’ interests. We introduce two approaches to enhancing user modeling using this data. Our approaches preserve the privacy of individuals while significantly enhancing model accuracy. We present extensive experimentation based on one-class, implicit feedback matrix factorization. We do not assume the existence of explicit ratings, but rather rely on unweighted, positive signals of the kind available in most commercial contexts. We demonstrate the value of our approach on two real datasets each comprising of the activities of tens of thousands of individuals. The first dataset details the downloads of Windows Phone 8 mobile applications and the second - item views in an online retail store. Both datasets are enhanced using anonymized Internet browsing logs. Our results show that prediction accuracy is improved by up to 72%. This improvement is largest when building a model which can predict for the entire catalog of items, not just popular ones. Finally, we discuss additional benefits of our approach, which include: improved recommendations for users with few past purchases, and enabling recommendations based on short-term purchase intent.
Categories: Microsoft
Differences in physical status, mental state and online behavior of people in pro-anorexia web communities
Background: There is a debate about the effects of pro-anorexia (colloquially referred to as pro-ana) websites. Research suggests that the effect of these websites is not straightforward. Indeed, the actual function of these sites is disputed, with studies indicating both negative and positive effects. Aim: This is the first study which systematically examined the differences between pro-anorexia web communities in four main aspects: web language used (posts); web interests/search behaviors (queries); users' self-reported weight status and weight goals; and associated self-reported mood/pathology. Methods:We collected three primary sources of data, including messages posed on three pro-anawebsites, a survey completed by over 1000 participants of a pro-ana website, and the searches made on the Bing search engine of pro-anorexia users. These data were analyzed for content, reported demographics and pathology, and behavior over time. Results: Although members of the main pro-ana website investigated appear to be depressed, with high rates of self-harm and suicide attempts, users are significantly more interested in treatment, have wishes of procreation and reported the highest goal weights among the investigated sites. In contrast, users of other pro-ana websites investigated, are more interested in morbid themes including depression, self-harm and suicide. The percentage of severely malnourished website users, in general, appears to be small (20%). Conclusions: Our results indicate that a new strategy is required to facilitate the communication between mental health specialists and pro-ana web users, recognizing the differences in harm associated with different websites.
Categories: Microsoft
Factoring with qutrits: Shor's algorithm on ternary and metaplectic quantum architectures
We determine the cost of performing Shor's algorithm for integer factorization on a ternary quantum computer, using two natural models of universal fault tolerant computing on ternary quantum systems: (i) a model based on magic state distillation that assumes the availability of the ternary Clifford gates, projective measurements, classical control and (ii) a model based on a metaplectic topological quantum computer (MTQC). Arguably, a natural choice to implement Shor's algorithm on a ternary quantum computer is to translate the entire arithmetic into a ternary form. However, it is also possible to simply emulate the standard binary version of the algorithm by encoding each qubit in a three level system. In this paper we address this emulation approach and analyze the complexity of implementing Shor's period finding function in both models, (i) and (ii).We compare the costs in terms of magic state counts required in each mode and find that a binary emulation implementation of Shor's algorithm on a ternary quantum computer requires slightly smaller circuit depth than the corresponding implementation in the binary Clifford+T framework. The reason for this are simplifications for binary arithmetic that can be leveraged over ternary gate sets. We also highlight that magic state preparation on MTQC requires magic state preprocessor of asymptotically smaller size which gives the MTQC solution a significant advantage over the binary framework.
Categories: Microsoft
Peer-to-peer in the workplace: A view from the road
This paper contributes to the growing literature on peer-to-peer (P2P) applications through an ethnographic study of auto-rickshaw drivers in Bengaluru, India. We describe how the adoption of a P2P application, Ola, which connects passengers to rickshaws, changes drivers work practices. Ola is part of the ‘peer services’ phenomenon which enable new types of ad-hoc trade in labour, skills and goods. Auto-rickshaw drivers present an interesting case because prior to Ola few had used Smartphones or the Internet. Furthermore, as financially vulnerable workers in the informal sector, concerns about driver welfare become prominent. Whilst technologies may promise to improve livelihoods, they do not necessarily deliver [57]. We describe how Ola does lit-tle to change the uncertainty which characterizes an auto drivers’ day. This leads us to consider how a more equitable and inclusive system might be designed.
Categories: Microsoft
Meerkat and Periscope: I Stream, You Stream, Apps Stream for Live Streams
We conducted a mixed methods study of the use of the Meerkat and Periscope apps for live streaming video and audio broadcasts from a mobile device. We crowdsourced a task to describe the content, setting, and other characteristics of 767 live streams. We also interviewed 20 frequent streamers to explore their motivations and experiences. Together, the data provide a snapshot of early live streaming use practices. We found a diverse range of activities broadcast, which interviewees said were used to build their personal brand. They described live streaming as providing an authentic, unedited view into their lives. They liked how the interaction with viewers shaped the content of their stream. We found some evidence for multiple live streams from the same event, which represent an opportunity for multiple perspectives on events of shared public interest.
Categories: Microsoft
Making Sense of Temporal Queries with Interactive Visualization
As real-time monitoring and analysis become increasingly important, researchers and developers turn to data stream management systems (DSMS’s) for fast, efficient ways to pose temporal queries over their datasets. However, these systems are inherently complex, and even database experts find it difficult to understand the behavior of DSMS queries. To help analysts better understand these temporal queries, we developed StreamTrace, an interactive visualization tool that breaks down how a temporal query processes a given dataset, step-by-step. The design of StreamTrace is based on input from expert DSMS users; we evaluated the system with a lab study of programmers who were new to streaming queries. Results from the study demonstrate that StreamTrace can help users to verify that queries behave as expected and to isolate the regions of a query that may be causing unexpected results.
Categories: Microsoft
Reasoning in Vector Space: An Exploratory Study of Question Answering
Question answering tasks have shown remarkable progress with distributed vector representation. In this paper, we investigate the recently proposed Facebook bAbI tasks which consist of twenty different categories of questions that require complex reasoning. Because the previous work on bAbI are all end-to-end models, errors could come from either an imperfect understanding of semantics or in certain steps of the reasoning. For clearer analysis, we propose two vector space models inspired by Tensor Product Representation (TPR) to perform knowledge encoding and logical reasoning based on common-sense inference. They together achieve near-perfect accuracy on all categories including positional reasoning and path finding that have proved difficult for most of the previous approaches. We hypothesize that the difficulties in these categories are due to the multi-relations in contrast to uni-relational characteristic of other categories. Our exploration sheds light on designing more sophisticated dataset and moving one step toward integrating transparent and interpretable formalism of TPR into existing learning paradigms.
Categories: Microsoft