Microsoft

Emerging and Recurring Data-Driven Storytelling Techniques: Analysis of a Curated Collection of Recent Stories

Microsoft Research Publications - Sun, 04/03/2016 - 08:00

Storytelling with data is becoming an important component of many fields such as graphic design, the advocacy of causes, and journalism. New techniques for integrating data visualization into narrative stories have now become commonplace. Authors are enabling new reader experiences, such as linking textual narrative and data visualizations through dynamic queries embedded in the text. Novel means of communicating position and navigating within the narrative also have merged, such as utilizing scrolling to advance narration and initiate animations. We advance the study of narrative visualization through an analysis of a curated collection of recent data-driven stories shared on the web. Drawing from the results of this analysis, we present a set of techniques being employed in these examples, organized under four high-level categories that help authors to tell stories in creative ways: communicating narrative and explaining data, linking separated story elements, enhancing structure and navigation, and providing controlled exploration. We describe the benefits of each storytelling technique along with a number of example applications of the ideas through recent data-driven stories. Additionally, we discuss the trends we observed as well as how the field has evolved and grown. Finally, we conclude with a discussion of areas for future research.

Categories: Microsoft

Commonsense Causal Reasoning between Short Texts

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Categories: Microsoft

Learning to Verify the Heap

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

We present a data-driven verification framework to automatically prove memory safety and functional correctness of heap programs. For this, we introduce a novel statistical machine learning technique that maps observed program states to (possibly disjunctive) separation logic formulas describing the invariant shape of data structures at relevant program locations. We then attempt to verify these predictions using a theorem prover, where counterexamples to a predicted invariant are used as additional input to the shape predictor in a refinement loop. After obtaining valid shape invariants, we use a second learning algorithm to strengthen them with data invariants, again employing a refinement loop using the underlying theorem prover. We have implemented our techniques in Cricket, an extension of the GRASShopper verification tool. Cricket is able to automatically prove memory safety and correctness of implementations of a variety of classical list-manipulating algorithms such as insertionsort.

Categories: Microsoft

Analyzing Runtime and Size Complexity of Integer Programs

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

We present a modular approach to automatic complexity analysis of integer programs. Based on a novel alternation between finding symbolic time bounds for program parts and using these to infer bounds on the absolute values of program variables, we can restrict each analysis step to a small part of the program while maintaining a high level of precision. The bounds computed by our method are polynomial or exponential expressions that depend on the absolute values of input parameters. We show how to extend our approach to arbitrary cost measures, allowing to use our technique to find upper bounds for other expended resources, such as network requests or memory consumption. Our contributions are implemented in the open source tool KoAT, and extensive experiments show the performance and power of our implementation in comparison with other tools.

Categories: Microsoft

The Dialog State Tracking Challenge Series: A Review

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

In a spoken dialog system, dialog state tracking refers to the task of correctly inferring the state of the conversation – such as the user’s goal – given all of the dialog history up to that turn. Dialog state tracking is crucial to the success of a dialog system, yet until recently there were no common resources, hampering progress. The Dialog State Tracking Challenge series of 3 tasks introduced the first shared testbed and evaluation metrics for dialog state tracking, and has underpinned three key advances in dialog state tracking: the move from generative to discriminative models; the adoption of discriminative sequential techniques; and the incorporation of the speech recognition results directly into the dialog state tracker. This paper reviews this research area, covering both the challenge tasks themselves and summarizing the work they have enabled.

Categories: Microsoft

Multimodal Learning for Image Captioning and Visual Question Answering (talk at UC Berkeley, BVLC)

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Categories: Microsoft

Inferring Annotations For Device Drivers From Verification Histories

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Finding invariants is an important step in automated program analysis. Discovery of precise invariants, however, can be very difficult in practice. The problem can be simplified if one has access to a candidate set of predicates (or annotations) and the search for invariants is limited over the space defined by these annotations. We present an approach that infers program annotations automatically by leveraging the history of verifying related programs. Our algorithm extracts high-quality annotations from previous verification attempts, and then applies them for verifying new programs. We present a case study where we applied our techniques to Microsoft's Static Driver Verifier (SDV). SDV currently uses manually-tuned heuristics for obtaining a set of annotations. Our techniques can not only replace the need for this manual effort, they even out-perform these heuristics and improve the performance of SDV overall.

Categories: Microsoft

Visual Storytelling

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

We introduce the first dataset for sequential vision-to-language, and explore how this data may be used for the task of visual storytelling. The first release of this dataset, SIND1 v.1, includes 81,743 unique photos in 20,211 sequences, aligned to both descriptive (caption) and story language. We establish several strong baselines for the storytelling task, and motivate an automatic metric to benchmark progress. Modelling concrete description as well as figurative and social language, as provided in this dataset and the storytelling task, has the potential to move artificial intelligence from basic understandings of typical visual scenes towards more and more human-like understanding of grounded event structure and subjective expression.

Categories: Microsoft

HIGHWAY LONG SHORT-TERM MEMORY RNNS FOR DISTANT SPEECH RECOGNITION

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

In this paper, we extend the deep long short-term memory (DLSTM) recurrent neural networks by introducing gated direct connections between memory cells in adjacent layers. These direct links, called highway connections, enable unimpeded information flow across different layers and thus alleviate the gradient vanishing problem when building deeper LSTMs. We further introduce the latency-controlled bidirectional LSTMs (BLSTMs) which can exploit the whole history while keeping the latency under control. Efficient algorithms are proposed to train these novel networks using both frame and sequence discriminative criteria. Experiments on the AMI distant speech recognition (DSR) task indicate that we can train deeper LSTMs and achieve better improvement from sequence training with highway LSTMs (HLSTMs). Our novel model obtains 43 :9=47 :7% WER on AMI (SDM) dev and eval sets, outperforming all previous works. It beats the strong DNN and DLSTM baselines

Categories: Microsoft

SPEAKER-AWARE TRAINING OF LSTM-RNNS FOR ACOUSTIC MODELLING

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Long Short-Term Memory (LSTM) is a particular type of recurrent neural network (RNN) that can model long term temporal dynamics. Recently it has been shown that LSTM-RNNs can achieve higher recognition accuracy than deep feed-forword neural networks (DNNs) in acoustic modelling. However, speaker adaption for LSTM-RNN based acoustic models has not been well investigated. In this paper, we study the LSTM-RNN speaker-aware training that incorporates the speaker information during model training to normalise the speaker variability. We first present several speaker-aware training architectures, and then empirically evaluate three types of speaker representation: I-vectors, bottleneck speaker vectors and speaking rate. Furthermore, to factorize the variability in the acoustic signals caused by speakers and phonemes respectively, we investigate the speaker-aware and phone-aware joint training under the framework of multi-task learning. In AMI meeting speech transcription task, speaker-aware training of LSTM-RNNs reduces word error rates by 6.5% relative to a very strong LSTM-RNN baseline, which uses FMLLR features.

Categories: Microsoft

PREDICTION-ADAPTATION-CORRECTION RECURRENT NEURAL NETWORKS FOR LOW-RESOURCE LANGUAGE SPEECH RECOGNITION

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

In this paper, we investigate the use of prediction-adaptationcorrection recurrent neural networks (PAC-RNNs) for lowresource speech recognition. A PAC-RNN is comprised of a pair of neural networks in which a correction network uses auxiliary information given by a prediction network to help estimate the state probability. The information from the correction network is also used by the prediction network in a recurrent loop. Our model outperforms other state-of-theart neural networks (DNNs, LSTMs) on IARPA-Babel tasks. Moreover, transfer learning from a language that is similar to the target language can help improve performance further.

Categories: Microsoft

AN INVESTIGATION INTO USING PARALLEL DATA FOR FAR-FIELD SPEECH RECOGNITION

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Far-field speech recognition is an important yet challenging task due to low signal to noise ratio. In this paper, three novel deep neural network architectures are explored to improve the far-field speech recognition accuracy by exploiting the parallel far-field and closetalk recordings. All three novel architectures use multi-task learning for the model optimization but focus on three different ideas:

Categories: Microsoft

INTEGRATED ADAPTATION WITH MULTI-FACTOR JOINT-LEARNING FOR FAR-FIELD SPEECH RECOGNITION

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Although great progress has been made in automatic speech recognition (ASR), significant performance degradation still exists in distant talking scenarios due to significantly lower signal power. In this paper, a novel adaptation framework, named integrated adaptation with multi-factor joint-learning , is proposed to improve the recognition accuracy for distant speech recognition. We explore and extract speaker, phone and environment factor representations using deep neural networks (DNNs), which are integrated into the main ASR DNN to improve classification accuracy. In addition, the hidden activations in the main ASR DNN are used to improve the factor extraction, which in turn helps the ASR DNN. All the model parameters, including those in the ASR DNN and factor extractor DNNs, are jointly optimized under the multi-task learning framework. Further more, unlike prior techniques, our novel approach requires no explicit separate stages for factor extraction and adaptation. Experiments on the AMI single distant microphone (SDM) task show that the proposed architecture can significantly reduce word error rate (WER) and additional improvement can be achieved by combining it with the i-vector adaptation. Our best configuration obtained more than 15% and 10% relative reduction on WER over the baselines using the SDM and close-talk data generated alignments, respectively.

Categories: Microsoft

DEEP BEAMFORMING NETWORKS FOR MULTI-CHANNEL SPEECH RECOGNITION

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Despite the significant progress in speech recognition enabled by deep neural networks, poor performance persists in some scenarios. In this work, we focus on far-field speech recognition which remains challenging due to high levels of noise and reverberation in the captured speech signals. We propose to represent the stages of acoustic processing including beamforming, feature extraction, and acoustic modeling, as three components of a single unified computational network. The parameters of a frequency-domain beamformer are first estimated by a network based on features derived from the microphone channels. These filter coefficients are then applied to the array signals to form an enhanced signal. Conventional features are then extracted from this signal and passed to a second network that performs acoustic modeling for classification. The parameters of both the beamforming and acoustic modeling networks are trained jointly using back-propagation with a common crossentropy objective function. In experiments on the AMI meeting corpus, we observed improvements by pre-training each sub-network with a network-specific objective function before joint training of both networks. The proposed method obtained a 3.2% absolute word error rate reduction compared to a conventional pipeline of independent processing stages.

Categories: Microsoft

Automatic Discovery of Attribute Synonyms Using Query Logs and Table Corpora

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Attribute synonyms are important ingredients for keyword-based search systems. For instance, web search engines, recognize queries that seek the value of an entity on a specific attribute (referred to as e+a queries) and provide direct answers for them using a combination of knowledge bases, web tables and documents. However, users often refer to an attribute in their e+a query differently from how it is referred in the web table or text passage. In such cases, search engines may fail to return relevant answers. To address that problem, we propose to automatically discover all the alternate ways of referring to the attributes of a given class of entities (referred to as attribute synonyms) in order to improve search quality. The state-of-the-art approach that relies on attribute name co-occurrence in web tables suffers from low precision. Our main insight is to combine positive evidence of attribute synonymity from query click logs, with negative evidence from web table attribute name co-occurrences. We formalize the problem as an optimization problem on a graph, with the attribute names being the vertices and the positive and negative evidences from query logs and web table schemas as weighted edges. We develop a linear programming based algorithm to solve the problem that has bi-criteria approximation guarantees. Our experiments on real-life datasets show that our approach has significantly higher precision and recall compared with the state-of-the-art.

Categories: Microsoft

Optimizing Distributed Actor Systems for Dynamic Interactive Services

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Distributed actor systems are widely used for developing interactive scalable cloud services, such as social networks and on-line games. By modeling an application as a dynamic set of lightweight communicating “actors”, developers can easily build complex distributed applications, while the underlying runtime system deals with low-level complexities of a distributed environment. We present ActOp — a data-driven, application-independent runtime mechanism for optimizing end-to-end service latency of actor-based distributed applications. ActOp targets the two dominant factors affecting latency: the overhead of remote inter-actor communications across servers, and the intra-server queuing delay. ActOp automatically identifies frequently communicating actors and migrates them to the same server transparently to the running application. The migration decisions are driven by a novel scalable distributed graph partitioning algorithm which does not rely on a single server to store the whole communication graph, thereby enabling efficient actor placement even for applications with rapidly changing graphs (e.g., chat services). Further, each server autonomously reduces the queuing delay by learning an internal queuing model and configuring threads according to instantaneous request rate and application demands. We prototype ActOp by integrating it with Orleans – a popular open-source actor system [4, 13]. Experiments with realistic workloads show latency improvements of up to 75% for the 99th percentile, up to 63% for the mean, with up to 2x increase in peak system throughput.

Categories: Microsoft

Efficient Queue Management for Cluster Scheduling

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Job scheduling in Big Data clusters is crucial both for cluster operators’ return on investment and for overall user experience. In this context, we observe several anomalies in how modern cluster schedulers manage queues, and argue that maintaining queues of tasks at worker nodes has significant benefits. On one hand, centralized approaches do not use worker-side queues. Given the inherent feedback delays that these systems incur, they achieve suboptimal cluster utilization, particularly for workloads dominated by short tasks. On the other hand, distributed schedulers typically do employ worker-side queuing, and achieve higher cluster utilization. However, they fail to place tasks at the best possible machine, since they lack cluster-wide information, leading to worse job completion time, especially for heterogeneous workloads. To the best of our knowledge, this is the first work to provide principled solutions to the above problems by introducing queue management techniques, such as appropriate queue sizing, prioritization of task execution via queue reordering, starvation freedom, and careful placement of tasks to queues. We instantiate our techniques by extending both a centralized (YARN) and a distributed (Mercury) scheduler, and evaluate their performance on a wide variety of synthetic and production workloads derived from Microsoft clusters. Our centralized implementation, Yaq-c, achieves 1.7x improvement on median job completion time compared to YARN, and our distributed one, Yaq-d, achieves 9.3x improvement over an implementation of Sparrow’s batch sampling on Mercury.

Categories: Microsoft

Did You Say U2 or Youtube? Inferring Implicit Transcripts from Voice Search Logs

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Categories: Microsoft

Author2Vec: Learning Author Representations by Combining Content and Link Information

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

In this paper, we consider the problem of learning representations for authors from bibliographic co-authorship networks. Existing methods for deep learning on graphs, such as DeepWalk, suffer from link sparsity problem as they focus on modeling the link information only. We hypothesize that capturing both the content and link information in a unified way will help mitigate the sparsity problem. To this end, we present a novel model `Author2Vec' 1, which learns lowdimensional author representations such that authors who write similar content and share similar network structure are closer in vector space. Such embeddings are useful in a variety of applications such as link prediction, node classification, recommendation and visualization. The author embeddings we learn are empirically shown to outperform DeepWalk by 2.35% and 0.83% for link prediction and clustering task respectively.

Categories: Microsoft

TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services

Microsoft Research Publications - Fri, 04/01/2016 - 08:00

Categories: Microsoft

Navigation

Geo Tracker

User login

Microsoft

Emerging and Recurring Data-Driven Storytelling Techniques: Analysis of a Curated Collection of Recent Stories

Commonsense Causal Reasoning between Short Texts

Learning to Verify the Heap

Analyzing Runtime and Size Complexity of Integer Programs

The Dialog State Tracking Challenge Series: A Review

Multimodal Learning for Image Captioning and Visual Question Answering (talk at UC Berkeley, BVLC)

Inferring Annotations For Device Drivers From Verification Histories

Visual Storytelling

HIGHWAY LONG SHORT-TERM MEMORY RNNS FOR DISTANT SPEECH RECOGNITION

SPEAKER-AWARE TRAINING OF LSTM-RNNS FOR ACOUSTIC MODELLING

PREDICTION-ADAPTATION-CORRECTION RECURRENT NEURAL NETWORKS FOR LOW-RESOURCE LANGUAGE SPEECH RECOGNITION

AN INVESTIGATION INTO USING PARALLEL DATA FOR FAR-FIELD SPEECH RECOGNITION

INTEGRATED ADAPTATION WITH MULTI-FACTOR JOINT-LEARNING FOR FAR-FIELD SPEECH RECOGNITION

DEEP BEAMFORMING NETWORKS FOR MULTI-CHANNEL SPEECH RECOGNITION

Automatic Discovery of Attribute Synonyms Using Query Logs and Table Corpora

Optimizing Distributed Actor Systems for Dynamic Interactive Services

Efficient Queue Management for Cluster Scheduling

Did You Say U2 or Youtube? Inferring Implicit Transcripts from Voice Search Logs

Author2Vec: Learning Author Representations by Combining Content and Link Information

TPC: Target-Driven Parallelism Combining Prediction and Correction to Reduce Tail Latency in Interactive Services

Search

Twitter

RSS Feedburner

Add This

Dilbert...

SkyDrive

Security